Hello, this is my first post on your forum, I thought hard on the threads and I hope I'm posting in the correct one.
I'm making a program that will interpret partial sentences, whole sentences, paragraphs and come up with some sort of reasonable response, most likely a question, but I have my hopes up.
I've already got my program to count each word and put it into its own table, each variable holds an individual word thus I now have a table full of the input text with each word separated.
Now I want to make it return what each word was in the sentence, verb, noun, adjective, etc. I'm pretty sure I need to find some sort of list, preferably a database so I can do some search through it, do some other things I would rather not say at the moment, and output what each word was from the input text.
I'm not even sure if such a list or database exists, I've tried to search for one but Google is pointing me in all the wrong directions, I was wondering if I could get some advice/help from someone here?
Thanks! :D
7 replies to this topic
#1
Posted 29 June 2011 - 02:56 PM
|
|
|
#2
Posted 29 June 2011 - 03:35 PM
You could use the thesaurus files for OpenOffice.
#3
Posted 29 June 2011 - 04:27 PM
Ahhah!! That is almost perfect! Except.. There is a license agreement that restricts redistribution. :(
Any other ideas anyone?
Any other ideas anyone?
#4
Posted 29 June 2011 - 05:22 PM
What?
Here's the license text from the Lingucomponent source code in the MyThes~1.zip file. Lingucomponent is the component used by OpenOffice.org to drive it's Thesaurus, among other features.
Here's the license text from the Lingucomponent source code in the MyThes~1.zip file. Lingucomponent is the component used by OpenOffice.org to drive it's Thesaurus, among other features.
/* * Copyright 2003 Kevin B. Hendricks, Stratford, Ontario, Canada * And Contributors. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * 3. All modifications to the source code must be clearly marked as * such. Binary redistributions based on modified source code * must be clearly marked as modified versions in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY KEVIN B. HENDRICKS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * KEVIN B. HENDRICKS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */Where did you read a restrictive license?
Wow I changed my sig!
#5
Posted 29 June 2011 - 05:48 PM
Hmmm I must have went to the wrong location, oopsies xD Where should I get it from?
And this is where I went,
And this is where I went,
<a href="http://wiki.services.openoffice.org/wiki/Dictionaries#English_.28AU.2CCA.2CGB.2CNZ.2CUS.2CZA.29">http://wiki.services.openoffice.org/wiki/Dictionaries</a>
#6
Posted 29 June 2011 - 07:42 PM
Pwhew Thanks a lot! That is exactly what I needed. Now I've just gotta figure out how to use it >.< xD
I would much rather prefer a huge .sql file if anyone knows if that exists?
I would much rather prefer a huge .sql file if anyone knows if that exists?
Edited by Kyle Joseph Klouzal, 29 June 2011 - 07:47 PM.
Adding Info
#7
Posted 29 June 2011 - 09:29 PM
I am unsure there are any out there, at least that I can find that are not direct conversions from plaintext files. Do you need information on each word, i.e. if it is an adjective or verb or noun? The dictionaries that you have linked to from OpenOffice have the Hunspell affix (the word/affix) format for each word, you could either implement the library or somehow gather what the affix does and use it in your application. It could as well be written to a database in your own format, you would just need to write the script to extract it.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
#8
Posted 30 June 2011 - 11:08 AM
Yes, all I need is a complete list of the entire english language and what it's part of speech is. Possibly synonyms and antonyms. I suppose the definition may come in handy?
I'm not all too entirely sure exactly what I'm going to need, I have a few theories as to how it will interpret the input and a general idea of the big picture, I'm just taking it step by step and seeing how that goes. :thumbup1:
I'm not all too entirely sure exactly what I'm going to need, I have a few theories as to how it will interpret the input and a general idea of the big picture, I'm just taking it step by step and seeing how that goes. :thumbup1:
Edited by Kyle Joseph Klouzal, 30 June 2011 - 11:10 AM.
Clarification
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users


Sign In
Create Account


Back to top









