'How to find word inflected forms in a large String? [closed]

I have a random text content in a String variable. I want to look for all word inflections of a specific word user specifies.

Example: If the user is looking for the word "assist" then it should grab all "assist, assists, assisted, assisting" occurrences in the String.

Is there a Java library available to detect such inflections automatically in the specified String?

Note: I have seen a Java library called WolframAlpha that claims it does this and here is its web interface, but i don't see this library working, and no guide is available for using it.



Solution 1:[1]

First of all it is not Java library, it is Wolfram language previously known as Mathematica. It does have JLink and can be called from Java, but you must have Wolfram Kernel running that executes the code.

This is called Natural Language Processing and it's a huge, complex field. I have fiddled about with few problems, but all I can say this is harder then complex if you want to get reliable solution.

Something you might want to take a look at would be : The Stanford NLP

Solution 2:[2]

It is called word stemming. First you need (for a specific language) derive the stem:

assisting -> assist using -ance, -ing, -ly, -s, -ed etcetera.
sought -> search using an exception list

Then do a search, maybe with a regular expression (Matcher.find). Pattern:

"\\bassist\\p{L}*"
"\\b(search|sought)\\p{L}"

For prefixes un- dis- inter- the case would still be more complicated, but in general flections are word endings in English. Then there is synonym searching.

Dictionaries out there are often called corpora. A search for "free English corpus" will yield results.

\\b = word boundary p{L}* = 0 or more (*) letters

Solution 3:[3]

Check this out..

I don't know how big your requirement is, but you can always use wiktionary and parse your data??

Check this question.. Can be of help

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Margus
Solution 2 Joop Eggen
Solution 3 Community