'How to overwrite a tag for a named entity with CoreNLP's RegexNER without specifying the original tag

I know that CoreNLP's RegexNER allows me to overwrite a tag using the mapping file. For example; I have the word EGFR which CoreNLP recognizes as an ORGANIZATION. If I have the following line in my mapping file, it still tags it as an ORGANIZATION.

EGFR GENE

If I change that line to look like the following:

EGFR GENE ORGANIZATION

Then CoreNLP tags it as a GENE.

To be able to do this though, I have to know that CoreNLP tags EGFR as an ORGANIZATION and I can't always know that for every word in my mapping file. Now my question is, is there a way to tell the RegexNER to overwrite the tag for EGFR no matter what the original tag is? Something like

EGFR GENE .*



Solution 1:[1]

You can provide a comma separated list of tags that can be overwritten.

For instance:

ORGANIZATION,PERSON,LOCATION,MISC

will allow it to overwrite all of those tags.

I don't think there is an overwrite all option at the moment, so you do have to list each type you want overwritten.

If you always want to overwrite everything with what is in your rules you can supply that with this option to the TokensRegexNERAnnotator

regexner.backgroundSymbol ORGANIZATION,PERSON,LOCATION,MISC,O

And then each rule doesn't have to have a list.

Solution 2:[2]

Great answer by @StanfordNLPHelp

However, if you are using ner.fine for mappings, use properties below to get the overriding -

Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner,regexner");
props.setProperty("ner.fine.regexner.mapping", rulesFiles);
//  props.put("regexner.backgroundSymbol", "ORGANIZATION,PERSON,LOCATION,MISC,O");
props.put("ner.fine.regexner.backgroundSymbol", "ORGANIZATION,PERSON,LOCATION,MISC,O");

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 StanfordNLPHelp
Solution 2 Jatin Sutaria