'FastText 0.9.2 - why is recall 'nan'?
I trained a supervised model in FastText using the Python interface and I'm getting weird results for precision and recall.
First, I trained a model:
model = fasttext.train_supervised("train.txt", wordNgrams=3, epoch=100, pretrainedVectors=pretrained_model)
Then I get results for the test data:
def print_results(N, p, r):
print("N\t" + str(N))
print("P@{}\t{:.3f}".format(1, p))
print("R@{}\t{:.3f}".format(1, r))
print_results(*model.test('test.txt'))
But the results are always odd, because they show precision and recall @1 as identical, even for different datasets, e.g. one output is:
N 46425
P@1 0.917
R@1 0.917
Then when I look for the precision and recall for each label, I always get recall as 'nan':
print(model.test_label('test.txt'))
And the output is:
{'__label__1': {'precision': 0.9202150724134941, 'recall': nan, 'f1score': 1.8404301448269882}, '__label__5': {'precision': 0.9134956983264135, 'recall': nan, 'f1score': 1.826991396652827}}
Does anyone know why this might be happening?
P.S.: To try a reproducible example of this behavior, please refer to https://github.com/facebookresearch/fastText/issues/1072 and run it with FastText 0.9.2
Solution 1:[1]
It looks like FastText 0.9.2 has a bug in the computation of recall, and that should be fixed with this commit.
Installing a "bleeding edge" version of FastText e.g. with
pip install git+https://github.com/facebookresearch/fastText.git@b64e359d5485dda4b4b5074494155d18e25c8d13 --quiet
and rerunning your code should allow to get rid of the nan
values in the recall computation.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |