'Benford - Dataset with NA strings returns an error in extract.digits

I've a dataset of macroeconomic data like GDP, inflation, etc... where Rows=different macroeconomic indicators and columns=years

Since some values are missing (ex: the GDP of any country in any year), they are charged as "NA".

When I perform these operations:

#
data = read.table("14varnumeros.txt", header = FALSE, sep = "", na.strings = "NA", dec = ".", strip.white = TRUE)

benford(data, number.of.digits = 1, sign = "both", discrete=TRUE, round=3)
#

It gives me this error:

Error in extract.digits(data, number.of.digits, sign, second.order, discrete = discrete, :
Data must be a numeric vector

I assume that this is because of the NA strings, but I do not know how to solve it.



Solution 1:[1]

I came across this issue, too. In my case, it wasn't missing data, instead it's because of a quirk in the extract.digits() function of the benford.analysis package. The function is checking if the data supplied to it is numeric data, but it does so using class(dat) != "numeric" instead of using the is.numeric() function.

This produces unexpected errors. Consider the code below:

library(benford.analysis)

dat <- data.frame(v1 = 1:5, v2 = c(1, 2, 3, 4, 5))

benford(dat$v1)          # produces error

I've submitted an issue on Github, but you can simply wrap your data in as.numeric() and you should be fine.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 paulstey