'How to check file size before opening?

How can I check the size of a file before I load it into R?

For example:

http://math.ucdenver.edu/RTutorial/titanic.txt

I'd like to use the optimal command to open a file based on the file's size.



Solution 1:[1]

Use file.info()

file.info("data/ullyses.txt")

                    size isdir mode               mtime               ctime               atime  uid  gid
data/ullyses.txt 1573151 FALSE  664 2015-06-01 15:25:55 2015-06-01 15:25:55 2015-06-01 15:25:55 1008 1008

Then extract the column called size:

file.info("data/ullyses.txt")$size
[1] 1573151

Solution 2:[2]

library(RCurl)
url = "http://math.ucdenver.edu/RTutorial/titanic.txt"
xx = getURL(url, nobody=1L, header=1L)
strsplit(xx, "\r\n")

Solution 3:[3]

Perhaps it has been added since this discussion, but at least for R3.4+, the answer is file.size.

Solution 4:[4]

If you don't want to download the file before knowing its size, you can try something like this:

Note: This will only work in Mac or Linux.

file_url = 'http://math.ucdenver.edu/RTutorial/titanic.txt'
curl_cmd = paste('curl -X HEAD -i', file_url)
system_cmd = paste(curl_cmd, '|grep Content-Length |cut -d : -f 2')

The above will pack together a string to be executed using system(). The curl_cmd string tells curl to go get just the header of the file.

The system_cmd string packs on some extra commands to parse the header and extract just the filesize.

Now, call system() and use the intern = TRUE argument to tell R to hold onto the output.

b <- system(system_cmd, intern = TRUE)
##  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current 
##                              Dload  Upload   Total   Spent    Left  Speed
##   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:-- 0   
## curl: (18) transfer closed

It will download just the header for the file and parse it to get the filesize. Now b will be the filesize in bytes.


Then you can decide how to open the file, or print something friendly like:

print(paste("There are", as.numeric(b)/1e6, "mb in the file:", file_url))
## [1] "There are 0.055692 mb in the file: http://math.ucdenver.edu/RTutorial/titanic.txt"

Solution 5:[5]

Besides file.size mentioned above, you can also use file_size from package fs, which will print the size in a more human-readable output, showing MB or GB instead of bytes.

As an example, compare the output returned by the two functions:

library(fs)

file.size(system.file("data/Rdata.rdb", package = "datasets"))
#> [1] 114974
fs::file_size(system.file("data/Rdata.rdb", package = "datasets"))
#> 112K

file.size(system.file("data/Rdata.rdb", package = "spData"))
#> [1] 2676333
fs::file_size(system.file("data/Rdata.rdb", package = "spData"))
#> 2.55M

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Andrie
Solution 2 Anthony Damico
Solution 3 M--
Solution 4
Solution 5 Matifou