'How to check file size before opening?
How can I check the size of a file before I load it into R?
For example:
http://math.ucdenver.edu/RTutorial/titanic.txt
I'd like to use the optimal command to open a file based on the file's size.
Solution 1:[1]
Use file.info()
file.info("data/ullyses.txt")
size isdir mode mtime ctime atime uid gid
data/ullyses.txt 1573151 FALSE 664 2015-06-01 15:25:55 2015-06-01 15:25:55 2015-06-01 15:25:55 1008 1008
Then extract the column called size
:
file.info("data/ullyses.txt")$size
[1] 1573151
Solution 2:[2]
library(RCurl)
url = "http://math.ucdenver.edu/RTutorial/titanic.txt"
xx = getURL(url, nobody=1L, header=1L)
strsplit(xx, "\r\n")
Solution 3:[3]
Perhaps it has been added since this discussion, but at least for R3.4+, the answer is file.size
.
Solution 4:[4]
If you don't want to download the file before knowing its size, you can try something like this:
Note: This will only work in Mac or Linux.
file_url = 'http://math.ucdenver.edu/RTutorial/titanic.txt'
curl_cmd = paste('curl -X HEAD -i', file_url)
system_cmd = paste(curl_cmd, '|grep Content-Length |cut -d : -f 2')
The above will pack together a string to be executed using system()
. The curl_cmd
string tells curl to go get just the header of the file.
The system_cmd
string packs on some extra commands to parse the header and extract just the filesize.
Now, call system()
and use the intern = TRUE
argument to tell R to hold onto the output.
b <- system(system_cmd, intern = TRUE)
## % Total % Received % Xferd Average Speed Time Time Time Current
## Dload Upload Total Spent Left Speed
## 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
## curl: (18) transfer closed
It will download just the header for the file and parse it to get the filesize. Now b
will be the filesize in bytes.
Then you can decide how to open the file, or print something friendly like:
print(paste("There are", as.numeric(b)/1e6, "mb in the file:", file_url))
## [1] "There are 0.055692 mb in the file: http://math.ucdenver.edu/RTutorial/titanic.txt"
Solution 5:[5]
Besides file.size
mentioned above, you can also use file_size
from package fs
, which will print the size in a more human-readable output, showing MB or GB instead of bytes.
As an example, compare the output returned by the two functions:
library(fs)
file.size(system.file("data/Rdata.rdb", package = "datasets"))
#> [1] 114974
fs::file_size(system.file("data/Rdata.rdb", package = "datasets"))
#> 112K
file.size(system.file("data/Rdata.rdb", package = "spData"))
#> [1] 2676333
fs::file_size(system.file("data/Rdata.rdb", package = "spData"))
#> 2.55M
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Andrie |
Solution 2 | Anthony Damico |
Solution 3 | M-- |
Solution 4 | |
Solution 5 | Matifou |