'SQLite3 database or disk is full on csv imports
This issue has been discussed on a number of threads, but none of the proposals seem to apply to my case.
I have a very large sqlite database (4Tb). I am trying to import csv files from the terminal
sqlite3 -csv -separator " " /data/mydb.db ".import '|cat *.csv' mytable"
I intermittently receive SQLite3 database or disk is full errors. Re-running the command after an error usually succeeds.
Some notes:
- /data has 3.2Tb free
- /tmp has 1.8Tb free.
- *.csv takes up approximately 802Gb.
- Both /tmp and /data are using ext4 which has a maximum file size of 16tb.
- The only process accessing the database is the one mentioned above.
- PRAGMA integrity_check returns ok.
- Test on both -sqlite3 --version - 3.38.1 2022-03-12 13:37:29 38c210fdd258658321c85ec9c01a072fda3ada94540e3239d29b34dc547a8cbc and 3.31.1 2020-01-27 19:55:54 3bfa9cc97da10598521b342961df8f5f68c7388fa117345eeb516eaa837balt1
- OS - Ubuntu 20.04
Any thoughts on what could be happening?
(Unless there is an informed reason for why I am exceeding the limits sqlite, I would prefer to avoid suggestions that I move to a client/server RDBMS.)
Solution 1:[1]
i didn't figure it out, but someone else did, am pretty sure this will "fix it" until you reach 8TB-ish:
sqlite3 ... "PRAGMA main.max_page_count=2147483647; .import '|cat *.csv' mytable"
However the invocation
sqlite3 ... "PRAGMA main.journal_mode=DELETE; PRAGMA main.max_page_count; PRAGMA main.max_page_count=2147483647; PRAGMA main.page_size=65536;VACUUM; import '|cat *.csv' mytable;"
should allow the db to grow to ~200TB, but that VACUUM command, which is needed to apply the new page_size, requires a lot of free space to run, and will probably use a long time =/
good news is that you only need to run that once and it should be a permanent change to your db, your next invocation only needs sqlite3 ... "import '|cat *.csv' mytable;"
notably, this will probably break again around ~200TB
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |