'unzip operation taking several hours
I am using the following shell script to loop over 90 zip files & unarchive them on a Linux box hosted with Hostinger (Shared web hosting)
#!/bin/bash
SOURCE_DIR="<path_to_archives>"
cd ${SOURCE_DIR}
for f in *.zip
do
# unzip -oqq "$f" -d "${f%.zip}" &
python3 scripts/extract_archives.py "${f}" &
done
wait
The python script being called by the above shell script is below -
import shutil
import sys
source_path = "<path to source dir>"
def extract_files(in_file):
shutil.unpack_archive(source_path + in_file, source_path + in_file.split('.')[0])
print('Extracted : ', in_file)
extract_files(sys.argv[1].strip())
Irrespective of whether I use the inbuilt unzip
command or a python, it's taking about 2.5 hours to unzip all the files. unarchiving all the zip files results 90 folders with 170000 files overall. I would've thought anywhere between 15/20 min is reasonably acceptable timeframe.
I've tried a few different variations in that, I have tried just tarring the folders instead of zipping them up thinking just un-tarring may be faster than unzipping. I've used tar command from source server to transfer the files over ssh & untar in memory something like this -
time tar zcf - . | ssh -p <port> user@host "tar xzf - -C <dest dir>"
Nothing is helping. I am open to using any other programming language like Perl, Go or others too if necessary to speed things up.
Please can someone help me solve this performance problem.
Solution 1:[1]
Thank you everyone for your answers. As you indicated, this was to do with throttling on the servers in a hosted environment
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | usert4jju7 |