'Python zipfile: file name with new line characters
Somebody managed somehow to add a new line character \r\n
to the name of a file in a zip, and that makes ZipFile fail when it tries to extract the zip:
2019-07-23 14:05:12,285 - __main__ - ERROR - Error desconocido: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'. Saliendo.
Traceback (most recent call last):
File "motor.py", line 51, in main
procesar_descarga(zip_object, ruta_temp, ruta_final)
File "C:\Users\david\pycharmProjects\descargueitor2\volcado.py", line 90, in procesar_descarga
zip_object.extractall(str(ruta_temp))
File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1616, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1670, in _extract_member
open(targetpath, "wb") as target:
OSError: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'
I tried the same file with several programs:
- The built-in compressed files reader in Windows explorer just ignores the file: it is not listed nor extracted.
- WinZip lists the file, but throws an error when opening or extracting the file.
- 7Zip can read and extract the file: it just converts the bad characters to underscores.
Is there any way to deal with this in Python? It looks like files in a zip cannot be renamed using the library.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|