'How to unzip a Reqwest/Hyper response using streams?
I need to download a 60MB ZIP file and extract the only file that comes within it. I want to download it and extract it using streams. How can I achieve this using Rust?
fn main () {
let mut res = reqwest::get("myfile.zip").unwrap();
// extract the response body to myfile.txt
}
In Node.js I would do something like this:
http.get('myfile.zip', response => {
response.pipe(unzip.Parse())
.on('entry', entry => {
if (entry.path.endsWith('.txt')) {
entry.pipe(fs.createWriteStream('myfile.txt'))
}
})
})
Solution 1:[1]
With reqwest
you can get the .zip
file:
reqwest::get("myfile.zip")
Since reqwest
can only be used for retrieving the file, ZipArchive
from the zip
crate can be used for unpacking it. It's not possible to stream the .zip
file into ZipArchive
, since ZipArchive::new(reader: R)
requires R
to implement Read
(which is fulfilled by the Response
of reqwest
) and Seek
, which is not implemented by Response
.
As a workaround you may use a temporary file:
copy_to(&mut tmpfile)
As File
implements both Seek
and Read
, zip
can be used here:
zip::ZipArchive::new(tmpfile)
This is a working example of the described method:
extern crate reqwest;
extern crate tempfile;
extern crate zip;
use std::io::Read;
fn main() {
let mut tmpfile = tempfile::tempfile().unwrap();
reqwest::get("myfile.zip").unwrap().copy_to(&mut tmpfile);
let mut zip = zip::ZipArchive::new(tmpfile).unwrap();
println!("{:#?}", zip);
}
tempfile
is a handy crate, which lets you create a temporary file, so you don't have to think of a name.
Solution 2:[2]
That's how I'd read the file hello.txt with content hello world
from the archive hello.zip located on a local server:
extern crate reqwest;
extern crate zip;
use std::io::Read;
fn main() {
let mut res = reqwest::get("http://localhost:8000/hello.zip").unwrap();
let mut buf: Vec<u8> = Vec::new();
let _ = res.read_to_end(&mut buf);
let reader = std::io::Cursor::new(buf);
let mut zip = zip::ZipArchive::new(reader).unwrap();
let mut file_zip = zip.by_name("hello.txt").unwrap();
let mut file_buf: Vec<u8> = Vec::new();
let _ = file_zip.read_to_end(&mut file_buf);
let content = String::from_utf8(file_buf).unwrap();
println!("{}", content);
}
This will output hello world
Solution 3:[3]
async
solution using Tokio
It's a bit convoluted, but you can do this using tokio
, futures
, tokio_util::compat
and async_compression
. The key is to create a futures::io::AsyncRead
stream using .into_async_read()
and then convert it into a tokio::io::AsyncRead
using .compat()
.
For simplicity, it downloads a txt.gz
file and prints it line by line.
use async_compression::tokio::bufread::GzipDecoder;
use futures::stream::TryStreamExt;
use tokio::io::AsyncBufReadExt;
use tokio_util::compat::FuturesAsyncReadCompatExt;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let url = "https://f001.backblazeb2.com/file/korteur/hello-world.txt.gz";
let response = reqwest::get(url).await?;
let stream = response
.bytes_stream()
.map_err(|e| futures::io::Error::new(futures::io::ErrorKind::Other, e))
.into_async_read()
.compat();
let gzip_decoder = GzipDecoder::new(stream);
// Print decompressed txt content
let buf_reader = tokio::io::BufReader::new(gzip_decoder);
let mut lines = buf_reader.lines();
while let Some(line) = lines.next_line().await? {
println!("{line}");
}
Ok(())
}
Credit to Benjamin Kay.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | mgul |
Solution 3 |