'How to unzip a Reqwest/Hyper response using streams?

I need to download a 60MB ZIP file and extract the only file that comes within it. I want to download it and extract it using streams. How can I achieve this using Rust?

fn main () {
    let mut res = reqwest::get("myfile.zip").unwrap();
    // extract the response body to myfile.txt
}

In Node.js I would do something like this:

http.get('myfile.zip', response => {
  response.pipe(unzip.Parse())
  .on('entry', entry => {
    if (entry.path.endsWith('.txt')) {
      entry.pipe(fs.createWriteStream('myfile.txt'))
    }
  })
})


Solution 1:[1]

With reqwest you can get the .zip file:

reqwest::get("myfile.zip")

Since reqwest can only be used for retrieving the file, ZipArchive from the zip crate can be used for unpacking it. It's not possible to stream the .zip file into ZipArchive, since ZipArchive::new(reader: R) requires R to implement Read (which is fulfilled by the Response of reqwest) and Seek, which is not implemented by Response.

As a workaround you may use a temporary file:

copy_to(&mut tmpfile)

As File implements both Seek and Read, zip can be used here:

zip::ZipArchive::new(tmpfile)

This is a working example of the described method:

extern crate reqwest;
extern crate tempfile;
extern crate zip;

use std::io::Read;

fn main() {
    let mut tmpfile = tempfile::tempfile().unwrap();
    reqwest::get("myfile.zip").unwrap().copy_to(&mut tmpfile);
    let mut zip = zip::ZipArchive::new(tmpfile).unwrap();
    println!("{:#?}", zip);
}

tempfile is a handy crate, which lets you create a temporary file, so you don't have to think of a name.

Solution 2:[2]

That's how I'd read the file hello.txt with content hello world from the archive hello.zip located on a local server:

extern crate reqwest;
extern crate zip;

use std::io::Read;

fn main() {
    let mut res = reqwest::get("http://localhost:8000/hello.zip").unwrap();

    let mut buf: Vec<u8> = Vec::new();
    let _ = res.read_to_end(&mut buf);

    let reader = std::io::Cursor::new(buf);
    let mut zip = zip::ZipArchive::new(reader).unwrap();

    let mut file_zip = zip.by_name("hello.txt").unwrap();
    let mut file_buf: Vec<u8> = Vec::new();
    let _ = file_zip.read_to_end(&mut file_buf);

    let content = String::from_utf8(file_buf).unwrap();

    println!("{}", content);
}

This will output hello world

Solution 3:[3]

async solution using Tokio

It's a bit convoluted, but you can do this using tokio, futures, tokio_util::compat and async_compression. The key is to create a futures::io::AsyncRead stream using .into_async_read() and then convert it into a tokio::io::AsyncRead using .compat().

For simplicity, it downloads a txt.gz file and prints it line by line.

use async_compression::tokio::bufread::GzipDecoder;
use futures::stream::TryStreamExt;
use tokio::io::AsyncBufReadExt;
use tokio_util::compat::FuturesAsyncReadCompatExt;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let url = "https://f001.backblazeb2.com/file/korteur/hello-world.txt.gz";
    let response = reqwest::get(url).await?;
    let stream = response
        .bytes_stream()
        .map_err(|e| futures::io::Error::new(futures::io::ErrorKind::Other, e))
        .into_async_read()
        .compat();
    let gzip_decoder = GzipDecoder::new(stream);

    // Print decompressed txt content
    let buf_reader = tokio::io::BufReader::new(gzip_decoder);
    let mut lines = buf_reader.lines();
    while let Some(line) = lines.next_line().await? {
        println!("{line}");
    }

    Ok(())
}

Credit to Benjamin Kay.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 mgul
Solution 3