Decompress archives directly with cURL and command line

Downloading and decompressing archives is a common task. Typically, you'd download the archive
first, then decompress it. But did you know you can streamline this process by combining cURL
and
command-line tools? Let's explore how.
Basic cURL
commands for downloading
cURL
is a powerful command-line tool for transferring data with URLs. To simply download a file,
you typically use:
curl -fsSLo archive.zip https://example.com/archive.zip
The flags used have specific purposes:
-f
: Fail silently on server errors (HTTP response codes >= 400).cURL
won't output HTML or other error content.-s
: Silent mode. Don't show progress meter or error messages. MakescURL
quiet.-S
: Show error. When used with-s
, it will still show an error message ifcURL
fails.-L
: Follow redirects. If the server responds with a redirect (3xx),cURL
will follow it.-o <filename>
: Write output to<filename>
instead of stdout.
Now, let's see how to combine downloading and decompression for different archive formats.
Handling different archive formats
The approach varies depending on the archive type and the capabilities of the decompression tool.
Zip archives (.zip
)
The standard unzip
command doesn't reliably support reading from standard input (stdin), which
prevents direct piping from cURL
. The most robust method is to download to a temporary file first,
then extract, and finally remove the temporary file.
# Download, extract to current directory, then clean up
curl -fsSL https://example.com/archive.zip -o temp.zip || { echo "Download failed"; exit 1; }
unzip temp.zip || { echo "Extraction failed"; rm temp.zip; exit 1; }
rm temp.zip
To extract the contents to a specific directory, use the -d
option with unzip
:
# Download, extract to a specific directory, then clean up
curl -fsSL https://example.com/archive.zip -o temp.zip || { echo "Download failed"; exit 1; }
unzip temp.zip -d /path/to/extract || { echo "Extraction failed"; rm temp.zip; exit 1; }
rm temp.zip
Tar gzip archives (.tar.gz
or .tgz
)
For .tar.gz
files, you can directly pipe the output of cURL
to the tar
command. tar
can read
the archive data from stdin.
# Download and extract to current directory
curl -fsSL https://example.com/archive.tar.gz | tar xzf - || { echo "Download or extraction failed"; exit 1; }
Here, tar
options are:
x
: Extract files from an archive.z
: Filter the archive throughgzip
(for.gz
compression).f -
: Read the archive from the specified file.-
signifies standard input.
To extract to a specific directory, use the -C
option (note the capital C):
# Download and extract to a specific directory
curl -fsSL https://example.com/archive.tar.gz | tar xzf - -C /path/to/extract || { echo "Download or extraction failed"; exit 1; }
Gzip archives (.gz
)
For single files compressed with gzip
(ending in .gz
), you can pipe the cURL
output to
gunzip
. Since gunzip
outputs the decompressed content to stdout by default, you need to redirect
it to a file.
# Download and decompress to 'outputfile'
curl -fsSL https://example.com/file.gz | gunzip > outputfile || { echo "Download or decompression failed"; exit 1; }
If the .gz
archive itself contains a tar
archive (like a .tar.gz
but named just .gz
), you
would pipe to tar
instead, similar to the .tar.gz
example:
# If file.gz actually contains a Tar archive
curl -fsSL https://example.com/file.gz | tar xzf - || { echo "Download or extraction failed"; exit 1; }
Security considerations
When downloading and extracting archives directly from URLs, especially in automated scripts, keep these security practices in mind:
- Verify Sources: Only download archives from trusted sources. Malicious archives can contain malware or exploit vulnerabilities in decompression tools.
- Dedicated Directories: Extract archives into dedicated, empty directories whenever possible. This prevents accidental overwriting of existing files.
- Path Traversal: Be cautious of archives containing files with absolute paths or paths that
traverse upwards (
../
). Some extraction tools have options to mitigate this (e.g.,tar
's--strip-components
can sometimes help, but careful review is needed). Malicious archives might try to overwrite system files. - Resource Exhaustion: Very large archives or "zip bombs" (small archives that decompress to enormous sizes) can exhaust disk space or memory. Set limits or monitor resource usage if dealing with untrusted archives.
- Permissions: Avoid running download and extraction commands as root or with unnecessary privileges. Extracted files might inherit permissions that could be insecure.
Common pitfalls and troubleshooting tips
- Incorrect URL or Network Issues: Double-check the URL. Use
curl -v
(verbose) to diagnose connection problems if the download fails. - Permission Errors: Ensure your script or user has write permissions in the target extraction
directory. Check the output of the
unzip
ortar
command for permission-denied errors. - Unsupported Archive Format: Make sure you're using the correct tool (
unzip
,tar
,gunzip
) for the archive type.file archive.ext
command can help identify the type. - Corrupted Downloads: Network issues can lead to incomplete or corrupted downloads. Add the
--retry 3
flag tocURL
to automatically retry failed downloads a few times, which can help on unstable connections. - Disk Space: Ensure sufficient disk space is available before starting the download and extraction, especially for large archives.
- Tool Not Found: Ensure
curl
,unzip
,tar
, andgunzip
are installed on your system.
Streamlining your workflow
Combining cURL
with command-line decompression tools is particularly useful in scenarios like:
- Setting up development environments.
- Automating software installations in CI/CD pipelines.
- Fetching and processing data sets.
- Updating application dependencies fetched as archives.
This approach avoids saving the intermediate archive file, saving disk space and potentially
speeding up workflows, especially for tar.gz
and .gz
files where direct streaming is possible.
If you need a more robust, programmatic solution for handling various archive formats within your application, consider using a dedicated service. For instance, Transloadit's File Compressing service includes a 🤖 /file/decompress Robot that supports multiple formats (ZIP, 7-Zip, RAR, GNU tar, ISO9660, CAB, LHA/LZH, XAR) and incorporates security measures like preventing symlink-based attacks.