Secure API file uploads with magic numbers

Validating file uploads is crucial for API security. Relying solely on file extensions or MIME types can leave your application vulnerable to spoofing attacks. A more secure method involves checking the file's magic numbers, also known as file signatures.
Why file extension validation isn't enough
File extensions and MIME types are easily spoofed. Attackers can rename malicious files to appear
harmless, bypassing basic validation checks. For example, a malicious executable could be renamed
document.pdf
, fooling systems that only check the extension.
MIME type validation is equally unreliable because browsers and clients can set arbitrary
Content-Type
headers. This makes magic-number validation essential for robust file verification.
Understand magic numbers and file Signatures
Magic numbers are unique byte sequences—usually found right at the start of a file—that unambiguously identify the true format. Because they are part of the binary structure, forging them is much harder than changing a filename or header. That makes them the first line of defense for secure upload endpoints.
Common magic numbers
Below is a short but frequently used list. For a comprehensive overview, see Wikipedia's “List of file signatures.”
Format | Hex (offset 0) | Notes |
---|---|---|
PNG | 89 50 4E 47 0D 0A 1A 0A |
Always 8 bytes |
JPEG | FF D8 FF DB FF D8 FF E0 FF D8 FF E1 |
Covers JFIF and EXIF variants |
GIF | 47 49 46 38 37 61 (GIF87a)47 49 46 38 39 61 (GIF89a) |
Six bytes |
25 50 44 46 2D (%PDF- ) |
Five bytes | |
ZIP | 50 4B 03 04 50 4B 05 06 |
DOCX, ODT, and APK are ZIP containers |
MP4 | 66 74 79 70 (offset 4) |
Preceded by four-byte size field |
Implement magic-number validation in Node.js
The popular file-type
package (v22 at the time of writing) detects signatures from a Buffer
. The
library is ESM-only, so make sure your project’s package.json
contains "type": "module"
or use
.mjs
files.
Read only the first n bytes
import fs from 'node:fs/promises'
import { fileTypeFromBuffer } from 'file-type'
/**
* Validate a file by magic number.
* @param {string} filePath Absolute or relative path to the file on disk
* @param {string[]} allowList Array of allowed MIME types
*/
export async function validateFileType(filePath, allowList = []) {
// Default allow-list: PNG, JPEG, and PDF
const allowedTypes = allowList.length ? allowList : ['image/png', 'image/jpeg', 'application/pdf']
const MAX_BYTES = 8192 // 8 KiB is enough for almost all formats
const fh = await fs.open(filePath, 'r')
const buffer = Buffer.alloc(MAX_BYTES)
const { bytesRead } = await fh.read(buffer, 0, MAX_BYTES, 0)
await fh.close()
const type = await fileTypeFromBuffer(buffer.subarray(0, bytesRead))
if (!type) throw new Error('Unknown or unsupported file type')
if (!allowedTypes.includes(type.mime)) {
throw new Error(`Disallowed file type: ${type.mime}`)
}
return { ...type, valid: true }
}
// Example
// (async () => {
// await validateFileType('uploads/avatar.png')
// })()
Validate streaming uploads with Express.js
import express from 'express'
import multer from 'multer'
import { fileTypeFromBuffer } from 'file-type'
const app = express()
const upload = multer({ storage: multer.memoryStorage() })
app.post('/upload', upload.single('file'), async (req, res) => {
try {
if (!req.file) return res.status(400).json({ error: 'No file uploaded' })
const type = await fileTypeFromBuffer(req.file.buffer)
if (!type || !['image/png', 'image/jpeg'].includes(type.mime)) {
return res.status(400).json({ error: 'Invalid file type' })
}
// TODO: Store file or forward to further processing
res.json({ message: 'File validated', mime: type.mime, size: req.file.size })
} catch (err) {
res.status(500).json({ error: err.message })
}
})
app.listen(3000, () => console.log('API listening on :3000'))
Validate files in Python with python-magic
python-magic
is a thin wrapper around the libmagic
C library, so installation differs by
platform:
# Debian/Ubuntu
sudo apt-get install python3-magic libmagic1
# macOS (homebrew)
brew install libmagic
pip install python-magic
# Windows (pre-built binaries)
pip install python-magic-bin
from pathlib import Path
import magic
class FileValidator:
"""Validate MIME type using libmagic signatures."""
def __init__(self, allowed=None):
self.allowed = set(allowed or {
'image/png',
'image/jpeg',
'application/pdf',
})
self._mime = magic.Magic(mime=True)
def validate(self, file_path: str | Path) -> dict:
path = Path(file_path)
if not path.is_file():
raise FileNotFoundError(path)
mime_type = self._mime.from_file(str(path))
if mime_type not in self.allowed:
raise ValueError(f'Blocked MIME: {mime_type}')
return {
'mime': mime_type,
'size': path.stat().st_size,
'valid': True,
}
# Example
# Validator = filevalidator()
# Print(validator.validate('uploads/report.pdf'))
Handle edge cases and polyglot files
Some attackers craft “polyglot” files that carry multiple valid headers (for example, image + ZIP). After the initial signature check, search deeper into the buffer for suspicious patterns:
import { fileTypeFromBuffer } from 'file-type'
export async function strictValidation(buffer, allow) {
const type = await fileTypeFromBuffer(buffer)
if (!type || !allow.includes(type.mime)) throw new Error('Invalid type')
const hex = buffer.toString('hex')
const redFlags = [
'4d5a', // MZ — Windows PE header
'3c68746d6c', // <html
'504b0304', // Local ZIP header far from offset 0
]
for (const flag of redFlags) {
// Skip the first 16 bytes so we don't flag legitimate ZIP/PE starting bytes
if (hex.indexOf(flag, 32) !== -1) {
throw new Error('Polyglot content detected')
}
}
return type
}
Performance tips for large files
- Read only the first few KiB—
file-type
needs at most 4,100 bytes for all supported formats. - Process uploads as streams to avoid buffering entire gigabyte-sized files in memory.
- Cache allowed-type arrays and regular expressions, especially in serverless environments where cold starts are expensive.
Combine magic-number validation with other controls
Magic numbers prove a file is what it claims to be—nothing more. Strengthen your upload pipeline by layering additional guards:
- File-size limits (
Content-Length
and on-disk checks) - Virus/malware scanning (ClamAV or a commercial API)
- Rate limiting and authentication on upload endpoints
- Content-Security-Policy headers when you serve user-supplied media
Test your implementation
A few unit tests (using Vitest, for example) help ensure future refactors don’t break validation. Test with valid files, malformed files, and edge cases like empty files or files with incorrect extensions.
import { describe, expect, it } from 'vitest'
import { fileTypeFromBuffer } from 'file-type'
const png = Buffer.from('89504e470d0a1a0a', 'hex')
const empty = Buffer.alloc(0)
const jpeg = Buffer.from(
'ffd8ffe000104a46494600010100000100010000ffdb0043000101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101ffc00011080001000103012200021101031101ffc4001f0000010501010101010100000000000000000102030405060708090a0bffda000c03010002110311003f00f2a900',
'hex',
) // A small valid JPEG
describe('magic-number validation', () => {
it('detects PNG correctly', async () => {
const t = await fileTypeFromBuffer(png)
expect(t?.mime).toBe('image/png')
})
it('detects JPEG correctly', async () => {
const t = await fileTypeFromBuffer(jpeg)
expect(t?.mime).toBe('image/jpeg')
})
it('rejects empty buffers', async () => {
const t = await fileTypeFromBuffer(empty)
expect(t).toBeUndefined()
})
})
Troubleshoot common issues
Symptom | Possible cause | Fix |
---|---|---|
Error: Unknown or unsupported file type |
file-type cannot match any signature |
Increase read size to 16 KiB; make sure the file is not encrypted or truncated |
Module not found: file-type |
Using CommonJS require() |
Switch to ESM or use import('file-type') dynamic import |
ImportError: failed to find libmagic |
libmagic missing on OS |
Install using package manager or use the -bin wheel on Windows |
Wrap-up
Magic-number checks offer a low-latency, high-confidence way to verify uploads before you store or process them. When combined with size limits, malware scanning, and robust error handling, they form an effective shield against many file upload attacks.
Need an easier way to handle uploads at scale? Check out our handling uploads service at Transloadit.