Validating file uploads is crucial for API security. Relying solely on file extensions or MIME types can leave your application vulnerable to spoofing attacks. A more secure method involves checking the file's magic numbers, also known as file signatures.

Why file extension validation isn't enough

File extensions and MIME types are easily spoofed. Attackers can rename malicious files to appear harmless, bypassing basic validation checks. For example, a malicious executable could be renamed document.pdf, fooling systems that only check the extension.

MIME type validation is equally unreliable because browsers and clients can set arbitrary Content-Type headers. This makes magic-number validation essential for robust file verification.

Understand magic numbers and file Signatures

Magic numbers are unique byte sequences—usually found right at the start of a file—that unambiguously identify the true format. Because they are part of the binary structure, forging them is much harder than changing a filename or header. That makes them the first line of defense for secure upload endpoints.

Common magic numbers

Below is a short but frequently used list. For a comprehensive overview, see Wikipedia's “List of file signatures.”

Format Hex (offset 0) Notes
PNG 89 50 4E 47 0D 0A 1A 0A Always 8 bytes
JPEG FF D8 FF DB
FF D8 FF E0
FF D8 FF E1
Covers JFIF and EXIF variants
GIF 47 49 46 38 37 61 (GIF87a)
47 49 46 38 39 61 (GIF89a)
Six bytes
PDF 25 50 44 46 2D (%PDF-) Five bytes
ZIP 50 4B 03 04
50 4B 05 06
DOCX, ODT, and APK are ZIP containers
MP4 66 74 79 70 (offset 4) Preceded by four-byte size field

Implement magic-number validation in Node.js

The popular file-type package (v22 at the time of writing) detects signatures from a Buffer. The library is ESM-only, so make sure your project’s package.json contains "type": "module" or use .mjs files.

Read only the first n bytes

import fs from 'node:fs/promises'
import { fileTypeFromBuffer } from 'file-type'

/**
 * Validate a file by magic number.
 * @param {string} filePath Absolute or relative path to the file on disk
 * @param {string[]} allowList Array of allowed MIME types
 */
export async function validateFileType(filePath, allowList = []) {
  // Default allow-list: PNG, JPEG, and PDF
  const allowedTypes = allowList.length ? allowList : ['image/png', 'image/jpeg', 'application/pdf']

  const MAX_BYTES = 8192 // 8 KiB is enough for almost all formats
  const fh = await fs.open(filePath, 'r')
  const buffer = Buffer.alloc(MAX_BYTES)
  const { bytesRead } = await fh.read(buffer, 0, MAX_BYTES, 0)
  await fh.close()

  const type = await fileTypeFromBuffer(buffer.subarray(0, bytesRead))
  if (!type) throw new Error('Unknown or unsupported file type')
  if (!allowedTypes.includes(type.mime)) {
    throw new Error(`Disallowed file type: ${type.mime}`)
  }
  return { ...type, valid: true }
}

// Example
// (async () => {
//   await validateFileType('uploads/avatar.png')
// })()

Validate streaming uploads with Express.js

import express from 'express'
import multer from 'multer'
import { fileTypeFromBuffer } from 'file-type'

const app = express()
const upload = multer({ storage: multer.memoryStorage() })

app.post('/upload', upload.single('file'), async (req, res) => {
  try {
    if (!req.file) return res.status(400).json({ error: 'No file uploaded' })

    const type = await fileTypeFromBuffer(req.file.buffer)
    if (!type || !['image/png', 'image/jpeg'].includes(type.mime)) {
      return res.status(400).json({ error: 'Invalid file type' })
    }

    // TODO: Store file or forward to further processing
    res.json({ message: 'File validated', mime: type.mime, size: req.file.size })
  } catch (err) {
    res.status(500).json({ error: err.message })
  }
})

app.listen(3000, () => console.log('API listening on :3000'))

Validate files in Python with python-magic

python-magic is a thin wrapper around the libmagic C library, so installation differs by platform:

# Debian/Ubuntu
sudo apt-get install python3-magic libmagic1

# macOS (homebrew)
brew install libmagic
pip install python-magic

# Windows (pre-built binaries)
pip install python-magic-bin
from pathlib import Path
import magic

class FileValidator:
    """Validate MIME type using libmagic signatures."""

    def __init__(self, allowed=None):
        self.allowed = set(allowed or {
            'image/png',
            'image/jpeg',
            'application/pdf',
        })
        self._mime = magic.Magic(mime=True)

    def validate(self, file_path: str | Path) -> dict:
        path = Path(file_path)
        if not path.is_file():
            raise FileNotFoundError(path)

        mime_type = self._mime.from_file(str(path))
        if mime_type not in self.allowed:
            raise ValueError(f'Blocked MIME: {mime_type}')

        return {
            'mime': mime_type,
            'size': path.stat().st_size,
            'valid': True,
        }

# Example
# Validator = filevalidator()
# Print(validator.validate('uploads/report.pdf'))

Handle edge cases and polyglot files

Some attackers craft “polyglot” files that carry multiple valid headers (for example, image + ZIP). After the initial signature check, search deeper into the buffer for suspicious patterns:

import { fileTypeFromBuffer } from 'file-type'

export async function strictValidation(buffer, allow) {
  const type = await fileTypeFromBuffer(buffer)
  if (!type || !allow.includes(type.mime)) throw new Error('Invalid type')

  const hex = buffer.toString('hex')
  const redFlags = [
    '4d5a', // MZ — Windows PE header
    '3c68746d6c', // <html
    '504b0304', // Local ZIP header far from offset 0
  ]

  for (const flag of redFlags) {
    // Skip the first 16 bytes so we don't flag legitimate ZIP/PE starting bytes
    if (hex.indexOf(flag, 32) !== -1) {
      throw new Error('Polyglot content detected')
    }
  }
  return type
}

Performance tips for large files

  1. Read only the first few KiB—file-type needs at most 4,100 bytes for all supported formats.
  2. Process uploads as streams to avoid buffering entire gigabyte-sized files in memory.
  3. Cache allowed-type arrays and regular expressions, especially in serverless environments where cold starts are expensive.

Combine magic-number validation with other controls

Magic numbers prove a file is what it claims to be—nothing more. Strengthen your upload pipeline by layering additional guards:

  • File-size limits (Content-Length and on-disk checks)
  • Virus/malware scanning (ClamAV or a commercial API)
  • Rate limiting and authentication on upload endpoints
  • Content-Security-Policy headers when you serve user-supplied media

Test your implementation

A few unit tests (using Vitest, for example) help ensure future refactors don’t break validation. Test with valid files, malformed files, and edge cases like empty files or files with incorrect extensions.

import { describe, expect, it } from 'vitest'
import { fileTypeFromBuffer } from 'file-type'

const png = Buffer.from('89504e470d0a1a0a', 'hex')
const empty = Buffer.alloc(0)
const jpeg = Buffer.from(
  'ffd8ffe000104a46494600010100000100010000ffdb0043000101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101ffc00011080001000103012200021101031101ffc4001f0000010501010101010100000000000000000102030405060708090a0bffda000c03010002110311003f00f2a900',
  'hex',
) // A small valid JPEG

describe('magic-number validation', () => {
  it('detects PNG correctly', async () => {
    const t = await fileTypeFromBuffer(png)
    expect(t?.mime).toBe('image/png')
  })

  it('detects JPEG correctly', async () => {
    const t = await fileTypeFromBuffer(jpeg)
    expect(t?.mime).toBe('image/jpeg')
  })

  it('rejects empty buffers', async () => {
    const t = await fileTypeFromBuffer(empty)
    expect(t).toBeUndefined()
  })
})

Troubleshoot common issues

Symptom Possible cause Fix
Error: Unknown or unsupported file type file-type cannot match any signature Increase read size to 16 KiB; make sure the file is not encrypted or truncated
Module not found: file-type Using CommonJS require() Switch to ESM or use import('file-type') dynamic import
ImportError: failed to find libmagic libmagic missing on OS Install using package manager or use the -bin wheel on Windows

Wrap-up

Magic-number checks offer a low-latency, high-confidence way to verify uploads before you store or process them. When combined with size limits, malware scanning, and robust error handling, they form an effective shield against many file upload attacks.

Need an easier way to handle uploads at scale? Check out our handling uploads service at Transloadit.