Last updated: May 12, 2025

Boost js file uploads using Web Workers and streams

Kevin van Zonneveld

Co-founder · Amsterdam, The Netherlands · Show bio ·

Uploading files efficiently is crucial for modern web applications. Traditional approaches often read entire files into memory, which can freeze the UI, consume excessive RAM, and lead to a poor user experience, especially with large files. By offloading heavy processing to Web Workers and streaming data in manageable chunks using JavaScript Streams, you can maintain a responsive main thread—even when users upload multi-gigabyte files.

Challenges with traditional file uploads

A common, yet inefficient, approach involves:

Reading the entire File object into memory using FileReader.
Constructing a FormData object with the file data.
Sending the FormData via a fetch or XMLHttpRequest POST request.

Uploading a single 2 GB video this way can cause significant memory spikes and make the page unresponsive. Attempting multiple large uploads in parallel using this method will likely crash the browser tab due to memory exhaustion.

Meet Web Workers and streams

Web Workers

Web Workers allow you to run JavaScript code in background threads, separate from the main execution thread that handles the UI. This means computationally intensive tasks like hashing, compression, or chunking data for uploads won't block rendering, scrolling, or user input, resulting in a smoother experience and improved file upload performance.

JavaScript streams

The Streams API enables processing data incrementally as chunks. Instead of loading an entire file into memory, you can read and process small pieces (typically Uint8Arrays) as they become available. This drastically reduces memory usage and allows data to be sent over the network almost immediately after being read, making JavaScript Streams ideal for large js file upload tasks.

Architecture overview

A robust parallel processing upload system using these technologies typically involves:

Main Thread: Handles UI interactions (like drag-and-drop), file selection, and displaying progress updates. It passes File objects to the worker pool.
Worker Pool: A set of Web Workers manages the file processing tasks. Each available worker receives a File reference.
Individual Worker: Uses the Blob.stream() or File.stream() API to read the file chunk by chunk. Each chunk is then POSTed to the back-end upload endpoint. Progress messages (percentage complete) and status updates (completion, errors) are sent back to the main thread.
Back-end: Receives the chunks and reassembles them into the complete file. This often involves protocols like Tus, cloud storage multipart uploads (e.g., S3 Multipart Upload), or custom server-side logic.

Streaming a file without freezing the UI

The following examples demonstrate a minimal but practical pattern for chunked uploads using a worker pool. Note the inclusion of error handling and cleanup mechanisms.

Main thread (`main.js`)

This script sets up the worker pool and handles file input events, delegating the processing of each file to the pool.

// Assumes WorkerPool class is defined elsewhere (see below)
const pool = new WorkerPool('upload-worker.js')

const fileInput = document.querySelector('#file-input')

fileInput.addEventListener('change', (evt) => {
  const files = Array.from(evt.target.files)
  files.forEach((file) => {
    console.log(`Queueing ${file.name} for upload...`)
    pool.processFile(file, {
      onProgress: (pct, msg) => updateProgressUI(file.name, pct, msg),
      onComplete: (msg) => showSuccess(file.name, msg),
      onError: (err) => showError(file.name, err),
    })
  })
})

function updateProgressUI(filename, pct, message) {
  // Update your progress bar or UI element here
  console.log(`${filename}: ${pct.toFixed(1)}% – ${message}`)
}

function showSuccess(filename, message) {
  // Update UI to show completion
  console.info(`${filename}: ${message}`)
}

function showError(filename, error) {
  // Update UI to show error state
  console.error(`${filename}: Upload failed - ${error}`)
}

// Example: Clean up the pool when the page unloads
window.addEventListener('unload', () => {
  pool.terminate()
})

Worker implementation (`upload-worker.js`)

This worker script receives a File object, reads it as a stream, uploads chunks, and sends progress/status messages back.

self.onmessage = async (event) => {
  const file = event.data
  let reader // Declare reader outside the try block for finally access

  try {
    if (!file || typeof file.stream !== 'function') {
      throw new Error('Invalid file object received.')
    }

    const stream = file.stream()
    reader = stream.getReader()

    const total = file.size
    let uploaded = 0
    let chunkIndex = 0

    while (true) {
      let value, done
      try {
        // Read the next chunk from the stream
        ;({ value, done } = await reader.read())
      } catch (readError) {
        // Handle potential errors during stream reading
        throw new Error(`Error reading file stream: ${readError.message}`)
      }

      if (done) {
        // End of stream reached
        break
      }

      try {
        // Upload the current chunk
        await uploadChunk(value, file.name, chunkIndex++)
        uploaded += value.length

        // Calculate and report progress
        const pct = total > 0 ? (uploaded / total) * 100 : 100
        self.postMessage({
          type: 'progress',
          progress: pct,
          message: `Uploading chunk ${chunkIndex}...`,
        })
      } catch (uploadError) {
        // Handle chunk upload errors
        // Consider implementing retry logic here before throwing
        console.error(
          `Chunk ${chunkIndex - 1} upload failed for ${file.name}: ${uploadError.message}`,
        )
        throw new Error(`Chunk upload failed: ${uploadError.message}`)
      }
    }

    // Signal completion if all chunks were uploaded successfully
    self.postMessage({ type: 'complete', message: 'Upload finished successfully.' })
  } catch (err) {
    // Catch any errors from the process and report them back
    self.postMessage({
      type: 'error',
      message: err.message || 'An unknown error occurred in the worker.',
    })
  } finally {
    // IMPORTANT: Ensure the stream lock is released regardless of success or failure
    if (reader) {
      try {
        await reader.releaseLock()
      } catch (releaseLockError) {
        // Log error if releasing the lock fails, but don't block completion/error reporting
        console.error('Error releasing stream lock:', releaseLockError)
      }
    }
    // Optional: Close the worker if it's designed for single use
    // self.close();
  }
}

async function uploadChunk(chunk, filename, index) {
  const formData = new FormData()
  // Send chunk index for server-side reassembly
  formData.append('chunkIndex', index.toString())
  // Send the actual chunk data as a Blob
  formData.append('fileChunk', new Blob([chunk]), `${filename}.part${index}`)
  // Add any other necessary data (e.g., unique upload ID, total chunks)
  // formData.append('uploadId', 'unique-upload-identifier');
  // formData.append('totalChunks', totalChunks.toString());

  // Replace '/upload' with your actual back-end endpoint
  const res = await fetch('/upload', {
    method: 'POST',
    body: formData,
    // Consider adding an AbortSignal here for cancellation support
    // signal: abortController.signal
  })

  if (!res.ok) {
    // Try to get more details from the server response on failure
    let errorText = `Server responded with status ${res.status}`
    try {
      const serverError = await res.text()
      if (serverError) {
        errorText += `: ${serverError}`
      }
    } catch (e) {
      /* Ignore errors reading response body */
    }
    throw new Error(errorText)
  }

  // Optionally process the server response if needed
  // const result = await res.json();
  // return result;
}

Why not read the whole file once in the worker?

While reading the entire file within the worker avoids blocking the main thread, it still consumes significant memory within the worker thread itself. Streaming the file chunk-by-chunk inside the worker offers several advantages:

Lower Memory Footprint: Only small chunks reside in memory at any given time.
Resumability: If an upload fails mid-way (e.g., network drop), you only need to retry the failed chunks, not the entire file.
Parallel Chunk Uploads: Advanced implementations could potentially upload multiple chunks concurrently (though this adds complexity in ordering and server-side handling).

Building a tiny worker pool

Using a single worker can still become a bottleneck if you need to process many files simultaneously. A WorkerPool distributes tasks among a limited number of workers (ideally matching CPU cores) and queues pending tasks.

class WorkerPool {
  constructor(script, size = navigator.hardwareConcurrency || 4) {
    this.workers = []
    this.idleWorkers = []
    this.taskQueue = []
    this.taskCallbacks = new Map() // Map task ID to callbacks

    console.log(`Initializing WorkerPool with size ${size}`)
    for (let i = 0; i < size; i++) {
      const worker = new Worker(script)
      worker.id = `worker_${i}`
      // Handle messages from the worker
      worker.onmessage = (e) => this.handleWorkerMessage(worker, e.data)
      // Handle errors occurring within the worker itself
      worker.onerror = (e) => this.handleWorkerError(worker, e)
      this.workers.push(worker)
      this.idleWorkers.push(worker)
    }
  }

  generateTaskId() {
    // Simple unique ID generator for tasks
    return Date.now().toString(36) + Math.random().toString(36).substring(2)
  }

  processFile(file, callbacks) {
    const taskId = this.generateTaskId()
    const task = { id: taskId, file }
    this.taskCallbacks.set(taskId, callbacks)

    const idleWorker = this.idleWorkers.pop()
    if (idleWorker) {
      // An idle worker is available, run the task immediately
      this.runTask(idleWorker, task)
    } else {
      // All workers are busy, add task to the queue
      this.taskQueue.push(task)
      console.log(
        `Worker pool busy. Queued task ${taskId} for ${file.name}. Queue size: ${this.taskQueue.length}`,
      )
    }
  }

  runTask(worker, task) {
    console.log(`Assigning task ${task.id} (${task.file.name}) to ${worker.id}`)
    worker.currentTask = task // Associate task metadata with the worker
    // Send the file object to the worker to start processing
    // For large files, consider if Transferable Objects are applicable/needed
    worker.postMessage(task.file)
  }

  handleWorkerMessage(worker, data) {
    const task = worker.currentTask
    if (!task) {
      console.warn(`Received message from worker ${worker.id} without an assigned task.`)
      return
    }

    const callbacks = this.taskCallbacks.get(task.id)
    if (!callbacks) {
      console.warn(`Received message for unknown or completed task ${task.id}`)
      return // Task might have been cancelled or already completed/failed
    }

    // Process messages based on their type
    switch (data.type) {
      case 'progress':
        if (callbacks.onProgress) callbacks.onProgress(data.progress, data.message)
        break
      case 'complete':
        if (callbacks.onComplete) callbacks.onComplete(data.message)
        this.finishTask(worker, task.id) // Mark task as finished
        break
      case 'error':
        if (callbacks.onError) callbacks.onError(data.message)
        this.finishTask(worker, task.id) // Also finish task on error
        break
      default:
        console.warn(`Received unknown message type from worker ${worker.id}:`, data.type)
    }
  }

  handleWorkerError(worker, errorEvent) {
    console.error(`Error in worker ${worker.id}:`, errorEvent.message, errorEvent)
    const task = worker.currentTask
    if (task) {
      const callbacks = this.taskCallbacks.get(task.id)
      if (callbacks && callbacks.onError) {
        // Report the error back via the task's error callback
        callbacks.onError(`Worker script error: ${errorEvent.message}`)
      }
      // Finish the task as it cannot continue
      this.finishTask(worker, task.id)
    } else {
      console.error(
        `Unhandled error in idle worker ${worker.id}. Consider worker replacement strategy.`,
      )
      // If an idle worker errors, you might want to remove it or try respawning
      // For simplicity, we just log it here.
    }
  }

  finishTask(worker, taskId) {
    const task = worker.currentTask
    if (task && task.id === taskId) {
      console.log(`Task ${taskId} (${task.file.name}) finished on ${worker.id}`)
      worker.currentTask = null // Disassociate task from worker
    }
    this.taskCallbacks.delete(taskId) // Remove callbacks

    // Check the queue for the next task
    if (this.taskQueue.length > 0) {
      const nextTask = this.taskQueue.shift()
      console.log(`Dequeuing task ${nextTask.id}. Queue size: ${this.taskQueue.length}`)
      this.runTask(worker, nextTask) // Assign the next task to this now-free worker
    } else {
      this.idleWorkers.push(worker) // No pending tasks, return worker to the idle pool
      console.log(`Worker ${worker.id} is now idle. Idle workers: ${this.idleWorkers.length}`)
    }
  }

  terminate() {
    console.log('Terminating worker pool...')
    this.workers.forEach((worker) => {
      console.log(`Terminating worker ${worker.id}`)
      worker.terminate()
    })
    // Clear internal state
    this.workers = []
    this.idleWorkers = []
    this.taskQueue = []
    this.taskCallbacks.clear()
  }
}

Browser compatibility

Web APIs evolve, so always verify browser support for the features you rely on.

Feature	Chrome	Firefox	Safari	Edge	Notes
Web Workers	4+	3.5+	4+	12+	Widely supported.
`File.stream()`	76+	69+	15.2+	79+	The core API for reading file contents as a stream.
`Blob.stream()`	76+	69+	14.1+	79+	Similar to `File.stream()`.
Transferable Streams	77+	111+*	✗	79+	Allows transferring stream ownership between threads efficiently.

* Transferable streams might require enabling flags in some Firefox versions or might have limitations. Check current support on MDN or Can I Use.

For browsers lacking support for File.stream(), you might need to fall back to a FileReader-based approach (potentially within the worker to avoid blocking the main thread, but still using more memory) or use established libraries like Tus-JS-Client or Uppy, which handle compatibility and provide features like resumability.

Memory management best practices

Limit Worker Count: Spawn workers based on available CPU cores, typically navigator.hardwareConcurrency. Creating too many workers can lead to excessive context switching and memory overhead.
Terminate Workers: Explicitly call worker.terminate() or pool.terminate() when the workers are no longer needed (e.g., after all uploads complete, or on page unload) to release resources. Use try...finally blocks in your application logic to ensure termination happens even if errors occur during the upload process.
Release References: In both the main thread and workers, nullify references to large objects (like File objects, Blobs, ArrayBuffers, or stream readers) once they are no longer needed (reader = null, file = null, chunk = null) to allow garbage collection. Ensure stream readers are released using reader.releaseLock().
Chunk Size: Choose a sensible chunk size (e.g., 1-10 MiB). Very small chunks increase network overhead (more HTTP requests per file), while very large chunks negate some of the memory-saving benefits of streaming.
Monitor Memory: Use browser developer tools (like Chrome's Memory tab or Firefox's Memory tool) or the performance.memory API (where available and applicable) during development and testing to monitor memory usage under load and identify potential leaks.

// Example using try...finally for pool termination in application code
const pool = new WorkerPool('upload-worker.js')
try {
  // ... use pool to process files ...
  // Example: await Promise.all(files.map(file => pool.processFileAsync(file)));
} finally {
  // Ensure pool is terminated regardless of success or failure
  console.log('Cleaning up worker pool...')
  pool.terminate()
}

Security and resilience

CORS: Configure your upload endpoint's Cross-Origin Resource Sharing (CORS) policy carefully on the server. Allow only necessary HTTP methods (POST, potentially OPTIONS for preflight requests), required headers (like Content-Type, Content-Range, custom headers like X-Chunk-Index), and restrict origins (Access-Control-Allow-Origin) to your application's domain.
Authentication/Authorization: Secure your upload endpoint. For chunked uploads, ensure each chunk request is authenticated and authorized. Methods include using secure HTTP-only session cookies, bearer tokens (JWTs) sent in the Authorization header, or generating pre-signed URLs for each chunk or the entire upload session (common with cloud storage).
Retries: Network issues are common. Implement a retry mechanism in your uploadChunk function for failed chunk uploads. Use exponential back-off (waiting progressively longer between retries: e.g., 1s, 2s, 4s) to avoid overwhelming the server or network. Abort retrying after a reasonable number of attempts (e.g., 3-5).
Cancellation: Provide users with a way to cancel ongoing uploads. Use the AbortController API. Create an AbortController instance before starting the upload, pass its signal to each fetch request, and call controller.abort() when the user cancels. Ensure your error handling catches the AbortError.

// Example: Using AbortController for fetch cancellation
const ctrl = new AbortController()
const signal = ctrl.signal

// In uploadChunk function:
try {
  const res = await fetch('/upload', { method: 'POST', body: formData, signal })
  // ... handle response ...
} catch (err) {
  if (err.name === 'AbortError') {
    console.log('Chunk upload fetch aborted')
    // Propagate cancellation signal or specific error
    throw new Error('Upload cancelled by user')
  } else {
    console.error('Chunk upload fetch error:', err)
    throw err // Re-throw other errors
  }
}

// To cancel the upload associated with this controller:
// ctrl.abort();

Debugging Web Workers

Debugging workers can be slightly different from main thread debugging:

Browser DevTools: Most modern browsers provide tools to inspect active workers. In Chrome DevTools, look under the "Sources" tab (you might see worker scripts listed) or the dedicated "Application" -> "Workers" section. In Firefox, check the "Debugger" tab (worker scripts appear in the sources list) or "Application" -> "Workers". You can set breakpoints, inspect variables, and view console.log messages from workers.
Error Handling: Robust postMessage communication for errors (as shown in the examples) is crucial for understanding issues occurring within the worker, as direct try...catch from the main thread won't catch worker errors. Ensure worker errors are explicitly caught and posted back.

Common pitfalls

Excessive Workers: Spawning a new worker for every file instead of using a pool can overwhelm the system's resources (CPU and memory).
Stream Locks: Forgetting to call reader.releaseLock() on a ReadableStreamDefaultReader after finishing reading or encountering an error. This prevents the stream from being properly closed or potentially read again. Always use a finally block for releaseLock().
Large Message Payloads: Avoid posting very large data objects between the main thread and workers using postMessage, as this involves serialization and deserialization overhead (or structured cloning). For large binary data, investigate using Transferable objects (like ArrayBuffer) for more efficient zero-copy transfers where supported and appropriate.
Chunk Ordering: Assuming the server will receive chunks in the exact order they were sent. Network latency and concurrent requests can cause reordering. Always include an index or byte offset with each chunk so the server can reassemble the file correctly.
Unhandled Errors: Lack of proper try...catch blocks within the worker, especially around asynchronous operations like stream reading (reader.read()) and network requests (fetch), can cause silent failures or unhandled promise rejections within the worker.

Key takeaways

Web Workers are essential for moving CPU-intensive file processing (like chunking) off the main thread, keeping the UI responsive during js file upload.
JavaScript Streams allow efficient handling of large files by processing data in chunks, significantly reducing peak memory usage and enabling features like resumability.
A worker pool manages parallel processing, preventing the system from being overloaded when handling multiple simultaneous uploads and improving overall file upload performance.
Robust error handling (including network retries and stream error handling), proper resource cleanup (terminate, releaseLock), security considerations (CORS, auth), and attention to browser compatibility are vital for production-ready implementations.

For a production-ready solution that handles chunking, resumability, retries, and parallel uploads out of the box, consider using libraries like Uppy with its various upload plugins, or explore services designed for robust file handling. Transloadit's handling uploads service integrates these concepts for reliable large file uploads. Happy uploading!

#js-file-upload #web-workers #javascript-streams #parallel-processing #file-upload-performance #handling-uploads-service