All posts
·6 min read

How We Process 50 Video Uploads Simultaneously (Without Melting Our Servers)

A mental health research platform needed to transcode hundreds of interview videos per day. Here's how we built a system that processes multiple videos in parallel—turning a 6-hour bottleneck into a 20-minute workflow.

backendproduct-engineeringperformancevideo

The Problem: Video Processing Was the Bottleneck

A research platform allows mental health researchers to upload video interviews with study participants. Each video is typically 45-60 minutes, recorded in 1080p on smartphones.

The workflow researchers needed:

  1. Upload original video (often 2-3 GB per file)
  2. System creates web-friendly versions (480p, 720p for different bandwidth scenarios)
  3. Extract audio as a separate file (for transcription services)
  4. All files available within minutes, not hours

The challenge: They were uploading 50+ videos per day. Peak times saw 10-15 simultaneous uploads.

Their initial attempt: A simple sequential processing system.

  • Upload video → wait for processing to finish → upload next video
  • Each video took 15-20 minutes to process
  • Processing 10 videos took 3+ hours
  • Researchers sat idle waiting for videos to finish

The requirement: Process multiple videos at the same time, without overwhelming the server or running out of disk space.

Why Sequential Processing Fails

Most developers start with the simple approach: process one video at a time. This creates a massive bottleneck.

The sequential problem:

  • Video 1: Finishes at 2:18 PM
  • Video 2: Finishes at 2:36 PM
  • Video 10: Finishes at 5:00 PM

The researcher who uploaded Video 10 at 2:00 PM waits 3 hours to see their processed file.

The server's perspective: CPU usage hovers at 25-30%. The server is bored, mostly idle, waiting for the current video to finish before starting the next one.

This is like having a 4-lane highway but forcing all cars into a single lane.

Our Solution: Parallel Processing with Smart Queue Management

We built a system that processes multiple videos simultaneously—but with controls to prevent server overload.

How it works:

  1. Researcher uploads 10 videos through web interface
  2. Videos go directly to cloud storage (AWS S3)
  3. All 10 videos enter a processing queue immediately
  4. We run 3 dedicated video processing workers (configurable based on server capacity)
  5. Each worker automatically picks jobs from the queue
  6. For each video, 3 operations run simultaneously: Transcode to 480p, transcode to 720p, extract audio track

The result:

  • 10 videos uploaded at 2:00 PM
  • Workers 1, 2, 3 process Videos 1, 2, 3 immediately (done by 2:18 PM)
  • Workers pick up Videos 4, 5, 6 (done by 2:36 PM)
  • Workers pick up Videos 7, 8, 9 (done by 2:54 PM)
  • Last worker picks up Video 10 (done by 3:06 PM)
  • All 10 videos fully processed in 66 minutes

Time savings: 3 hours → 66 minutes (nearly 3x faster)

The Technical Challenge: Smart Worker Configuration

Instead of one giant queue for everything, we created separate queues for different types of work:

Video Processing Queue:

  • Heavy CPU usage (video encoding)
  • 3 dedicated workers, each handling 1 video at a time
  • Each worker uses 2 CPU cores (for the 3 parallel operations per video)
  • Total: 6 CPU cores actively encoding

Why this configuration works:

  • 8-core server handles 3 videos + other tasks simultaneously
  • Leaves 2 cores for the web app and database
  • Workers automatically pick jobs from queue—no manual intervention needed
  • If a worker crashes, job manager re-queues the incomplete video and another worker picks it up

Handling failures gracefully:

  • Corrupted video? Worker marks job as "failed" with error message, moves on to next video
  • Server out of disk space? Workers check available disk before downloading, job stays in queue until space frees up
  • Automatic cleanup of processed temporary files prevents disk buildup

Real-World Performance

Before parallel processing:

  • ⏱️ Processing time: 15-20 minutes per video (sequential)
  • 📊 Throughput: ~3 videos per hour
  • 💻 CPU usage: 25-30% (server mostly idle)
  • 😓 User experience: "Why is this taking so long?"

After parallel processing:

  • ⏱️ Processing time: Still 15-20 minutes per video, but 3 videos process at once
  • 📊 Throughput: ~9 videos per hour (3x improvement)
  • 💻 CPU usage: 70-80% (server actually working)
  • User experience: "My videos are ready in 20 minutes even during peak hours"

Peak load handling:

  • Before: 50 videos uploaded → 16+ hours of processing time
  • After: 50 videos uploaded → 6 hours of processing time
  • Real-world: Most videos finish within 30 minutes of upload during normal hours

What We Learned

1. Parallelism is limited by resources, not code

You can't just keep adding workers indefinitely. We tested 3 workers (perfect balance, 70-80% CPU usage), 6 workers (server struggled, 95%+ CPU, system became unresponsive), and 2 workers (underutilized, 50% CPU, throughput suffered). Test with realistic load to find the optimal worker count for your server.

2. Separate queues prevent one bottleneck from blocking everything

Early version had one queue for all jobs (videos, audio, documents, thumbnails). When videos backed up during peak hours, simple document jobs waited unnecessarily. Separate queues meant quick jobs finish fast regardless of video backlog. Users get some feedback quickly even if video is still processing.

3. Visibility into queue status changes user behavior

We added a simple progress indicator: "Your video is #3 in the processing queue. Estimated time: 25 minutes." Impact: Support tickets about "slow processing" dropped by 80%. Users just needed to know the system was working.


Building a platform that handles media uploads, document processing, or any batch processing workload? Let's talk →

We've built parallel processing systems for video transcoding, image optimization, AI inference, and document generation. The pattern is always the same: identify independent work, distribute it across workers, and monitor to prevent overload.

Have questions about this post? Get in touch.