media server logo

How To Make A Video Smaller

Mar 09, 2026

This guide explains how to make a video smaller in practical, production terms — whether you need a smaller MP4 for web delivery, a lower-bandwidth live stream, or to shrink mobile uploads without breaking playback. You'll get concrete encoder targets, ffmpeg recipes, latency/architecture budgets, testing checklists and common fixes used in real-world streaming and VOD systems. If this is your main use case, this practical walkthrough helps: Hevc Video. Before full production rollout, run a Test and QA pass with Generate test videos and streaming quality check and video preview. Pricing path: validate with bitrate calculator. For this workflow, Paywall & access is the most direct fit. Before full production rollout, run a Test and QA pass with a test app for end-to-end validation.

What it means (definitions and thresholds)

"Make a video smaller" can mean several different objectives. Pick the one that matches your use case before changing settings. For an implementation variant, compare the approach in Your Sports Stream.

  • Reduce file size (VOD): lower the bytes on disk for downloads and progressive playback. Typical thresholds: small web preview < 1 MB for a 10–15 s clip; standard 720p video files often 1–5 MB/min; 1080p VOD files commonly 30–60 MB/min depending on quality.
  • Reduce streaming bandwidth (ABR / live): limit the instantaneous bitrate consumed by a viewer. Target bitrates are normally measured in kbps or Mbps (see practical targets below).
  • Shrink upload size / transfer payload: client-side encoding (mobile/browser) or server-side recompress before storage/ingest.
  • Reduce startup time / memory footprint: smaller container metadata (moov atom at start) and fewer shards per segment.

Compression is lossy in most production cases — you trade bits for visual quality, CPU, latency and compatibility. Recommended working thresholds you can use immediately: If you need a deeper operational checklist, use Stream Live.

  • H.264/AVC targets (typical perceptual quality): 1080p30 = 3.5–6 Mbps; 1080p60 = 5–8 Mbps; 720p30 = 1.5–3 Mbps; 480p30 = 700–1,200 kbps; 360p30 = 400–700 kbps; 240p = 200–400 kbps.
  • HEVC/H.265: expect ~30–50% bitrate reduction for similar quality over H.264 (hardware/browser support varies).
  • AV1: can be 30–60% more efficient than H.264 but encoding is CPU-intensive and hardware support is still limited.
  • Segment and part sizes (for streaming): segments 2–6 s (classic HLS/DASH); CMAF part sizes (low-latency) 100–500 ms.
  • GOP / keyframe interval: live low-latency = 1–2 s; standard streaming = 2–4 s; VOD can be 4–10 s to increase compression efficiency.

Decision guide

Choose one of these paths and follow the recipes later in the article. A related implementation reference is Low Latency.

  • Fast web delivery of recorded content (VOD):
    • Re-encode to an adaptive set of MP4/HLS bitrates and use mp4 "faststart" or CMAF for progressive playback.
    • Use two strategies: CRF for single-file quality control; constrained VBR (target + maxrate + bufsize) for packaging into ABR ladders.
  • Live streaming with lower bandwidth per viewer:
    • Transcode to a multi-bitrate ladder and serve via ABR with CDN. Use shorter GOPs (1–2 s) if you need low latency.
    • Consider transport: RTMP/SRT/WebRTC depending on latency and network reliability. For lossy networks, SRT is a good balanced choice — tune jitter buffer and latency.
  • Mobile or client-side upload shrink:
    • Prefer fast presets, moderate CRF (x264 crf 23–28), and hardware encoders (NVENC, QuickSync) if available. Limit resolution and frame rate at capture to reduce work.

Latency budget / architecture budget

When you change bitrate/GOP/segment sizes you also affect latency. Explicit budgets are useful to make trade-offs predictable.

Example target budgets (total end-to-end):

  • VOD progressive playback: startup = 0.5–3 s (player buffer 0.5–2 s).
  • Classic HLS / DASH live: 15–30 s (segment 3–6 s, playlist holdback 3–5 segments).
  • Low-latency HLS/CMAF: 1–5 s. Typical config: segment 2 s, part 200–300 ms, player holdback 2–4 parts.
  • SRT transport (reliable low-latency over public Internet): one-way latency typically configurable from 120 ms up to several seconds depending on jitter and packet-loss tolerance. Practical live deployments use 200–800 ms buffer for unstable networks.
  • WebRTC (ultra low-latency): sub-500 ms with direct peer or SFU-based transcode; CPU/encode adds 50–200 ms typically.

Budgeting example for a low-latency live stream target E2E = 2 s:

  • Encoder GOP/keyframe alignment: 400 ms — 800 ms (1–2 s GOP recommended, keyframes aligned to segments)
  • Network (ingest to cloud): 200–500 ms
  • Packaging & CDN edge: 200–400 ms
  • Player buffer and decode: 200–400 ms
  • Total: ~1.0–2.1 s

To make a video smaller without increasing latency, target encoder efficiency (better codec/preset) and reduce resolution or frame rate rather than increasing segment packing or buffering.

Practical recipes

Below are production-tested recipes. Replace input/output names and parameters to match your environment.

Recipe A — Recompress a recorded MP4 for web delivery (single file, CRF)

When you have a recorded file and need a smaller MP4 quickly with good perceptual quality, use a CRF encode with conservative presets.

ffmpeg -i input.mp4 -c:v libx264 -preset slow -crf 23 -profile:v high -level 4.1 -pix_fmt yuv420p \\
  -vf "scale='min(1280,iw)':'min(720,ih)':force_original_aspect_ratio=decrease" \\
  -c:a aac -b:a 128k -movflags +faststart output_720p_crf23.mp4
  • Notes:
    • CRF 18–20 is visually high quality, CRF 23–28 reduces file size further. For web previews CRF 24–28 is reasonable.
    • Use -preset slower for better compression at cost of CPU time.
    • -movflags +faststart moves the moov atom so playback can start immediately during progressive download.

Recipe B — Create an adaptive bitrate ladder for streaming (multi-bitrate)

Encode multiple renditions for ABR using constrained VBR settings (target, maxrate, bufsize) and align keyframes to segments. Example ladder: 1080p30, 720p30, 480p30, 360p30.

# Example for one rendition (720p30 ~ 2500k target)
ffmpeg -i input.mp4 -c:v libx264 -preset medium -b:v 2500k -maxrate 2750k -bufsize 5000k \\
  -g 60 -keyint_min 60 -sc_threshold 0 -pix_fmt yuv420p -vf "scale=1280:720" \\
  -c:a aac -b:a 128k -f hls -hls_time 4 -hls_segment_type fmp4 -hls_playlist_type vod 720p.m3u8
  • Notes:
    • Set g (GOP) = framerate * keyframe_seconds (e.g., 30 fps * 2 s = 60).
    • maxrate and bufsize typically maxrate = target * 1.1 and bufsize = target * 2 (in kbps).
    • Package as fMP4 (CMAF-compatible) for easier low-latency options and CDN compatibility.
  • Use a workflow orchestration or API to spawn encodes for each ladder item — see multi-streaming and video API for automation.

Recipe C — Reduce live bandwidth on constrained networks (using SRT ingest)

When viewers and/or the origin have limited bandwidth, reduce each output bitrate and provide smaller renditions. Use SRT for resilient ingest and tune latency/jitter buffer.

  • Ingest: configure your encoder to send an SRT stream with a latency allowance based on measured jitter. Example SRT latency: 300–800 ms for public Internet with moderate loss.
  • Transcode ladder: drop the top rung (e.g., no 1080p) and send 720p30 at 1.5–2.5 Mbps and 480p30 at 700–1,200 kbps for low-bandwidth viewers.
  • Player: ensure ABR and a fast initial buffer (0.5–1 s) and allow seamless switching between renditions.

Operational tip: measure packet-loss and RTT between encoder and origin; increase SRT latency (jitter buffer) in 100 ms steps until packet loss is corrected without rebuffering.

Practical configuration targets

Use these targets as starting points. Tune for motion, content complexity and viewer devices.

  • Encoder general settings:
    • Codec: H.264 (libx264) for max compatibility; HEVC (libx265) or AV1 for bitrate savings where supported.
    • Keyframe interval (GOP): 1–2 s for low-latency live; 2–4 s for general streaming; 4–10 s for VOD.
    • Profile/level: use High/Main profile for H.264; use level 4.1 for 1080p30 compatibility.
    • Ratecontrol:
      • VOD: CRF mode (x264 CRF 18–28 depending on quality target).
      • Streaming: constrained VBR with -b:v target, -maxrate = target * 1.1, -bufsize = target * 2.
  • Resolution / bitrate targets (H.264, good visual quality):
    • 2160p30/60: 12–25 Mbps (rare for streaming—use only when needed)
    • 1080p30: 3.5–6 Mbps; 1080p60: 5–8 Mbps
    • 720p30: 1.5–3 Mbps; 720p60: 2.5–5 Mbps
    • 480p30: 700–1,200 kbps
    • 360p30: 400–700 kbps
    • 240p: 200–400 kbps
  • Audio targets:
    • Stereo AAC-LC: 96–192 kbps (128 kbps common)
    • Mono or voice-only: 32–64 kbps
  • Segment & packaging:
    • HLS/DASH segment duration: 2–6 s (lower for faster start/seek).
    • CMAF part size for low-latency: 100–500 ms (200–300 ms is common).
    • For live low-latency set player holdback to 2–4 parts.

Limitations and trade-offs

  • Codec vs compatibility: HEVC and AV1 reduce bitrate but have limited browser/hardware decoder support; H.264 remains the broadest compatibility option.
  • Encode CPU vs bitrate: slower encoder presets (x264 veryslow) give better compression but cost more CPU and time — not suitable for real-time live unless hardware encoders or extremely powerful servers are used.
  • Quality vs latency: Aggressive compression and very low GOPs increase bitrate variability or visible artifacts. To reduce bandwidth without sacrificing latency, reduce resolution or frame rate first.
  • Adaptive streaming overhead: More renditions increase storage and outbound CDN cost; balance ladder size against viewer metrics.

Common mistakes and fixes

  • Wrong keyframe alignment: If players can’t switch renditions cleanly, align keyframes to segment boundaries. Fix: set -g equal to segment_duration_in_seconds * framerate and disable scene cut (-sc_threshold 0).
  • Using CRF for live ABR: CRF is for single-file quality; use constrained VBR (target + maxrate + bufsize) for live ABR packaging.
  • Not moving moov atom: Outputs without faststart stall progressive playback. Fix: add -movflags +faststart.
  • Too-high CPU preset for live: Using veryslow on live encoders causes dropped frames. Fix: use medium/fast or hardware encoders (NVENC, QuickSync) and measure latency/quality trade-off.
  • Expecting codec miracles: Switching to AV1 without checking client support will not help. Always provide an H.264 fallback.

Rollout checklist

Before you ship changes to production, run this checklist and automate tests where possible.

  • Encoding & packaging
    • Generate ABR renditions with aligned GOPs and validate segment boundaries with tools like ffprobe/segmenter.
    • Verify -movflags +faststart for VOD files.
    • Check HLS/DASH manifests for correct bandwidth attributes and codecs.
  • Quality & measurements
    • Compute objective quality (VMAF/SSIM) between original and encoded files for representative clips — target VMAF > 80 for acceptable quality depending on resolution.
    • Run ABR simulations: force network throttling (e.g., 2G/3G/4G profiles) and confirm smooth switching.
  • Latency & performance
    • Measure end-to-end latency using timestamps; verify buffer and jitter across regions.
    • Monitor encoder CPU/GPU and memory under target concurrency.
  • Delivery & monitoring
    • Run CDN smoke tests from multiple regions; verify correct MIME types and fMP4 compatibility.
    • Deploy metrics collection: startup time, bitrate switching events, rebuffer ratio, VMAF trends.

Example architectures

Two common patterns with actionable detail.

1) VOD pipeline (batch encoding, small-file targets)

  • Upload or ingestion (client/mobile/browser).
  • Transcoding cluster (container servers or cloud instances):
    • Jobs: create multiple renditions using constrained VBR and faststart; generate HLS and DASH manifests.
    • Recommended artifact outputs: fMP4 segments, master.m3u8, manifest.mpd.
  • Storage & CDN: origin objects and CDN configuration for caching and compression. For automated workflows see video-on-demand.
  • API & orchestration: manage encoding jobs, monitor progress and deliver asset URLs — see video API.

2) Live low-latency pipeline (SRT ingest, cloud transcode, CDN)

  • Encoder / field device sends video to origin via SRT; configure SRT latency (e.g., 300–800 ms) according to network quality.
  • Cloud transcoders create an ABR ladder with GOP = framerate * 1–2 s and package into CMAF fragmented fMP4 parts sized 200–300 ms.
  • Origin publishes CMAF/HLS/DASH to CDN with low-latency configuration.
    • Player uses low-latency CMAF/HLS or WebRTC depending on client support. For fallback, provide classic HLS with longer segments.
  • Operational tooling: monitor ingest packet loss/RTT and automatically adjust SRT jitter buffer or fallback bitrates. See low-level setup notes in /docs/srt-setup.
  • When you must distribute to multiple destinations (social platforms, other CDNs), consider using a multi-streaming approach — see multi-streaming.

Troubleshooting quick wins

  • Check actual bitrate and resolution:
    ffprobe -v error -select_streams v:0 -show_entries stream=width,height,r_frame_rate,bit_rate -of default=noprint_wrappers=1 input.mp4
    
  • If output file is bigger than expected:
    • Confirm CRF or target bitrate used; use a higher CRF or lower target bitrate.
    • Ensure you didn't accidentally force lossless flags or veryslow presets with tune settings that increase size.
  • If playback stalls at start:
    • Check for moov atom at end of file; fix with -movflags +faststart.
    • Verify segment durations and manifest are correct for ABR player expectations.
  • If rendition switching is choppy:
    • Align keyframes to segments and ensure timelines are monotonic across renditions.
    • Check that codec profiles/levels are compatible across renditions (avoid mixing HEVC and AVC in same ladder unless client supports both).

Next step

Run a quick experiment: pick a 60 s high-motion clip and produce two versions — (A) smaller file via CRF 26 and (B) scaled 720p with constrained VBR target 2.5 Mbps. Compare filesize, VMAF/SSIM, and subjective quality on devices. Use the checklist above to validate ABR packaging and player behavior.

If you need tools or integrations to run these workflows at scale, explore Callaba product pages:

  • Multi-streaming — to distribute multiple renditions or destinations efficiently.
  • Video-on-demand — for batch VOD encoding and packaging workflows.
  • Video API — to automate transcodes, manifests and asset delivery.

Further reading and setup docs:

If you prefer to run an on-prem or self-managed stack tuned for lower bandwidth and control, look at our self-hosted option: /self-hosted-streaming-solution, or deploy marketplace solutions via AWS: AWS Marketplace listing.