media server logo

Video Hosting

Mar 06, 2026

This is a practical, engineer-focused guide to building and operating video hosting for live and VOD where low-latency SRT is part of the workflow. It covers definitions, decision criteria, latency budgets, concrete recipes, configuration targets, common failures and fixes, rollout steps, example architectures, and immediate troubleshooting actions. For this workflow, teams usually start with Player & embed and combine it with 24/7 streaming channels. If you need a step by step follow-up, read Video Platforms. If you need a step by step follow-up, read Rtmp. If you need a step by step follow-up, read Obs. If you need a step by step follow-up, read Drm Protected. If you need a step by step follow-up, read Akamai Cdn. If you need a step by step follow-up, read Free Video Hosting. If you need a step by step follow-up, read Aws Elastic Ip. If you need a step by step follow-up, read Live Streaming Software. If you need a step by step follow-up, read Html Video Player. If you need a step by step follow-up, read Video Sharing Platforms. If you need a step by step follow-up, read Upload Video.

What it means (definitions and thresholds)

Before configuring systems, be precise about terms and thresholds you will use for success criteria.

  • Video hosting: storing, transcoding, packaging, serving and recording of live or on-demand video. Hosting includes origin servers, storage backends, transcoding pipelines and CDN distribution.
  • Contribution vs distribution:
    • Contribution: transport of camera/encoder output to an origin or production facility (often SRT, RTMP, WebRTC).
    • Distribution: delivery to viewers (HTTP-based HLS/DASH, LL-HLS/CMAF, WebRTC, or proprietary players).
  • SRT (Secure Reliable Transport): a UDP-based contribution protocol with ARQ retransmissions and optional FEC. It is used for resilient, encrypted contribution over the public internet.
  • Latency classes (practical thresholds):
    • Ultra-low-latency: < 500 ms (interactive applications: auctions, real-time gaming, remote control).
    • Low-latency: 500 ms – 3 s (sports, live events, social viewing where sub-3s is desirable).
    • Near-real-time: 3 s – 10 s (reduced interactivity, acceptable for many live broadcasts).
    • Traditional HTTP streaming: > 10 s (older HLS/DASH settings).
  • Operational thresholds:
    • Target packet loss at the origin < 0.5% for stable streams; if packet loss > 1% begin defensive measures (increase latency, enable FEC).
    • Jitter under 30–50 ms for sub-second targets; under 100 ms for 1–3 s targets.
    • RTT for SRT peers ideally < 200 ms for sub-second uses; acceptable up to 1,000 ms with higher configured latency.

Decision guide

Use the following questions to choose an architecture and hosting components. Each recommendation maps to product capabilities you should provision.

  1. What is your target end-to-end latency (ms)?
    • < 500 ms — target WebRTC for viewer delivery; use SRT for contribution into a WebRTC gateway or ingest + packager that can output RTC.
    • 500 ms–3 s — SRT contribution to a cloud origin + chunked CMAF/LL-HLS distribution to CDN edges is a good balance of scale and latency.
    • > 3 s — classic HLS/DASH with larger segments is acceptable and simpler.
  2. How many concurrent viewers?
    • Small < 10k: origin + CDN with moderate autoscaling.
    • 10k–1M+: rely on global CDN peering and multi-region origins; design stateless packaging/transcoding and autoscale workers.
  3. Is interactivity required (two-way audio/video)?
    • Two-way: use WebRTC or native RTC stacks.
    • One-way low-latency: SRT contribution + WebRTC gateway or LL-HLS distribution to players.
  4. Network quality and reliability of contributors?
    • Unreliable networks: enable SRT with higher latency (500–1,500 ms) and FEC 10–30%.
    • Controlled networks (WAN, managed facilities): use lower SRT latency (100–300 ms).

Map choices to product pages to provision resources and endpoints: Video Hosting, Video API, and Worldwide CDN.

Latency budget / architecture budget

Define an end-to-end budget and allocate milliseconds to each stage. Below are two example budgets and ways to measure each component.

Example A — target 2,000 ms (2 s) end-to-end

  • Capture + encoder (frame capture, encode, mux): 150–400 ms — measure with encoder timestamps.
  • Contribution network (SRT): 300–700 ms configured latency — measure SRT RTT and retransmissions.
  • Origin transcode + packager: 200–500 ms — depends on hardware acceleration; measure processing time per segment.
  • CDN edge + TCP/TLS handshake + delivery to player: 100–400 ms — measure using real client probes from regions.
  • Player demux/decode + buffer: 100–200 ms — depends on player and buffer strategy.
  • Total: 850–2,200 ms (tune each component so total ≤ 2,000 ms).

Example B — target 400 ms (ultra-low)

  • Capture + encode (hardware NVENC or similar): 50–120 ms.
  • Contribution (SRT internal/peered link): 100–150 ms (aggressive SRT latency setting).
  • Packager/gateway (SRT→WebRTC): 50–100 ms.
  • Player decode/buffer (WebRTC): 50–80 ms.
  • Total: 250–450 ms (requires tight control of each step and higher CPU/bandwidth).

How to measure each leg:

  • Encoder: log PTS at capture and output; measure encoder internal latency.
  • SRT: query SRT statistics (RTT, packet loss, retransmits) from endpoint logs.
  • Transcoder/packager: instrument processing time per manifest/segment.
  • CDN: run synthetic clients from target regions and measure TTFB and first-frame times.
  • Player: collect startup and rebuffer metrics via in-player metrics SDK.

Practical recipes

Working recipes you can deploy and iterate on. Each recipe includes the minimum configuration and an operational checklist.

Recipe 1 — Scalable low-latency live event (target 1–3 s)

  1. Ingest: contributors push to SRT endpoints on the origin. Configure SRT latency to 400 ms and enable FEC at 10%.
    • SRT settings: latency=400 ms; FEC=10%; MTU=1200–1400 bytes.
  2. Encoding at contributor:
    • Codec: H.264 (Main profile). Keyframe interval = 1 s (for 30 fps → 30 frames).
    • B-frames = 0 (or 1 for <3 s targets), rate control = CBR or constrained VBR with VBV buffer <= 2 s.
    • Bitrate ladder example:
      • 1080p30: 4,500–6,500 kbps
      • 720p30: 2,500–4,000 kbps
      • 480p30: 800–1,200 kbps
      • Audio: AAC 96–128 kbps
  3. Origin/transcoding: run stateless transcoder workers that produce a CMAF/LL-HLS packager.
    • Segment target: 2 s. Part size: 200 ms–300 ms (LL-HLS/CMAF parts).
    • Ensure closed GOP alignment across renditions to enable seamless ABR switching.
  4. Distribution: push to a global CDN with cache TTLs tuned for live manifests (very short, e.g. 1–2 s for playlists) — see Worldwide CDN.
  5. Player: configure player to request LL-HLS and use a playback buffer target of 1.5–3 s.

Operational checklist: verify SRT stats, validate closed GOP, run multi-region synthetic viewers, and adjust FEC/latency if packet loss > 0.5%.

Recipe 2 — Interactive Q&A / auctions (target < 500 ms)

  1. Use SRT for contribution from remote presenters to a central origin and bridge to WebRTC for viewer delivery.
  2. Contributor encoder:
    • Keyframe interval = 1 s or 0.5 s if the encoder supports fast keyframes.
    • B-frames = 0; lookahead disabled; profile = main.
    • SRT latency = 150–250 ms; avoid excessive FEC — try 5–10% only if needed.
  3. Gateway: an SRT→WebRTC gateway reduces overall latency by packaging RTP for WebRTC clients. Ensure the gateway supports Opus audio for best interactive audio quality.
  4. Player: WebRTC player targeted for sub-500 ms; maintain small jitter buffer < 100 ms where network allows.

Notes: expect higher CPU usage per stream and more complex signaling. Use hardware encode where possible.

Recipe 3 — Multi-camera remote production

  1. Each camera/encoder sends an SRT stream to the production origin, configure redundancy (primary/backup) by provisioning two SRT endpoints and enabling automatic failover in the switcher.
  2. Keep identical encoder profiles across cameras: same codec, GOP/keyframe interval, frame rate, and packet MTU.
  3. Perform audio/video alignment with PTS or SMPTE timecode where possible; otherwise use a sync pulse strategy on ingest to compensate in the switcher.

Result: fast switching (program output latency added by the mixer/transcoder usually 100–300 ms); record ISO tracks in parallel for VOD.

Recipe 4 — Live-to-VOD with continuous CMAF recording

  1. Ingest via SRT to origin; simultaneously record CMAF chunks to object storage.
    • Segment duration = 2 s; chunk size 200 ms for immediate availability; write chunks to storage as they complete.
  2. After event, generate final MP4/MPD/HLS manifests from recorded CMAF segments for VOD hosting and long-term archive.

Benefits: short time-to-VOD and consistent bitrate ladders between live and VOD.

Practical configuration targets

Concrete targets you can apply directly to encoders, SRT endpoints, packagers and players.

  • Encoder (contributor):
    • Codec: H.264 Main or High for compatibility; H.265 only for controlled clients. For interactive, prefer H.264.
    • Keyframe interval: 1 s (30 fps → 30 frames). For aggressive setups use 0.5 s.
    • B-frames: 0 for <1 s; 0–2 for 1–3 s targets.
    • Rate control: CBR or constrained VBR. VBV buffer <= 2 s recommended for low-latency.
    • MTU / packet size: 1,200–1,400 bytes to avoid fragmentation on the internet.
    • Bitrate examples (kbps): 1080p30 = 4,500–6,500; 720p30 = 2,500–4,000; 480p = 800–1,200; audio = 96–128.
  • SRT ingest:
    • latency = 200–800 ms for low-latency workflows. Set lower (100–250 ms) only on reliable networks.
    • FEC = 0–20% initial; raise to 30–50% only for very lossy last-mile links.
    • MTU = 1200–1400 bytes. Monitor retransmit counters and RTT.
  • Packager / LL-HLS:
    • Segment length = 2 s (1–4 s acceptable). Part size = 200–400 ms for LL-HLS/CMAF.
    • Playlist update frequency set to match part lengths; ensure CDN caches allow low TTL for manifests (1–3 s).
  • Transcoding:
    • Prefer hardware (NVENC/QuickSync) for per-stream latency of 50–150 ms; software x264/x265 may be 100–500+ ms depending on presets.
    • Disable lookahead and large motion-estimation windows for low-latency jobs.
  • Player settings:
    • Startup buffer target: 1.5–3 s for LL-HLS; < 500 ms for WebRTC where possible.
    • Switching: ensure closed GOP and consistent codecs for seamless ABR.
    • Instrumentation: emit first-byte, first-frame, rebuffer events, and bandwidth estimates.

For implementation details see the encoder and SRT configuration docs: Encoder settings, SRT configuration, and LL-HLS setup.

Limitations and trade-offs

Every optimization has costs. Be explicit about what you accept and what you won’t.

  • Latency vs reliability: reducing SRT latency increases retransmits and risk of perceived glitches. If you need aggressive latency (<200 ms), prioritize network stability or accept higher CPU/bitrate overhead.
  • FEC overhead: adding 10–20% FEC consumes extra bandwidth. For mobile viewers with metered plans be mindful of cost.
  • Encoder complexity vs CPU: low-latency requires disabling features that improve compression efficiency (B-frames, large lookahead), which increases bitrate or decreases quality.
  • Device compatibility: LL-HLS and CMAF are still being adopted across platforms; fallback to classic HLS for unsupported clients adds complexity.
  • CDN caching: very short manifest TTLs increase origin load—design origin autoscaling and stateless packaging to absorb higher manifest QPS.

Common mistakes and fixes

Real operational failures and how to fix them quickly.

  1. Symptom: Long startup (10+ seconds). Cause: segments too long (6–10 s) or player buffering. Fix:
    • Reduce segment length to 2 s and part size to 200–300 ms; set playlist TTL to 1–3 s.
  2. Symptom: Frequent stalls on mobile. Cause: bitrates too high or adaptive ladder not configured. Fix:
    • Provide a low-bitrate rendition (360p < 800 kbps), enable ABR, and lower initial bitrate for startup (start with 480p or 360p).
  3. Symptom: Audio/video drift after switching encoders. Cause: mismatched GOP/keyframe intervals. Fix:
    • Standardize keyframe interval across encoders (1 s) and enable closed GOP alignment in packager/transcoder.
  4. Symptom: High packet loss/retransmits at origin. Cause: SRT latency set too low for network conditions. Fix:
    • Increase SRT latency by 200–500 ms increments and enable FEC 10% as a first step.
  5. Symptom: ABR switch shows black frames. Cause: non-aligned keyframes across renditions. Fix:
    • Enable closed GOP and force keyframe alignment during transcode.

Rollout checklist

Checklist to move from pilot to production with measurable gates.

  1. Define SLOs: target latency class, acceptable rebuffer rate (<1%), and packet loss threshold (<0.5%).
  2. Provision resources: create SRT endpoints, transcoder pools, packagers and CDN distribution via Video Hosting and Worldwide CDN.
  3. Implement metrics and tracing: collect SRT stats, encoder timings, packager latency, player metrics (startup, rebuffer, fps).
  4. Pilot: run a staged pilot with increasing concurrency (10 → 100 → 1k → 10k) and validate metrics at each step.
  5. Chaos testing: simulate 1–5% packet loss and verify FEC/latency adjustments behave as expected.
  6. Fallbacks and rollback: implement automatic failover between ingest endpoints and a fallback HLS playlist for unsupported viewers.
  7. Post-run audit: collect recordings, analyze PTS alignment, and adjust encoder/packager settings for final event.

Example architectures

High-level architectures and the expected latencies and scale considerations.

Architecture A — Single-region low-latency live

SRT ingests → origin SRT gateway → stateless hardware-accelerated transcoders → CMAF/LL-HLS packager → global CDN edges → players (LL-HLS or fallback HLS).

  • Scale: CDN handles viewer scale; origin autoscale for transcoding.
  • Expected latency: 1–3 s with default settings above.

Architecture B — Ultra-low interactive

SRT ingests (presenters) → real-time mixer/gateway → WebRTC edge clusters → direct RTC clients.

  • Scale: heavy CPU per concurrent peer; use SFU-scale-out and regionally distributed RTC edges.
  • Expected latency: 150–500 ms depending on network.

Architecture C — Global resilient distribution

Multi-region SRT peering into regional origins; each origin runs local transcoders and packagers and pushes to CDN with signed manifests.

  • Scale: supports millions of viewers with reduced tail latency by regionalizing packaging.
  • Expected latency: 0.5–3 s depending on client region and origin proximity.

Troubleshooting quick wins

Immediate actions when a live stream shows high rebuffering, stalls or poor quality.

  1. Check SRT statistics (RTT, packet loss, retransmits).
    • If packet loss > 1%: increase SRT latency by 200–500 ms and enable FEC 10–20%.
  2. Reduce ingest bitrate temporarily (e.g., step down one rung on the ladder) to see if network congestion improves viewer experience.
  3. Lower encoder preset quality (faster encode) to reduce per-frame latency when CPU is saturated.
  4. Shorten segment length / parts and reduce player buffer if you can tolerate slight increases in rebuffer risk.
  5. Confirm MTU not fragmenting packets; set packet size to 1,200–1,400 bytes and avoid larger MTU.
  6. Monitor CDN edge availability and origin health; if origin CPU is throttling, spin up additional transcoding workers.

Next step

If you want to test these patterns against real traffic, provision a focused pilot plan now:

  1. Create an SRT ingest endpoint and try the low-latency recipe in a controlled network test. See the SRT configuration guide: SRT configuration.
  2. Choose the hosting plan that fits expected scale and features: Video Hosting, API-based integration: Video API, and CDN distribution: Worldwide CDN.
  3. Follow the encoder configuration checklist: Encoder settings, and set up LL-HLS packaging using our packaging doc: LL-HLS setup.

When you are ready, contact our platform team for an architecture review or to provision a pilot environment. Use the product pages above to start a trial or request onboarding.