Live Video Streaming

Mar 09, 2026

Callaba on LinkedIn

More live video workflow notes and product updates.

Use the bitrate calculator to size the workload, or build your own licence with Callaba Self-Hosted if the workflow needs more flexibility and infrastructure control. Managed launch is also available through AWS Marketplace.

What it means (definitions and thresholds)

Before designing a pipeline you must be precise about what "low latency" means for your product. Below are operational definitions tied to production thresholds. For an implementation variant, compare the approach in Church Online Platform.

Glass-to-glass latency: the measurable time from physical camera exposure to the frame being visible to an end user. All budgets below are glass-to-glass targets.
Ultra-low / interactive: <500 ms. Typical for WebRTC-based calls, auctions, bidding and gaming where sub-second responsiveness is mandatory.
Low latency: 0.5 s – 5 s. Achievable with SRT for contribution + chunked-CMAF / LL-HLS or short-segment HLS for distribution. This is the target for most live sports, commentary, and live commerce where interaction is present but not real-time conversational.
Near real-time: 5 s – 15 s. Works with short HLS segments (1–3 s) without full chunked CMAF/LL-HLS tooling everywhere.
Standard / legacy: 15 s – 45+ s. Typical for traditional HLS with 6 s segments and larger player buffer.

Key protocol distinction: SRT is primarily a contribution (encoder-to-ingest) transport that provides packet-recovery via ARQ and a tunable latency parameter (default 120 ms in many stacks). SRT is not a CDN distribution protocol; pair it with packaging into LL-HLS/CMAF or a CDN-friendly format for viewer distribution. If you need a deeper operational checklist, use 4k Streaming Bandwidth.

Decision guide

Choose the pipeline by answering four questions: interactivity requirement, scale, network reliability, and output destinations. The short decision guide below maps common needs to architectures and product pages. A related implementation reference is Low Latency.

Do you need sub-500 ms interactivity?

Yes: use WebRTC/SFU and keep state/room servers near users. For API-driven ingest, consider Video API for session orchestration.
No: continue.

Is the event highly interactive but requires many viewers (10k+)?

Use SRT for contribution from the venue to a cloud mixing/transcoding layer, then publish via chunked-CMAF/LL-HLS to CDN endpoints. This keeps contribution resilient while enabling scale.

Do you need to push to a dozen social platforms or destinations?

Use a central ingest (SRT or RTMP) plus a distribution layer such as our multi-streaming product to handle per-platform RTMP/RTMPS endpoints and rate limits.

Do you require recording and subsequent VOD publishing?

Ensure packaging into fragmented MP4/CMAF during the live window so segments are immediately usable for VOD. See Video on Demand for integration and retention options.

For programmatic control of streams, ingest endpoints and metadata integration, link orchestration into CI/CD and event systems with Video API.

Latency budget / architecture budget

Build a latency budget from camera to player. Two practical targets below demonstrate how component budgets add up.

Budget example: 2.0 s glass-to-glass target (typical low-latency SRT pipeline)

Camera & capture: 10–40 ms
Encoder capture pipeline & frame queuing: 50–150 ms
SRT contribution transport (tunable latency window): 120–500 ms (we recommend 200–400 ms for production networks)
Ingest gateway & packet reassembly: 50–150 ms
Transcode or transmux for packaging: 150–400 ms (hardware accel NVENC/QuickSync typically faster; software x264 slower)
Packaging (CMAF/LL-HLS partials): 200–400 ms
CDN edge fetch + player buffer: 200–500 ms

Total: ~0.98 s – 2.14 s depending on choices. To reliably hit the 2 s mark plan for the higher end of each item and test under load.

Budget example: 5.0 s glass-to-glass target (scale-first pipeline)

Camera & capture: 10–40 ms
Encoder: 50–200 ms
SRT contribution: 200–800 ms
Batch packaging into HLS segments (2 s segment + 3 segment startup): 4–6 s
CDN edge fetch + player buffer: 200–800 ms

Total: typically 5–8 s. Use this when you trade latency for reliability and compatibility.

Practical recipes

Below are three operational recipes with step-by-step targets. Each recipe includes encoder, transport and packaging targets you can implement immediately.

Recipe A: Live sports — 2 s target, large audience

On-site capture & encoder

Hardware or dedicated server encoder. Encode H.264 High profile (or H.265 if you control endpoints), GOP 2 s, keyframe interval = 2 s.
Video presets: x264 tune=zerolatency; preset=veryfast (or hardware NVENC for lower CPU/latency).
Resolution/bitrate ladder (example):

1080p60: 8 000–12 000 kbps (CBR or constrained VBR), max 13 500 kbps
1080p30: 4 500–6 500 kbps
720p30: 2 500–4 000 kbps
480p: 1 200–1 800 kbps
Audio: AAC-LC 48 kHz, 128 kbps stereo

SRT contribution

Use SRT from encoder to cloud ingest. Set SRT latency to 200–400 ms for production; default 120 ms is fine for stable networks but provides less ARQ window.
Use packet_size=1316 to avoid fragmentation on typical MTUs, and ensure encoder output is a fragmented MP4 or MPEG-TS stream aligned to keyframes.

Cloud ingest & transcoding

Ingest via Video API endpoints that accept SRT. Transcode to the ladder above using hardware acceleration if possible.
Package to chunked-CMAF / LL-HLS with partials (part) = 200 ms and target segment = 2 s (10 parts). Keep manifest updates frequent for player responsiveness.

Distribution

Push packaged assets to a CDN with short caching windows for manifests and parts. Use multi-CDN if you need high availability and reach.

Recipe B: Remote production (REMI) — reliable contribution, switching in cloud

Encoder settings

GOP 1–2 s, keyframe aligned across all remote encoders (important for fast switch cuts).
Lower latency encoder presets; avoid B-frames or limit to 1 if your encoder and packager can handle decoder delay.

SRT settings

Set latency=300–600 ms if networks are variable; prioritize ARQ over immediate drop.
Use caller/listener mode appropriate to your NAT/topology and set a secure passphrase for encrypted SRT sessions.

Cloud mixing & switcher

Use a real-time mixer that accepts SRT and outputs aligned streams. Ensure all inputs use the same GOP and clock alignment to avoid frame-drops on cuts.

Recording and VOD

Record as fragmented MP4/CMAF during the event so the same asset can be published later via Video on Demand workflows.

Ingest: RTMP or SRT to cloud ingest

For social restreaming, RTMP to an ingest is often simpler from common encoders. For contribution resilience use SRT to Video API.

Transcode and restream

Use a multi-streaming engine to manage per-platform profiles, RTMPS endpoints and bitrate rules.

Record for VOD

Record a high-bitrate master and auto-generate VOD derivatives using Video on Demand.

Practical configuration targets

These are concrete encoder, packaging and network targets to use as defaults in production manifests and configuration files.

GOP / keyframe

GOP length: 1–2 s (recommended 2 s for efficient compression with fast seek/manifest update).
Keyframe interval: equal to GOP length; align across all encoders feeding a switcher/transcoder.

Parts & segment sizes

Chunked-CMAF/LL-HLS part size: 200 ms recommended for stable low-latency delivery; 200–400 ms acceptable.
Target segment duration: 2 s for low-latency pipelines; 2–4 s for near-real-time. Avoid 6 s segments when aiming <10 s latency.

Player buffer

Set player target buffer: 1–3 s for low-latency; p95 startup buffer <3 s.

Bitrate ladders (example)

1080p60: 8 000–12 000 kbps
1080p30: 4 500–6 500 kbps
720p30: 2 500–4 000 kbps
480p: 1 000–1 800 kbps
Audio: AAC-LC, 48 kHz, 64–192 kbps depending on voice/music

SRT transport

Latency parameter: 120 ms default; production 200–600 ms depending on network. Remember higher latency enlarges the ARQ window and tolerates more loss at the cost of glass-to-glass delay.
MTU / packet_size: 1316 recommended to avoid IP fragmentation on most networks.

Network & monitoring targets

Round-trip time (RTT) target: <150–200 ms for best SRT performance.
Packet loss target: <0.5% average; SRT can recover from bursts if latency window is sufficient.

Limitations and trade-offs

Every design choice creates trade-offs. Be explicit about the ones that matter for low-latency SRT pipelines.

Reliability vs latency: ARQ-based recovery (SRT) uses time to retransmit lost packets. Lower SRT latency windows reduce recovery time and increase visible loss/artifacts.
Quality vs bandwidth: Smaller GOP and more frequent keyframes increase bandwidth but improve recovery and stream switch stability.
Codec efficiency vs compatibility: H.265/AV1 offer lower bitrates but have limited device support and longer encode latency; H.264 is the safest for wide reach.
Player behavior: many players add safety buffers and re-buffer to avoid stalls; you must configure player-policy (max buffer, rebuffer thresholds) to realize the architecture budget.
CDN behavior: standard CDNs are optimized for 30 s+ caching windows. For LL-HLS/CMAF you need CDN features that support chunked transfer and small TTL for manifests/parts.

Common mistakes and fixes

These are recurring issues we see in production and the straightforward fixes.

Problem: GOPs not aligned across encoders causing black frames on cuts.

Fix: enforce keyframe alignment (same keyframe interval and SLA) across all upstream encoders.

Problem: large segment durations (6 s) used for a low-latency goal.

Fix: switch to part-based CMAF/LL-HLS with 200 ms parts and 2 s segment targets.

Problem: SRT latency set too low for variable networks, causing visible artifacts.

Fix: increase SRT latency to 300–600 ms to give the ARQ window room; monitor packet loss and RTT.

Problem: encoder tuned for quality (high-latency presets) instead of zerolatency.

Fix: use encoder presets for low latency (x264 tune=zerolatency, hardware NVENC low-latency modes). Reduce B-frames.

Problem: player buffer defaults are large, masking backend improvements.

Fix: adjust player settings to match target glass-to-glass; test with controlled start and p95/p99 latency measurements.

Rollout checklist

Use this checklist before going live for the first time and before every major event.

Design & planning

Define target glass-to-glass latency (explicit number).
Choose transport (SRT for contribution, WebRTC for ultra-low interactivity, HLS/CMAF for distribution).

Staging tests

Unit tests for encoder config, SRT handshake, transcode chain, and packaging.
Latency tests: measure p50/p95/p99 glass-to-glass under nominal and degraded network conditions.

Load & chaos testing

Load test CDN and origin to target expected concurrent viewers + 50% headroom.
Simulate packet loss and increased RTT; verify SRT latency window recovers streams without collapsing the transcoder pool.

Operational readiness

Alerting for CPU, queue depth, SRT connection count, packet loss, and player-side error rates.
Failover plan for CDN or origin node failure and rapid re-route to backup ingest.

Pre-show checklist (30–60 minutes before)

Confirm encoder clocks and NTP sync.
Validate SRT handshake and observed RTT per encoder; check ARQ retransmissions and adjust latency if needed.

Example architectures

Three example architectures with component roles and why they work.

Architecture 1: Small event (up to 5k viewers) — minimal cost, low-latency

On-site encoder (SRT) → Cloud ingest (single region) → Transcoder (single cluster) → Single CDN endpoint
Latency expectation: 1–3 s when tuned.
When to use: one-off events, low concurrency.

Architecture 2: Large event (100k+ viewers) — resilient low-latency

On-site SRT encoders → Multi-region ingestion (active-active) → Transcoding farms with autoscale (hardware accelerated) → Multi-CDN and edge packaging → Client
Include origin-layer caching for manifests/parts and geo-redirection. Use multi-streaming to push to socials from the cloud layer so encoder bandwidth is preserved.

Architecture 3: Interactive show with talkbacks — mixed transport

Guest endpoints use WebRTC for ultra-low interactivity → SFU mix → Cloud mixer outputs SRT to production/packager → packaged to LL-HLS for viewers
Rationale: WebRTC for guest experience, SRT for reliable, jitter-tolerant contribution into production.

Troubleshooting quick wins

If latency or quality is worse than expected, run these quick checks in order.

Network basics

Check RTT (ping) and jitter with ping/iperf3. Target RTT <200 ms for best SRT behavior. If RTT >200–300 ms, increase SRT latency window.
Check packet loss with iperf3 or SRT stats. Average loss <0.5% is a good production target. For bursts, increase latency window.

Encoder health

Verify CPU/GPU headroom <70% averaged; high CPU will increase encode latency and drop frames.
Check encoder buffer/VBV settings; set VBV buffer to 1–2 s for constrained VBR.

SRT stats

Inspect SRT metrics: rtt, rttVar, pktSndLoss, pktSndDrop. A rising pktSndLoss suggests network or MTU problems.

Packaging alignment

Confirm keyframe alignment between encoder and packager. If segments start misaligned you will see delay and poor switch seams.

Player-side

Turn on player diagnostics to measure segment arrival times, buffer fill, and ABR switches. Reduce initial buffer target to test lower-latency path.

Next step

Start with a focused pilot: pick a single venue or feed and prove your target glass-to-glass under realistic network conditions, then scale. Use these resources and products to accelerate the pilot and roll-out:

For programmatic ingest, orchestration and control APIs: Video API.
For pushing a single live feed to many social spots and RTMP endpoints: multi-streaming.
To take live recordings and publish VOD quickly: Video on Demand.
Detailed implementation guides: Low-latency architecture, SRT setup, Encoder best practices.
If you need a self-hosted option or full control over the stack: review our self-host option at /self-hosted-streaming-solution or evaluate the AWS Marketplace appliance at AWS Marketplace.

If you want hands-on help mapping these targets into your deployment, schedule a technical review with our engineering team through the Video API page or request a demo from product. Implement one recipe end-to-end in staging, validate p95/p99 latency, then roll to production with the rollout checklist above.