Live Streaming Sites

Mar 09, 2026

Callaba on LinkedIn

More live video workflow notes and product updates.

Use the bitrate calculator to size the workload, or build your own licence with Callaba Self-Hosted if the workflow needs more flexibility and infrastructure control. Managed launch is also available through AWS Marketplace.

What it means (definitions and thresholds)

When people search for "live streaming sites" they mean a spectrum of systems: live linear channels, one-to-many event streams, interactive shows, and social re-streams. Architectures and configuration targets differ by required end-to-end latency and scale. Define terms up front so decisions are measurable. For an implementation variant, compare the approach in Best Cameras For Streaming.

Ultra‑low / real‑time: sub-500 ms glass-to-glass. Usually requires WebRTC or local SFU clusters and is appropriate for two‑way interactions (live auctions, remote contribution that needs conversation).
Low‑latency live: 500 ms – 3 s. Common target for sports, live commerce, betting—fast enough for low perceptible delay while supporting CDN scale. Achieved by contribution via SRT or WebRTC and distribution via LL‑HLS or CMAF low‑latency pipelines.
Near‑real‑time: 3 s – 10 s. Practical, easier to operate; standard LL‑HLS or tuned HLS with short segments fits here.
Traditional HLS/DASH: 10 s – 45 s. Good for standard OTT where compression efficiency and broad compatibility win over latency.

Key distinctions: If you need a deeper operational checklist, use Webcams For Streaming.

Contribution: how the encoder/field source reaches your origin (SRT, RTMP, WebRTC). Contribution paths must tolerate packet loss and provide encryption and failover.
Processing / Transcoding: origin/transcoder cluster—where you repackage and create ABR renditions.
Distribution: CDN and player behaviour (segment size, buffer, ABR switching).

SRT in this context is a contribution and point‑to‑point transport protocol that uses ARQ (retransmit) and configurable receive latency (in milliseconds). Typical reliable settings vary by network: A related implementation reference is Low Latency.

Managed LAN / dedicated links: latency 50–200 ms
Good public internet: latency 200–800 ms
Mobile/4G/5G and unstable links: latency 800–2000 ms

Decision guide

Pick the simplest stack that meets your latency and scale targets. Use this checklist to decide.

Is end-to-end interactivity required?
- Yes, sub-500 ms: design on WebRTC and SFU. Expect higher operational complexity and stateful scaling.
- No, but you need fast live (<=3s): SRT for contribution into a packager that outputs LL‑HLS or CMAF low-latency.
Will you need social destinations / many outputs?
- If yes, centralize ingest and use an origin that can output RTMP to social and low-latency HLS for your site. See /products/multi-streaming for multi-destination patterns.
Do you need server‑side programmatic control, recording, or server-side mixing?
- Use an API-driven control plane. Examine /products/video-api and integrate health checks, ingest rules and automated transcoding triggers.
Do you require VOD generation and per-view paywalls?
- Ingest to an origin that writes CMAF/fMP4 segments and HLS playlists; post-event use those files for VOD. Map to /products/video-on-demand for storage and monetization workflows.

If you want a quick PoC: push a single SRT stream to an origin, transcode to an LL‑HLS packager and test in browsers. For configuration references see /docs/latency-guidelines and /docs/srt-setup.

Latency budget / architecture budget

Always build a latency budget: allocate a target for each hop and verify with instrumentation. Below are component budgets and two worked examples.

Typical component latencies (glass-to-glass)

Capture + ingest encoder (frame capture, encode, packetization): 50–400 ms. Hardware encoders are at the low end; software encoders tend higher.
Transport / network one-way delay: local 10–80 ms, regional 50–200 ms, cross-continent 150–300+ ms. Use RTT/2 as baseline.
SRT receive buffer + retransmit window: 50 ms – 2,000 ms depending on latency setting and network conditions.
Origin/transcoding (decode + encoding multiple renditions): 50–500 ms depending on hardware acceleration and number of renditions.
Packaging (CMAF fragments, part creation): 50–300 ms. Small part sizes increase CPU and IO pressure.
CDN edge propagation and player fetches: 50–600 ms (depends on edge proximity and manifest TTL). LL‑HLS requires TTLs near 0–2 s.
Player buffer + decode: 50–800 ms (player target buffer). Lower buffers increase risk of rebuffering under jitter.

Budget example A — Target 2.0 s end-to-end (typical sports/live commerce)

Capture + encode: 150 ms (CPU + x264 tuned for low latency)
SRT network (one-way incl. ARQ): 200 ms
Transcoding & packaging: 300 ms
CDN edge + HTTP fetch: 300 ms
Player buffer (LL‑HLS targeting ~2s): 1,050 ms
Total ≈ 2,000 ms

Budget example B — Target sub‑500 ms (interactive)

Capture + encode (hardware): 50–80 ms
Network (regional, WebRTC UDP): 60–100 ms
SFU mixing forwarding: 40–80 ms
Player decode: 40–80 ms
Total ≈ 190–340 ms

Use these budgets to determine feasibility. If a single stage consumes more than 30–40% of the total budget, optimise that stage first (encoder settings, network path, CDN edge placement).

Practical recipes (implementable now)

Recipe 1 — Scalable low‑latency site (target 1–3 s) using SRT + LL‑HLS

Contribution: have encoders (hardware or ffmpeg) push to SRT ingest on the origin. Set mode=caller or listener based on network topology.
Origin: accept SRT, decode and transcode to ABR renditions, package to CMAF fMP4 with LL‑HLS playlists and parts (segment target 1–2 s, part size 200–300 ms).
Distribution: CDN configured with edge TTL & cache control tuned to 0–2 s for playlists and parts.
Player: LL‑HLS capable player with playback buffer target of 1.0–2.0 s and ABR smoothing enabled.

Key config snippets (example encoder -> SRT using ffmpeg):

-re -i input -c:v libx264 -preset veryfast -tune zerolatency -profile:v high -level 4.2 -g 60 -keyint_min 60 -sc_threshold 0 -b:v 4500k -maxrate 4950k -bufsize 9000k -c:a aac -b:a 128k -f mpegts "srt://ingest.example.net:10000?mode=caller&latency=200&pkt_size=1316"

Packaging must align keyframes with segment boundaries; use a packager that supports LL‑HLS partial segments (Bento4 or Shaka Packager).

Recipe 2 — Interactive Q&A or talk show (sub‑500 ms) using WebRTC SFU

Browser clients use WebRTC to connect to regional SFUs. Use TURN servers for NAT traversal; ensure ICE trickle is enabled for faster connect.
SFU provides selective forwarding and optionally records the mixed output to the origin for later VOD. For a mass audience, output the SFU stream to a packager as SRT or RTMP then package as LL‑HLS for viewers.
Scale SFU horizontally by region; use a control plane for session affinity and autoscaling.

Participant bitrate targets:

720p: 2.5–4 Mbps
480p: 1–2 Mbps
Audio: 48–128 kbps per participant

Recipe 3 — Resilient contribution from remote journalists with SRT primary/backup

Configure two SRT destinations (primary and backup) on different public IPs/ISPs. Use matchable stream IDs so origin picks the first live stream and keeps the backup suppressed until failover.
Set SRT latency = 800–2000 ms for mobile connections, enable SRT encryption (AES‑128 or AES‑256) and monitor packet loss and RTT.
Record locally at the source as a backup file (e.g., MP4 fragmenter) and upload if the live feed fails.

Recipe 4 — Multi‑destination live streaming site (site + socials)

Ingest via SRT or WebRTC. Use a central origin to create ABR HLS/LL‑HLS outputs and to generate RTMP/RTMPS outputs for social destinations.
Use a managed multi‑streaming solution to fan out RTMP to social endpoints and preserve low‑latency LL‑HLS for your site. See /products/multi-streaming.
Monitor each social output separately for bitrate and disconnects; create automatic re-route rules if a social endpoint fails.

Practical configuration targets

Copy and tune these targets. They are intended to be conservative starting points.

Encoder / capture

Codec: H.264 (baseline compatibility) or HEVC/AV1 for closed environments. For browsers, H.264 remains the broadest compatibility.
Frame rate: keep native frame rate. For low-latency sport consider 50–60 fps if available, plan bitrate accordingly.
GOP / keyframe interval: 1–2 s. (Example: 30 fps → -g 30 for 1 s, 60 fps → -g 60 for 1 s.) Keyframes must align with segment boundaries.
x264 tuner: -tune zerolatency, preset veryfast or faster for CPU-limited encoders.
VBV / buffer sizing: set -maxrate and -bufsize to control bitrate spikes. Example: for 4.5 Mbps target use -maxrate 4950k -bufsize 9000k.

Bitrate ladder (example)

2160p60: 15–30 Mbps
1080p60: 6–10 Mbps
1080p30: 4.5–6 Mbps
720p60: 3.5–5 Mbps
720p30: 2.5–4 Mbps
480p: 1–2 Mbps
360p: 500–800 kbps

SRT transport parameters

latency: choose based on network stability: 120–200 ms (LAN), 200–800 ms (public internet), 800–2000 ms (mobile)
mode: caller vs listener depending on whether the encoder initiates the connection
pkt_size: default 1316 bytes is common; increase if your network and MTU allow it.
Encryption: enable AES encryption and rotate keys. Keep key exchange out-of-band.

Packaging / segments

LL‑HLS segment target: 1–2 seconds
LL‑HLS part size: 200–300 ms (smaller parts increase HTTP request rate)
CMAF chunk: 250–500 ms
Playlist TTL / CDN caching: Cache-Control with short s-maxage (0–2 s) for playlists and parts

Player

Target buffer: 0.5–2.0 s for LL‑HLS; 2–6 s for near‑real‑time or traditional HLS
ABR switching: align keyframes across renditions and prefer smooth upswitching (avoid aggressive rate jumps)

Limitations and trade-offs

Every low-latency decision has trade-offs. Document these with explicit monitoring and fallback plans.

Compression efficiency vs latency: Smaller GOPs and zerolatency tuning reduce compression efficiency. Expect 10–40% higher bitrates to maintain quality.
CDN load: LL‑HLS increases request rates at the edge because of smaller parts and more frequent playlist updates. Ensure your CDN supports HTTP/2 or HTTP/3 and configure request limits.
Operational complexity: WebRTC provides the lowest latency but requires SFU infrastructure, TURN, and per‑session state. SRT is point-to-point and simpler to scale for contribution, but still needs robust monitoring and failover.
Retransmit vs latency: Lower SRT latency reduces replay buffer for retransmits; packet loss will more directly impact video quality. If you need to prioritise reliability, increase the SRT latency.

Common mistakes and fixes

Fix these first when diagnosing playback or latency issues.

Misaligned keyframes
- Symptom: ABR switches cause stalls or black frames.
- Fix: Set encoder -g to match segment boundary and ensure -keyint_min is equal to -g. For example, for 2 s segments at 30 fps: -g 60 -keyint_min 60.
SRT latency set too low
- Symptom: frequent retransmit or drop, stalls on variable mobile networks.
- Fix: increase latency to 800–2000 ms for mobile; monitor SRT stats for packet loss & RTT.
Large HLS segments for low-latency use
- Symptom: 10–30 s delay despite other low-latency pieces.
- Fix: reduce segment duration to 1–2 s and enable partial segments (LL‑HLS).
Ignoring CDN configuration
- Symptom: sudden stalls or >10 s additional delay at edge.
- Fix: set short edge TTLs for manifests and parts, enable origin shielding to reduce origin load and verify HTTP/2 or HTTP/3 is supported for small object delivery.

Rollout checklist

Follow this checklist before going live with a new live streaming site or workflow.

Set your target end‑to‑end latency (sub‑500 ms, 0.5–3 s, 3–10 s). Document SLOs.
Choose contribution protocol (SRT / WebRTC) and document failover rules.
Define encoder profiles and align keyframe intervals across all outputs. Commit ffmpeg/hardware encoder configs to version control.
Provision origin/transcoder cluster and packager. Verify hardware acceleration where needed.
Configure CDN with short TTLs for playlists/parts; enable HTTP/2 or HTTP/3. Run synthetic tests from multiple regions.
Implement monitoring: stream health, packet loss %, SRT RTT, CDN edge hit rates, player stall rate, startup time. Create alerts for threshold breaches.
Execute progressive load tests: 10 users → 1k → target concurrency. Validate ABR switching and latency targets under load.
- Record packet-level metrics during tests (pcap or SRT stats).
Set rollback paths: increase player buffer, switch to higher latency SRT settings, or fall back to non-LL HLS during incidents.
Document operational playbooks for encoder failures, origin failover, and CDN issues. Link job runbooks to your alerting system.

For on-premise options and self-hosted deployment patterns see /self-hosted-streaming-solution and consider marketplace appliances such as the one in AWS Marketplace for rapid testbeds.

Example architectures

Small live site (few thousand concurrent)

Field encoders → SRT ingest (single origin VM) → Transcoder with hardware GPU → LL‑HLS packager → CDN
Budget: segment 1–2 s, SRT latency 200–500 ms, player target buffer 1.5 s
Use /products/video-api to automate start/stop and ingest routing.

Large OTT / multi-region

Regional SRT edge ingests → regional transcoding clusters → origin replication → CDN global
Use a central control plane to issue packaged outputs for VOD and use /products/video-on-demand for archive workflows.
Set CDN configuration for LL‑HLS; origin shielding with regional caches is recommended.

Interactive production (Game shows / auctions)

Browsers / remote guests → regional WebRTC SFU cluster → mixing/recording → forward to origin via SRT for distribution
SFU autoscale by sessions; use TURN fallback and regional affinity for minimum RTT.

Troubleshooting quick wins

Fast checks you can run in the first 15 minutes.

Verify encoder keyframes:
- Run ffprobe -show_frames -select_streams v to confirm keyframe interval matches your segment duration.
Check SRT stats:
- Inspect SRT receiver stats for RTT, packet loss and retransmits. If loss >1–2% and latency <800 ms, increase latency.
Measure CDN edge behavior:
- Fetch playlists and parts from multiple regions; verify last segment / part timestamps and that TTLs are near zero for LL‑HLS.
Player debug:
- Enable player logs (many players expose a debug console) and look for buffer underrun events, manifest parse errors or ABR oscillation.

Quick ffmpeg sanity check:

ffmpeg -re -i source -c:v libx264 -preset veryfast -tune zerolatency -g 60 -keyint_min 60 -b:v 4500k -maxrate 4950k -bufsize 9000k -c:a aac -b:a 128k -f mpegts "srt://ingest.example.net:10000?mode=caller&latency=200"

Next step

If you’re building or optimizing a live streaming site, pick one of these immediate next steps:

Run a one‑stream PoC: push SRT from a field encoder to an origin, produce LL‑HLS outputs, and validate end‑to‑end latency and ABR switching. Use the setup guidance in /docs/srt-setup and /docs/ll-hls-guide.
If you need multi-destination (social + web) or programmatic control, evaluate a managed multi‑streaming route via /products/multi-streaming and integration with /products/video-api.
To archive and turn live into VOD reliably, wire outputs into your VOD pipeline described at /products/video-on-demand.

Operational help: if you run a self-hosted deployment see /self-hosted-streaming-solution for patterns and marketplace appliances such as the AWS listing here for rapid environments.

If you want a short checklist to hand to your infra team, start with these four items: 1) define latency SLO, 2) align encoder keyframes and GOP to segments, 3) set SRT latency appropriate to network conditions, 4) configure CDN TTLs for manifests/parts. For architecture references and operational runbooks see /docs/latency-guidelines.