Cast Software

Mar 09, 2026

Callaba on LinkedIn

More live video workflow notes and product updates.

Use the bitrate calculator to size the workload, or build your own licence with Callaba Self-Hosted if the workflow needs more flexibility and infrastructure control. Managed launch is also available through AWS Marketplace.

What it means (definitions and thresholds)

"Cast software" here means the set of software components that take a live camera or encoder feed, secure and transport it to a cloud origin, transcode/pack it, and distribute it to viewers or endpoints. The common components are: encoder (hardware or software), contribution transport (SRT, RTMP, WebRTC), a cloud packager/transcoder, CDN/edge, and the player. SRT (Secure Reliable Transport) is typically used for the contribution leg (field encoder -> origin) for its reliability and configurable de-jitter buffer. For an implementation variant, compare the approach in Webcams For Streaming.

Latency thresholds you should care about (end-to-end, camera capture to viewer render): If you need a deeper operational checklist, use Webcam For Streaming.

Interactive / conversational: <500 ms — requires WebRTC or direct SFU-based paths and tight control over capture/encode/transport/decoding. Not a typical SRT use-case unless you control both edges and are on low-latency networks.
Ultra-low latency (UL): 500–1000 ms — achievable in tightly controlled environments with optimized encoders and WebRTC or carefully tuned SRT->WebRTC gateways.
Low latency (broadcast): 1–3 s — a pragmatic, reliable target using SRT for contribution plus CMAF LL-HLS or DASH-LL for distribution.
Conventional HLS/DASH: >3 s — broadly compatible but not suitable for low-latency interactivity.

Operational measure: when we say "SRT-based low-latency", expect realistic E2E targets of 1–3 seconds in production across the public Internet; sub-1s is possible only with tightly controlled networks, co-located processing, or WebRTC gateways. A related implementation reference is Low Latency.

For implementation details and SRT connection options, see the quick setup guide at /docs/srt-setup.

Decision guide

Choose the cast software pattern based on two questions: where does the feed originate (field encoder vs browser) and what is the viewer expectation (interactive vs broadcast). Use the following decision matrix.

Remote field encoders and cameras (hardware/OBS)
- Primary transport: SRT. Use SRT when the encoder runs on an unstable network (cellular, public WAN) because ARQ gives packet recovery.
- Choose cast software that supports SRT ingest, transcoding and CMAF packaging at the origin.
Browser-to-browser or many interactive participants
- Primary transport: WebRTC. Use WebRTC when you need sub-second interaction or local interactivity among participants.
- WebRTC scales via SFU architectures; you need cast software that integrates an SFU or gateway to convert SRT to WebRTC if you also have remote SRT contributors.
Mass scale broadcast with wide compatibility
- Primary transport: CMAF LL-HLS/DASH for viewers, SRT for contribution. Cast software must repack to HLS/DASH and push to CDN.

Product mapping (quick):

API-first ingestion & programmable pipelines: /products/video-api
Multi-destination distribution and social outputs: /products/multi-streaming
VOD workflows and packaging after the event: /products/video-on-demand

Latency budget / architecture budget

You must budget latency across capture, encode, transport, packager/CDN, and player. The following table-like guidance gives practical ranges; these are typical numbers you should measure in your environment.

Capture (camera, framegrab): 10–60 ms
Encoder (frame buffering, B-frames disabled): 50–300 ms
- Software encoder (libx264, x265): 50–300 ms depending on preset and CPU.
- Hardware encoder: 20–120 ms.
Transport (SRT contribution): 100–800 ms depending on SRT "latency" setting and RTT.
Transcoding / packager: 50–800 ms (depends on VM instance size, number of renditions).
CDN edge + request overhead: 50–500 ms (if using an originless edge this can be smaller)
Player buffer / decoding: 100–600 ms (configurable to trade startup vs stability)

Simple formula for end-to-end latency (E2E):

E2E ≈ capture + encode + transport + packager + CDN + player

Practical examples:

Target E2E ≈ 1,200 ms (1.2s): capture 30 + encode 150 + transport 250 + packager 300 + CDN 120 + player 350 = 1,200 ms.
Target E2E ≈ 2,500 ms (2.5s): capture 30 + encode 300 + transport 600 + packager 700 + CDN 200 + player 670 = 2,500 ms.

How to pick SRT "latency" parameter: measure round-trip time (RTT) between encoder and origin. Set SRT latency to at least 3× RTT + 100 ms margin when you want reliable ARQ recovery under moderate jitter. For well-managed paths (RTT < 50 ms) you can try latency = 150–250 ms for lower E2E delay.

More on budgeting and measurement at /docs/latency-budget.

Practical recipes

Concrete recipes you can copy into your pipeline. All examples assume a 30 fps source unless noted and are focused on contribution with SRT.

Recipe A — Reliable broadcast: SRT ingest -> cloud transcoder -> CMAF LL-HLS -> CDN (target 1.5–3 s)

Encoder settings (ffmpeg example for a 30 fps 720p feed):

ffmpeg -re -i input -c:v libx264 -preset veryfast -tune zerolatency -g 60 -keyint_min 60 -sc_threshold 0 -b:v 3000k -maxrate 3300k -bufsize 6000k -c:a aac -b:a 128k -f mpegts "srt://origin.example.com:4200?pkt_size=1316&latency=250"

Notes:

-g 60 sets a 2 s GOP at 30 fps; for 1 s GOP use -g 30.
-tune zerolatency disables VBV buffering and B-frames; this reduces encoder latency.

Origin/transcoder:
- Accept SRT caller connections on a public IP and port (e.g. 4200). Use SRT passphrase (AES-128/256) for encryption.
- Transcode to ABR ladder (for example, 1080p 5–6 Mbps, 720p 3–4 Mbps, 480p 1.2–1.8 Mbps, 360p 600–900 kbps).
- Package to CMAF LL-HLS with part duration = 200–500 ms and segment duration = 2 s.
CDN: Use a CDN that supports CMAF LL-HLS or use an edge repackager. Configure cache-control and low TTLs (e.g. 1–3 s) and ensure edge origin requests are fast.
Player: Use native LL-HLS players or JS players that support CMAF PARTS. Configure initial buffer to 1–3 parts (200–500 ms per part) + plumbing to handle rebuffer events.

Recipe B — Ultra-low interactive path: SRT ingest -> WebRTC gateway (SFU) -> WebRTC clients (target <1 s in LAN or co-located)

Use SRT to bring field encoder to a cloud gateway that can transcode SRT->RTP and inject into an SFU (e.g., mediasoup, Janus, or commercial SFU).
- Set SRT latency = 120–200 ms if network RTT < 50 ms. If RTT > 50 ms, increase latency to 300–500 ms to preserve packet retransmit window.
At the gateway, use a transcode profile with GOP <= 1 s and no B-frames. Use OPUS/AAC for audio with 48 kHz and 24–64 kbps for speech, 96–128 kbps for music.
SFU handles distribution to many WebRTC clients. Set per-client receive bandwidth ceilings and use simulcast/svc where available to reduce CPU load.

Accept SRT on origin. Transcode to the social platform bitrates and push via RTMPS with stable keys.
- Use CBR or constrained VBR tuned to the platform's required rate. For instance, for a 720p social stream target 3000k CBR, for 1080p 6000k CBR.
When distribution to many social endpoints is required, use a multi-streaming product capability to fan-out rather than running many separate encoders at the edge. See /products/multi-streaming for platform mapping and workflow templates.

Recipe D — Back-to-back recordings and VOD: ingest via SRT -> record to CMAF fragments -> finalize VOD

On ingest, write CMAF fragmented MP4 segments in parallel with live packaging to ensure lower latency VOD assembly and faster post-event availability.
Use the VOD pipeline to transcode final renditions and store them with proper chaptering and thumbnails. See /products/video-on-demand for pipeline mapping.

Practical configuration targets

Below are actionable configuration targets you can apply quickly. Tweak them after measuring real RTT, jitter, and packet loss.

Encoder recommendations

GOP / keyframe interval: 1–2 s. Example: 30 fps -> -g 30 (1 s) or -g 60 (2 s).
B-frames: 0 for lowest latency; 0–1 if you can accept slightly higher latency.
Rate control: CBR or constrained-VBR with a bufsize = 2× to 4× bitrate (e.g., bitrate 3000k -> bufsize 6000k–12000k).
Preset/tune: x264 preset = veryfast or faster; -tune zerolatency to minimize frame reordering buffering.
Audio: 48 kHz, AAC-LC, 64–192 kbps depending on content (voice 64–96 kbps, music 128–192 kbps).

SRT parameters

latency: 120–800 ms. Use at least 3× RTT + 100 ms margin when packet loss >1%.
mode: caller/listener/rendezvous — pick caller on the encoder when origin is reachable by public IP.
pkt_size: 1316–1460 bytes for UDP-based efficiency (e.g., pkt_size=1316).
encryption: enable passphrase to use AES encryption (SRT supports built-in AES).
monitor: measure RTD/RTT with SRT stats and log packet loss; adjust latency accordingly.

Packaging (CMAF LL-HLS) targets

part (partial segment) duration: 0.2–0.5 s (200–500 ms) for LL-HLS.
segment duration: 2 s (align segment duration to GOP length where possible).
playlist holdback: 3× part duration + safety margin. Example: parts 0.25 s -> holdback ≈ 0.75–1.0 s.

ABR ladder (example)

1080p 30 fps — 4500–6000 kbps
720p 30 fps — 2500–3500 kbps
480p 30 fps — 1200–1800 kbps
360p 30 fps — 600–900 kbps

Limitations and trade-offs

Understand the trade-offs before optimizing for the lowest possible latency:

SRT is a contribution technology. It is not native in browsers, so for large-scale browser delivery you need a packager or a gateway that converts SRT to HLS/DASH or WebRTC. That conversion adds latency.
ARQ (SRT retransmissions) trades time for reliability. If you want ARQ to recover lost packets, you need a larger latency window — increasing your end-to-end delay.
Tightening encoder settings (shorter GOP, no B-frames, zerolatency tune) reduces compression efficiency; you will need more bitrate for the same visual quality.
Small part / segment sizes increase request overhead (more HTTP requests) and pressure on packagers and CDNs. Ensure your infrastructure can handle higher request rates.
Ultra-low-latency setups require stricter operational monitoring (RTT, jitter, packet loss). They are more brittle on mobile networks with high variance.

Common mistakes and fixes

Mismatch between keyframe interval and HLS segment size
- Fix: Make keyframe interval equal to or divide evenly into your segment duration. Example: 2 s segment -> GOP = 2 s (g=60 for 30 fps).
SRT latency set too low for the observed RTT and jitter -> frequent retransmits and dropouts
- Fix: Measure RTT, then set latency >= 3× RTT + 100 ms. Increase slowly and test under packet loss.
Using B-frames with short GOPs -> decoder stalls on some players
- Fix: Disable or limit B-frames for low-latency viewers (set bframes=0).
Too-large player buffer -> perceived high latency
- Fix: Tune player startup and buffer targets. For LL-HLS aim for player buffer = 2–4 parts (e.g. 400–1600 ms depending on part size).
Firewall blocks UDP / SRT ports
- Fix: Ensure ports are open and test with srt-live-transmit or ffmpeg srt URL. Consider running SRT in caller mode if origin is behind firewall.

Rollout checklist

Use this checklist when you deploy cast software for SRT-based live streams:

Baseline network tests between all encoders and origin: measure RTT, jitter, and packet loss over 1 minute and 1 hour windows.
Set initial SRT latency using measured RTT (latency = 3× RTT + 100 ms) and test under simulated packet loss (1–5%).
Configure encoder GOP to match packager segment durations (GOP = segment duration in seconds).
Deploy a small-scale staging run with real viewers and measure E2E latency per viewer and by CDN region.
Instrument metrics: encode latency, SRT stats (RTT, packet loss, retransmits), packager latency, CDN edge latency, client startup time, rebuffer rate.
Run failover tests: origin failure, network loss, encoder reconnect, and CDN region shift.
Prepare fallback pathways: RTMP fallback for legacy encoders, and WebRTC fallback for interactive viewers if available.

Example architectures

Textual diagrams with example latencies and the appropriate product pages to map implementations.

Architecture 1 — Small event, single origin

Encoder (SRT caller, latency=250 ms) -> Origin Transcoder VM -> CMAF LL-HLS packager -> CDN edge -> Player

Example latency budget: capture 30 + encode 150 + SRT 250 + transcoder 300 + CDN 100 + player 200 = 1,030 ms (approx. 1.0 s usual under ideal network). For reliable public Internet targeting 1.5–3 s, increase SRT latency to 400–600 ms and tune CDN TTL accordingly.

Architecture 2 — Interactive webinar with remote contributor

Remote contributor -> SRT to gateway -> WebRTC SFU -> WebRTC clients

Mapping: use /products/video-api to build programmable SRT ingestion and interface with an SFU for distribution. Where scalable WebRTC is required, run SFU clusters and use simulcast to reduce per-client CPU. Expect E2E 300–800 ms in good network conditions.

Multiple SRT encoders -> origin cluster -> transcoding farm -> CDN + RTMPS outputs to social (fan-out via multi-streaming)

Use /products/multi-streaming to manage social outputs and rate-limited pushes. For post-event assets link to /products/video-on-demand.

Troubleshooting quick wins

High packet loss on SRT: increase SRT latency and monitor retransmit counts; if packet loss > 5% consider a higher-latency but more reliable path or move to a managed contribution network.
Start-up delay too high in players: reduce player buffer and ensure packager exposes enough low-latency parts (e.g. 200 ms parts, and 2–3 part initial buffer).
Frequent ABR switches and quality oscillation: widen bitrate gaps in ABR ladder and set minimum switch duration to 2–5 s. Use constrained-VBR with proper bufsize.
Stuttering after CDN edge switch: verify segment alignment and consistent timestamps delivered by packager; ensure consistent codec parameters across renditions.
Encoders dropping frames: check CPU and I/O utilization, use hardware encoder if required, or lower resolution/bitrate.

Next step

If you want a practical deployment path for SRT-based casting workflows, pick what you need and follow the mapped product pages and documentation:

Programmable SRT ingestion and pipelines: /products/video-api
Multi-destination social and fan-out: /products/multi-streaming
VOD assembly after live events: /products/video-on-demand
Self-hosted deployment and appliance options: /self-hosted-streaming-solution
Managed marketplace offering: https://aws.amazon.com/marketplace/pp/prodview-npubds4oydmku

For configuration examples and deeper encoder/packager settings, consult the technical docs at /docs/srt-setup, /docs/encoding-guidelines, and /docs/latency-budget. When you're ready, run a staged test: one encoder, one origin, one CDN edge, and a small set of real viewers in each target region and iterate settings until your measured E2E latency and rebuffer rate meet the SLA.

If you'd like help mapping your specific event (bitrate, number of viewers, expected network conditions) to a cast software architecture and cost estimate, contact the team via the product pages above or evaluate the self-hosted listing in AWS Marketplace.