normalize media pipeline at client boundary

- AudioParams.framing field: client declares "raw" or "adts"
- Client strips ADTS from audio before sending (strip_adts)
- Client does H.264 NAL inspection for keyframe detection (h264_is_keyframe)
- Server uses declared sample_rate/channels for ADTS synthesis instead of hardcoded 48kHz/stereo
- Server gates ADTS wrapping on framing field instead of per-packet sniffing

New backends only need to pipe output to demux_and_send() — server and Python unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-10 13:51:11 -03:00
parent e92ab933ce
commit e9e1d14e6b
5 changed files with 102 additions and 20 deletions

View File

@@ -166,6 +166,14 @@ pub struct AudioParams {
pub sample_rate: u32,
pub channels: u16,
pub codec: String,
/// Audio framing on the wire: "raw" (no container headers) or "adts".
/// Default "raw" — client strips ADTS before sending.
#[serde(default = "default_framing")]
pub framing: String,
}
fn default_framing() -> String {
"raw".into()
}
impl ControlMessage {
@@ -231,6 +239,7 @@ mod tests {
sample_rate: 48000,
channels: 2,
codec: "aac".into(),
framing: "raw".into(),
},
};
let wire = msg.to_wire_packet().unwrap();