comparison doc/HR-codec-library @ 633:3ab76caba41c

doc/HR-codec-library: document stateless utility functions
author Mychaela Falconia <falcon@freecalypso.org>
date Fri, 20 Mar 2026 02:17:55 +0000
parents 7fc57e2a6784
children 723265aea9f8
comparison
equal deleted inserted replaced
632:7fc57e2a6784 633:3ab76caba41c
246 functions in traditional TRAUs. This flag can be changed mid-session: as 246 functions in traditional TRAUs. This flag can be changed mid-session: as
247 explained in HR-codec-Rx-logic article, our implementation of TFO transform 247 explained in HR-codec-Rx-logic article, our implementation of TFO transform
248 proceeds by applying classic Rx front end processing that only emits speech 248 proceeds by applying classic Rx front end processing that only emits speech
249 frames, and then replacing output with SID frames under certain conditions if 249 frames, and then replacing output with SID frames under certain conditions if
250 DTXd is enabled. 250 DTXd is enabled.
251
252 Stateless utility functions
253 ===========================
254
255 All functions in this section are stateless (no encoder, decoder or RxFE state
256 structure is needed); they merely manipulate data formats.
257
258 void gsmhr_pack_ts101318(const int16_t *param, uint8_t *payload);
259
260 This function converts a 112-bit GSM-HR codec frame from an array of speech
261 parameters (18 16-bit words) into the packed format of ETSI TS 101 318, which
262 is a buffer of 14 octets with every bit used for payload. Any extraneous bits
263 in input 16-bit words (beyond the size of each parameter in bits) are ignored.
264
265 void gsmhr_unpack_ts101318(const uint8_t *payload, int16_t *param);
266
267 This function converts a 112-bit GSM-HR codec frame from the packed format of
268 TS 101 318 into an array of 18 speech parameters.
269
270 void gsmhr_encoder_twts002_out(const int16_t *param, uint8_t *payload);
271
272 This function converts a cod-style frame (output from gsmhr_encode_frame() or
273 gsmhr_tfo_xfrm(), or read from an ETSI *.cod file) into TW-TS-002 format. The
274 output is always 15 octets long (the buffer must have this much room), and is
275 valid per both RFC 5993 and TW-TS-002 specs. The only two possible frame types
276 in this context are good speech and good SID, distinguished by SP flag in the
277 cod-style input and by FT field in RFC 5993 output.
278
279 int gsmhr_decoder_twts002_in(const uint8_t *payload, int16_t *param);
280
281 This function reads a super-5993 frame in TW-TS-002 format from a buffer and
282 converts it into the required form for input to gsmhr_decode_frame() or
283 gsmhr_tfo_xfrm(), which is an extended form of ETSI's *.dec format. The input
284 must be a valid super-5993 in the following sense:
285
286 * The first octet in the buffer must be valid ToC per TW-TS-002 section 5.1;
287
288 * F bit in this ToC octet must be cleared;
289
290 * FT field must equal 0, 1, 2, 6 or 7 per TW-TS-002 section 5.2;
291
292 * If FT equals 0, 2 or 6, the ToC octet must be followed by 14 octets of frame
293 payload.
294
295 If any of these rules are violated, gsmhr_decoder_twts002_in() returns a
296 negative value (-1 if F bit is set or -2 if FT is invalid) and does not write
297 anything into the output array. Otherwise, the function returns 0 (indicating
298 success) and the output array is filled as follows:
299
300 * For frame types 0, 2 and 6, the 18 speech parameters are filled from the
301 TS-101-318-like payload portion of super-5993 input.
302
303 * For frame types 1 and 7, the 18 speech parameters are set to all zeros, with
304 the expectation that gsmhr_decode_frame() or gsmhr_tfo_xfrm() will ignore
305 them. Please note that "verbose" invalid SID bits that may be present in
306 TW-TS-002 transport are ignored.
307
308 * The 4 metadata flags BFI, UFI, SID and TAF are set based on FT and the
309 additional ToC flags defined in TW-TS-002 section 5.3.
310
311 * Themyscira extension of BFI=2, described earlier in this document, is used
312 to represent FT=7.
313
314 * Invalid SID frames (FT=1) are converted to BFI=1 SID=1.
315
316 int gsmhr_rtp_in_preen(const uint8_t *rtp_in, unsigned rtp_in_len,
317 uint8_t *canon_pl);
318
319 This function performs initial processing of RTP input that is expected to be
320 one of the defined RTP formats for GSM-HR codec. It accepts all possibilities
321 of TW-TS-002, RFC 5993 or TS 101 318 (listed in ThemWi order of preference) and
322 writes canonical TW-TS-002 super-5993 format into a buffer. The output buffer
323 must have 15 bytes of space, and the frame written into this buffer will ALWAYS
324 be a valid input to gsmhr_decoder_twts002_in() function described above.
325
326 The input arguments are RTP payload and its length. The return value is 0 if
327 RTP input was in a recognized format, or -1 if it is invalid. In the case of
328 invalid RTP input, the output is filled with ToC of 0x70 (BFI with no data) -
329 the output is always valid.
330
331 Zero-length RTP payloads are acceptable; if rtp_in_len is 0, then rtp_in pointer
332 may be NULL. The output in this case is filled with ToC of 0x70 (BFI with no
333 data), but the return value is 0, indicating success. The intent is that truly
334 invalid RTP payloads are error events which should be counted, while NULL input
335 is a normal occurrence when ThemWi jitter buffer (twjit) does not hold a
336 previously received RTP packet that maps to the current tick. (Actually
337 transmitted RTP packets with a zero-length payloads are also possible: they are
338 ThemWi preferred alternative to IETF approach of intentional gaps in the RTP
339 stream.)
340
341 int gsmhr_rtp_in_direct(const uint8_t *rtp_in, unsigned rtp_in_len,
342 int16_t *param);
343
344 This function is fully equivalent to calling first gsmhr_rtp_in_preen(), then
345 gsmhr_decoder_twts002_in(). It is however slightly more efficient, as it avoids
346 the intermediate buffer and some copying. The return value is the same as
347 gsmhr_rtp_in_preen(), and just like with that function, the output is always
348 valid.
349
350 Reading *.cod and *.dec files
351 -----------------------------
352
353 The most native representation format for GSM-HR codec frames in libgsmhr1 is
354 arrays of broken-down speech parameters. However, unlike TS 101 318 format in
355 which every possible bit pattern is a plausible GSM-HR codec frame, an array of
356 broken-down parameters that purports to be a GSM-HR frame can contain garbage.
357 The additional metadata flags in the canonical decoder input format can also
358 contain garbage - which our speech decoder and TFO transform engines are NOT
359 prepared for! There is no potential for malfunction if these arrays of
360 parameters and metadata flags come only from libgsmhr1 functions - but if an
361 application needs to read *.cod or *.dec files, or otherwise accept external
362 input in any of these formats, then an explicit validation step is required.
363
364 int gsmhr_check_common_params(const int16_t *params);
365
366 This function examines an array of 18 codec parameters in the int16_t
367 representation used in this library, and checks if the unused upper bits of
368 each int16_t word are cleared as they should be. The return value is 0 if the
369 frame is valid or -1 if some extraneous high bits are set.
370
371 int gsmhr_check_encoder_params(const int16_t *params);
372
373 This function examines a frame of 20 int16_t words that corresponds to GSM-HR
374 encoder output format, and checks if the unused upper bits of each int16_t word
375 are cleared as they should be. This function should be used when reading from
376 ETSI-format *.cod files, to guard against reading garbage or wrong endian. The
377 return value is 0 if the frame is valid or -1 if some extraneous high bits are
378 set.
379
380 int gsmhr_check_decoder_params(const int16_t *params);
381
382 This function examines a frame of 22 int16_t words that corresponds to GSM-HR
383 decoder input format, and checks if the unused upper bits of each int16_t word
384 are cleared as they should be. This function should be used when reading from
385 ETSI-format *.dec files, to guard against reading garbage or wrong endian. The
386 return value is 0 if the frame is valid or -1 if some extraneous high bits are
387 set. Both BFI and SID words are limited to range [0,2], i.e., Themyscira BFI=2
388 extension is accepted.
389
390 SID field manipulation
391 ----------------------
392
393 Unlike FR and EFR, GSM-HR codec lacks fixed rules for Rx frame classification
394 as valid SID, invalid SID or non-SID speech. The BTS makes this classification
395 decision according to its internal private rules, and the SID flag then needs
396 to be carried out of band in Abis, Ater and TFO. GSM 08.61 and TW-TS-002
397 (extended 5993) formats provide the necessary out-of-band SID indication, but
398 the bare format of TS 101 318 does not. Therefore, the only kind of GSM-HR SID
399 that can be represented in TS 101 318 format are perfect, 100% error-free SID
400 frames in which all 79 bits of the SID field are set to 1.
401
402 int gsmhr_ts101318_is_perfect_sid(const uint8_t *payload);
403
404 This function checks the given TS 101 318 payload for the possibility of
405 perfect SID. The return value is 2 (GSM 06.41 code for valid SID) if the frame
406 is indeed a perfect SID, or 0 (GSM 06.41 code for non-SID speech) otherwise.
407
408 void gsmhr_ts101318_set_sid_codeword(uint8_t *payload);
409
410 This function sets all 79 bits of the SID field to 1s, forming a perfect SID
411 frame in the 14-byte buffer. The first 33 bits that carry R0 and LPC parameters
412 must already be filled correctly.
413
414 void gsmhr_set_sid_cw_params(int16_t *params);
415
416 This function fills parameters 4 through 17 of generated SID frames, setting
417 them to the required SID codeword. It can also be used to transform a speech
418 frame into a SID frame with the same R0 and LPC parameters. It is logically
419 equivalent to gsmhr_ts101318_set_sid_codeword(), but operates on the array of
420 parameters form, rather than TS 101 318 packed format.