comparison doc/PCM8-conversions @ 235:0ee1a66c1846

doc/PCM8-conversions: beginning of document
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 08 May 2023 00:45:26 +0000
parents
children 4c7d0dc1eecb
comparison
equal deleted inserted replaced
234:c7f02428bda6 235:0ee1a66c1846
1 What is the authoritatively correct, officially endorsed bidirectional mapping
2 between G.711 A-law and mu-law encodings on one side and 16-bit 2's complement
3 linear PCM on the other side? Surprisingly, there is no official answer to this
4 problem anywhere in the specs! Instead the specs provide the following partial
5 answers:
6
7 * The G.711 spec itself provides one mapping from A-law code octets to linear
8 numeric values in range [-4032,4032] and another mapping from mu-law code
9 octets to linear numeric values in range [-8031,8031]. The output from each
10 of these mapping is given in "pure mathematical" form, without specifying any
11 bit-level encoding, and furthermore, mu-law decoder output in its pure
12 "conceptual" form has both +0 and -0 values. (The same signed zero problem
13 does not occur in A-law because it's a mid-riser code rather than mid-tread,
14 and thus has no quantized values equal to 0.)
15
16 * If one takes the "pure mathematical" output from the spec-prescribed G.711
17 decoder and represents it in 2's complement form, squashing +0 and -0 outputs
18 from the canonical mu-law decoder into "plain 0" at this step, the result is
19 a 13 bits wide 2's complement value for A-law decoding and a 14 bits wide 2's
20 complement value for mu-law.
21
22 * All GSM speech encoders take 13-bit 2's complement linear PCM samples as their
23 input. How should this 13-bit GSM codec input be derived from A-law or mu-law
24 code octets? GSM specs refer to ITU's G.726 spec for ADPCM - it just so
25 happens that inside the ADPCM algorithm of G.726 (a totally unrelated codec of
26 no relevance to GSM codec work outside of this reference) there is a pair of
27 functions for expanding A-law and mu-law to linear PCM and compressing linear
28 PCM back to A-law or mu-law.
29
30 * Following this obscure G.726 reference, we eventually conclude that in the
31 case of A-law, GSM specs call for the obvious treatment: take the "natural"
32 output from the canonical A-law decoder, represent it in 2's complement form,
33 the result is 13 bits wide, and just feed that 13-bit 2's complement form to
34 the input of GSM speech encoders. However, in the case of mu-law the
35 "natural" G.711 decoder output is one sign bit plus 13 bits of magnitude,
36 requiring 14 bits in 2's complement representation - and none of the specs I
37 could find says anything about exactly how this 14-bit input should be reduced
38 to 13 bits for feeding to GSM speech encoders. Canonical C implementations
39 of all GSM speech encoders take their input in 16-bit words and clear the 3
40 least significant bits as their first step; if the 14-bit mu-law decoder
41 output is represented in 16-bit words by padding 2 zero bits on the right and
42 this output is then fed to GSM speech encoder functions, the end effect is
43 that the least-significant bit of the 14-bit decoder output is simply cut off.
44 This form of mu-law-to-GSM transcoder implementation is consistent with
45 TESTx-U.INP and TESTx-U.COD sequences provided in the GSM 06.54 package for
46 EFR.
47
48 Based on the above considerations, we have our answer for how we should convert
49 from G.711 to 16-bit 2's complement linear PCM:
50
51 * For A-law, we emit the "natural" output in 13-bit 2's complement form and
52 append 3 zero bits on the right; this transformation is fully lossless.
53
54 * For mu-law, we emit the "natural" output in 14-bit 2's complement form and
55 append 2 zero bits on the right. This transformation is almost lossless,
56 with just one exception: the "pure" decoder's -0 output (resulting from PCMU
57 octet 0x7F) is squashed to "plain 0", and will be re-emitted as PCMU octet
58 0xFF rather than 0x7F on subsequent re-encoding to G.711 PCMU.
59
60 For anyone needing a G.711 to 16-bit linear PCM decoder, the present package
61 provides ready-made decoding tables (following the above rules) in
62 dev/a2s-regen.out and dev/u2s-regen.out, generated by dev/a2s-regen.c and
63 dev/u2s-regen.c programs.
64
65 Now for the opposite problem: what is the most correct way to compress 16-bit
66 2's complement linear PCM to A-law or mu-law? In this direction the official
67 specs leave even more ambiguity than in the G.711 decoding direction:
68
69 * The G.711 spec itself says: "The conversion to A-law or mu-law values from
70 uniform PCM values corresponding to the decision values, is left to the
71 individual equipment specification." The specific implementation used in the
72 guts of G.726 ADPCM codec is referred to only as a non-normative example.
73
74 * GSM specs likewise refer to this G.726 section 4.2.8 (for compression of
75 13-bit speech decoder output to G.711) with language that suggests a
76 non-normative example.
77
78 After painstakingly comparing the C implementation of G.726 in the ITU-T G.191
79 STL against the language of G.726 spec itself and convincing myself that they
80 really do match, and then painstakingly comparing this approach against the one
81 implemented in the same G.191 STL for G.711 in alaw_compress() and
82 ulaw_compress() and against the table lookup method implemented in libgsm/toast
83 (my first reference, before I went down the rabbit hole of tracking down
84 official specs), I reached the following conclusions:
85
86 * For A-law encoding all 3 parties (G.191 STL alaw_compress() function, G.726
87 "compress" block and toast_alaw.c) agree on the same mapping. In this
88 mapping only the most significant 12 bits of the 2's complement input word
89 (equivalent to one sign bit and 11 bits of magnitude) are relevant, leading
90 to the following two interesting properties:
91
92 - the least-significant bit of GSM speech decoder output is always discarded
93 when converting to A-law;
94
95 - conversion can be easily implemented with a 4096-byte look-up table based
96 on the upper 12 bits of input, exactly as was done in toast_alaw.c in the
97 venerable libgsm source.
98
99 * Mu-law encoding is the real hair-raiser: if the input to the to-be-implemented
100 encoder has 14 or more bits (including the most practical problem of 16-bit
101 2's complement input), there are no less than 3 different ways to implement
102 this encoder!
103