comparison doc/FR1-Rx-DTX @ 303:4034c2b06ec8

doc/FR1-Rx-DTX: update for libgsmfr2 and the new landscape
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 15 Apr 2024 22:07:00 +0000
parents 731c98b67da1
children
comparison
equal deleted inserted replaced
302:f469bad44c0e 303:4034c2b06ec8
24 SID flag, but it is determined from frame payload bits), and then the 24 SID flag, but it is determined from frame payload bits), and then the
25 interface from the Rx DTX handler to the GSM 06.10 decoder is another FR frame 25 interface from the Rx DTX handler to the GSM 06.10 decoder is another FR frame
26 of 260 bits. 26 of 260 bits.
27 27
28 What are the implications of this situation for the GSM published-source 28 What are the implications of this situation for the GSM published-source
29 software community? Prior to the present libgsmfrp offering, there has always 29 software community? Prior to the present Themyscira offering, there has always
30 been libgsm, but no Rx DTX handler. If you are working with a GSM uplink RTP 30 been libgsm, but no Rx DTX handler. If you are working with a GSM uplink RTP
31 stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or 31 stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or
32 some other GSM MS PHY, feeding that stream directly to libgsm (without passing 32 some other GSM MS PHY, feeding that stream directly to libgsm (without passing
33 through an Rx DTX handler) is NOT acceptable: a "bare" GSM 06.10 decoder won't 33 through an Rx DTX handler) is NOT acceptable: a "bare" GSM 06.10 decoder won't
34 recognize SID frames and won't produce the expected comfort noise output, and 34 recognize SID frames and won't produce the expected comfort noise output, and
39 will be garbage during those frame windows when no good frame was received; 39 will be garbage during those frame windows when no good frame was received;
40 feeding that garbage to libgsm produces noises that are very unkind on ears. 40 feeding that garbage to libgsm produces noises that are very unkind on ears.
41 41
42 The correct solution is to implement an Rx DTX handler, pass the stream of 42 The correct solution is to implement an Rx DTX handler, pass the stream of
43 frames and flags from the BTS or the MS PHY to this handler first, and then pass 43 frames and flags from the BTS or the MS PHY to this handler first, and then pass
44 the output of this handler to libgsm 06.10 decoder. Themyscira libgsmfrp is a 44 the output of this handler to the standard GSM 06.10 decoder (classic libgsm or
45 Free Software implementation of Rx DTX handler for GSM FR, implementing SID 45 some updated port thereof). Themyscira libgsmfrp was our first Free Software
46 classification, comfort noise generation and error concealment. 46 implementation of Rx DTX handler for GSM-FR, implementing SID classification,
47 comfort noise generation and error concealment. Our new libgsmfr2 offering
48 takes the harmonization effort (between GSM-FR and other GSM codecs) one step
49 further, eliminating the dependency on old libgsm and putting all GSM-FR codec
50 functions "under one roof".
51
52 libgsmfrp/libgsmfr2 API documentation
53 =====================================
54
55 The Rx DTX component of libgsmfr2 has the same API as our previous libgsmfrp,
56 except for dropping the use of <gsm.h> and its types and needing to include our
57 new API header <tw_gsmfr.h>. The present article previously contained the full
58 description of this API; that description has now been moved to FR1-library-API
59 article, where the whole of libgsmfr2 is documented.
60
61 Standalone exerciser utility
62 ============================
63
64 The present GSM codec libraries and utilities package includes a standalone
65 utility that exercises our Rx DTX handler for GSM-FR. This utility is
66 gsmfr-preproc, to be run as follows:
67
68 gsmfr-preproc input.gsmx output.gsm
69
70 The input is an extended-libgsm file that can contain SIDs and BFI frame gaps
71 in addition to regular GSM 06.10 speech frames (see Binary-file-format article);
72 the output is GSM 06.10 speech frames only.
73
74 False SID detection
75 ===================
76
77 The intent of GSM-FR spec authors was that the sets of possible speech frames
78 and possible SID frames be disjoint. Prior to introduction of DTX, there were
79 only regular speech frames per GSM 06.10, no SID, and a receiver had to deal
80 with only two possibilities: either a good speech frame was received, or the
81 frame was lost to radio errors or FACCH stealing (unusable frame). When SID
82 frames were introduced for the purpose of intentional DTX as distinct from
83 radio errors, the intent was that SID was to be a "new animal" not seen before,
84 distinct from regular speech frames. There is, however, a small blemish in the
85 actual system as realized: if the SID frame detector and the Rx DTX handler
86 that follows it in the Rx chain follow the rules of GSM 06.31 sections 6.1.1
87 and 6.1.2, respectively (like our implementation does), then some speech frames
88 may be mistaken for invalid SID, or perhaps even for valid SID, producing a
89 nonzero failure rate in this mechanism.
90
91 Official test sequence 02 in the set of 5 provided by ETSI exhibits this effect:
92 Seq02.inp is a legitimate 13-bit linear PCM input to the speech encoder, and the
93 corresponding output of GSM 06.10 encoder is contained in Seq02.cod. However,
94 that output contains some frames that are mistakenly classified as SID=1
95 (invalid SID) by the rules of GSM 06.31 section 6.1.1! It is true that these
96 ancient test sequences chronologically predate the invention of DTX and
97 GSM 06.31, but we still need to bear in mind that this problematic Seq02.cod is
98 not an artificially constructed sequence of 06.10 codec parameters: it is the
99 required output of the prescribed bit-exact encoder given a legitimate PCM
100 input! There does not exist a perfect solution to this problem: as usual,
101 real-world engineering is all about trade-offs and compromises, and occasionally
102 a gear will slip. The best we can do is to model the probability of such
103 gear-slip or wrong detection events, and engineer our systems to reduce this
104 probability to a level that is deemed acceptable - which is exactly what GSM
105 spec designers did here.
106
107 As of gsm-codec-lib-r3, gsmrec-dump utility shows the SID classification result
108 (GSM 06.31 section 6.1.1) in addition to parsed 06.10 codec parameters for each
109 frame, thus one can inspect FR-encoded streams and check for this blemish.
47 110
48 Effect of extra preprocessing 111 Effect of extra preprocessing
49 ============================= 112 =============================
50 113
51 One key detail deserves extra emphasis before going into library API details: 114 What will happen if the output of our Rx DTX preprocessor (e.g., the output of
52 if the input to libgsmfrp consists entirely of good speech frames (no SID frames 115 gsmfr-preproc utility) is fed to another utility such as gsmfr-decode that also
53 and no BFIs), then the preprocessor becomes an identity transform. Therefore, 116 applies the same preprocessor to its input? In other words, what is the effect
54 if the output of our libgsmfrp preprocessor were to be fed to an additional 117 of a secondary preprocessor application to previous preprocessor output?
55 instance of the same further down the processing chain, no extra transformation
56 of any kind will happen.
57 118
58 Using libgsmfrp 119 Most of the time, the second preprocessor pass will be an identity transform
59 =============== 120 under these conditions, as the input to that second pass will consist entirely
60 121 of good speech frames, no SIDs and no BFIs. Any speech frames in the original
61 The external public interface to Themyscira libgsmfrp consists of a single 122 input that were mistakenly classified as SID (valid or invalid) have already
62 header file <gsm_fr_preproc.h>; it should be installed in the same system 123 been converted to comfort noise (or to the silence frame in one corner case of
63 include directory as <gsm.h> from libgsm. Please note that <gsm_fr_preproc.h> 124 invalid SID), hence they are no longer present in the output to trigger this
64 includes <gsm.h>, as needed for gsm_byte and gsm_frame defined types. 125 effect a second time. However, there is still a small possibility that a
65 126 second pass will be a non-identity transform: pseudorandom RPE pulse parameters
66 The dialect of C we chose for libgsmfrp is ANSI C (function prototypes), const 127 in our comfort noise output are uniformly distributed between 1 and 6 (GSM 06.12
67 qualifier is used where appropriate; however, unlike libgsmefr, the interface 128 section 6.1), and if PRNG dice roll such that at least 80 out of 95 SID codeword
68 to libgsmfrp is defined in terms of gsm_byte type defined in <gsm.h>, included 129 bit positions (all in the xMc part of the frame) are all zeros, the resulting
69 from <gsm_fr_preproc.h>. 130 CN frame will be liable to misinterpretation as SID (invalid SID most of the
70 131 time, or even more rarely valid SID if at least 94 out of 95 SID codeword bit
71 State allocation and freeing 132 positions are all zeros) if fed to the preprocessor a second time. That second
72 ============================ 133 pass would then further alter those affected frames, but no others.
73
74 The Rx DTX handler is stateful, hence you will need to allocate a preprocessor
75 state structure in addition to the usual libgsm state structure for your GSM FR
76 Rx session. The necessary function is:
77
78 extern struct gsmfr_preproc_state *gsmfr_preproc_create(void);
79
80 struct gsmfr_preproc_state is an opaque structure to library users: you only get
81 a pointer which you remember and pass around, but <gsm_fr_preproc.h> does not
82 give you a full definition of this struct. As a library user, you don't even
83 get to know the size of this struct, hence the necessary malloc() operation
84 happens inside gsmfr_preproc_create(). However, the structure is malloc'ed as
85 a single chunk, hence when you are done with it, simply call free() on the
86 pointer you got from gsmfr_preproc_create().
87
88 gsmfr_preproc_create() can fail if the malloc() call inside fails, in which case
89 it returns NULL.
90
91 Preprocessing good frames
92 =========================
93
94 For every good traffic frame (BFI=0) you receive from the radio subsystem, you
95 need to call this preprocessor function:
96
97 extern void gsmfr_preproc_good_frame(struct gsmfr_preproc_state *state,
98 gsm_byte *frame);
99
100 The second argument is both input and output, i.e., the frame is modified in
101 place. If the received frame is not SID (specifically, if the SID field
102 deviates from the SID codeword by 16 or more bits, per GSM 06.31 section 6.1.1),
103 then the frame (considered a good speech frame) will be left unmodified (i.e.,
104 it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state
105 will be updated. OTOH, if the received frame is classified as either valid or
106 invalid SID per GSM 06.31, then the output frame will contain comfort noise
107 generated by the preprocessor using a PRNG, or a silence frame in one particular
108 corner case.
109
110 GSM-FR RTP (or libgsm) 0xD magic: the upper nibble of the first byte can be
111 anything on input to gsmfr_preproc_good_frame(), but the output frame will
112 always have the correct magic in it.
113
114 Handling BFI conditions
115 =======================
116
117 If you received a lost/missing frame indication instead of a good traffic frame,
118 call this preprocessor function:
119
120 extern void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf,
121 gsm_byte *frame_out);
122
123 TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag,
124 pass 0 - you will lose the function of comfort noise muting in the event of
125 prolonged SID loss, but all other Rx DTX functions will still work the same.
126
127 With this function the 33-byte frame buffer is only an output, i.e., prior
128 buffer content is a don't-care and there is no provision for making any use of
129 erroneous frames like in EFR. The frame generated by the preprocessor may be
130 substitution/muting, comfort noise or silence depending on the state.
131
132 Other miscellaneous functions
133 =============================
134
135 extern void gsmfr_preproc_reset(struct gsmfr_preproc_state *state);
136
137 This function resets the preprocessor state to what it is right out of
138 gsmfr_preproc_create(), which is naturally just a combination of malloc() and
139 gsmfr_preproc_reset(). Given that our Rx DTX handler state is much simpler
140 than, for example, EFR codec state, there does not seem to be any need for
141 explicit resets, but the reset function is made public for the sake of
142 completeness.
143
144 extern int gsmfr_preproc_sid_classify(const gsm_byte *frame);
145
146 This function analyzes an RTP-encoded FR frame (the upper nibble of the first
147 byte is NOT checked for 0xD signature) for the SID codeword of GSM 06.12 and
148 classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.31
149 section 6.1.1.
150
151 Silence frame datum
152 ===================
153
154 extern const gsm_frame gsmfr_preproc_silence_frame;
155
156 Many implementors make the mistake of thinking that a GSM FR silence frame is a
157 frame of 260 zero bits, but the official specs disagree: the silence frame given
158 in GSM 06.11 (3GPP TS 46.011, at the very end of the spec) is quite different.
159 Themyscira libgsmfrp implements the correct silence frame per the spec, and that
160 datum is also made public.
161
162 libgsmfrp change history: version 1.0.1 to version 1.0.2
163 ========================================================
164
165 There are only two changes, both involving corner cases with invalid SID frames
166 being received:
167
168 1) An invalid SID frame was received immediately following a good speech frame.
169 In this case we start CN generation, but we take the needed LARc and Xmaxc
170 parameters from the last speech frame, instead of the usual procedure of
171 extracting them from a valid SID frame. The change from 1.0.1 to 1.0.2
172 concerns the Xmaxc parameter in this corner case: in 1.0.1 we took Xmaxc
173 from the last subframe and used it for ensuing CN generation, but in 1.0.2
174 we compute a more proper mean Xmaxc from all 4 subframes, by dequantizing,
175 summing and requantizing.
176
177 2) An invalid SID frame was received in the speech muting state. The sequence
178 of inputs would have to be:
179
180 - a good speech frame;
181 - one or more BFIs, but not too many, so that the cached speech frame
182 does not decay fully by Xmaxc reduction;
183 - an invalid SID frame.
184
185 In version 1.0.1 we handled this even more obscure corner case by entering
186 the CN muting state, i.e., the state that is normally entered upon the
187 second lost SID. In version 1.0.2 we ignore invalid SID in the speech
188 muting state and act as if we got BFI, i.e., continue speech muting rather
189 than switch to CN muting.
190
191 libgsmfrp change history: version 1.0.0 to version 1.0.1
192 ========================================================
193
194 Version 1.0.0 exhibited the following defects, which are fixed in 1.0.1:
195
196 1) The last received valid SID was cached forever for the purpose of
197 handling future invalid SIDs - we could have received some valid
198 SID ages ago, then lots of speech or NO_DATA, and if we then get
199 an invalid SID, we would resurrect the last valid SID from ancient
200 history - a bad design. In our new design, we handle invalid SID
201 based on the current state, much like BFI.
202
203 2) GSM 06.11 spec says clearly that after the second lost SID
204 (received BFI=1 && TAF=1 in CN state) we need to gradually decrease
205 the output level, rather than jump directly to emitting silence
206 frames - we previously failed to implement such logic.
207
208 3) Per GSM 06.12 section 5.2, Xmaxc should be the same in all 4 subframes
209 in a SID frame. What should we do if we receive an otherwise valid
210 SID frame with different Xmaxc? Our previous approach would
211 replicate this Xmaxc oddity in every subsequent generated CN frame,
212 which is rather bad. In our new design, the very first CN frame
213 (which can be seen as a transformation of the SID frame itself)
214 retains the original 4 distinct Xmaxc, but all subsequent CN frames
215 are based on the Xmaxc from the last subframe of the most recent SID.