FreeCalypso > hg > gsm-codec-lib
comparison doc/HR-codec-Rx-logic @ 632:7fc57e2a6784
beginning of GSM-HR documentation
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Thu, 19 Mar 2026 04:13:45 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 631:6bad9af66f69 | 632:7fc57e2a6784 |
|---|---|
| 1 Rx DTX handler logic for GSM-HR speech codec | |
| 2 ============================================ | |
| 3 | |
| 4 With all 3 classic GSM speech codecs (FR, HR and EFR), as TCH UL Rx traffic on | |
| 5 the network side passes from the BTS to the TRAU, the first processing step | |
| 6 performed by the TRAU prior to actual speech decoding is an Rx DTX handler. | |
| 7 (For TCH DL Rx on the mobile side, exactly the same processing steps happen in | |
| 8 total, but because everything is integrated into a single device, interfaces | |
| 9 between steps may be implemented more loosely.) | |
| 10 | |
| 11 For GSM-HR codec the 3 controlling specs for different parts of Rx DTX handler | |
| 12 logic are GSM 06.21, GSM 06.22 and GSM 06.41 - however, for the full details | |
| 13 these specs defer to the reference C code in GSM 06.06. This article explains | |
| 14 this logic from all aspects which we find important: what the Rx DTX logic was | |
| 15 in the original reference code from ETSI and how we adapted it in libgsmhr1, | |
| 16 both for the full speech decoder and for our implementation of TFO transform. | |
| 17 | |
| 18 Normative vs freely changeable aspects | |
| 19 ====================================== | |
| 20 | |
| 21 In the case of error-free transmission, such that the receiver never encounters | |
| 22 a frame with BFI or UFI set except during continuation of a DTX pause (after | |
| 23 receiving a valid SID that begins comfort noise insertion) and is never asked | |
| 24 to begin CN insertion with an invalid SID, the full behaviour of the speech | |
| 25 decoder to the final linear PCM output is required to be bit-exact and gets | |
| 26 exercised by test sequences. This bit-exact behaviour includes non-error- | |
| 27 handling aspects of the Rx DTX handler and comfort noise generation, complete | |
| 28 with interpolation for periodic CN updates via subsequent SID frames. | |
| 29 | |
| 30 However, the reference C implementation becomes a non-normative example | |
| 31 (allowing changes in logic without violating spec requirements) in the | |
| 32 following aspects: | |
| 33 | |
| 34 * Handling of BFI and UFI outside of DTX pauses previously entered via a valid | |
| 35 SID, including most aspects of error concealment; | |
| 36 | |
| 37 * Exact manner of comfort noise muting when expected SID updates fail to arrive; | |
| 38 | |
| 39 * The exact logic to be applied when a CN insertion period begins with an | |
| 40 invalid SID frame. | |
| 41 | |
| 42 Almost-modular nature of GSM-HR Rx DTX handler | |
| 43 ============================================== | |
| 44 | |
| 45 An Rx DTX handler can be considered fully modular if its output (which is then | |
| 46 passed as input to the main body of the speech decoder) is a potentially | |
| 47 modified set of speech parameters that can be packed into a new speech frame | |
| 48 and transmitted through a second radio leg with no change in the final output | |
| 49 of the speech decoder. The Rx DTX handler implemented in the reference code | |
| 50 from ETSI (both spec-normative and "example" aspects as broken down above) | |
| 51 _almost_ meets this modularity criterion, but not fully. The following aspects | |
| 52 are non-modular: | |
| 53 | |
| 54 * The interpolation of R0 and LPC parameters during comfort noise insertion | |
| 55 (bit-exact implementation considered normative) happens after expansion of | |
| 56 transmitted parameter bits into linear form. In the general case one cannot | |
| 57 produce a new set of encoded parameters (that can be transmitted through a | |
| 58 second radio leg) that will produce the same bit-exact result upon final | |
| 59 decoding. | |
| 60 | |
| 61 * Handling of speech frames (not SID, outside of DTX pause state) that are | |
| 62 marked with BFI=0 and UFI=1 (unreliable frames) has both a modular and a | |
| 63 non-modular aspect. If R0 increment is either small enough to not trigger | |
| 64 any mitigation or large enough to where UFI is converted into BFI, the applied | |
| 65 handling is fully modular. However, if R0 increment falls into the narrow | |
| 66 window between the two thresholds, the applied handling (output signal | |
| 67 concealment per GSM 06.21 section 5.1.2) is non-modular: it happens deep in | |
| 68 the guts of the speech decoder and cannot be represented via a modified set | |
| 69 of speech parameters. | |
| 70 | |
| 71 TFO transform derived from the reference Rx DTX handler | |
| 72 ======================================================= | |
| 73 | |
| 74 If one extracts the reference Rx DTX handler from GSM 06.06 code and removes | |
| 75 the two non-modular aspects detailed above, leaving only fully modular logic, | |
| 76 the result can be used as a TFO transform that implements the functions of | |
| 77 TS 28.062 section C.3.2.1.1, specifically Case 1 in which UL may have DTX, but | |
| 78 DL is required to consist of speech frames only. | |
| 79 | |
| 80 How does one address the two non-modular aspects of the standard GSM-HR Rx DTX | |
| 81 handler that are not possible in TFO? The simplest implementation is to remove | |
| 82 them altogether: | |
| 83 | |
| 84 * Comfort noise parameters are not interpolated, instead an abrupt change in R0 | |
| 85 and LPC parameters occurs every 240 ms when a new SID frame arrives. | |
| 86 | |
| 87 * UFI is simply dropped in the case when the standard decoder would apply output | |
| 88 signal concealment, i.e., the latter feature is given up. | |
| 89 | |
| 90 Obviously this approach constitutes functional regression relative to the | |
| 91 standard speech decoder - thus we were initially hesitant to adopt it. However, | |
| 92 experiments with a real historical TRAU that supports TFO (Nokia TCSM2) reveal | |
| 93 that Nokia implemented exactly the same approach (minimal complexity at the | |
| 94 price of slight functional degradation) in their TRAU DSP firmware. Seeing | |
| 95 that a major classic vendor of GSM infrastructure implemented this simplistic | |
| 96 approach, we are now comfortable with doing the same - especially considering | |
| 97 the work scope limits explained in HR-codec-limits article. | |
| 98 | |
| 99 In Themyscira libgsmhr1 implementation, a component has been factored out which | |
| 100 we call the Rx front end (RxFE). This RxFE is our cleaned-up reimplementation | |
| 101 of those parts of the original Rx DTX handler that are fully modular (including | |
| 102 the speech ECU and all CN parameters that aren't interpolated), plus some | |
| 103 additional internal flag inputs and outputs. Out of the latter internal flags, | |
| 104 some are used only by the full speech decoder, while others are used only by | |
| 105 the TFO transform. RxFE state, which also serves as the API-visible TFO | |
| 106 transform state, is a subset of full speech decoder state. However, the core | |
| 107 RxFE function is not exported directly as API; instead the TFO transform API | |
| 108 function is a TFO-specific wrapper around the RxFE. | |
| 109 | |
| 110 Detailed RxFE logic and its evolution | |
| 111 ===================================== | |
| 112 | |
| 113 Now that we have covered the background of the previous sections, we can | |
| 114 properly examine the actual logic of our RxFE, the follow-up logic for CN | |
| 115 interpolation that exists only in the full decoder, and their origins in the | |
| 116 reference GSM 06.06 code. | |
| 117 | |
| 118 Unless noted otherwise, all logic described in the following sections is the | |
| 119 same between ETSI original and the present Themyscira implementation. The | |
| 120 internal representation and code structure may be different, but the behavioral | |
| 121 logic remains the same unless explicitly called out otherwise. | |
| 122 | |
| 123 Input frame classification | |
| 124 -------------------------- | |
| 125 | |
| 126 As the very first processing step for every incoming frame, BFI, UFI and SID | |
| 127 flags are combined per GSM 06.41 Table 1 to classify the frame as good speech, | |
| 128 valid SID, invalid SID or unusable for DTX purposes. Note that UFI turns valid | |
| 129 SID into invalid just like BFI, and for DTX purposes all non-SID frames marked | |
| 130 with UFI are considered "unusable". But as we shall see shortly, this | |
| 131 "unusable" classification matters only for DTX and not for speech ECU logic, | |
| 132 which is separate. | |
| 133 | |
| 134 Speech vs CNI state | |
| 135 ------------------- | |
| 136 | |
| 137 RxFE state that carries from one frame to the next includes one very important | |
| 138 two-state flag: either speech or CNI (comfort noise insertion) mode. By | |
| 139 combining the 4 possible frame classifications from GSM 06.41 Table 1 (see | |
| 140 above) with these two possible carry-over states, we get 4 possible ways in | |
| 141 which the current frame may be handled: | |
| 142 | |
| 143 Input frame class Previously speech Previously CNI | |
| 144 -------------------------------------------------------------- | |
| 145 SID (valid or invalid) CNIFIRSTSID CNICONT | |
| 146 Good speech SPEECH SPEECH | |
| 147 Unusable SPEECH CNIBFI | |
| 148 | |
| 149 Here we can see that unless we enter DTX/CNI state, neither BFI nor UFI moves | |
| 150 RxFE logic out of SPEECH handling. This SPEECH handling mode includes the ECU | |
| 151 and handles both good and bad speech frames. However, once DTX/CNI state has | |
| 152 been entered, then only a (BFI==0 && UFI==0 && SID==0) good speech frame can | |
| 153 effect exit from this state! | |
| 154 | |
| 155 Speech ECU logic | |
| 156 ================ | |
| 157 | |
| 158 The frame-to-frame persistent state for the ECU consists of the state counter | |
| 159 variable (range [0,7]) described in GSM 06.21 section 6.3 and a saved copy of | |
| 160 the last good speech frame. The just-referenced spec section describes the | |
| 161 logic quite well, but a few additional notes are in order: | |
| 162 | |
| 163 * The last good speech frame that gets regurgitated in substitution/muting | |
| 164 states of the ECU is not exactly the same as the actual last good speech frame | |
| 165 that went through: | |
| 166 | |
| 167 + GSP0 parameters for the first 3 subframes are replaced with GSP0 parameter | |
| 168 for the last subframe; | |
| 169 | |
| 170 + If the frame is voiced, LTP lag parameters are modified - read the code for | |
| 171 the details. | |
| 172 | |
| 173 In the original ETSI implementation, these modifications are applied at the | |
| 174 time of substitution/muting output; in our implementation, they are applied | |
| 175 at the time when a good speech frame is saved. Our implementation approach | |
| 176 makes it clearer what state is actually retained, but the functional behaviour | |
| 177 is exactly the same. | |
| 178 | |
| 179 * When that last good speech frame gets regurgitated during bad frame handling, | |
| 180 codevector parameters may be taken either from that saved last good speech | |
| 181 frame or from the current bad frame. Use of codevector parameters from the | |
| 182 current bad frame is possible only when the current bad frame and the saved | |
| 183 last good speech frame have the same voiced vs unvoiced mode. If this mode | |
| 184 matches for one frame and bad-frame codevector parameters get passed on, but | |
| 185 the next bad frame has incompatible mode, the saved last good speech frame | |
| 186 gets used in its entirety once again, subject only to the modifications | |
| 187 described above. | |
| 188 | |
| 189 * Our Themyscira version features an extension: if BFI equals 2 instead of 1, | |
| 190 indicating BFI without payload bits, then there are no bad-frame codevector | |
| 191 parameters and the saved last good speech frame is used in its entirety, | |
| 192 just as if BFI frames always have the wrong voiced vs unvoiced mode. | |
| 193 | |
| 194 BFI out of reset | |
| 195 ================ | |
| 196 | |
| 197 What happens if the very first input frame in reset state (after external reset | |
| 198 or after a decoder homing frame) is a bad frame per BFI, or per UFI treated as | |
| 199 BFI - what is the default "last" good speech frame? In ETSI original code it | |
| 200 is a frame of all zero parameters, but this oddity is not readily visible - the | |
| 201 final output of linear PCM is also all zeros, and all is well. In Themyscira | |
| 202 implementation, the output of our RxFE may be visible externally if it is used | |
| 203 as a TFO transform - hence more attention was given to this issue. | |
| 204 | |
| 205 If we feed all zeros as PCM input to a homed standard GSM-HR speech encoder, we | |
| 206 get this frame, repeating endlessly as long as all-zeros PCM input continues: | |
| 207 | |
| 208 R0=00 LPC=164,171,cb Int=0 Mode=0 | |
| 209 s1=00,00,00 s2=00,00,00 s3=00,00,00 s4=00,00,00 | |
| 210 | |
| 211 This frame differs from all-zero params only in the LPC set, and this sane-LPC | |
| 212 silence frame is the one we have adopted as our reset-default fallback frame. | |
| 213 | |
| 214 When libgsmhr1 full speech decoder engine is used, as opposed to TFO transform, | |
| 215 there is an additional check. If the current state is the special home state | |
| 216 (logic required for spec-mandated EHF output with repeated DHF input) and the | |
| 217 input frame has BFI flag set (no other flags are considered in this case), the | |
| 218 PCM output is set to all zero samples without leaving the home state. However, | |
| 219 the regular speech ECU and its last good frame default can still be reached if | |
| 220 BFI is clear, UFI is set and R0 is high. | |
| 221 | |
| 222 Comfort noise logic in RxFE | |
| 223 =========================== | |
| 224 | |
| 225 GSM 06.22 spec treats the required bit-exact CN generator as a single entity - | |
| 226 however, in our implementation it is split between the RxFE and the main body | |
| 227 of the full speech decoder. The bit-exact result in the case of full speech | |
| 228 decoding remains the same, but our arrangement allows non-interpolated CN | |
| 229 generation in the TFO transform as well. | |
| 230 | |
| 231 When our RxFE is used as a TFO transform with DTXd=0 (the mode that includes CN | |
| 232 generation), CN output from the transform matches GSM 06.22 Table 2, with the | |
| 233 exception of R0 and LPC parameters. These R0 and LPC parameters will be filled | |
| 234 as follows: | |
| 235 | |
| 236 * If CN insertion period begins with a valid SID, R0 and LPC are taken from | |
| 237 that SID. | |
| 238 | |
| 239 * If CN insertion period begins with an invalid SID, R0 and LPC are taken from | |
| 240 the last good speech frame, the one used by the speech ECU. Directly out of | |
| 241 reset (or after a DHF), these parameters are as shown above: | |
| 242 | |
| 243 R0=00 LPC=164,171,cb | |
| 244 | |
| 245 * Any time a new valid SID frame arrives during a CN insertion period, R0 and | |
| 246 LPC parameters change to this new SID. | |
| 247 | |
| 248 * Any time the input during CN insertion is either an unusable frame or an | |
| 249 invalid SID, R0 and LPC parameters remain unchanged from the most recently | |
| 250 received valid SID, or from the last good speech frame if only invalid SID | |
| 251 frames have been received in the entire CN insertion period so far. | |
| 252 | |
| 253 Comfort noise muting | |
| 254 ==================== | |
| 255 | |
| 256 Per GSM 06.21 sections 5.2.3 and 5.2.4, when SID frames fail to arrive for 3 | |
| 257 consecutive TAF positions, generated comfort noise needs to be muted. We | |
| 258 implement this logic in our RxFE, and the actual logic is unchanged from ETSI | |
| 259 reference code - it is described in GSM 06.21 section 6.4. | |
| 260 | |
| 261 This SID aging and CN muting logic works by counting unusable frames received | |
| 262 in between SID updates. In the original GSM 06.06 code the criterion to start | |
| 263 CN muting is: | |
| 264 | |
| 265 TAF == 1 && CNIBFI_count >= 25 | |
| 266 | |
| 267 In our version we changed it to: | |
| 268 | |
| 269 CNIBFI_count >= (TAF ? 25 : 36) | |
| 270 | |
| 271 When TAF is indicated correctly, once every 12 frames and with the flag always | |
| 272 present at least in BFI frames (consider GSM 08.61 TRAU-8k format), our extended | |
| 273 criterion is equivalent to the original; however, our version will also produce | |
| 274 eventual CN muting if TAF is missing. | |
| 275 | |
| 276 For the purpose of this logic, invalid SID is as good as valid: while it is | |
| 277 treated just like unusable frames (CNIBFI) for the purpose of R0 and LPC | |
| 278 parameters and their interpolation (see next section), for the purpose of SID | |
| 279 aging and CN muting, invalid SID resets the count of unusable frames, and if | |
| 280 muting already started previously, it is halted at the current (partially muted) | |
| 281 R0 value. | |
| 282 | |
| 283 Comfort noise interpolation | |
| 284 =========================== | |
| 285 | |
| 286 When our RxFE is invoked internally by our full speech decoder, the RxFE passes | |
| 287 some additional flags to the main body of the decoder. One of these flags | |
| 288 controls interpolation of R0 and LPC parameters for CNI, a function that is | |
| 289 required by the specs with bit-exact stipulation, but which cannot be | |
| 290 implemented at the level of speech parameters. | |
| 291 | |
| 292 The only case in which the behaviour of our libgsmhr1 full speech decoder | |
| 293 differs from ETSI original is when an invalid SID frame arrives immediately out | |
| 294 of reset, not preceded by any good speech, valid SID or even unusable frames. | |
| 295 In this case the original GSM 06.06 code uses initialized all-zero state of | |
| 296 pswOldFrmKsDec[] array, which cannot happen in any other case. In our | |
| 297 implementation we use LPC=164,171,cb instead, as already explained. | |
| 298 | |
| 299 Outside of this corner case, invalid SID frames are handled as follows | |
| 300 (unchanged between EISI original and our version): | |
| 301 | |
| 302 * If CN insertion period begins with an invalid SID, R0 and LPC are taken from | |
| 303 the last good speech frame, the one used by the speech ECU. These R0 and LPC | |
| 304 params are then fed into the prescribed bit-exact interpolation mechanism as | |
| 305 if CN insertion started with a valid SID frame with these parameters. | |
| 306 | |
| 307 * Any invalid SID frames that occur in the middle of a CN insertion period are | |
| 308 treated just like unusable frames for the purpose of interpolation. | |
| 309 | |
| 310 Return from CN insertion to speech state | |
| 311 ======================================== | |
| 312 | |
| 313 Exit from DTX/CNI state happens upon receipt of a good speech frame, i.e., a | |
| 314 frame that meets this criterion: | |
| 315 | |
| 316 BFI == 0 && UFI == 0 && SID == 0 | |
| 317 | |
| 318 However, the original implementation in GSM 06.06 reference code exhibits this | |
| 319 flaw: if the speech ECU is in state 6 (see GSM 06.21 section 6.3) and then an | |
| 320 accepted SID frame (valid or invalid) puts us into DTX state, the first good | |
| 321 speech frame after this DTX pause will be dropped and replaced with fully muted | |
| 322 form of the last good speech frame from before the CN insertion period. This | |
| 323 effect happens no matter how long that DTX pause was - thus the last good speech | |
| 324 frame being regurgitated (with R0 reduced to 0) may be indefinitely old and out | |
| 325 of place. Furthermore, if the CNI-exiting good speech frame that is dropped | |
| 326 here is followed by BFI unusable frames, the ECU will return to state 6 and the | |
| 327 parameters (other than muted R0) of the last good speech frame from before the | |
| 328 DTX pause will continue being reused indefinitely. | |
| 329 | |
| 330 In our libgsmhr1 version, the state counter for the speech ECU is reset to 7 | |
| 331 (the initial home state) whenever our RxFE passes through DTX/CNI state. Since | |
| 332 only a good speech frame with BFI=0 and UFI=0 can make exit from CN insertion | |
| 333 state, this reset of ECU state ensures that this good speech frame will pass | |
| 334 through, and then the ECU will be in state 0 after this talkspurt-opening good | |
| 335 speech frame. | |
| 336 | |
| 337 Fully muted state after unusable frames in input | |
| 338 ================================================ | |
| 339 | |
| 340 If the input to the speech decoder or TFO transform becomes nothing but BFI | |
| 341 unusable frames, what is the final fully muted or "decayed" output at the level | |
| 342 of modified speech parameters? In GSM-FR codec there is a special silence frame | |
| 343 defined in GSM 06.11 Table 1, and the final decayed state is a continuous output | |
| 344 of these fixed silence frames - irrespective of whether the Rx DTX handler got | |
| 345 to this fully decayed state from speech or CN muting. | |
| 346 | |
| 347 However, no equivalent fully decayed state with fixed output is defined for | |
| 348 GSM-HR. While this aspect is a non-normative "example" implementation detail, | |
| 349 in both GSM 06.06 reference code and Themyscira libgsmhr1 the fundamental state | |
| 350 of speech vs CNI persists indefinitely even when fully muted: | |
| 351 | |
| 352 * If an indefinitely long string of unusable frames occurs in speech state, | |
| 353 the speech ECU will be in state 6, and the output from the RxFE (externally | |
| 354 visible in the case of TFO) will endlessly repeat parameters of the last good | |
| 355 speech frame, except for R0 reduced to 0. | |
| 356 | |
| 357 * If an indefinitely long string of unusable frames occurs in DTX/CNI state, | |
| 358 the output form shown in GSM 06.22 Table 2, complete with bit-exact | |
| 359 pseudorandom sequence in unvoiced codevector parameters, will likewise | |
| 360 continue indefinitely. LPC parameters will remain from the most recently | |
| 361 received valid SID frame (or from the last good speech frame if CNI period | |
| 362 began with invalid SID and no valid SID was received afterward), but R0 will | |
| 363 be reduced to 0 by the CN muting logic. | |
| 364 | |
| 365 Because R0 is reduced to 0 in both cases, the above details are generally | |
| 366 invisible with full endpoint speech decoding. However, they become fully | |
| 367 visible in the case of TFO transform with DTXd=0. | |
| 368 | |
| 369 TFO transform with DTXd=1 | |
| 370 ========================= | |
| 371 | |
| 372 The internal RxFE block that emits CN parameters during DTX/CNI state is correct | |
| 373 for the full endpoint speech decoder application and for TFO transform with | |
| 374 DTXd=0. The case of TFO transform with DTXd=1 is implemented by calling the | |
| 375 same RxFE block, then applying this simple modification to its output: if the | |
| 376 current frame was processed in DTX/CNI mode, the frame of CN parameters is | |
| 377 transformed into a downlink SID frame by replacing all speech parameters beyond | |
| 378 R0 and LPC with all-ones SID codeword. | |
| 379 | |
| 380 The internal RxFE block tells the TFO wrapper when this just-described | |
| 381 modification should be applied by way of an internal flag. This flag is set | |
| 382 in two cases: | |
| 383 | |
| 384 1) When the current frame was processed in DTX/CNI mode, or | |
| 385 | |
| 386 2) When the speech ECU applied substitution/muting handling to the current | |
| 387 frame, and the ECU state was 6 or 7 at the beginning of current frame | |
| 388 processing. | |
| 389 | |
| 390 The effects of this logic are as follows: | |
| 391 | |
| 392 1) DTX pauses in UL pass through into DTX pauses in DL, with unusable frames | |
| 393 and invalid SID replaced with the most recent valid SID, or with R0+LPC from | |
| 394 the last good speech frame in the case of initial invalid SID. The | |
| 395 spec-compliant Rx DTX handler in the destination MS can then produce the | |
| 396 most correct form of comfort noise, including interpolation of R0 and LPC | |
| 397 parameters. | |
| 398 | |
| 399 2) When the input to TFO transform is nothing but unusable frames, the downlink | |
| 400 radio leg should go into DTXd state in order to produce the desired reduction | |
| 401 in radio interference and BTS power consumption. This effect should happen | |
| 402 irrespective of whether the "fully decayed" state of RxFE is DTX/CNI muting | |
| 403 or speech ECU, as covered in the previous section. Our logic of turning | |
| 404 "fully decayed" ECU state into DTXd SID achieves the desired effect. | |
| 405 | |
| 406 Finally, there is one more modification applied only in the case of TFO | |
| 407 transform with DTXd=1 and not in other cases: muting of comfort noise. In the | |
| 408 case of full endpoint speech decoding or TFO transform with DTXd=0, when the | |
| 409 criterion for CN muting is first reached, the muting proceeds by decrementing | |
| 410 R0 by 2 on every frame, i.e., gradually. (See GSM 06.21 section 6.4.) However, | |
| 411 in the case of TFO transform with DTXd=1, CN muting is effected by reducing R0 | |
| 412 to 0 immediately as soon as CN muting criterion is reached. The rationale is | |
| 413 as follows: | |
| 414 | |
| 415 * A TRAU (or TRAU-emulating MGW that feeds Abis to a BTS) has no way of knowing | |
| 416 exactly which of its continuously emitted DL SID frames will actually get | |
| 417 transmitted on the air and seen by the MS. Therefore, a muting process that | |
| 418 gradually decrements R0 with every emitted SID frame would make no sense. | |
| 419 | |
| 420 * If the destination MS receives a SID update with R0=0 subsequent to whatever | |
| 421 previous SID it received with non-zero R0, the spec-required CN interpolation | |
| 422 logic in that MS will produce the desired effect of gradual muting over 240 ms | |
| 423 - not too far from the 320 ms muting time called for in GSM 06.21 section | |
| 424 5.2.4. | |
| 425 | |
| 426 TFO transform homing | |
| 427 ==================== | |
| 428 | |
| 429 ThemWi implementation of TFO transform includes the feature of in-band homing: | |
| 430 if the input to the transform is the spec-defined decoder homing frame (DHF), | |
| 431 this DHF is passed through to the output just like any other good speech frame, | |
| 432 but the internal state is reset to the initial "home" state. | |
| 433 | |
| 434 The check for DHF (all bits must match, plus (BFI == 0 && SID == 0) criterion) | |
| 435 and the resulting state reset happen at the end of frame processing, after the | |
| 436 output for the current frame has been generated. In the case of ThemWi TFO | |
| 437 transform for GSM-HR, there are two corner cases in which an incoming DHF may | |
| 438 be acted upon (produce state reset), but not appear in the output: | |
| 439 | |
| 440 1) The overall state of RxFE was speech (as opposed to DTX/CNI) and the speech | |
| 441 ECU state was 6 - the state in which the first received good speech frame | |
| 442 gets dropped. | |
| 443 | |
| 444 2) The overall state of RxFE was DTX/CNI and the incoming DHF is marked with | |
| 445 UFI=1. UFI is not a criterion for DHF detection, only BFI is, but UFI in | |
| 446 DTX/CNI state will cause current frame processing to treat the frame as | |
| 447 unusable. |
