FreeCalypso > hg > gsm-net-reveng
comparison doc/TFO-xform/Theory @ 33:e828468b0afd
doc/TFO-xform/Theory: article written
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Sat, 31 Aug 2024 20:45:25 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 32:f6bb790e186a | 33:e828468b0afd |
|---|---|
| 1 TFO transform from uplink to downlink | |
| 2 ===================================== | |
| 3 | |
| 4 With all 3 classic GSM codecs (FRv1, HRv1, EFR) the original architecture calls | |
| 5 for a network-side transcoder (TRAU) on each individual call leg. The | |
| 6 implications are: | |
| 7 | |
| 8 * The uplink runs from the MS to the speech decoder in the TRAU that turns the | |
| 9 mobile-generated speech into 64 kbit/s G.711. The Rx DTX handler, a subblock | |
| 10 of that speech decoder in the TRAU, handles error concealment (substitution | |
| 11 and muting of lost frames) and comfort noise insertion during DTXu pauses, | |
| 12 and once this speech stream has been transcoded to G.711, all trace of these | |
| 13 GSM-specific effects disappears. | |
| 14 | |
| 15 * The downlink runs from the speech encoder in the TRAU to TCH DL radio output | |
| 16 from the BTS. Because the DL frame stream comes from a free-running speech | |
| 17 encoder, it never contains errored frames or invalid SID or any other | |
| 18 aberrations: without DTXd, this frame stream is 100% good speech frames, and | |
| 19 with DTXd, it is a mixture of good speech and valid SID frames. | |
| 20 | |
| 21 But suppose you have two mobile call legs (mobile user Alice calls mobile user | |
| 22 Bob), and you wish to eliminate the quality-degrading effect of double or tandem | |
| 23 transcoding by passing compressed speech frames directly from Alice to Bob and | |
| 24 vice-versa - what happens now? The UL frame stream from each call leg will | |
| 25 contain BFI frame gaps that are never allowed in DL, and if the network deploys | |
| 26 DTX only in the UL direction (DTXu without DTXd, a very sensible choice for | |
| 27 small-capacity single-carrier cells), the representation of DTXu pauses coming | |
| 28 from each call leg (SID frames followed by prolonged BFI gaps) is also not | |
| 29 suitable for direct passing to the DL of the opposite call leg. | |
| 30 | |
| 31 The solution offered in the TFO spec (GSM 08.62) is a special transform from | |
| 32 call leg A UL to call leg B DL. This transform has no official name that I | |
| 33 could find, but I call it "TFO transform". In the original GSM 08.62 spec (up | |
| 34 to R99) this TFO transform is described in sections 8.2.1 and 8.2.2; when the | |
| 35 spec changed to 28.062 with 3GPP Release 4 (adding AMR in GSM and AMR-only | |
| 36 UMTS), the description of TFO transform for classic GSM codecs moved to section | |
| 37 C.3.2.1.1. | |
| 38 | |
| 39 However, both spec versions only say what "shall" be done without any guidance | |
| 40 on how to do it algorithmically: the spec language is "subject to manufacturer | |
| 41 dependent future improvements and is not part of this recommendation." | |
| 42 Distilling the problem to its essence, the addition of TFO introduces a new type | |
| 43 of logical transform on codec frames (and a stateful one at that!) that never | |
| 44 appeared previously anywhere in classic GSM architecture, is not mentioned in | |
| 45 any other spec, and is not addressed at all by any of the reference codec | |
| 46 sources. This new transform is implemented only in the TFO block in TRAUs and | |
| 47 nowhere else (in classic GSM architecture), and can be exercised only by | |
| 48 establishing a TFO call between two interworking TRAUs. | |
| 49 | |
| 50 There are 3 main parts to this TFO transform, 3 main areas where anyone who | |
| 51 seeks to implement this transform has to think hard and come up with an | |
| 52 innovative solution: | |
| 53 | |
| 54 1) Error concealment in non-DTX speech: if an errored frame (BFI) appears after | |
| 55 non-SID speech frames (meaning non-DTX speech), the transform has to fill in | |
| 56 substitution/muting "speech" frames (meaning codec frames that look like | |
| 57 valid speech frames) in the stream going to call leg B DL. | |
| 58 | |
| 59 2) Comfort noise insertion: if the incoming frame stream from call leg A UL | |
| 60 contains SID frames (DTXu) but the same are not allowed on call leg B DL | |
| 61 (no DTXd), the transform has to insert "speech" frames (in the same | |
| 62 parenthetical meaning) that represent comfort noise, as intended by Alice's | |
| 63 phone that transmitted SID with certain CN parameters. | |
| 64 | |
| 65 3) Comfort noise muting: handling the case where the incoming UL frame stream | |
| 66 goes into CN insertion state (via one or more SID frames), but then goes | |
| 67 total BFI, with no more SID update frames appearing in TAF positions. In | |
| 68 the case of a single codec leg from a source encoder to an end decoder, | |
| 69 standard decoders are required by their respective DTX specs to gradually | |
| 70 mute their CN output, to indicate channel breakdown to the user - the TFO | |
| 71 transform has to produce the same effect. | |
| 72 | |
| 73 All 3 of the just-listed functions are explicitly called out in the TFO spec, in | |
| 74 each case with the same language of "shall" followed by "subject to manufacturer | |
| 75 dependent future improvements and is not part of this recommendation." | |
| 76 | |
| 77 DTXd or no DTXd | |
| 78 =============== | |
| 79 | |
| 80 When the destination call leg operates without DTXd, the TFO transform can only | |
| 81 emit frames that are well-formed speech frames for the respective codec, no SID | |
| 82 frames. In this case the transform has to do "everything", all 3 of the listed | |
| 83 functions, although the last function of CN muting may be either separate or | |
| 84 absorbed into CN generation function depending on the codec. | |
| 85 | |
| 86 OTOH, when call leg B has DTXd enabled/allowed, there is more room for | |
| 87 additional complexity. The simplest solution would be to not make use of DTXd | |
| 88 capability and always emit speech frames - but the problem with this simple | |
| 89 approach is teleological. If a GSM network operator runs with DTXd enabled, | |
| 90 presumably that operator seeks to reap the benefits of DTXd as in reduction of | |
| 91 radio interference, in which case a TFO transform that fails to make use of DTXd | |
| 92 capability would defeat the purpose. Hence if someone sets out to implement a | |
| 93 TFO transform that supports full utilization of DTXd, they would have to do | |
| 94 additional work: | |
| 95 | |
| 96 * The function of CN insertion in the transform _mostly_ goes away: if a valid | |
| 97 SID frame comes, the TRAU caches it and repeats it continuously until the | |
| 98 next SID update, allowing the BTS to select which SID frames it will actually | |
| 99 transmit based on its SACCH alignment. But more complex handling is still | |
| 100 needed if the first SID frame (the one that begins CN insertion period) came | |
| 101 in as invalid SID, and the function of CN muting takes on new significance. | |
| 102 | |
| 103 * CN muting: when the cached SID expires and no new SID updates arrive in TAF | |
| 104 positions, the TFO transform has to indicate somehow to Bob that Alice's call | |
| 105 leg is having trouble, which will be easy or difficult depending on what rules | |
| 106 are specified in the codec specs for SID interpolation in the final receiver. | |
| 107 | |
| 108 * Error concealment in non-DTX speech: at first glance this function appears to | |
| 109 be exactly the same whether DTXd is used or not. But consider the case of | |
| 110 total channel breakdown, such that the incoming frame stream becomes all BFI: | |
| 111 how should this case be handled? In the absence of DTXd, the output of the | |
| 112 TFO transform becomes a stream of silence frames, meaning some kind of | |
| 113 "speech" frames that produce total silence at the end decoder. But if the | |
| 114 network operates with DTXd with the aim of reducing radio interference, these | |
| 115 silence "speech" frames should be replaced with SIDs whose parameters are | |
| 116 chosen to produce silent output. | |
| 117 | |
| 118 Current approach in Themyscira libraries | |
| 119 ======================================== | |
| 120 | |
| 121 There is a desire to implement TFO transform for all 3 classic GSM codecs in | |
| 122 Themyscira Wireless GSM codec libraries suite, and the first question to be | |
| 123 decided is the policy with regard to DTXd. | |
| 124 | |
| 125 The current approach is to not implement any DTXd support, i.e., implement the | |
| 126 TFO transform only in its no-DTXd basic form. The reason for this decision is | |
| 127 based on the reality of small-capacity single-carrier cells: given that the | |
| 128 total number of humans who actually _want_ to use GSM (as opposed to whatever | |
| 129 latest 4G/5G/etc is peddled by Big Tech mafia) is vanishingly small, there is | |
| 130 currently no justification for building higher-capacity GSM cells that use more | |
| 131 than a single 200 kHz radio carrier. And if each GSM cell consists of only one | |
| 132 radio carrier (the BCCH carrier, also called C0 in the specs), then physical | |
| 133 DTXd (as in actually turning off radio Tx, as opposed to "logical" DTXd where | |
| 134 that effect is merely faked for the MS by transmitting dummy bursts or | |
| 135 induced-BFI frames) is simply impossible. Therefore, in the present state of | |
| 136 human condition, there is no justification for expending the effort to implement | |
| 137 additional complexity for proper DTXd. |
