FreeCalypso > hg > gsm-codec-lib
comparison doc/AMR-EFR-hybrid-emu @ 467:ad032051166a
doc: AMR-EFR-hybrid-emu new article
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Sun, 12 May 2024 23:54:43 +0000 |
| parents | doc/AMR-EFR-philosophy@9bcf65088006 |
| children |
comparison
equal
deleted
inserted
replaced
| 466:0c4e1bc06740 | 467:ad032051166a |
|---|---|
| 1 Emulation of other people's AMR-EFR hybrid implementations | |
| 2 ========================================================== | |
| 3 | |
| 4 [Please see AMR-EFR-philosophy article for background information on the | |
| 5 differences between classic GSM-EFR and the 12k2 mode of AMR, and how ETSI/3GPP | |
| 6 loosened their regulation on bit-exactness of EFR, then continue here.] | |
| 7 | |
| 8 Experiments reveal that the extant commercial GSM networks of T-Mobile USA and | |
| 9 Telcel Mexico (and likely other countries' GSM networks too) use a GSM speech | |
| 10 transcoder implementation that performs EFR encoding and decoding (for times | |
| 11 when the MS declares no support for AMR and the network falls back to EFR) per | |
| 12 the alternative which we call AMR-EFR hybrid. The needed experiments are done | |
| 13 by using a FreeCalypso phone or devboard as the MS (declaring yourself to the | |
| 14 network as non-AMR-capable via AT%SPVER), capturing TCH DL and feeding TCH UL | |
| 15 with FreeCalypso tools, and using a SIP-to-PSTN connectivity provider (BulkVS | |
| 16 or Anveo) on the other end of the test call that allows the experimenter to | |
| 17 receive the PCMU or PCMA sample stream coming out of the GSM network's speech | |
| 18 transcoder and feed a crafted PCMU/PCMA sample stream in the other direction. | |
| 19 | |
| 20 In this experimental setup, bit-exact details of how the GSM network under study | |
| 21 implements EFR decoding can be tested by feeding a controlled sequence of EFR | |
| 22 codec frames (beginning with at least two DHFs) to GSM Um uplink and observing | |
| 23 the PCMU or PCMA sample stream received on the IP-PSTN end of the call. | |
| 24 Similarly, bit-exact details of how the NUS implements EFR encoding can be | |
| 25 tested by feeding controlled PCMU/PCMA sample streams into the call from IP-PSTN | |
| 26 and observing what the network emits on GSM Um downlink. In the latter case, | |
| 27 frame synchronization finding tricks described in ETSI/3GPP test sequence specs | |
| 28 need to included as part of the experiment. | |
| 29 | |
| 30 When these experiments were performed on the GSM networks of T-Mobile USA and | |
| 31 Telcel Mexico, it was immediately apparent that they do not implement EFR | |
| 32 following the original bit-exact code of GSM 06.53: feeding any of the original | |
| 33 EFR test sequences from GSM 06.54 to the NUS does not produce matching results. | |
| 34 However, when I tried feeding EFR codec frame sequences from amr122_efr.zip | |
| 35 (the late addendum to GSM 06.54 for the AMR-EFR hybrid option) to GSM UL, the | |
| 36 PCMU (T-Mobile USA) or PCMA (Telcel Mexico) output from the GSM network's EFR | |
| 37 decoder matched _those_ test sequences, indicating that these networks use the | |
| 38 AMR-EFR alternative implementation. | |
| 39 | |
| 40 Creating tinkerer-oriented FOSS tools that can emulate or replicate the poorly | |
| 41 defined "EFR alternative 2" implemented by these extant commercial networks has | |
| 42 been a sportive challenge ever since. The present development in Themyscira | |
| 43 GSM codec libraries and utilities suite is a step toward conquering that | |
| 44 challenge: we are now able to replicate the mystery commercial transcoder in | |
| 45 non-DTX operation, specifically: | |
| 46 | |
| 47 a) We can feed a SID-free stream of EFR codec frames to GSM UL, beginning with | |
| 48 DHF, and get the expected result on PCMU or PCMA; | |
| 49 | |
| 50 b) In the encoder direction, for the first 7 frames after EHF, before DTX is | |
| 51 allowed to kick in, we can get GSM DL output from the network that matches | |
| 52 our expectations. | |
| 53 | |
| 54 Encoder 5 ms delay and DHF transformation | |
| 55 ========================================= | |
| 56 | |
| 57 One of the diffs between classic EFR and MR122 in the encoder direction is the | |
| 58 artificial delay of 5 ms introduced in the AMR version. In true multirate | |
| 59 operation this delay is needed to support seamless switching between codec | |
| 60 modes, but when the only allowed codec rate is 12k2 (which is the case with EFR | |
| 61 by definition), this delay is pure waste. (Needless to say, an extra delay of | |
| 62 5 ms is nothing compared to the egregious latencies introduced by today's ugly | |
| 63 and horrible world of IP-based transport everywhere, but still...) This | |
| 64 artificial 5 ms delay in the encoder is the reason for the DHF difference | |
| 65 between EFR and MR122 - but here is the wild part: instead of recognizing this | |
| 66 artificial delay as unnecessary and wasteful for 12k2-only EFR and removing it | |
| 67 from the AMR-EFR hybrid contraption, those commercial transcoder vendors and | |
| 68 the people who prepared amr122_efr.zip for ETSI/3GPP (were they the same | |
| 69 people?) kept this 5 ms encoder delay, keeping the whole encoder unchanged AMR | |
| 70 except for whatever insane trickery they did to fit EFR DTX logic and EFR SID | |
| 71 generation into it, but added special DHF transformation logic on the output of | |
| 72 this AMR encoder to produce compliant EFR DHF when the input is EHF. | |
| 73 | |
| 74 Exactly how this DHF transformation is done in those actually-deployed AMR-EFR | |
| 75 hybrid encoders is a bit of a mystery. My first thought was to compare the | |
| 76 speech parameters emitted by the AMR encoder against MR122 DHF, and if the | |
| 77 result is a match, replace that MR122-DHF parameter set with EFR DHF. This | |
| 78 approach is implemented in the simple amr_dhf_subst_efr() function in libtwamr. | |
| 79 One distinctive signature of this approach is that the output of a hybrid | |
| 80 encoder following this method can never equal MR122 DHF: this one particular | |
| 81 bit pattern is precluded from the set of possible outputs under all conditions. | |
| 82 | |
| 83 However, subsequent experiments quickly revealed that the logic implemented by | |
| 84 the transcoder in the network of T-Mobile USA must be different. One of the | |
| 85 counter-intuitive effects of the 5 ms artificial delay in the MR122 encoder is | |
| 86 what happens when the encoder is in its homed state and you feed it an input | |
| 87 frame whose first 120 samples are all 0x0008, but some (as few as one or as many | |
| 88 as all) of the last 40 samples are different. This frame does not meet the | |
| 89 definition of EHF and won't be recognized as such - the encoder won't get | |
| 90 rehomed once again after processing this frame - yet the output will be | |
| 91 bit-exact MR122 DHF. How do those AMR-EFR hybrid encoders handle *this* case? | |
| 92 | |
| 93 Experiments on T-Mobile reveal that in the case in question, the encoded frame | |
| 94 is emitted with the bit pattern of MR122 DHF, *not* transformed into EFR DHF. | |
| 95 Because MR122-DHF output is impossible with an encoder that implements logic | |
| 96 like our amr_dhf_subst_efr() first cut, we know (by modus tollens) that | |
| 97 T-Mobile's implementation uses some different logic. | |
| 98 | |
| 99 Our new (current) working model is implemented in amr_dhf_subst_efr2(): we | |
| 100 replace the output of the AMR encoder with EFR DHF if the raw encoder output | |
| 101 was MR122 DHF *and* the input frame was EHF. This version appears to match | |
| 102 the observed behavior of T-Mobile USA so far. | |
| 103 | |
| 104 EFR DHF in the decoder direction | |
| 105 ================================ | |
| 106 | |
| 107 The way decoder homing works in all ETSI/3GPP-defined speech codecs, there is | |
| 108 an explicit check against known DHF bit pattern (up to first subframe only) at | |
| 109 the beginning of the decoder (if the decoder is homed and the input is DHF per | |
| 110 this reduced check, artificially emit EHF, stay homed and do nothing more), and | |
| 111 a second similar check against the known DHF bit pattern (full frame comparison | |
| 112 this time) at the end of the decoder, triggering the state reset function on | |
| 113 match. These checks are (and can only be) implemented by explicit comparison | |
| 114 against a known hard-coded DHF pattern - hence it doesn't matter in the decoder | |
| 115 case whether the DHF is natural (as in all properly ETSI-defined codecs) or | |
| 116 artificial as in AMR-EFR hybrid. Thus the "correct" handling of DHF in the | |
| 117 AMR-EFR hybrid decoder is a matter of replacing the check against MR122 DHF bit | |
| 118 pattern with a check against the different bit pattern of EFR DHF. | |
| 119 | |
| 120 The decoder engine in libtwamr supports this different-DHF option for MR122 | |
| 121 decoding by way of a bit set in the mode field in struct amr_param_frame - see | |
| 122 the detailed description in AMR-library-API article. | |
| 123 | |
| 124 Command line utilities for AMR-EFR hybrid | |
| 125 ========================================= | |
| 126 | |
| 127 The present package includes a small set of command line utilities that work | |
| 128 with the AMR-EFR hybrid described above: | |
| 129 | |
| 130 amrefr-encode-r | |
| 131 amrefr-decode-r | |
| 132 | |
| 133 These two utilities function just like gsmefr-encode-r and | |
| 134 gsmefr-decode-r described in Codec-utils article, but implement the | |
| 135 AMR-EFR hybrid version of the codec instead of original EFR. The | |
| 136 no-DTX limitation applies: amrefr-encode-r lacks -d option, and the | |
| 137 input to amrefr-decode-r must not contain any SID frames. | |
| 138 | |
| 139 amrefr-tseq-enc | |
| 140 amrefr-tseq-dec | |
| 141 | |
| 142 These two utilities are AMR-EFR counterparts to gsmefr-etsi-enc and | |
| 143 gsmefr-etsi-dec test programs described in EFR-testing article. They | |
| 144 pass all tests on the non-DTX t??_efr.* sequences in ETSI's | |
| 145 amr122_efr.zip, but not on any of the DTX sequences included in the | |
| 146 same ZIP. Just like amrefr-encode-r, amrefr-tseq-enc lacks -d option, | |
| 147 and amrefr-tseq-dec rejects input containing SID frames. |
