FreeCalypso > hg > gsm-codec-lib
comparison doc/AMR-EFR-philosophy @ 311:83408f67a96c
doc/AMR-EFR-philosophy: new article
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Wed, 17 Apr 2024 20:53:10 +0000 |
| parents | doc/AMR-EFR-conversion@8eb0e7a39409 |
| children | 9bcf65088006 |
comparison
equal
deleted
inserted
replaced
| 310:8ad5d5adb848 | 311:83408f67a96c |
|---|---|
| 1 Relation between GSM-EFR and 12k2 mode of AMR | |
| 2 ============================================= | |
| 3 | |
| 4 What are the differences between GSM-EFR codec and the highest 12k2 mode of AMR, | |
| 5 or MR122 for short? The most obvious difference is in DTX: the format of SID | |
| 6 frames and even the very paradigm of how DTX works are completely different | |
| 7 between EFR and AMR. But what about non-DTX operation? If a codec session | |
| 8 consists solely of good speech frames, no SIDs and no BFI frame gaps, are EFR | |
| 9 and MR122 strictly identical? | |
| 10 | |
| 11 The correct answer is that in the absence of SIDs, EFR and MR122 are directly | |
| 12 interoperable in that the output of an EFR encoder can be fed to the input of | |
| 13 an AMR decoder, and vice-versa. However, the two codecs are NOT identical at | |
| 14 the bit-exact level! The differences are subtle, such that finding them | |
| 15 requires some intense study; this article documents some of these study | |
| 16 findings: | |
| 17 | |
| 18 https://www.freecalypso.org/hg/efr-experiments/file/tip/Theory-and-mystery | |
| 19 | |
| 20 What other DSP/transcoder vendors have done | |
| 21 =========================================== | |
| 22 | |
| 23 ETSI had a tradition of defining standard GSM codecs (FR, HR, EFR) in bit-exact | |
| 24 form, and every production implementation was required to match the output of | |
| 25 the official reference bit for bit. However, once AMR came out, the regulation | |
| 26 on EFR was loosened. GSM 06.54 document from 2000-08 (ETSI TS 100 725 V5.2.0) | |
| 27 has an appendix-like chapter (chapter 10) whose first paragraph reads: | |
| 28 | |
| 29 The 12.2 kbit/s mode of the Adaptive Multi Rate speech coder described | |
| 30 in TS 26.071 is functionally equivalent to the GSM Enhanced Full Rate | |
| 31 speech coder. An alternative implementation of the Enhanced Full Rate | |
| 32 speech service based on the 12.2 kbit/s mode of the Adaptive Multi Rate | |
| 33 coder is allowed. Alternative implementations shall implement the | |
| 34 functionality specified in TS 26.071 for the 12.2 kbit/s mode, with the | |
| 35 exception that the DTX transmission format (GSM 06.81) and the comfort | |
| 36 noise generation (GSM 06.62) shall be used. | |
| 37 | |
| 38 It appears that DSP vendors (for GSM MS or for network transcoders, or perhaps | |
| 39 both) weren't too happy with the prospect of having to include two different | |
| 40 versions of _almost_ the same codec algorithm with a bunch of interspersed | |
| 41 subtle diffs, and so the rules were bent: EFR implementors were given permission | |
| 42 to deviate from the original bit-exact definition of EFR in order to have more | |
| 43 commonality with MR122. | |
| 44 | |
| 45 Approach adopted for Themyscira GSM codec libraries suite | |
| 46 ========================================================= | |
| 47 | |
| 48 I (Mother Mychaela) previously entertained the idea of creating a unified codec | |
| 49 library that supports both AMR and EFR with common code, producing a published- | |
| 50 source, FOSS-culture equivalent of what most proprietary vendors have done. | |
| 51 However, on further reflection, that idea has been rejected. The current vision | |
| 52 (as of 2024-04) is that libgsmefr (stable since early 2023) and libtwamr | |
| 53 (currently a work in progress) shall remain separate and independent libraries, | |
| 54 the former implementing GSM-EFR (the original bit-exact definition) and the | |
| 55 latter implementing AMR. My reasons for this decision are: | |
| 56 | |
| 57 * Libgsmefr already exists, and it is already a bit of a jewel compared to the | |
| 58 sorry state of true GSM codec support in the world of FOSS outside Themyscira. | |
| 59 Giving up on this library and moving to some nebulous new one does not sound | |
| 60 appealing. | |
| 61 | |
| 62 * There does not exist any formal, bit-exact definition for what we informally | |
| 63 call "EFR version 2": the realization of EFR as implemented by post-AMR-era | |
| 64 proprietary vendors, some sort of AMR-EFR hybrid. As I see it, it is not my | |
| 65 place to try to innovate in speech codec design, instead it is my job to | |
| 66 provide 100% correct, bit-exact implementations of existing solid standards - | |
| 67 and there is no bit-exact standard to follow for "EFR version 2". | |
| 68 | |
| 69 * Libtwamr project: the task of turning the original AMR code from 3GPP into a | |
| 70 proper library, style-consistent with Themyscira libgsmfr2 and libgsmefr, | |
| 71 without the ugliness of opencore-amr, is already a lot of work as it is. | |
| 72 There is no need to make it harder by adding the task of supporting AMR-based | |
| 73 EFR, especially when the latter lacks formal definition. | |
| 74 | |
| 75 Performance issues | |
| 76 ================== | |
| 77 | |
| 78 Right now the only significant downside of libgsmefr compared to | |
| 79 libopencore-amrnb is that our library is significantly slower: almost 7 times | |
| 80 slower on non-DTX encode and a little over 3 times slower on SID-free decode. | |
| 81 However, this performance problem will need to be solved by profiling the code | |
| 82 to find the slowest spots, comparing the code of individual blocks between ours | |
| 83 and theirs, and porting over whatever performance-optimizing strategies were | |
| 84 implemented in OpenCORE code base. The latter code base is a derivative work | |
| 85 based on 3GPP AMR source, hence the guts of the codec are largely the same | |
| 86 between 3GPP AMR and libopencore-amrnb; the latter has been significantly | |
| 87 performance-optimized, but also heavily uglified. But there is no reason why | |
| 88 the same performance fixes can't be applied to EFR code base - it will simply | |
| 89 take work. This work is currently part of our future roadmap. |
