FreeCalypso > hg > gsm-codec-lib
comparison doc/AMR-EFR-conversion @ 136:8eb0e7a39409
doc: document command line utilities
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Sun, 11 Dec 2022 22:20:36 +0000 |
| parents | |
| children | 78739fda2856 |
comparison
equal
deleted
inserted
replaced
| 135:22601ae99434 | 136:8eb0e7a39409 |
|---|---|
| 1 We have two simple utilities that allow one to experiment with "dumb" bit- | |
| 2 shuffling conversion between AMR 12k2 and EFR codec formats, to explore | |
| 3 capabilities and limitations of this approach. | |
| 4 | |
| 5 gsm-amr2efr reads an AMR speech recording in RFC 4867 storage format (the common | |
| 6 .amr format) and converts it to EFR in gsmx format. The AMR input to this | |
| 7 utility must consists of MR122 frames only - no other AMR modes, no SID and no | |
| 8 NO_DATA gaps. The intent is that one can take a starting speech sample in WAV | |
| 9 format, encode it into AMR with amrnb-enc from opencore-amrnb (by default that | |
| 10 utility produces MR122 encoding without DTX), and then convert the AMR output to | |
| 11 EFR with gsm-amr2efr. One can then encode the same starting-point WAV speech | |
| 12 sample with gsmefr-encode (matching official EFR from ETSI) and compare the two | |
| 13 EFR outputs. When you do this experiment, you will see that the two EFR outputs | |
| 14 will be different (you can then analyze encoded speech parameter diffs with | |
| 15 gsmrec-dump), but each version can be fed to an EFR decoder, resulting in | |
| 16 OK-sounding speech. | |
| 17 | |
| 18 gsm-efr2amr performs the opposite conversion: it reads an EFR session recording | |
| 19 in gsmx format and converts it to AMR storage format. The input to gsm-efr2amr | |
| 20 is allowed to contain Themyscira BFI markers in addition to EFR frames; these | |
| 21 BFI markers will be turned into AMR NO_DATA frames. The same input can also | |
| 22 contain EFR SID frames - however, gsm-efr2amr will not detect them and won't | |
| 23 give them any special handling, instead they will be bit-reshuffled into MR122 | |
| 24 just like EFR speech frames. The result of such "dumb" conversion is invalid | |
| 25 AMR, and when you decode it with amrnb-dec, you will hear some strange noises. |
