view doc/Voice-memo-utils @ 909:1e9fe07f8f09

doc/Voice-memo-utils: new article
author Mychaela Falconia <falcon@freecalypso.org>
date Thu, 29 Dec 2022 21:03:11 +0000
parents
children
line wrap: on
line source

The full Calypso hw+fw solution as delivered by TI (the relevant components here
are the DSP, the official L1 code and RiViera Audio Service) implements an
interesting feature called voice memos.  The voice memo feature itself, plus
FreeCalypso-added AT commands that exercise it, are described in the
Voice-memo-feature article in our separate freecalypso-docs repository; the
present document describes the available FC host tools utilities for working
with these voice memo recordings.

FreeCalypso tools for decoding voice memo files
===============================================

If you have recorded a voice memo with AT@VMR and then read it out with fc-fsio,
you can use additional FC tools to analyze it.  The following tools are
available, split between FC host tools and GSM codec libs & utilities packages:

* fc-vm2gsmx (new with fc-host-tools-r18) takes a binary VM recording (as you
  would read out with fc-fsio) and converts it into extended-libgsm (gsmx)
  format defined in our GSM codec libraries & utilities package.  This gsmx
  format is an extension of the classic libgsm (GSM 06.10) format, adding the
  possibility of SID frames and BFI markers (frame gaps) in addition to regular
  speech frames, thus it can represent the content of a voice memo recording
  made in DTX mode.  These gsmx files can then be decoded into playable WAV
  with our gsmfr-decode utility.

* fc-vm2hex (dates back to fc-host-tools-r5) converts a binary VM recording into
  ASCII hex format, similar to the old (2016) TCH DL recording format before it
  was extended in late 2022.  Every fully-written frame is emitted in the hex
  output as 3 space-separated hex status words followed by a block of 66 hex
  digits giving the FR1 codec frame in the unchanged bit order of TI's DSP, and
  every skipped frame (one for which only status word 0 was written into the
  memo file) is emitted in the hex output as just that one word.  The hex output
  from fc-vm2hex can be further fed to gsmfr-dlcap-parse utility (gsm-codec-lib
  package) for deeper analysis.

FreeCalypso tools for external generation of voice memo files
=============================================================

Using FreeCalypso tools, you can produce an external speech recording in GSM
06.10 FR1 codec format, convert it into TCS211 VM format, upload it into FC
device FFS with fc-fsio, and then play these externally-produced voice memos
with AT@VMP.  The steps are as follows:

1) You can use gsmfr-encode to FR1-encode a speech sample from WAV into classic
   .gsm format, or gsmfr-encode-r if the source is raw BE instead of WAV.
   Alternatively, you can use any other off-the-shelf software that can encode
   FR1 and write libgsm format; SoX shipped with Slackware includes the
   necessary support.

2) fc-gsm2vm (unchanged since fc-host-tools-r5) converts a .gsm recording into
   non-DTX TCS211 VM format.

At the present time we don't have any tools for producing external DTX-enabled
VM recordings: the main limitation is that at least to this Mother's knowledge,
the published source software community does not currently possess a GSM 06.10
encoding library that has been extended with VAD and DTX functions.  There is
classic libgsm from 1990s, used by everyone in the FOSS community who needs a
GSM 06.10 encoder or decoder, but it doesn't do DTX; we (FreeCalypso and
Themyscira Wireless) have produced our own libgsmfrp front-end that implements
Rx DTX handler functions (that's how we can properly decode FR1 streams that
contain SIDs and/or missing frames), but it doesn't help with DTX encoding.
Therefore, our ability to produce TCS211-compatible VM recordings externally is
currently limited to non-DTX mode.