changeset 303:4034c2b06ec8

doc/FR1-Rx-DTX: update for libgsmfr2 and the new landscape
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 15 Apr 2024 22:07:00 +0000
parents f469bad44c0e
children 03b0702f4463
files doc/FR1-Rx-DTX
diffstat 1 files changed, 86 insertions(+), 168 deletions(-) [+]
line wrap: on
line diff
--- a/doc/FR1-Rx-DTX	Mon Apr 15 18:45:08 2024 +0000
+++ b/doc/FR1-Rx-DTX	Mon Apr 15 22:07:00 2024 +0000
@@ -26,7 +26,7 @@
   of 260 bits.
 
 What are the implications of this situation for the GSM published-source
-software community?  Prior to the present libgsmfrp offering, there has always
+software community?  Prior to the present Themyscira offering, there has always
 been libgsm, but no Rx DTX handler.  If you are working with a GSM uplink RTP
 stream from a BTS or a GSM downlink frame stream read out of TI Calypso DSP or
 some other GSM MS PHY, feeding that stream directly to libgsm (without passing
@@ -41,175 +41,93 @@
 
 The correct solution is to implement an Rx DTX handler, pass the stream of
 frames and flags from the BTS or the MS PHY to this handler first, and then pass
-the output of this handler to libgsm 06.10 decoder.  Themyscira libgsmfrp is a
-Free Software implementation of Rx DTX handler for GSM FR, implementing SID
-classification, comfort noise generation and error concealment.
+the output of this handler to the standard GSM 06.10 decoder (classic libgsm or
+some updated port thereof).  Themyscira libgsmfrp was our first Free Software
+implementation of Rx DTX handler for GSM-FR, implementing SID classification,
+comfort noise generation and error concealment.  Our new libgsmfr2 offering
+takes the harmonization effort (between GSM-FR and other GSM codecs) one step
+further, eliminating the dependency on old libgsm and putting all GSM-FR codec
+functions "under one roof".
+
+libgsmfrp/libgsmfr2 API documentation
+=====================================
+
+The Rx DTX component of libgsmfr2 has the same API as our previous libgsmfrp,
+except for dropping the use of <gsm.h> and its types and needing to include our
+new API header <tw_gsmfr.h>.  The present article previously contained the full
+description of this API; that description has now been moved to FR1-library-API
+article, where the whole of libgsmfr2 is documented.
+
+Standalone exerciser utility
+============================
+
+The present GSM codec libraries and utilities package includes a standalone
+utility that exercises our Rx DTX handler for GSM-FR.  This utility is
+gsmfr-preproc, to be run as follows:
+
+gsmfr-preproc input.gsmx output.gsm
+
+The input is an extended-libgsm file that can contain SIDs and BFI frame gaps
+in addition to regular GSM 06.10 speech frames (see Binary-file-format article);
+the output is GSM 06.10 speech frames only.
+
+False SID detection
+===================
+
+The intent of GSM-FR spec authors was that the sets of possible speech frames
+and possible SID frames be disjoint.  Prior to introduction of DTX, there were
+only regular speech frames per GSM 06.10, no SID, and a receiver had to deal
+with only two possibilities: either a good speech frame was received, or the
+frame was lost to radio errors or FACCH stealing (unusable frame).  When SID
+frames were introduced for the purpose of intentional DTX as distinct from
+radio errors, the intent was that SID was to be a "new animal" not seen before,
+distinct from regular speech frames.  There is, however, a small blemish in the
+actual system as realized: if the SID frame detector and the Rx DTX handler
+that follows it in the Rx chain follow the rules of GSM 06.31 sections 6.1.1
+and 6.1.2, respectively (like our implementation does), then some speech frames
+may be mistaken for invalid SID, or perhaps even for valid SID, producing a
+nonzero failure rate in this mechanism.
+
+Official test sequence 02 in the set of 5 provided by ETSI exhibits this effect:
+Seq02.inp is a legitimate 13-bit linear PCM input to the speech encoder, and the
+corresponding output of GSM 06.10 encoder is contained in Seq02.cod.  However,
+that output contains some frames that are mistakenly classified as SID=1
+(invalid SID) by the rules of GSM 06.31 section 6.1.1!  It is true that these
+ancient test sequences chronologically predate the invention of DTX and
+GSM 06.31, but we still need to bear in mind that this problematic Seq02.cod is
+not an artificially constructed sequence of 06.10 codec parameters: it is the
+required output of the prescribed bit-exact encoder given a legitimate PCM
+input!  There does not exist a perfect solution to this problem: as usual,
+real-world engineering is all about trade-offs and compromises, and occasionally
+a gear will slip.  The best we can do is to model the probability of such
+gear-slip or wrong detection events, and engineer our systems to reduce this
+probability to a level that is deemed acceptable - which is exactly what GSM
+spec designers did here.
+
+As of gsm-codec-lib-r3, gsmrec-dump utility shows the SID classification result
+(GSM 06.31 section 6.1.1) in addition to parsed 06.10 codec parameters for each
+frame, thus one can inspect FR-encoded streams and check for this blemish.
 
 Effect of extra preprocessing
 =============================
 
-One key detail deserves extra emphasis before going into library API details:
-if the input to libgsmfrp consists entirely of good speech frames (no SID frames
-and no BFIs), then the preprocessor becomes an identity transform.  Therefore,
-if the output of our libgsmfrp preprocessor were to be fed to an additional
-instance of the same further down the processing chain, no extra transformation
-of any kind will happen.
-
-Using libgsmfrp
-===============
-
-The external public interface to Themyscira libgsmfrp consists of a single
-header file <gsm_fr_preproc.h>; it should be installed in the same system
-include directory as <gsm.h> from libgsm.  Please note that <gsm_fr_preproc.h>
-includes <gsm.h>, as needed for gsm_byte and gsm_frame defined types.
-
-The dialect of C we chose for libgsmfrp is ANSI C (function prototypes), const
-qualifier is used where appropriate; however, unlike libgsmefr, the interface
-to libgsmfrp is defined in terms of gsm_byte type defined in <gsm.h>, included
-from <gsm_fr_preproc.h>.
-
-State allocation and freeing
-============================
-
-The Rx DTX handler is stateful, hence you will need to allocate a preprocessor
-state structure in addition to the usual libgsm state structure for your GSM FR
-Rx session.  The necessary function is:
-
-extern struct gsmfr_preproc_state *gsmfr_preproc_create(void);
-
-struct gsmfr_preproc_state is an opaque structure to library users: you only get
-a pointer which you remember and pass around, but <gsm_fr_preproc.h> does not
-give you a full definition of this struct.  As a library user, you don't even
-get to know the size of this struct, hence the necessary malloc() operation
-happens inside gsmfr_preproc_create().  However, the structure is malloc'ed as
-a single chunk, hence when you are done with it, simply call free() on the
-pointer you got from gsmfr_preproc_create().
-
-gsmfr_preproc_create() can fail if the malloc() call inside fails, in which case
-it returns NULL.
-
-Preprocessing good frames
-=========================
-
-For every good traffic frame (BFI=0) you receive from the radio subsystem, you
-need to call this preprocessor function:
-
-extern void gsmfr_preproc_good_frame(struct gsmfr_preproc_state *state,
-				     gsm_byte *frame);
-
-The second argument is both input and output, i.e., the frame is modified in
-place.  If the received frame is not SID (specifically, if the SID field
-deviates from the SID codeword by 16 or more bits, per GSM 06.31 section 6.1.1),
-then the frame (considered a good speech frame) will be left unmodified (i.e.,
-it is to be passed unchanged to the GSM 06.10 decoder), but preprocessor state
-will be updated.  OTOH, if the received frame is classified as either valid or
-invalid SID per GSM 06.31, then the output frame will contain comfort noise
-generated by the preprocessor using a PRNG, or a silence frame in one particular
-corner case.
-
-GSM-FR RTP (or libgsm) 0xD magic: the upper nibble of the first byte can be
-anything on input to gsmfr_preproc_good_frame(), but the output frame will
-always have the correct magic in it.
-
-Handling BFI conditions
-=======================
-
-If you received a lost/missing frame indication instead of a good traffic frame,
-call this preprocessor function:
-
-extern void gsmfr_preproc_bfi(struct gsmfr_preproc_state *state, int taf,
-			      gsm_byte *frame_out);
-
-TAF is a flag defined in GSM 06.31 section 6.1.1; if you don't have this flag,
-pass 0 - you will lose the function of comfort noise muting in the event of
-prolonged SID loss, but all other Rx DTX functions will still work the same.
-
-With this function the 33-byte frame buffer is only an output, i.e., prior
-buffer content is a don't-care and there is no provision for making any use of
-erroneous frames like in EFR.  The frame generated by the preprocessor may be
-substitution/muting, comfort noise or silence depending on the state.
+What will happen if the output of our Rx DTX preprocessor (e.g., the output of
+gsmfr-preproc utility) is fed to another utility such as gsmfr-decode that also
+applies the same preprocessor to its input?  In other words, what is the effect
+of a secondary preprocessor application to previous preprocessor output?
 
-Other miscellaneous functions
-=============================
-
-extern void gsmfr_preproc_reset(struct gsmfr_preproc_state *state);
-
-This function resets the preprocessor state to what it is right out of
-gsmfr_preproc_create(), which is naturally just a combination of malloc() and
-gsmfr_preproc_reset().  Given that our Rx DTX handler state is much simpler
-than, for example, EFR codec state, there does not seem to be any need for
-explicit resets, but the reset function is made public for the sake of
-completeness.
-
-extern int gsmfr_preproc_sid_classify(const gsm_byte *frame);
-
-This function analyzes an RTP-encoded FR frame (the upper nibble of the first
-byte is NOT checked for 0xD signature) for the SID codeword of GSM 06.12 and
-classifies the frame as SID=0, SID=1 or SID=2 per the rules of GSM 06.31
-section 6.1.1.
-
-Silence frame datum
-===================
-
-extern const gsm_frame gsmfr_preproc_silence_frame;
-
-Many implementors make the mistake of thinking that a GSM FR silence frame is a
-frame of 260 zero bits, but the official specs disagree: the silence frame given
-in GSM 06.11 (3GPP TS 46.011, at the very end of the spec) is quite different.
-Themyscira libgsmfrp implements the correct silence frame per the spec, and that
-datum is also made public.
-
-libgsmfrp change history: version 1.0.1 to version 1.0.2
-========================================================
-
-There are only two changes, both involving corner cases with invalid SID frames
-being received:
-
-1) An invalid SID frame was received immediately following a good speech frame.
-   In this case we start CN generation, but we take the needed LARc and Xmaxc
-   parameters from the last speech frame, instead of the usual procedure of
-   extracting them from a valid SID frame.  The change from 1.0.1 to 1.0.2
-   concerns the Xmaxc parameter in this corner case: in 1.0.1 we took Xmaxc
-   from the last subframe and used it for ensuing CN generation, but in 1.0.2
-   we compute a more proper mean Xmaxc from all 4 subframes, by dequantizing,
-   summing and requantizing.
-
-2) An invalid SID frame was received in the speech muting state.  The sequence
-   of inputs would have to be:
-
-   - a good speech frame;
-   - one or more BFIs, but not too many, so that the cached speech frame
-     does not decay fully by Xmaxc reduction;
-   - an invalid SID frame.
-
-   In version 1.0.1 we handled this even more obscure corner case by entering
-   the CN muting state, i.e., the state that is normally entered upon the
-   second lost SID.  In version 1.0.2 we ignore invalid SID in the speech
-   muting state and act as if we got BFI, i.e., continue speech muting rather
-   than switch to CN muting.
-
-libgsmfrp change history: version 1.0.0 to version 1.0.1
-========================================================
-
-Version 1.0.0 exhibited the following defects, which are fixed in 1.0.1:
-
-1) The last received valid SID was cached forever for the purpose of
-   handling future invalid SIDs - we could have received some valid
-   SID ages ago, then lots of speech or NO_DATA, and if we then get
-   an invalid SID, we would resurrect the last valid SID from ancient
-   history - a bad design.  In our new design, we handle invalid SID
-   based on the current state, much like BFI.
-
-2) GSM 06.11 spec says clearly that after the second lost SID
-   (received BFI=1 && TAF=1 in CN state) we need to gradually decrease
-   the output level, rather than jump directly to emitting silence
-   frames - we previously failed to implement such logic.
-
-3) Per GSM 06.12 section 5.2, Xmaxc should be the same in all 4 subframes
-   in a SID frame.  What should we do if we receive an otherwise valid
-   SID frame with different Xmaxc?  Our previous approach would
-   replicate this Xmaxc oddity in every subsequent generated CN frame,
-   which is rather bad.  In our new design, the very first CN frame
-   (which can be seen as a transformation of the SID frame itself)
-   retains the original 4 distinct Xmaxc, but all subsequent CN frames
-   are based on the Xmaxc from the last subframe of the most recent SID.
+Most of the time, the second preprocessor pass will be an identity transform
+under these conditions, as the input to that second pass will consist entirely
+of good speech frames, no SIDs and no BFIs.  Any speech frames in the original
+input that were mistakenly classified as SID (valid or invalid) have already
+been converted to comfort noise (or to the silence frame in one corner case of
+invalid SID), hence they are no longer present in the output to trigger this
+effect a second time.  However, there is still a small possibility that a
+second pass will be a non-identity transform: pseudorandom RPE pulse parameters
+in our comfort noise output are uniformly distributed between 1 and 6 (GSM 06.12
+section 6.1), and if PRNG dice roll such that at least 80 out of 95 SID codeword
+bit positions (all in the xMc part of the frame) are all zeros, the resulting
+CN frame will be liable to misinterpretation as SID (invalid SID most of the
+time, or even more rarely valid SID if at least 94 out of 95 SID codeword bit
+positions are all zeros) if fed to the preprocessor a second time.  That second
+pass would then further alter those affected frames, but no others.