diff doc/Audio-mode-config @ 847:6e137995c9c8

doc/Audio-mode-config: elaborate on AEC and FIR blocks
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 10 Aug 2021 01:50:41 +0000
parents 6a0fcbca8ac7
children 6c306705f503
line wrap: on
line diff
--- a/doc/Audio-mode-config	Sat Jul 31 22:37:34 2021 +0000
+++ b/doc/Audio-mode-config	Tue Aug 10 01:50:41 2021 +0000
@@ -268,6 +268,228 @@
   loading code in our firmwares, these 164 byte mode files can still be used
   with current Tourmaline fw, with AEC set to its default disabled state.
 
+New AEC parameter words
+=======================
+
+The 12 words that configure AEC of the L1_NEW_AEC flavor (appearing on the
+aec-new line in tiaud-compile input or in an fc-tmsh auw 12 command) map as
+follows:
+
+Word 0: aec_enable
+
+This word must be set to 0 to disable AEC or 2 to enable it.  This word is
+translated to a single DSP control bit by the Audio Service layer, thus no
+other values must be written into it.
+
+Word 1: continuous_filtering
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "enable (1) or disable (0) continuous mode filtering".
+
+Word 2: granularity_attenuation
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "granularity of the smoothed attenuation".
+
+Word 3: smoothing_coefficient
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "smoothing coefficient".
+
+Word 4: max_echo_suppression_level
+
+This word is written directly into the DSP; it is described as "maximum
+attenuation level", and the following constants are defined for it:
+
+  #define AUDIO_MAX_ECHO_0dB              (0x7FFF)
+  #define AUDIO_MAX_ECHO_2dB              (0x65AA)
+  #define AUDIO_MAX_ECHO_3dB              (0x59AD)
+  #define AUDIO_MAX_ECHO_6dB              (0x4000)
+  #define AUDIO_MAX_ECHO_12dB             (0x1FFF)
+  #define AUDIO_MAX_ECHO_18dB             (0x0FFF)
+  #define AUDIO_MAX_ECHO_24dB             (0x07FF)
+
+Word 5: vad_factor
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "VAD factor relative to the current estimated energy".  VAD must stand
+for "voice activity detector", but our knowledge ends here.
+
+Word 6: absolute_threshold
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "VAD absolute offset relative to the current estimated energy".
+
+Word 7: factor_asd_filtering
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "modifying factor of d_far_end_noise for filtering decision".
+
+Word 8: factor_asd_muting
+
+This word is written directly into the DSP, and we have no documentation for it
+beyond "modifying factor of d_far_end_noise for muting decision".
+
+Word 9: aec_visibility
+
+This word must be set to 0 for normal operation or 0x200 for "AEC visibility"
+debug mode.  This word is translated to a single L1 control bit by the Audio
+Service layer, thus no other values must be written into it.
+
+Word 10: noise_suppression_enable
+
+This word must be set to 0 to disable SPENH algorithm or 4 to enable it.  This
+word is translated to a single DSP control bit by the Audio Service layer, thus
+no other values must be written into it.  We don't know what this "speech
+enhancement" algorithm does, and whether or not it is the same as "noise
+suppression".
+
+Word 11: noise_suppression_level
+
+This config word is mapped to just two bits in the actual DSP control word by
+the Audio Service layer, thus there are only 4 possible valid values here:
+
+  #define AUDIO_NOISE_NO_LIMITATION       (0x0000)
+  #define AUDIO_NOISE_6dB                 (0x0020)
+  #define AUDIO_NOISE_12dB                (0x0040)
+  #define AUDIO_NOISE_18dB                (0x0060)
+
+Some known-good AEC configurations
+==================================
+
+The terse descriptions of parameter words given above unfortunately constitute
+the total extent of our knowledge of the AEC block in our dear Calypso DSP and
+its tuning parameters - we don't know anything more.  However, we do have 3
+example configurations to look at: we have the default values of the tuning
+parameters that appear to be initialized by the DSP itself on boot, and we have
+two AEC-enabled configurations set by Pirelli DP-L10 firmware: one for handheld
+and wired headset modes, the other for the hands-free loudspeaker mode.  Here
+are the 12 parameter words in the 3 available configurations:
+
+Parameter word			Default		Pirelli		Pirelli
+				(AEC disabled)	handheld	hands-free
+--------------------------------------------------------------------------
+aec_enable			0 (off)		2 (on)		2 (on)
+continuous_filtering		0 (off)		1 (on)		1 (on)
+granularity_attenuation		0x0001		0x0014		0x0014
+smoothing_coefficient		0x7FFF		0x0CCC		0x0CCC
+max_echo_suppression_level	0x1FFF (12 dB)	0x59AD (3 dB)	0x0FFF (18 dB)
+vad_factor			0x4000		0x4000		0x4000
+absolute_threshold		0x0032		0x0032		0x0032
+factor_asd_filtering		0x1000		0x1000		0x1000
+factor_asd_muting		0x1000		0x1000		0x1000
+aec_visibility			0 (off)		0 (off)		0 (off)
+noise_suppression_enable	0 (off)		4 (on)		4 (on)
+noise_suppression_level		0 (none)	0 (none)	0x0060 (18 dB)
+
+The following observations can be made:
+
+* The 4 parameters vad_factor, absolute_threshold, factor_asd_filtering and
+  factor_asd_muting remain unchanged between TI's DSP default and Pirelli's
+  production configs.  On the basis of this observation, I (Mother Mychaela)
+  get the feeling that these four should be left alone.
+
+* Besides the obvious steps of enabling AEC and SPENH, Pirelli did change
+  continuous_filtering (from off to on), granularity_attenuation and
+  smoothing_coefficient.  Unfortunately, unless we recover the source code for
+  our Calypso DSP ROM or some documents explaining this version of AEC in
+  detail, we have no way of understanding what these parameters do, let alone
+  evaluating the merits of Pirelli's change.
+
+* max_echo_suppression_level and noise_suppression_level seem to be the two
+  parameters most amenable to tuning.
+
+* It is interesting to note that Pirelli's fw enables AEC not only in the
+  loudspeaker mode, but also in the more basic handheld and wired headset
+  modes.  The two AEC configs differ only in max_echo_suppression_level and
+  noise_suppression_level parameters, with the loudspeaker mode AEC config
+  being more aggressive.
+
+Prior to seeing what Pirelli's fw does, my (Mychaela's) own thinking was that
+AEC is only needed in loudspeaker configurations, not handheld or headset.
+After seeing Pirelli's AEC configs, I reason that enabling a less aggressive
+AEC configuration in those less echo-prone modes probably doesn't hurt - thus
+until and unless we recover more documentation or other knowledge, the plan for
+our own FreeCalypso Libre Dumbphone handset is to do what Pirelli does: use a
+less aggressive AEC config in handheld and headset modes, and a more aggressive
+one in the hands-free loudspeaker mode.
+
+On our current FCDEV3B setup with a SparkFun COM-09151 loudspeaker and a CUI
+CMC-9745-130T microphone, applying Pirelli's loudspeaker-mode AEC config
+produces echo cancellation that sounds acceptable to our subjective human
+evaluator on the far end of test calls.
+
+FIR filter details
+==================
+
+Calypso DSP has two FIR filters in the voice paths, one in the uplink path and
+one in the downlink path.  Aside from their placement, the two FIR filters are
+identical.  Each FIR block has 31 taps (making a 30th order filter), and each of
+the 31 coefficients is a 16-bit fixed-point number.  The fixed point format is
+F2.14 aka Q14: to get the real coefficient from the physical 16 bits, treat the
+16-bit datum as a two's complement signed integer, then divide by 16384.
+Examples: 0x4000 means 1, 0x2000 means 0.5, 0xC000 means -1, 0xE000 means -0.5.
+
+In principle you can set all 31 coefficients to whatever you like, but in
+practice only two possible configurations are used:
+
+* When the FIR filter is disabled (identity transform), coefficient 0 is set to
+  0x4000 (unity) and all other coefficients are set to 0.  In this configuration
+  the FIR block does not introduce any extra delay: all delayed samples are
+  multiplied by 0 and thus produce no effect.
+
+* When some non-identity frequency response transformation is desired, a linear
+  phase filter is set up: coefficient #15 becomes the main tap (significantly
+  greater in absolute value than all others) and all other coefficients mirror
+  around it symmetrically: #0 equals #30, #1 equals #29 and so forth, until #14
+  equals #16.  This filter adds 1.875 ms of delay (15 sample times) to the voice
+  path in which it is active, and an equal amount of "pre-ringing".
+
+The presumed purpose of these two FIR filters (uplink and downlink) is to
+flatten the frequency response of the speaker and microphone transducers, or
+perhaps even more ambitiously, the frequency response of the modeled acoustic
+environment.  However, actually coming up with a good set of FIR filter
+coefficients given a desired frequency response is a hard problem, one where
+forward engineering is much more difficult than reverse.
+
+When it comes to reverse engineering of existing Calypso DSP FIR filters, a
+total of 7 specimen have been captured out in the wild so far: one downlink FIR
+filter from Openmoko's non-functional para0.cfg (no way of knowing which speaker
+it was once designed for), and a set of 6 filters extracted from Pirelli DP-L10,
+3 uplink and 3 downlink, corresponding to the 3 audio routing modes supported on
+this phone model (handheld, hands-free and wired headset).  All 7 are linear
+phase filters as described above.  Analyzing the frequency response of a given
+already existing FIR filter is easy: just use the fir2freq program in our
+freecalypso-reveng Hg repository.  OTOH, coming up with a new set of FIR filter
+coefficients for some desired frequency response (e.g., for a new phone handset
+being designed) is a much harder problem, one which we will probably have to
+outsource to a hired DSP/FIR expert.
+
+Calypso FIR support in FC host tools
+------------------------------------
+
+Uplink and downlink FIR filter coefficients can be included in the input to
+tiaud-compile.  Each coefficient is given as the actual 16-bit word going into
+the DSP (Q14 scaling included), and can be specified either in hex or as a
+signed decimal integer.
+
+We also have a dedicated ASCII file format for a FIR filter coefficient set by
+itself, like this example:
+
+fir-coeff-table
+
+0x0178 0x0AB5 0xF43D 0xFED5 0xFCA7 0x04D8 0x00B8 0x0371
+0x032F 0x0007 0x151C 0xF24C 0x19A6 0xE918 0xF7CD 0x7D0C
+0xF7CD 0xE918 0x19A6 0xF24C 0x151C 0x0007 0x032F 0x0371
+0x00B8 0x04D8 0xFCA7 0xFED5 0xF43D 0x0AB5 0x0178
+
+(This example is the FIR filter extracted from Openmoko's non-functional
+para0.cfg.)  This by-itself FIR filter coeff set format is accepted as input to
+the auw-fir command in fc-tmsh (allowing experimental FIR filters to be uploaded
+to a running Calypso device for testing) and to our fir2freq analysis program.
+We will probably use the same format if and when we embark on a venture to
+design our own FIR filters for our own handset hardware.
+
 fc-tmsync aur and aur-all addition
 ==================================