FCDEV3B hardware bug: sleep mode self-reboot

Kent del Pino kentdelpino at gmail.com
Sat Jul 22 15:34:07 UTC 2017


I heard you and I usually have access a RIGOL DS1052E here in London.
However, I can not help with the problem before next weekend and I would
love to.

Remark (I'm sure you already know it): going from 100nF to 1uF capacity
with a unit from the same series of capabilities could make the problem
bigger due to the, under normal conditions, increased Equivalent Series
Resistance (ESR).

On the PCB my eyes catch the C306, among several hot spot, it is not a
physical small PCB. When you say the same design/layout as other products,
it is very important that the decoupling capacitors actually is the same,
not only in farad, but also in type. If we do not know, we go for X7R.

Never save on the decoupling capacitor (the 100nF over Vcc), it is a matter
of production methods, to lower the ESR which is frequency dependent. In
addition, modern small capabilities are very temperature sensitive (large
variation in F), go for X7R.

Digital noise on the power rail (Vcc) will go into all subsystems and in
this case ends on the antenna.


On Sat, Jul 22, 2017 at 1:23 AM, Mychaela Falconia <
mychaela.falconia at gmail.com> wrote:

> Hello FreeCalypso community,
>
> When I originally set out to build our own FCDEV3B modem board, one of
> my goals has been to produce a proper replacement for the no-longer-
> made, surplus-exhausted and not very convenient Neo FreeRunner, i.e.,
> produce a modem that is strictly no worse than Openmoko's on every
> point.  This goal has now been achieved on all points except two:
>
> * Our current FCDEV3B modems have a problem with sleep modes as
>   described in detail below.  This problem has never been seen on
>   Openmoko hardware or on any of our other pre-existing hw targets
>   (Mot C1xx or Pirelli), thus it is a regression defect in our FCDEV3B
>   hardware.
>
> * Testing voice calls and Calypso audio features is very inconvenient
>   but possible on the FreeRunner.  On our own FCDEV3B hardware I still
>   haven't got around to adding the loudspeaker and microphone, hence
>   those circuits on our modem boards have not been exercised at all
>   yet.
>
> Note that RF calibration is no longer in the list of our own hardware's
> regressions relative to Openmoko: even though my CMU200 still has not
> been properly repaired and I still have to use the Aux Tx generator
> (B96 variant) instead of the broken main Tx, I have completed the
> development of fully automated calibration software that produces
> calibration results which I believe to be no worse than Openmoko's.  I
> can now take a freshly assembled board, connect it to the CMU200 and
> run a single shell script (fc-rfcal-tri900) - this script will run all
> 7 required calibration parts (VCXO, 3 Rx bands and 3 Tx bands), saving
> all produced calibration results in the flash file system.  The accuracy
> of this calibration is probably less than ideal because I don't have a
> fancy metrology-grade cabling setup with precisely measured insertion
> loss and because the calibration status of my CMU200 itself is unknown,
> but it should definitely be within the GSM 05.05 spec tolerances.
>
> Now onto the sleep mode problem, which is definitely a regression in
> our current hw relative to Openmoko and all other known pre-existing
> Calypso hw targets.  Before I get into further details, I need to
> emphasize that the sleep mode problem on our current FCDEV3B boards is
> quite different from the infamous deep sleep problem (bug #1024) that
> used to plague Openmoko devices.  Openmoko's bug #1024 affected only
> deep sleep, not other sleep modes, and its manifestation was that
> while camped on a cell in idle mode, the modem would wake up from deep
> sleep in a messed-up state and not be able to receive the paging
> channel correctly any more until it gave up and reestablished a new
> network sync, causing the network registration status to bounce.
>
> Our sleep mode problem is quite different: the manifestation of our
> sleep mode hw bug is that certain sleep-wake sequences cause the modem
> to suddenly reboot: yes, a total reboot out of the blue, completely
> blowing away whatever you were doing at the time.  This behaviour is
> something that was never experienced by Openmoko, nor by us on any of
> our pre-existing Calypso hw targets, thus it is a new problem.
>
> Out of the 5 FCDEV3B boards I have left, I have 3 that aren't broken
> in other ways, and on all 3 boards the sleep mode self-reboot bug is
> reproducible 100% of the time under the following conditions:
>
> * The firmware needs to be Magnetite-l1reconst.  I have heard reports
>   that the self-reboot bug does not happen with Citrine fw (and thus
>   Magnetite-hybrid may be similarly avoiding the bug), but it is a
>   *hardware* bug, and one of the first steps in debugging such is
>   getting it reproducible.  Magnetite-l1reconst fw does the job of
>   making the hw bug 100% reproducible.
>
> * This Magnetite-l1reconst fw needs to be flashed, not loaded via
>   fc-xram, as the manifestation of the bug involves the modem self-
>   rebooting.
>
> * There needs to be a SIM inserted in the socket.  It doesn't matter
>   whether or not this SIM is recognized as valid by any network
>   operator, as reproducing the sleep mode self-reboot bug does not
>   involve connecting to a network or even bringing up the radio at
>   all, i.e., the antenna can be disconnected.  The SIM only needs to
>   be valid for the AT+CFUN=1 command.
>
> * Boot the board with Magnetite-l1reconst fw, and see the boot output
>   in the rvinterf window.  Without changing any sleep modes with
>   AT%SLEEP commands, i.e., with all sleep modes enabled by default,
>   give it an AT+CFUN=1 command, either through fc-shell or through the
>   dedicated AT command UART.  Instead of the command completing with
>   an OK response, the modem will self-reboot, which you should observe
>   in the rvinterf window.
>
> Disabling all sleep modes with AT%SLEEP=0 stops this self-reboot from
> happening, allowing us to get past the AT+CFUN=1 command and on to
> subsequent radio bring-up, which is what we've been doing since early
> April when I first brought our FCDEV3B boards home and immediately hit
> the self-reboot bug.  However, only in the last few days have I done a
> deeper investigation.
>
> The first noteworthy observation is that the sleep mode that triggers
> the self-reboot (at least in this scenario) is not deep sleep or even
> big sleep, but rather small sleep.  If I issue AT%SLEEP=3 (big sleep
> and deep sleep enabled, small sleep disabled) before AT+CFUN=1, the
> latter command always succeeds, and I can then proceed to radio
> bring-up (AT+COPS=0) in this state.  On rare occasions the modem will
> self-reboot on a sleep-wake sequence while connected to a network and
> listening for paging with big sleep and deep sleep enabled; I haven't
> tried big sleep only, but even with both big and deep sleep enabled,
> the reboot is quite rare, most of the time the modem is fine like that.
>
> On the other hand, if I issue AT%SLEEP=1 (enable small sleep only)
> before AT+CFUN=1, the latter always triggers the self-reboot, just
> like with all sleep modes enabled, thus we know that small sleep is
> the culprit in this self-reboot scenario.
>
> At this point I need to explain the 3 Calypso sleep modes.  Big sleep
> and deep sleep involve L1 code calculating how long the system is
> going to remain idle and programming the hardware to disable TDMA
> frame interrupts for that long, and in the case of deep sleep, also
> stopping the VCXO for that duration.  Small sleep OTOH happens in the
> idle thread of Nucleus RTOS scheduler: when all tasks are blocked
> waiting for something and the task scheduler falls into the idle
> thread waiting for the next interrupt, if small sleep is enabled, TI's
> modified version of Nucleus' idle thread will stop the ARM7 CPU by
> cutting off its clock with a control register bit; this CPU clock is
> then re-enabled by the hardware when the next interrupt occurs.
>
> Small sleep cannot have a duration longer than one TDMA frame
> (4.615 ms), as it always ends as soon as an interrupt occurs, and
> unless some other interrupt occurs sooner, the wake-up event will be
> the next TDMA frame interrupt - except when suppressed by big sleep or
> deep sleep logic, these interrupts always occur on every TDMA frame.
>
> But here is what happens on the AT+CFUN=1 command: this command brings
> up the SIM interface, and while bits are being transferred to and from
> the SIM, the responsible Nucleus task is blocked waiting on a timer.
> But while the Nucleus task is blocked thusly, interrupts occur quite
> frequently, as the SIM interface hardware interrupts on every byte and
> the interrupt handler feeds it the next byte to be sent.  Because no
> Nucleus tasks are in runable state, when the SIM interrupt handler
> returns, control falls back into the idle thread which activates small
> sleep, only to be woken up by the next SIM interrupt one byte later.
> Thus the execution of the AT+CFUN=1 command with all sleep modes
> enabled involves a lot of back-to-back sleep-wake sequences in rapid
> succession, and it is my hypothesis that the latter act as the
> triggering condition for our hardware bug.
>
> The next question becomes: what do we about it?  We know that we have
> a hardware bug on our hands because the exact same firmware works
> without a hitch on Openmoko-made FreeRunners and other pre-existing
> Calypso hw targets.  The first thought that naturally comes to mind is
> that the rapid back-to-back sleep-wake sequences cause an increased
> current draw on one of the power supply rails, which in turn causes
> too great of a voltage drop somewhere.  But the perplexing thing is
> that our entire modem core design comes straight from Openmoko,
> virtually unchanged at the physical layout level, and at the schematic
> level (including all capacitor values) Openmoko's modem is completely
> unchanged from TI's Leonardo reference in this area.  So how come we
> have a hw problem in a section that directly follows both TI and
> Openmoko known-good references?
>
> My other first thought from months back was that perhaps the series
> jumpers I inserted into VBAT power paths for current measurement
> purposes were the culprit, by adding too much series resistance into
> those power current paths, but the following experiments strongly
> suggest that the problem ought to be elsewhere:
>
> * I tried removing the two-post headers with shorting blocks from JP2
>   and JP3, and replaced them with solid wire jumpers soldered directly
>   into the PCB holes - no improvement.
>
> * I tried upping the 0402 caps at C220 and C221 (on the VBAT net near
>   the inputs to Iota LDO regulators) from 100 nF (Leonardo value that
>   always worked fine for Openmoko) to 1 uF - no improvement.
>
> * I tried feeding higher voltages to our board's "battery" power input,
>   up to 5 V, to cancel out any voltage drop in the VBAT path before
>   the regulators - no improvement.
>
> The above results all suggest that the problem is not likely to be on
> the VBAT input side of the LDO regulators in the Iota chip, but is
> more likely to be on the regulator output side, i.e., between the
> regulated voltage outputs from Iota and the corresponding power
> consumers in the Calypso itself or perhaps in the flash+XRAM chip.
>
> I have a sinking feeling that we may not be able to fix this hw bug,
> and we may have to publicly admit that our hardware is defective in a
> way which we are not able to fully understand, let alone fix, and tell
> our users to disable all sleep modes as a workaround.  Or try to hide
> the embarrassment by checking in a code change to our firmwares that
> disables sleep by default if the hardware target is FCDEV3B, and hope
> that no one notices...  However, if it is possible to "fix" the sleep
> mode self-reboot mode bug allopathically by upping some other
> capacitor(s) on one or more regulated power nets, then I would like to
> apply that allopathic fix to our next batch of boards.
>
> Right now I could really use some help from one of the other community
> members who has an FCDEV3B board and would be able to do some probing
> with an oscilloscope.  At the moment there are only two other FC
> community members who have FCDEV3B boards: Das Signal and whoever in
> the Serg+Kent+possible_others team currently has the board I sent to
> them.  (Harald Welte also has a board, but I doubt that he would be
> interested in helping fix something that only affects FreeCalypso and
> not OsmocomBB.)  If either of you happen to have access to an
> oscilloscope and the knowledge of how to operate it, as well as sharp
> eyes and a steady hand (or an assistant who can offer such) to hold an
> o'scope probe on one side of an 0402 capacitor, you could help the
> project by probing at several points to see if a voltage drop can be
> spotted when an AT+CFUN=1 command triggers a self-reboot as described
> earlier.  If you are able and willing to help, please let me know and
> I'll provide more detailed instructions as to exactly where to probe
> and what to look for.
>
> Hasta la Victoria, Siempre,
> Mychaela aka The Mother
> _______________________________________________
> Community mailing list
> Community at freecalypso.org
> https://www.freecalypso.org/mailman/listinfo/community
>



-- 
Regards,

Kent del Pino

   - - Embedded Programmer and Microelectronics Engineering - -

 Phone: +44 (0)7 544 141 228
 Skype: kent.del.pino
 Mail: kentdelpino at gmail.com


More information about the Community mailing list