# HG changeset patch # User Mychaela Falconia # Date 1582656000 0 # Node ID 6824c4d5584890ab5495273754d7f8e26950b507 # Parent 97fe41e9242ad3f308b13b4d8aa5bd1cb1442ec2 doc/Loadtools-performance: program-m0 slowness documented diff -r 97fe41e9242a -r 6824c4d55848 doc/Loadtools-performance --- a/doc/Loadtools-performance Tue Feb 25 07:01:28 2020 +0000 +++ b/doc/Loadtools-performance Tue Feb 25 18:40:00 2020 +0000 @@ -19,7 +19,8 @@ operations are implemented in our architecture. Here are some examples of expected flash programming times, all obtained on the -Mother's Slackware 14.2 host system: +Mother's Slackware 14.2 host system, using the flash program-bin command as +opposed to program-m0 or program-srec: Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s @@ -46,14 +47,37 @@ target (physics inside the flash chip); * The overhead of command-response exchanges between fc-loadtool and loadagent. -XRAM loading via fc-xram is similar to flash programming in that fc-xram sends -a separate ML command to loadagent for each S-record, thus the total XRAM image -loading time is not only the serial bit transfer time, but also the overhead of -command-response exchanges between fc-xram and loadagent. The flash programming -times listed above include flashing an FC Magnetite fw image into an FCDEV3B, -which took 2m11s; doing an fc-xram load of the same FC Magnetite fw image (built -as ramimage.srec) into the same FCDEV3B via the same FT2232D adapter at 812500 -baud takes 2m54s. +If you are starting out with a firmware image in m0 format, converting it to +binary with mokosrec2bin (like our FC Magnetite build system always does) and +then flashing via program-bin is faster than flashing the original m0 image +directly via program-m0. Following the last example above of flashing a +Magnetite hybrid fw image into an FCDEV3B, the flashing operation via +program-bin took 2m11s; flashing the same image via program-m0 took 3m54s. + +Flashing via program-bin is faster than program-m0 or program-srec because the +program-bin operation uses a larger unit size internally. fc-loadtool +implements all flash programming operations by sending AMFW or INFW commands to +loadagent; each AMFW or INFW command carries a string of 16-bit words to be +programmed. Our program-bin operation programs 256 bytes at a time, i.e., +sends one AMFW or INFW command per 256 bytes of image payload; our program-m0 +and program-srec operations program one S-record at a time, i.e., each S-record +in the source image turns into its own AMFW or INFW command to loadagent. In +the case of m0 images produced by TI's hex470 post-linker, each S-record carries +30 bytes of payload, thus flashing that m0 image directly with program-m0 will +proceed in 30-byte units, whereas converting it to binary and then flashing with +program-bin will proceed in 256-byte units. The smaller unit size slows down +the overall operation by increasing the overhead of command-response exchanges. + +XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that +fc-xram sends a separate ML command to loadagent for each S-record, thus the +total XRAM image loading time is not only the serial bit transfer time, but also +the overhead of command-response exchanges between fc-xram and loadagent. Going +back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in +2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that +same fw image (built as ramimage.srec) into the same FCDEV3B via the same +FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram +loading is faster than flash program-m0 or program-srec, but slower than flash +program-bin. Why does XRAM loading take longer than flashing? Shouldn't it be faster because the flash programming step on the target is replaced with a simple memcpy()?