# HG changeset patch # User Mychaela Falconia # Date 1583638991 0 # Node ID e66fafeeb37701133a16a7f22abbb5651b091be5 # Parent 815c3f8bcff1d7bc3f96b170a5ba40a00cd70a0c doc/Loadtools-performance: new faster flash operations diff -r 815c3f8bcff1 -r e66fafeeb377 doc/Loadtools-performance --- a/doc/Loadtools-performance Sun Mar 08 01:47:57 2020 +0000 +++ b/doc/Loadtools-performance Sun Mar 08 03:43:11 2020 +0000 @@ -1,15 +1,20 @@ -Dumping and programming flash -============================= +Memory dump performance +======================= Here are the expected run times for the flash dump2bin operation of dumping the -entire flash content of a Calypso GSM device: +entire flash content of a Calypso GSM device with the current version of +fc-loadtool which uses the new binary transfer protocol: Dump of 4 MiB flash (e.g., Openmoko GTA01/02 or Mot C139/140) at 115200 baud: -12m53s +6m4s + +The same 4 MiB flash dump at 812500 baud: 0m52s -The same 4 MiB flash dump at 812500 baud: 1m50s +Dump of 8 MiB flash (e.g., Mot C155/156) at 812500 baud: 1m44s -Dump of 8 MiB flash (e.g., Mot C155/156) at 812500 baud: 3m40s +These times are a 2x improvement compared to all previous versions of +fc-loadtool (prior to fc-host-tools-r13) which used a hex-based transfer +protocol. Because of the architecture of fc-loadtool and its loadagent back-end, the run time of a flash dump operation depends only on the serial baud rate and the @@ -21,34 +26,65 @@ port hardware - this host system dependency exists because of the way these operations are implemented in our architecture. +Flash programming operations +============================ + Here are some examples of expected flash programming times, all obtained on the Mother's Slackware 14.2 host system: Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware -image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s +image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 0m19s to +erase 37 sectors, 3m45s to program the image. Flashing the same OM GTA02 modem with the same fw image, using a CP2102 -USB-serial cable at 812500 baud: 1m52s +USB-serial cable at 812500 baud: 0m19s to erase, 0m51s to program. Flashing a Magnetite hybrid fw image (2378084 bytes) into an FCDEV3B board -(S71PL129N flash chip) via an FT2232D adapter at 812500 baud: 2m11s +(S71PL129N flash chip) via an FT2232D adapter at 812500 baud: 0m24s to erase +13 sectors (4 small and 9 large), 1m27s to program the image. + +Regardless of whether you execute these two steps separately or use one of our +new flash e-program-{bin,m0,srec} commands, flash programming is always done in +two steps: first the erase operation covering the needed range of sectors, then +the actual programming operation that includes the data transfer. -These times are just for the flash program-bin operation, not counting the -flash erase which must be done first. Flash erase times are determined -entirely by physical processes inside the flash chip and are not affected by -software design or the serial link: for each sector to be erased, fc-loadtool -issues the sector erase command to the flash chip and then polls the chip for -operation completion status; the polling is done over the serial link and thus -may seem very slow, but the extra bit of latency added by the finite polling -speed is still negligible compared to the time of the actual sector erase -operation inside the flash chip. In contrast, the execution time of a flash -program-bin operation is a sum of 3 components: +Flash erase times are determined entirely by physical processes inside the +flash chip and thus should not be affected by software design or the serial +link: for each sector to be erased, fc-loadtool issues the sector erase command +to the flash chip and then polls the chip for operation completion status; the +polling is done over the serial link and thus may seem very slow, but the extra +bit of latency added by the finite polling speed is still negligible (at least +on the Mother's Slackware system) compared to the time of the actual sector +erase operation inside the flash chip. One remaining flaw is that in our +current implementation the issuance of each individual sector erase command to +the flash chip takes 6 command-response exchanges between fc-loadtool and +loadagent; on my Slackware host system this extra overhead is still negligible +compared to the 0.5s or more for the actual erase operation time, but this +overhead may become more significant on host systems with higher latency. + +After the erase operation, the execution time of the main flash programming +operation is a sum of 3 components: * The time it takes for the bits to be transferred over the serial link; * The time it takes for the flash programming operation to complete on the target (physics inside the flash chip); * The overhead of command-response exchanges between fc-loadtool and loadagent. +Because image data transfer is taking place in this step, flash programming at +812500 baud is faster than 115200 baud, although it is not the same 7x +improvement as happens with flash dumps. The present version of fc-loadtool +also uses a new binary transfer protocol instead of the hex-based one used in +previous versions (prior to fc-host-tools-r13); this change produces a 2x +improvement for OM GTA02 flashing, but only a smaller improvement for FCDEV3B +flashing. + +Notice the difference in flash programming times between GTA02 and FCDEV3B: the +fw image size is almost exactly the same, any difference in latency between +CP2102 and FT2232D is less likely to produce such significant time difference +given our current 2048 byte transfer block size, thus the difference in physical +flash program operation times between K5A3281CTM and S71PL129N flash chips seems +to be the most likely explanation. + Programming flash using program-m0 or program-srec ==================================================