FreeCalypso > hg > freecalypso-tools
comparison doc/Loadtools-performance @ 618:6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
| author | Mychaela Falconia <falcon@freecalypso.org> |
|---|---|
| date | Tue, 25 Feb 2020 18:40:00 +0000 |
| parents | 39b74c39d914 |
| children | 8c6e7b7e701c |
comparison
equal
deleted
inserted
replaced
| 617:97fe41e9242a | 618:6824c4d55848 |
|---|---|
| 17 run times do depend on the host system and USB-serial adapter or other serial | 17 run times do depend on the host system and USB-serial adapter or other serial |
| 18 port hardware - this host system dependency exists because of the way these | 18 port hardware - this host system dependency exists because of the way these |
| 19 operations are implemented in our architecture. | 19 operations are implemented in our architecture. |
| 20 | 20 |
| 21 Here are some examples of expected flash programming times, all obtained on the | 21 Here are some examples of expected flash programming times, all obtained on the |
| 22 Mother's Slackware 14.2 host system: | 22 Mother's Slackware 14.2 host system, using the flash program-bin command as |
| 23 opposed to program-m0 or program-srec: | |
| 23 | 24 |
| 24 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware | 25 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware |
| 25 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s | 26 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s |
| 26 | 27 |
| 27 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 | 28 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 |
| 44 * The time it takes for the bits to be transferred over the serial link; | 45 * The time it takes for the bits to be transferred over the serial link; |
| 45 * The time it takes for the flash programming operation to complete on the | 46 * The time it takes for the flash programming operation to complete on the |
| 46 target (physics inside the flash chip); | 47 target (physics inside the flash chip); |
| 47 * The overhead of command-response exchanges between fc-loadtool and loadagent. | 48 * The overhead of command-response exchanges between fc-loadtool and loadagent. |
| 48 | 49 |
| 49 XRAM loading via fc-xram is similar to flash programming in that fc-xram sends | 50 If you are starting out with a firmware image in m0 format, converting it to |
| 50 a separate ML command to loadagent for each S-record, thus the total XRAM image | 51 binary with mokosrec2bin (like our FC Magnetite build system always does) and |
| 51 loading time is not only the serial bit transfer time, but also the overhead of | 52 then flashing via program-bin is faster than flashing the original m0 image |
| 52 command-response exchanges between fc-xram and loadagent. The flash programming | 53 directly via program-m0. Following the last example above of flashing a |
| 53 times listed above include flashing an FC Magnetite fw image into an FCDEV3B, | 54 Magnetite hybrid fw image into an FCDEV3B, the flashing operation via |
| 54 which took 2m11s; doing an fc-xram load of the same FC Magnetite fw image (built | 55 program-bin took 2m11s; flashing the same image via program-m0 took 3m54s. |
| 55 as ramimage.srec) into the same FCDEV3B via the same FT2232D adapter at 812500 | 56 |
| 56 baud takes 2m54s. | 57 Flashing via program-bin is faster than program-m0 or program-srec because the |
| 58 program-bin operation uses a larger unit size internally. fc-loadtool | |
| 59 implements all flash programming operations by sending AMFW or INFW commands to | |
| 60 loadagent; each AMFW or INFW command carries a string of 16-bit words to be | |
| 61 programmed. Our program-bin operation programs 256 bytes at a time, i.e., | |
| 62 sends one AMFW or INFW command per 256 bytes of image payload; our program-m0 | |
| 63 and program-srec operations program one S-record at a time, i.e., each S-record | |
| 64 in the source image turns into its own AMFW or INFW command to loadagent. In | |
| 65 the case of m0 images produced by TI's hex470 post-linker, each S-record carries | |
| 66 30 bytes of payload, thus flashing that m0 image directly with program-m0 will | |
| 67 proceed in 30-byte units, whereas converting it to binary and then flashing with | |
| 68 program-bin will proceed in 256-byte units. The smaller unit size slows down | |
| 69 the overall operation by increasing the overhead of command-response exchanges. | |
| 70 | |
| 71 XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that | |
| 72 fc-xram sends a separate ML command to loadagent for each S-record, thus the | |
| 73 total XRAM image loading time is not only the serial bit transfer time, but also | |
| 74 the overhead of command-response exchanges between fc-xram and loadagent. Going | |
| 75 back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in | |
| 76 2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that | |
| 77 same fw image (built as ramimage.srec) into the same FCDEV3B via the same | |
| 78 FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram | |
| 79 loading is faster than flash program-m0 or program-srec, but slower than flash | |
| 80 program-bin. | |
| 57 | 81 |
| 58 Why does XRAM loading take longer than flashing? Shouldn't it be faster because | 82 Why does XRAM loading take longer than flashing? Shouldn't it be faster because |
| 59 the flash programming step on the target is replaced with a simple memcpy()? | 83 the flash programming step on the target is replaced with a simple memcpy()? |
| 60 Answer: fc-xram is currently slower than flash program-bin because the latter | 84 Answer: fc-xram is currently slower than flash program-bin because the latter |
| 61 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a | 85 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a |
