FreeCalypso > hg > freecalypso-sw
comparison doc/Firmware_Architecture @ 868:d92b110e06e0
doc/Firmware_Architecture written
| author | Space Falcon <falcon@ivan.Harhan.ORG> |
|---|---|
| date | Sun, 17 May 2015 03:45:19 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 867:c4da570dca83 | 868:d92b110e06e0 |
|---|---|
| 1 Our FreeCalypso GSM firmware follows the same architecture as TI's TCS211; | |
| 2 this document is an attempt to describe this architecture. | |
| 3 | |
| 4 Nucleus environment | |
| 5 =================== | |
| 6 | |
| 7 Like all classic TI firmwares, ours is based on the Nucleus PLUS RTOS. Just | |
| 8 like TI's original code on which we are based, we use only a small subset of | |
| 9 the functionality provided by Nucleus - but because the latter is a library, | |
| 10 the pieces we don't use simply don't get pulled into the link. The main | |
| 11 function we get out of Nucleus is the scheduling of threads, or tasks as | |
| 12 Nucleus calls them. | |
| 13 | |
| 14 Our entry point code as we receive control from the Calypso boot ROM or from | |
| 15 other bootloaders on crippled targets or from loadagent in the case of fc-xram | |
| 16 loadable builds does some absolutely minimal initialization (set up sensible | |
| 17 memory access timings, copy iram.text to IRAM and .data to XRAM if we are | |
| 18 booting from flash, zero out our two bss segments (int.bss and ext.bss)) and | |
| 19 jumps to Nucleus' assembly init entry point. Prior to jumping to Nucleus, we | |
| 20 don't even have a stack (all init code prior to that point is pure assembly and | |
| 21 uses only ARM registers); Nucleus then sets up the stack pointer for everything | |
| 22 running under its control. | |
| 23 | |
| 24 Aside from just a few exceptions (ARM exception handlers come to mind, never | |
| 25 mind the pun), every piece of code in the firmware executes in one of the | |
| 26 following contexts: | |
| 27 | |
| 28 * Application_Initialize(): this function and everything called from it execute | |
| 29 just before Nucleus' thread scheduler starts; at this point interrupts are | |
| 30 disabled at the ARM7 core level (in the CPSR) and must not be enabled; the | |
| 31 stack is Nucleus' "system stack" which is also used by the scheduler and LISRs | |
| 32 as explained below. | |
| 33 | |
| 34 * Regular threads or tasks: once Application_Initialize() finishes, all code | |
| 35 with the exception of interrupt handlers (LISRs and HISRs as explained below) | |
| 36 runs in the context of some Nucleus task. Whenever you are trying to debug | |
| 37 or simply understand some piece of code in the firmware, the first question | |
| 38 you should ask is "which task does this code execute in?". Most functional | |
| 39 components run in their own tasks, i.e., a given piece of code is only | |
| 40 intended to run within the Nucleus task that belongs to the component in | |
| 41 question. On the other hand, some components are implemented as APIs, | |
| 42 functions to be called from other components: these don't have their own task | |
| 43 associated with them, and instead they run in the context of whatever task | |
| 44 they were called from. Some only get called from one task: for example, the | |
| 45 "uartfax" driver API calls only get called from the protocol stack's UART | |
| 46 entity, which is its own task. Other component API functions like FFS and | |
| 47 trace can get called from just about any task in the system. Many components | |
| 48 have both their own task and some API functions to be called from other tasks, | |
| 49 and the API functions oftentimes post messages to the task to be worked on by | |
| 50 the latter; the just-mentioned FFS and trace functions work in this manner. | |
| 51 | |
| 52 In our current GSM firmware (just like in TCS211) every Nucleus task is | |
| 53 created either through Riviera or through GPF, and not in any other way - see | |
| 54 the description of Riviera and GPF below. | |
| 55 | |
| 56 * LISRs (Low level Interrupt Service Routines): these are the interrupt handlers | |
| 57 that run immediately when an ARM IRQ or FIQ comes in. The code at the IRQ and | |
| 58 FIQ vector entry points calls Nucleus' magic stack switching function | |
| 59 (switches the CPU from IRQ/FIQ into SVC mode, saves the interrupted thread's | |
| 60 registers on that thread's stack, and switches to the "system" stack) and | |
| 61 then calls TI's IRQ dispatcher implemented in C. The latter figures out | |
| 62 which Calypso interrupt needs to be handled and calls the handler configured | |
| 63 in the compiled-in table. Nucleus' LISR registration framework is not used | |
| 64 by the GSM fw, but these interrupt handlers should be viewed as LISRs | |
| 65 nonetheless. | |
| 66 | |
| 67 There is one additional difference between canonical Nucleus and TI's version | |
| 68 (we've replicated the latter): canonical Nucleus was designed to support | |
| 69 nested LISRs, i.e., IRQs re-enabled in the magic stack switching function, | |
| 70 but in TI's version which we follow this IRQ re-enabling is removed: each LISR | |
| 71 runs with interrupts disabled and cannot be interrupted. (The corner case of | |
| 72 an FIQ interruping an IRQ remains to be looked at more closely as bugs may be | |
| 73 hiding there, but Calypso doesn't really use FIQ interrupts.) There is really | |
| 74 no need for LISR nesting in our GSM fw, as each LISR is very short: most LISRs | |
| 75 do nothing more than trigger the corresponding HISR. | |
| 76 | |
| 77 * HISRs (High level Interrupt Service Routines): these hold an intermediate | |
| 78 place between LISRs and tasks, similar to softirqs in the Linux kernel. A | |
| 79 HISR can be activated by a LISR calling NU_Activate_HISR(), and when the LISR | |
| 80 returns, the HISR will run before the interrupted task (or some higher | |
| 81 priority task, see below) can resume. HISRs run with CPU interrupts enabled, | |
| 82 thus more interrupts can occur, with their LISRs executing and possibly | |
| 83 triggering other HISRs. All triggered HISRs must complete and thereby go | |
| 84 "quiescent" before task scheduling resumes, i.e., all HISRs as a group have a | |
| 85 higher scheduling priority than tasks. | |
| 86 | |
| 87 Nucleus implements priority scheduling for tasks. Tasks have their priority set | |
| 88 when they are created (through Riviera or GPF, see below), and a higher priority | |
| 89 task will run until it gets blocked waiting for something, at which time lower | |
| 90 priority tasks will run. If a lower priority task sends a message to a higher | |
| 91 priority task, unblocking the latter which was waiting for incoming messages, | |
| 92 the lower priority task will effectively suspend itself immediately while the | |
| 93 higher priority task runs to process the message it was sent. | |
| 94 | |
| 95 HISRs oftentimes post messages to their associated tasks as well; if one of | |
| 96 these messages unblocks a higher priority task, that unblocked task will run | |
| 97 upon the completion of the HISR instead of the original lower priority task | |
| 98 that was interrupted by the LISR that triggered the HISR. Nucleus' scheduler | |
| 99 is fun! | |
| 100 | |
| 101 Major functional blocks | |
| 102 ======================= | |
| 103 | |
| 104 At the highest level, all code in TI's classic firmwares and in our FreeCalypso | |
| 105 fw can be divided into 3 broad groupings: | |
| 106 | |
| 107 * GSM Layer 1: this code was developed by TI, is highly specific to TI's | |
| 108 baseband chipset family in general and to specific individual chips in | |
| 109 particular (the code is liberally sprinkled with conditional compilation | |
| 110 based on DBB type, ABB type, DSP ROM version and so on), and is absolutely | |
| 111 necessary in order to operate a Calypso device as a GSM MS (mobile station) | |
| 112 and not merely as a general purpose microprocessor platform. This code can | |
| 113 be considered to be the most important part of the entire firmware. | |
| 114 | |
| 115 L1 interties with Nucleus and with the G23M stack (with which it needs to | |
| 116 communicate) in a very peculiar way described later in this article. | |
| 117 | |
| 118 * G23M protocol stack: at the beginning of TI's involvement in the GSM baseband | |
| 119 chipset business, they only developed and maintained their own L1 code, while | |
| 120 the rest of the protocol stack (which is hardware-independent) was licensed | |
| 121 from another company called Condat. Later Condat as a company was fully | |
| 122 acquired by TI, and the once-customer of this code became its owner. The | |
| 123 name of TI/Condat's implementation of GSM layers 2&3 for the MS side is G23M, | |
| 124 and it forms its own major division of the overall fw architecture. | |
| 125 | |
| 126 Underlying the G23M stack is a special layer called GPF, which was originally | |
| 127 Condat's Generic Protocol stack Framework. Apparently Condat was in the | |
| 128 business of developing and maintaining a whole bunch of protocol stacks: GSM | |
| 129 MS side, GSM network side, TETRA and who knows what else. GPF was their | |
| 130 common underpinning for all of their protocol stack projects, which ran on top | |
| 131 of many different OS environments: Nucleus, pSOS, VxWorks, Unix/Linux, Win32 | |
| 132 and who knows what else. | |
| 133 | |
| 134 In the case of FreeCalypso GSM fw, both the protocol stack and the underlying | |
| 135 OS environment are fixed: GSM and Nucleus, respectively. But GPF is still a | |
| 136 critically important layer in the firmware architecture: in addition to | |
| 137 serving as the glue between the G23M stack and Nucleus, it provides some | |
| 138 important support infrastructure for the protocol stack. | |
| 139 | |
| 140 * Miscellaneous peripheral accessories: under this category I (Space Falcon) | |
| 141 place everything implemented through TI's Riviera framework. Historical | |
| 142 evidence indicates that TI's earliest firmwares did not have this part, i.e., | |
| 143 Riviera and everything built on top of it is a "non-essential" later | |
| 144 addition. It appears that TI originally invented Riviera in order to support | |
| 145 the development of fancy "feature phone" UI/application layers, complete with | |
| 146 Java, MMS, WAP, games and whatnot - things upon which our FreeCalypso project | |
| 147 looks with disdain - but in the TCS211 firmware from 2007 which I used as the | |
| 148 reference for FreeCalypso this Riviera framework serves as the foundation for | |
| 149 some small but essential pieces of functionality: the FFS implementation, the | |
| 150 SPI-based ABB access driver, the RTC driver and the debug trace facility. | |
| 151 | |
| 152 While it is certain that TI had some non-Riviera implementation of the just- | |
| 153 listed essential pieces in their earliest pre-Riviera days, trying to find | |
| 154 surviving sources from those days would be a "mission impossible" task. OTOH, | |
| 155 reusing the Riviera code from TCS211 was quite easy, as the copy of TCS211 we | |
| 156 got has it in full source form with nothing omitted. Therefore, I took the | |
| 157 sensible easy road and kept Riviera in FreeCalypso. | |
| 158 | |
| 159 The above division of the firmware into 3 broad functional groupings also | |
| 160 corresponds quite neatly with where each piece of our source code originally | |
| 161 came from. Our versions of L1 and G23M came in their entirety from TI's TCS3.2 | |
| 162 program targeting their later LoCosto chipset (specifically from the | |
| 163 TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip release from Peek/FGW), whereas | |
| 164 everything in the 3rd division (Riviera and everything built on top of it) came | |
| 165 from our TCS211/Leonardo source from Sotovik. | |
| 166 | |
| 167 The just-listed divisions of the firmware are really separate software | |
| 168 environments which are linked together into one final image, but which have | |
| 169 very little in the way of interties. Each of the 3 realms has its own very | |
| 170 different coding style, its own set of header files and its own defined types. | |
| 171 It is very rare for a module from one realm to include any header files or call | |
| 172 any functions from another realm, and while they all ultimately run on top of | |
| 173 Nucleus, they interface with Nucleus in different ways: G23M goes through GPF, | |
| 174 everything in Riviera land goes through Riviera, and L1 uses its own bizarre | |
| 175 mechanism which in our fw ends up going through GPF but hasn't always been this | |
| 176 way - to be explained lated in this article. | |
| 177 | |
| 178 Also note that there is no mention of any handset UI code (or MMI in the GSM | |
| 179 industry's sexist speak) in the above breakdown of code divisions. This | |
| 180 document describes the architecture of TI's modem firmware in which the highest | |
| 181 layer is the AT command interface (part of the G23M suite, or its uppermost | |
| 182 layer to be precise), and which does not include any UI code. Our TI reference | |
| 183 sources do include their "MMI" code, but I haven't studied it closely enough | |
| 184 yet to comment on it properly, and the version of TCS211 which serves as our | |
| 185 primary reference is set up for the modem configuration without this "MMI" part. | |
| 186 Making sense of TI's "MMI" code is a task to be tackled later in the project | |
| 187 when we have a working modem and are ready to start building a usable handset | |
| 188 with UI. | |
| 189 | |
| 190 Riviera and GPF | |
| 191 =============== | |
| 192 | |
| 193 Riviera and GPF are two parallel/independent/competing wrappers around or | |
| 194 layers above Nucleus. The way in which they are treated in our FreeCalypso fw | |
| 195 architecture is somewhat inverted: originally GPF was the essential framework | |
| 196 underlying the G23M stack (and to which L1 was also attached in a hacky way) | |
| 197 while Riviera was added to support non-essential frills, but in our current FC | |
| 198 fw Riviera is always included just like Nucleus, whereas GPF only needs to be | |
| 199 included in the build when building with feature gsm (full GSM MS functionality) | |
| 200 or feature l1stand (L1 standalone) - but is not needed if one wishes to build | |
| 201 an "in vivo" FFS editing agent, for example. | |
| 202 | |
| 203 This peculiar arrangement happened because of the source code availability | |
| 204 situation we found ourselves in. TCS211 uses real Riviera that is fully | |
| 205 independent of GPF (see below), and our copy thereof came with this part in | |
| 206 full source form. On the other hand, we never got the complete original source | |
| 207 for GPF in one piece, thus our FC version of GPF had to be reconstructed from | |
| 208 bits and pieces. For this reason I made the decision early on to include | |
| 209 Riviera and some RV-based components in the "mandatory core" part of our FC fw | |
| 210 architecture, while leaving GPF to be worked on later. And when I did get to | |
| 211 reintegrating GPF, at that point it was natural to make it into an "optional" | |
| 212 component that is included only when needed. | |
| 213 | |
| 214 At some point in their post-Calypso TCS3.x program TI decided to eliminate | |
| 215 Riviera as an independent framework and to reimplement Riviera APIs (used by | |
| 216 peripheral but necessary code such as FFS, ETM, various drivers etc) over GPF. | |
| 217 This arrangement is used in the TCS3.2 LoCosto code from which we lifted our | |
| 218 versions of L1 and G23M. However, I (Space Falcon) chose not to adopt this | |
| 219 approach for FreeCalypso, and mimic the TCS211 way (Riviera entirely | |
| 220 independent of GPF) instead. The reasons were twofold: (1) there was no full | |
| 221 source for GPF and a painstaking reconstruction effort was required before we | |
| 222 could have our own working version of GPF in our gcc-built fw, and (2) I felt | |
| 223 more comfortable and familiar with following TCS211. | |
| 224 | |
| 225 Start-up process | |
| 226 ================ | |
| 227 | |
| 228 I mentioned earlier that every Nucleus task in our firmware gets created and | |
| 229 started either through Riviera or through GPF. All GPF tasks are created and | |
| 230 placed into the runable state in the Application_Initialize() context: the work | |
| 231 is done by GPF init code in gsm-fw/gpf/frame/frame.c, and the top level GPF | |
| 232 init function called from Application_Initialize() is StartFrame(). Thus when | |
| 233 Application_Initialize() finishes and the Nucleus thread scheduler starts | |
| 234 running for the first time, all GPF tasks are there to be scheduled. | |
| 235 | |
| 236 There is a compiled-in table of all protocol stack entities and the tasks in | |
| 237 which they need to run which (in our fw) lives under gsm-fw/gpf/conf and which | |
| 238 logically belongs to GPF. Canonically each protocol stack entities runs in its | |
| 239 own task, but sometimes two or more are combined to run in the same task: for | |
| 240 example, in the minimal GSM "voice only" configuration (no CSD, fax or GPRS) | |
| 241 CC, SMS and SS entities share the same task named CM. Unlike Riviera, GPF does | |
| 242 not support dynamic starting and stopping of tasks. | |
| 243 | |
| 244 As each GPF task starts running (immediately upon entry into Nucleus' scheduling | |
| 245 loop as Application_Initialize() finishes), pf_TaskEntry() function in | |
| 246 gsm-fw/gpf/frame/frame.c is the first code it runs. This function creates the | |
| 247 queue for messages to be sent to all entities running within the task in | |
| 248 question, calls each entity's pei_init() function (repeatedly until it succeeds: | |
| 249 it will fail until the other entities to which this entity needs to send | |
| 250 messages have created their message queues), and then falls into the main body | |
| 251 of the task: for all "regular" entities/tasks except L1, this main body consists | |
| 252 of waiting for messages (or signals or timeouts) to arrive on the queue and | |
| 253 dispatching each received message to the appropriate handler in the right | |
| 254 entity. | |
| 255 | |
| 256 Riviera tasks get started in a different way. The same Application_Initialize() | |
| 257 function that calls StartFrame() to create and start all GPF tasks also calls | |
| 258 create_tasks() (found in gsm-fw/riviera/init/create_RVtasks.c), the appinit-time | |
| 259 function for starting the Riviera environment. But this function does not | |
| 260 create and start every configured Riviera task like StartFrame() does for GPF. | |
| 261 Instead it creates a special helper task which will do this work once scheduled. | |
| 262 Thus at the completion of Application_Initialize() and the beginning of | |
| 263 scheduling the set of runable Nucleus tasks consists of all GPF ones plus the | |
| 264 special RV starter task. Once the RV starter task gets scheduled, it will call | |
| 265 rvm_start_swe() to launch every configured Riviera SWE (SoftWare Entity), which | |
| 266 in turns entails creating the tasks in which these SWEs are to run. | |
| 267 | |
| 268 Dynamic memory allocation | |
| 269 ========================= | |
| 270 | |
| 271 All dynamic memory allocation (i.e., all RAM usage beyond statically allocated | |
| 272 variables and buffers) is once again done either through Riviera or through GPF, | |
| 273 and in no other way. Ultimately all areas of the physical RAM that will ever | |
| 274 be used by the fw in any way are allocated when the fw is compiled and linked: | |
| 275 the areas from which Riviera and GPF serve their dynamic memory allocations are | |
| 276 statically allocated as char arrays in the respective C modules and placed in | |
| 277 the int.ram or ext.ram section as appropriate; Riviera and GPF then provide | |
| 278 API functions that allocate memory dynamically from these statically allocated | |
| 279 large pools. | |
| 280 | |
| 281 Riviera and GPF have entirely separate memory pools from which they serve their | |
| 282 respective clients, hence there is no possibility of one affecting the other. | |
| 283 Riviera's memory allocation scheme is very much like the classic malloc&free: | |
| 284 there is one large unstructured pool from which all allocations are made, one | |
| 285 can allocate a chunk of any size, free chunks are merged when physically | |
| 286 adjacent, and fragmentation is an issue: a memory allocation request may fail | |
| 287 even when there is enough memory available in total if it is too fragmented. | |
| 288 | |
| 289 GPF's dynamic memory allocation facility is considerably more robust: while it | |
| 290 does maintain one or two (depending on configuration) memory pools of the | |
| 291 traditional "dynamic" kind (like malloc&free, susceptible to fragmentation), | |
| 292 most GPF memory allocation works on "partition" memory instead. Here GPF | |
| 293 maintains 3 separate groups of pools: PRIM, TEST and DMEM; each allocation | |
| 294 request must specify the appropriate pool group and cannot affect the others. | |
| 295 Within each pool there is a fixed number of partitions of a fixed size: for | |
| 296 example, in TI's TCS211 GSM+GPRS configuration the PRIM pool group consists of | |
| 297 190 partitions of 60 bytes, 110 partitions of 128 bytes, 50 partitions of 632 | |
| 298 bytes and 7 partitions of 1600 bytes. An allocation request from a given pool | |
| 299 group (e.g., PRIM) can request any arbitrary size in bytes, but it gets rounded | |
| 300 up to the nearest partition size and allocated out of the respective pool. If | |
| 301 no free partitions are available, the requesting task is suspended until another | |
| 302 task frees on. Because these partitions are used primarily for intertask | |
| 303 communication, if none are free, it can only mean (assuming that the firmware | |
| 304 functions correcly) that all partitions have been allocated and sent to some | |
| 305 queue for some task to work on, hence eventually they will get freed. | |
| 306 | |
| 307 This scheme implemented in GPF is extremely robust in the opinion of this | |
| 308 author, and the other purely "dynamic" scheme is used (in the case of GPF) only | |
| 309 for init-time allocations which are never freed, such as task stacks - hence | |
| 310 the GPF-based part of the firmware is not suspectible at all to the problem of | |
| 311 memory fragmentation. But Riviera does suffer from this problem, and the | |
| 312 concern is more than just theoretical: one major user of Riviera-based dynamic | |
| 313 memory allocation is the trace facility (described in its own section below), | |
| 314 and my observation of the trace output from Pirelli's proprietary fw (which | |
| 315 appears to use the same architecture with separate Riviera and GPF) suggests | |
| 316 that after the fw has been running for a while, Riviera memory gets fragmented | |
| 317 to a point where many traces are being dropped. Replacing Riviera's poor | |
| 318 dynamic memory allocation scheme with a GPF-like partition-based one is a to-do | |
| 319 item for our project. | |
| 320 | |
| 321 Message-based intertask communication | |
| 322 ===================================== | |
| 323 | |
| 324 Even though all entities of the G23M protocol stack are linked together into | |
| 325 one monolithic fw image and there is nothing to stop them from calling each | |
| 326 other's functions and accessing each other's variables, they don't work that | |
| 327 way. Instead all communication between entities is done through messages, just | |
| 328 as if they ran in separate address spaces or even on separate processors. | |
| 329 Buffers for this message exchange are allocated from a GPF partition pool: an | |
| 330 entity that needs to send a message to another entity allocates a buffer of the | |
| 331 needed size, fills it with the message to be sent, and posts it on the recipient | |
| 332 entity's message queue, all through GPF services. The other entity simply | |
| 333 processes the stream of messages that arrives on its message queue, freeing each | |
| 334 message (returning the buffer to the partition pool in came from) as it is | |
| 335 processed. | |
| 336 | |
| 337 Riviera-based tasks use a similar mechanism: unlike G23M protocol stack | |
| 338 entities, most Riviera-based functional modules provide APIs that are called as | |
| 339 functions from other tasks, but these API functions typically allocate a memory | |
| 340 buffer (through Riviera), fill it with the call parameters, and post it to the | |
| 341 associated task's message queue (also in the Riviera land) to be worked on. | |
| 342 Once the worker task gets the job done, it will either call a callback function | |
| 343 or post a response message back to the requestor - the latter option is only | |
| 344 possible if the requesting entity is also Riviera-based. | |
| 345 | |
| 346 A closer look at GPF | |
| 347 ==================== | |
| 348 | |
| 349 There are certain sublayers within GPF which need to be pointed out. The 3 | |
| 350 major subdivisions within GPF are: | |
| 351 | |
| 352 * The meaty core of GPF: this part is the code under gsm-fw/gpf/frame in our | |
| 353 source tree. It appears that this part was originally intended to be both | |
| 354 project-independent (same for GSM, TETRA etc) and OS-independent (same for | |
| 355 Nucleus, pSOS, VxWorks etc). This is the part of GPF that matters for the | |
| 356 G23M stack: all APIs called by PS entities are implemented here, and so are | |
| 357 all other PS-facing functions such as startup. (PS = protocol stack) | |
| 358 | |
| 359 * OS adaptation layer (OSL): this is the part of GPF that adapts it to a given | |
| 360 underlying OS, in our case Nucleus. | |
| 361 | |
| 362 * Test interface: see the code under gsm-fw/gpf/tst_drv and gsm-fw/gpf/tst_pei. | |
| 363 This part handles the trace output from all entities that run under GPF and | |
| 364 the mechanism for sending external debug commands to the GPF+PS subsystem. | |
| 365 | |
| 366 GPF was a difficult step in our GSM firmware reintegration process because no | |
| 367 complete source for it could be found anywhere: apparently GPF was so stable | |
| 368 and so independent of firmware particulars (Calypso or LoCosto, GSM only or | |
| 369 GSM+GPRS, modem or complete phone with UI etc) that it appears to have been | |
| 370 used and distributed as prebuilt binary libraries even inside TI. All TI fw | |
| 371 (semi-)sources we have use GPF in prebuilt library form and are not set up to | |
| 372 recompile any part of it from source. (They had to include all GPF header | |
| 373 files though, as most of them are included by G23M C modules, and it would be | |
| 374 too much hassle to figure out which ones are or aren't needed, hence all were | |
| 375 included.) | |
| 376 | |
| 377 Fortunately though, we were able to find the sources for most parts of GPF: | |
| 378 | |
| 379 * The LoCosto source in TCS3.2_N5.24_M18_V1.11_M23BTH_PSL1_src.zip features the | |
| 380 source for the "core" part of GPF under gpf/FRAME - these sources aren't | |
| 381 actually used by that fw's build system (it only uses the prebuilt binary | |
| 382 libs for GPF), but they are there. | |
| 383 | |
| 384 * Our TCS211 semi-src doesn't have any sources for the core part of GPF, but | |
| 385 instead it features the source for the test interface and some "misc" parts: | |
| 386 under gpf/MISC and gpf/tst in that source tree - these sources are not present | |
| 387 in the LoCosto version from Peek. | |
| 388 | |
| 389 But one critical piece was still missing: the OS adaptation layer. It appears | |
| 390 that the GPF core (vsi_??? modules) and OSL (os_??? modules) were maintained | |
| 391 and built together, ending up together in frame_<blah>.lib files in the binary | |
| 392 form used to build firmwares, but the source for the "frame" part in the Peek | |
| 393 find contained only vsi_*.c and others, but not any of os_*.c. | |
| 394 | |
| 395 Thus we had to reconstruct GPF from the shattered bits and pieces we had. I | |
| 396 took the frame sources from Peek and the misc and tst sources from Sotovik, and | |
| 397 saw that they compiled w/o problems in our gcc environment. Attempting to link | |
| 398 any firmware that uses GPF would have been futile at this point, as it would | |
| 399 have failed with undefined references to os_*() functions. Then I had to do | |
| 400 the hard work: disassemble the missing os_??? modules from the binary libs in | |
| 401 the TCS211 version (hey, at least this one was known to work reliably) and write | |
| 402 new C code replicating the exact logic found in the disassembly of the known | |
| 403 working and fitting binary. This work is now mostly done (some non-essential | |
| 404 functions have been stubbed out to be revisited later), and the version of GPF | |
| 405 used by FreeCalypso is a significant work of reconstruction, not merely lifted | |
| 406 from a readily available source and plopped in. | |
| 407 | |
| 408 A closer look at L1 | |
| 409 =================== | |
| 410 | |
| 411 The L1 code is remarkable in how little intertie it has with the rest of the | |
| 412 firmware it is linked into. It is almost entirely self-contained, expecting | |
| 413 only 4 functions to be provided by the underlying OS environment: | |
| 414 | |
| 415 os_alloc_sig -- allocate message buffer | |
| 416 os_free_sig -- free message buffer | |
| 417 os_send_sig -- send message to upper layers | |
| 418 os_receive_sig -- receive message from upper layers | |
| 419 | |
| 420 It helps to remember that at the beginning of TI's involvement in the GSM | |
| 421 baseband chipset business, L1 was the only thing they "owned", while Condat, | |
| 422 the maintainers of the higher level protocol stack, was a separate company. | |
| 423 TI's "turnkey" solution must have consisted of their own L1 code plus G23M code | |
| 424 (including GPF etc) licensed from Condat, but I'm guessing that TI probably | |
| 425 wanted to retain the ability to sell their chips with their L1 without being | |
| 426 entangled by Condat: let the customer use their own GSM L23 stack, or perhaps | |
| 427 work out their own independent licensing arrangements with Condat. I'm | |
| 428 guessing that L1 was maintained as its own highly independent and at least | |
| 429 conceptually portable entity for these reasons. | |
| 430 | |
| 431 The way in which L1 is intertied into our FreeCalypso GSM fw is the same as how | |
| 432 it is done in TI's production firmwares, including both our TCS211 reference | |
| 433 and the TCS3.2 version from which we got our L1 source. There is a module | |
| 434 called OSX, which is an extremely thin adaptation layer that implements the | |
| 435 APIs expected by L1 in terms of GPF. Furthermore, this OSX layer provides | |
| 436 header file isolation: the only "outside" (non-L1) header included by L1 is | |
| 437 cust_os.h, and it defines the necessary interface to OSX *without* including | |
| 438 any other headers (no GPF headers in particular), using only the C language's | |
| 439 native types. Apart from this cust_os.h header, the entire OSX layer is | |
| 440 implemented in one C module (osx.c, which we had to reconstruct from osx.obj as | |
| 441 the source was missing - but it's very simple) which does include some GPF | |
| 442 headers and implements the OSX API in terms of GPF services. Thus in TI's | |
| 443 production firmwares and in our FC GSM fw L1 does sit on top of GPF, but very | |
| 444 indirectly. | |
| 445 | |
| 446 More specifically, the "production" version of OSX implements its API in terms | |
| 447 of *high-level* GPF functions, i.e., VSI. However, they also had an interesting | |
| 448 OP_L1_STANDALONE configuration which omitted not only all of G23M, but also the | |
| 449 core of GPF and possibly the Riviera environment as well. We don't have a way | |
| 450 to recreate this configuration exactly as it existed inside TI because we don't | |
| 451 have the source bits specific to this configuration (our own standalone L1 | |
| 452 configuration is implemented differently, see below), but we do have a little | |
| 453 bit of insight into how it worked. | |
| 454 | |
| 455 It appears that TI's OP_L1_STANDALONE build used a special "gutted" version of | |
| 456 GPF in which the "meaty core" (VSI etc) was removed. The OS layer (os_??? | |
| 457 modules implementing os_*() functions) that interfaces to Nucleus was kept, and | |
| 458 so was OSX used by L1 - but this time the OSX API functions were implemented in | |
| 459 terms of os_*() ones (low-level wrappers around Nucleus) instead of the higher- | |
| 460 level VSI APIs provided by the "meaty core" of GPF. It is purely a guess on my | |
| 461 part, but perhaps this hack was also done in the days before TI's acquisition | |
| 462 of Condat, and by omitting the "meaty core" of GPF, TI could claim that their | |
| 463 OP_L1_STANDALONE configuration did not contain any of Condat's "intellectual | |
| 464 property". | |
| 465 | |
| 466 In FreeCalypso we do have a way to build a firmware image that includes L1 but | |
| 467 not G23M: it is our own L1 standalone configuration, enabled with a | |
| 468 feature l1stand line in build.conf. However, because IP considerations don't | |
| 469 apply to us (we operate under the doctrine of eminent domain), we are not | |
| 470 replicating TI's gutting of GPF: *our* L1 standalone configuration includes the | |
| 471 full GPF (with OSX for L1 implemented in terms of VSI), but with a greatly | |
| 472 reduced set of tasks when G23M is omitted. | |
| 473 | |
| 474 Run-time structure of L1 | |
| 475 ======================== | |
| 476 | |
| 477 L1 consists of two major parts: L1S and L1A. L1S is the synchronous part where | |
| 478 the most time-critical functions are performed; it runs as a Nucleus HISR. The | |
| 479 hardware in the Calypso generates an interrupt on every TDMA frame (4.615 ms), | |
| 480 and the LISR handler for this interrupt triggers the L1S HISR. L1S communicates | |
| 481 with L1A through a shared memory data structure, and also sometimes allocates | |
| 482 message buffers and posts them to L1A's incoming message queue (both via OSX | |
| 483 API functions, i.e., via GPF in disguise). | |
| 484 | |
| 485 L1A runs as a regular task under Nucleus, and includes a blocking call (to GPF | |
| 486 via OSX) to wait for incoming messages on its queue. It is one big loop that | |
| 487 waits for incoming messages, then processes each received message and commands | |
| 488 L1S to do most of the work. The entry point to L1A in the L1 code proper is | |
| 489 l1a_task(), although the responsibility for running it as a task falls on some | |
| 490 "glue" code outside of L1 proper. TI's production firmwares with G23M included | |
| 491 have an L1 protocol stack entity within G23M whose only job (aside from some | |
| 492 initialization) is to run l1a_task() in the Nucleus task created by GPF for | |
| 493 that protocol stack entity; we do the same in our firmware. | |
| 494 | |
| 495 Communication between L1 and G23M | |
| 496 ================================= | |
| 497 | |
| 498 It is remarkable that L1 and G23M don't have any header files in common: L1 | |
| 499 uses its own (almost fully self-contained), whereas the G23M+GPF realm is its | |
| 500 own world with its own header files. One has to ask then: how do they | |
| 501 communicate? OK, we know they communicate through primitives (messages in | |
| 502 buffers allocated from GPF's PRIM partition memory pool) passes via message | |
| 503 queues, but what about the data structures in these messages? Where are those | |
| 504 defined if there are no header files in common between L1 and G23M? | |
| 505 | |
| 506 The answer is that there are separate definitions of the L1<->G23M interface on | |
| 507 each side, and TI must have kept them in sync manually. Not exactly a | |
| 508 recommended programming or software maintenance practice for sure, but TI took | |
| 509 care of it, and the existing proprietary products based on TI's firmware are | |
| 510 rock solid, so it is not really our place to complain. | |
| 511 | |
| 512 TI's firmwares from the era we are working with (the TCS3.2/LoCosto source from | |
| 513 20090327 from which we took our L1 and G23M and the binary libs version of | |
| 514 TCS211 from 20070608 which serves as our reference) also include a component | |
| 515 called ALR. It resides in the G23M code realm: G23M coding style, uses Condat | |
| 516 header files, runs as its own protocol stack entity under GPF. This component | |
| 517 appears to serve as a glue layer between the rest of the G23M stack (which is | |
| 518 supposed to be truly hardware-independent) and TI's L1. | |
| 519 | |
| 520 Speaking of ALR, it is worth mentioning that there is a little naming | |
| 521 inconsistency here. ALR is known to the connect-by-name logic in GPF as "PL" | |
| 522 (physical layer, apparently), while the ACI entity (Application Control | |
| 523 Interface, the top level entity) is known to the same logic as "MMI". No big | |
| 524 deal really, but hopefully knowing this quirk will save someone some confusion. | |
| 525 | |
| 526 Debug trace facility | |
| 527 ==================== | |
| 528 | |
| 529 See the RVTMUX document in the same directory as this one for general background | |
| 530 information about the debug and development interface provided by TI-based | |
| 531 firmwares. Our FreeCalypso GSM firmware implements an RVTMUX interface as well, | |
| 532 and the most immediate use to which it is put is debug trace output. In this | |
| 533 section I'm going to describe how this debug trace output is generated inside | |
| 534 the fw. | |
| 535 | |
| 536 The firmware component that "owns" the physical UART channel assigned to RVTMUX | |
| 537 is RVT, implemented in gsm-fw/riviera/rvt. It is a Riviera-based component, | |
| 538 and it has a Nucleus task that is created and started through Riviera. All | |
| 539 calls to the actual driver for the UART are made from RVT. In the case of | |
| 540 output from the Calypso GSM device to an external host, all such output is | |
| 541 performed in the context of RVT's Nucleus task; this task drains RVT's message | |
| 542 queue and emits the content of allocated buffers posted to it, freeing them | |
| 543 afterward. (The dynamic memory allocation system in this case is Riviera's, | |
| 544 which is susceptible to fragmentation - see discussion earlier in this article.) | |
| 545 Therefore, every trace or other output packet emitted from a GSM device running | |
| 546 our fw (or any of the proprietary firmwares based on the same architecture) | |
| 547 appears as a result of a message in a dynamically allocated buffer having been | |
| 548 posted to RVT's queue. | |
| 549 | |
| 550 RVT exports several API functions that are intended to be called from other | |
| 551 tasks, it is by way of these functions that most output is submitted to RVT. | |
| 552 One can call rvt_send_trace_cpy() with a fully prepared output message, and | |
| 553 that function will allocate a buffer from Riviera's dynamic memory allocator | |
| 554 properly accounted to RVT, fill it and post it to the RVT task's queue. | |
| 555 Alternatively, one can can rvt_mem_alloc() to allocate the buffer, fill it in | |
| 556 and then pass it to rvt_send_trace_no_cpy(). | |
| 557 | |
| 558 At higher levels, there are a total of 3 kinds of debug traces that can be | |
| 559 emitted: | |
| 560 | |
| 561 * Riviera traces: these are generated by various components implemented in | |
| 562 Riviera land, although in reality any component can generate a trace of this | |
| 563 form by calling rvf_send_trace() - this function can be called from any task. | |
| 564 | |
| 565 * L1 traces: L1 has its own trace facility implemented in | |
| 566 gsm-fw/L1/cfile/l1_trace.c; it generates its traces as ASCII messages and | |
| 567 sends them out via rvt_send_trace_cpy(). | |
| 568 | |
| 569 * GPF traces: code that runs in GPF/G23M land and uses those header files and | |
| 570 coding conventions etc can emit traces through GPF. GPF's trace functions | |
| 571 (implemented in gsm-fw/gpf/frame/vsi_trc.c) allocate a memory partition from | |
| 572 GPF's TEST pool, format the trace into it, and send the trace primitive to | |
| 573 GPF's special test interface task. That task receives trace and other GPF | |
| 574 test interface primitives on its queue, performs some manipulations on them, | |
| 575 and ultimately generates RVT trace output, i.e., a new dynamic memory buffer | |
| 576 is allocated in the Riviera land, the trace is copied there, and the Riviera | |
| 577 buffer goes to the RVT task for the actual output. | |
| 578 | |
| 579 Trace masking | |
| 580 ============= | |
| 581 | |
| 582 The RV trace facility invoked via rvf_send_trace() has a crude masking ability, | |
| 583 but by default all traces are enabled. In TI's standard firmwares most of the | |
| 584 trace output comes from L1: L1's trace output is very voluminous, and appears | |
| 585 to be fully enabled by default. I have yet to look more closely if there is | |
| 586 any trace masking functionality in L1 and what the default trace verbosity | |
| 587 level should be. | |
| 588 | |
| 589 On the other hand, GPF and therefore G23M traces are mostly disabled by default. | |
| 590 One can turn the trace verbosity level from any GPF-based entity up or down by | |
| 591 sending a "system primitive" command to the running fw, and another such command | |
| 592 can be used to save these masks in FFS, so that they will be restored on the | |
| 593 next boot cycle and be effective at the earliest possible time. Enabling *all* | |
| 594 GPF trace output for all entities is generally not useful though, as it is so | |
| 595 verbose that a developer trying to make sense of it will likely drown in it. | |
| 596 | |
| 597 GPF compressed trace hack | |
| 598 ========================= | |
| 599 | |
| 600 TI's Windows-based GSM firmware build systems include a hack called str2ind. | |
| 601 Seeking to reduce the fw image size by eliminating trace ASCII strings from it, | |
| 602 and seeking to reduce the load on the RVTMUX serial interface by eliminating | |
| 603 the transmission time of these strings, they passed their sources through an | |
| 604 ad hoc preprocessor that replaces these ASCII strings with numeric indices. | |
| 605 The compilation process with this str2ind hack becomes very messy: each source | |
| 606 file is first passed through the C preprocessor, then the intermediate form is | |
| 607 passed through str2ind, and finally the de-string-ified form is compiled, with | |
| 608 the compiler being told not to run the C preprocessor again. | |
| 609 | |
| 610 TI's str2ind tool maintains a table of correspondence between the original trace | |
| 611 ASCII strings and the indices they've been turned into, and a copy of this table | |
| 612 becomes essential for making sense of GPF trace output: the firmware now emits | |
| 613 only numeric indices which are useless without this str2ind.tab mapping table. | |
| 614 | |
| 615 Our FreeCalypso firmware does not currently implement this str2ind aka | |
| 616 compressed trace hack, i.e., all GPF trace output from our fw is in full ASCII | |
| 617 string form. I have not bothered to implement compressed traces because: | |
| 618 | |
| 619 * We have not yet encountered a case of the full ASCII strings causing a problem | |
| 620 either with fw images not fitting into the available memory or excessive load | |
| 621 on the RVTMUX interface; | |
| 622 | |
| 623 * Implementing the hack in question would require extra work: the str2ind tool | |
| 624 would have to be reimplemented anew, as of the original we have no source, | |
| 625 only a Windows binary, and requiring our free fw build process to run a | |
| 626 Windows binary under Wine is a no-no; | |
| 627 | |
| 628 * I don't feel like doing all that extra work for what appears to be no real | |
| 629 gain; | |
| 630 | |
| 631 * Having to run gcc with separate cpp and actual compilation steps with str2ind | |
| 632 sandwiched in between would be ugly and gross; | |
| 633 | |
| 634 * Having to keep track of which str2ind.tab goes with which fw image and supply | |
| 635 the right table to our rvinterf tools would likely be a pita. | |
| 636 | |
| 637 So we shall stick with full ASCII string traces until and unless we run into an | |
| 638 actual (as opposed to hypothetical) problem with either fw image size or serial | |
| 639 interface load. | |
| 640 | |
| 641 RVTMUX command input | |
| 642 ==================== | |
| 643 | |
| 644 RVTMUX is not just debug trace output: it is also possible for an external host | |
| 645 to send commands to the running fw via RVTMUX. | |
| 646 | |
| 647 Inside the fw RVTMUX input is handled by the RVT entity by way of a Nucleus | |
| 648 HISR. This HISR gets triggered when Rx bytes arrive at the designated UART, | |
| 649 and it calls the UART driver to collect the input. RVT code running in this | |
| 650 HISR parses the message structure and figures out which fw component the | |
| 651 incoming message is addressed to. Any fw component can register to receive | |
| 652 RVTMUX packets, and provides a callback function with this registration; this | |
| 653 callback function is called in the context of the HISR. | |
| 654 | |
| 655 In our current FC GSM fw there are two components that register to receive | |
| 656 external host commands via RVTMUX: ETM and GPF. ETM is described in my earlier | |
| 657 RVTMUX write-up. ETM is implemented as a Riviera SWE and has its own Nucleus | |
| 658 task; the callback function that gets called from the RVT HISR posts received | |
| 659 messages onto ETM's own queue drained by its task. The ETM task gets scheduled, | |
| 660 picks up the command posted to its queue, executes it, and sends a response | |
| 661 message back to the external host through RVT. | |
| 662 | |
| 663 Because all ETM commands funnel through ETM's queue and task, and that task | |
| 664 won't start looking at a new command until it finished handling the previous | |
| 665 one, all ETM commands and responses are in strict lock-step: it is not possible | |
| 666 to send two commands and have their responses come in out of order, and it makes | |
| 667 no sense to send another ETM command prior to receiving the response to the | |
| 668 previous one. (But there can still be debug traces or other traffic intermixed | |
| 669 on RVTMUX in between an ETM command and the corresponding response!) | |
| 670 | |
| 671 The other component that can receive external commands is GPF. GPF's test | |
| 672 interface can receive so-called "system primitives", which are ASCII string | |
| 673 commands parsed and acted upon by GPF, and also binary protocol stack | |
| 674 primitives. Remember how all entities in the G23M stack communicate by sending | |
| 675 messages to each other? Well, GPF's test interface allows such messages to be | |
| 676 injected externally as well, directed to any entity in the running fw. System | |
| 677 primitive commands can also be used to cause entities to send their outgoing | |
| 678 primitives to the test interface, either instead of or in addition to the | |
| 679 originally intended recipient. | |
| 680 | |
| 681 Firmware subsetting | |
| 682 =================== | |
| 683 | |
| 684 We have built our firmware up incrementally, piece by piece, starting from a | |
| 685 very small skeleton. As we added pieces working toward full GSM MS | |
| 686 functionality, the ability to build less functional fw images corresponding to | |
| 687 our earlier stages of development has been retained. Each piece we added is | |
| 688 "optional" from the viewpoint of our build system, even if it is absolutely | |
| 689 required for normal usage, and is enabled by the appropriate feature line in | |
| 690 build.conf. | |
| 691 | |
| 692 Our minimal baseline with absolutely no "features" enabled consists of: | |
| 693 | |
| 694 * Nucleus | |
| 695 * Riviera | |
| 696 * TI's basic drivers for GPIO, ABB etc | |
| 697 * RVTMUX on the UART port chosen by the user (RVTMUX_UART_port Bourne shell | |
| 698 variable in build.conf) and the UART driver for it | |
| 699 * FFS code operating on a fake FFS image in RAM | |
| 700 | |
| 701 If one runs this minimal "firmware" on a Calypso device, one will see some | |
| 702 startup messages in RV trace format followed by a System Time trace every 20 s. | |
| 703 This "firmware" can't do anything more, there is not even a way to command it | |
| 704 to power off or reboot. | |
| 705 | |
| 706 Working toward full GSM MS functionality, pieces can be added to this skeleton | |
| 707 in this order: | |
| 708 | |
| 709 * GPF | |
| 710 * L1 | |
| 711 * G23M | |
| 712 | |
| 713 feature gsm enables all of the above for normal usage; feature l1stand can be | |
| 714 used alternatively to build an L1 standalone image without G23M - we expect | |
| 715 that we may end up using a ramImage form of the latter for RF calibration on | |
| 716 our own Calypso hardware. | |
| 717 | |
| 718 ETM and various FFS configurations are orthogonal features to the choice of | |
| 719 core functionality level. | |
| 720 | |
| 721 Further reading | |
| 722 =============== | |
| 723 | |
| 724 Believe it or not, some of the documentation that was written by the original | |
| 725 vendors of the software in question and which we've been able to locate turns | |
| 726 out to be fairly relevant and helpful, such that I recommend reading it. | |
| 727 | |
| 728 Documentation for Nucleus PLUS RTOS: | |
| 729 | |
| 730 ftp://ftp.ifctf.org/pub/embedded/Nucleus/nucleus_manuals.tar.bz2 | |
| 731 | |
| 732 Quite informative, and fits our version of Nucleus just fine. | |
| 733 | |
| 734 Riviera environment: | |
| 735 | |
| 736 ftp://ftp.ifctf.org/pub/GSM/Calypso/riviera_preso.pdf | |
| 737 | |
| 738 It's in slide presentation form, not a detailed technical document, but | |
| 739 it covers a lot of points, and all that Riviera stuff described in the | |
| 740 preso *is* present in our fw for real, hence it should be considered | |
| 741 relevant. | |
| 742 | |
| 743 GPF documentation: | |
| 744 | |
| 745 http://scottn.us/downloads/peek/SW%20doc/frame_users_guide.pdf | |
| 746 http://scottn.us/downloads/peek/SW%20doc/vsipei_api.pdf | |
| 747 | |
| 748 Very good reading, helped me understand GPF when I first reached this | |
| 749 part of firmware reintegration. | |
| 750 | |
| 751 TCS3.x/LoCosto fw architecture: | |
| 752 | |
| 753 http://scottn.us/downloads/peek/SW%20doc/TCS2_1_to_3_2_Migration_v0_8.pdf | |
| 754 ftp://ftp.ifctf.org/pub/GSM/LoCosto/LoCosto_Software_Architecture_Specification_Document.pdf | |
| 755 | |
| 756 These TI docs focus mostly on how they changed the fw architecture from | |
| 757 their TCS2.x program (Calypso) to their newer TCS3.x (LoCosto), but one | |
| 758 can still get a little insight into the "old" TCS211 architecture they | |
| 759 were moving away from, which is the architecture I've adopted for | |
| 760 FreeCalypso. |
