Hack-o-Rocket software architecture

The overall software architecture is dictated by the modest specs of the
hardware.  When two 29F010 chips are used, the total flash bank is 256 KB
and it consists of 8 erase blocks of 32 KB each.  The MC68LC302 reset vector
is at the start.  The Hack-o-Rocket monitor goes into erase block 0 of the
flash, fits entirely in that one block and executes directly from flash.
Other programs can go into other erase blocks of the flash bank and run from
flash or from RAM as space permits.

There is only one Hack-o-Rocket monitor and it works on both CR201i and CR201s
hardware.  It deliberately avoids touching any of the DSL hardware which is
different between the two versions; the monitor thus performs no useful
function of its own as far as DSL goes and only serves to load and run other
programs which do use the DSL hardware.  The latter programs are obviously
specific to either the IDSL or the SDSL version.

Memory map

MC68x302 processors have a software-determined memory map.  The following
memory map is used on the Hack-o-Rocket, set up by the monitor:

Start		End		Resource
00000000	0000FFFF	RAM
00200000	0023FFFF	Flash memory
00600000	0061FFFF	CS2 block, further divided in hardware:
  608300	  60830F	  CS8900 I/O space
  610000	  610000	  LED register
  618000	  6181FF	  SDSL bitpump registers
00700000	00700FFF	MC68LC302 internal 4 KB DPRAM and register block
00800000	00800FFF	CS8900 memory space

This is the same memory map as used by CM's original firmware.

Initialization provided by the monitor

The monitor performs some basic system initialization before any other code
may be run from it, and other programs in the suite assume that this
initialization is done before they run.  This initialization consists of:

* Memory map established
* GPIO pins initialized to a sensible initial state
* MC68LC302 clock PLL set to the right frequency
* Interrupt and exception vector table set up in RAM
* Ethernet basic initialization (see below)
* Software timer set up (see below)
* SCC2 set up for console operation (see below)

Ethernet initialization and usage

As part of system initialization the HoR monitor does the necessary magic to
make the CS8900 Ethernet chip's registers accessible in 16-bit mode via both
memory and I/O space windows, reads the MAC address out of EEPROM and writes it
into the right location in the CS8900 memory page, but then stops there.  In
particular, the Ethernet interface is NOT made running.  The Ethernet IRQ line
is properly routed and un-tristated, but not enabled in the BusCTL register.

The only time the monitor makes real use of the Ethernet interface is the TFTP
command (load from a TFTP server into memory).  This command brings the
interface up, does its thing and shuts it back down.  Long-term bringup of the
Ethernet interface is left to operational code.

Software timer

The monitor programs MC68LC302's Timer 1 to interrupt every 50 ms.  It also sets
up a handler for this interrupt which decrements a 16-bit variable in the user
area of LC302's internal RAM (see below).  This variable is called sw_decr_timer
in the code.

sw_decr_timer is intended primarily for "local" delays and timeouts (set it to
something and wait for it to decrement past zero), not as a "global" long term
time base.  This strange design choice has been made for historical reasons:
much of the monitor code has been leveraged from the author's earlier work on
PowerPC processors, and that code made heavy use of the PowerPC decrementer
register in exactly the same manner.

The router code prefers a "global" long term time base (it's written as a very
ad hoc miniature OS kernel), so it installs its own timer interrupt handler
which increments a 32-bit variable that is initialized to zero only once.

Serial console implementation

In hardware terms the debug serial port is wired to MC68LC302's internal SCC2.
Configuring it in software for asynchronous operation at 9600 baud, 8-N-1
(settings matching CM's original debug output port and the general industry
standard) is quite straightforward, but given Motorola's BD architecture
(as contrasted to 16550 etc register-based serial ports) a set of software
conventions needs to be established in order to use it effectively, especially
across multiple programs (the monitor and various applications).

The Hack-o-Rocket software suite implements just such a set of conventions for
the console port on SCC2.  The monitor sets up 8 1-byte Rx buffers and 4
128-byte Tx buffers in the user area of LC302's internal RAM.  As explained in
more details below, they are part of a global data structure shared between all
programs in the suite.  There is a 1-to-1 correspondence between these buffers
and Rx/Tx BDs, i.e., each BD is set up once to point to its respective buffer
and never changed thereafter.

Motorola's BD architecture requires that the software side maintains pointers
to the "current position" in the RxBD and TxBD queues, i.e., knows which Rx
buffer to read next and which Tx buffer to fill next.  In the HoR software
architecture these pointers are part of the global data structure shared across
all programs in the suite.

The printf() family of functions in the HoR libc (see below) rely on the above
infrastructure being established by the monitor.  They implement the following
algorithm:

* Read the _printf_curbd variable (pointer into the TxBD queue) to figure out
  which buffer to fill now.

* Check the ready bit in the TxBD to see if it's free. If it's owned by the SCC,
  wait for it to become free.

* Format output text into the buffer and set the length field in the TxBD
  accordingly.

* Set the ready bit in the TxBD thereby handing it over to the SCC.

* Advance the _printf_curbd pointer with the proper wraparound.

This implementation has two important implications:

* Don't output more than 128 bytes in a single printf() call. putchar() is not
  a lower level primitive in this implementation, but equivalent to a printf()
  of one byte. (It acquires a Tx buffer just like printf() but sets the TxBD
  length field to 1.)

* If the serial console output isn't too voluminous (specifically when there is
  no wait upfront because all 4 Tx buffers are full) the printf() call doesn't
  have to wait for the output to go out at the serial baud rate, instead it is
  formatted at the CPU speed and goes out serially after printf() returns.

The Rx direction is more primitive.  The 8 1-byte Rx buffers and the associated
head pointer serve as an 8-character receive FIFO functionally equivalent to
those implemented in register-based UARTs like 16550.  getchar() and
getchar_poll() functions are provided which work in the straightforward manner.

One additional gotcha is that the Hack-o-Rocket libc printf() lacks the
traditional UNIXism of turning '\n' into "\r\n", so you need to put the latter
explicitly in your messages whenever you need a newline.  This design decision
was contraversial, but oh well, I'm the author of the code and can do whatever
I want.  We are an embedded system and don't have to follow other people's
standards.

Interrupt handling

The Hack-o-Rocket software suite makes moderate use of M68K interrupts.  The
CPU core has 8 interrupt priority levels; there are a bunch of LC302 internal
interrupt sources at IPL 4, Ethernet at IPL 6 and the DSL transceiver (SDSL
bitpump or ISDN chip) at IPL 1.

The system initialization performed by the monitor installs *some* handler for
every possible interrupt or exception vector.  With the exception of Timer 1
(described above) all of these handlers print a message on the console and drop
to the monitor prompt.  Dedicated handlers which print distinctive messages are
installed for some vectors, others are handled by a generic handler in which
case there is no direct way of determining which vector invoked the handler.

The handlers installed by the monitor for IRQ1 and IRQ6 are dedicated and print
distinctive messages, but are otherwise no different in that they
unceremoniously drop one to the monitor prompt.  No dedicated handlers with
distinctive messages are installed for most internal interrupts (i.e., the
generic default handler is installed), but they aren't enabled in IMR either.

It is the job of other programs run from the monitor to set up more useful
handlers for the interrupts they use.  If one returns to the monitor in one way
or another after running such a program and wants to restore the monitor's
original handlers for all vectors, the REINIT command will do that as part of
full system re-initialization.

The following routines are provided in HoR libc for managing the IPL:

* current_ipl() returns the current IPL.
* splN() (replace N with 0 thru 7) sets the IPL to N.
* splx(s) sets the IPL to s.
* splup(s) sets the IPL to the higher of s and the current IPL.

All spl* functions return the previous IPL.

SET IPL and SHOW IPL commands are also provided in the monitor; their action
is straightforward.  At the completion of monitor initialization the IPL is
left at 1.  It is not 0 because the IRQ1 line may be asserted by the bitpump on
the SDSL rocket and the monitor has no way to clear it being agnostic to the
SDSL/IDSL distinction.

Printf IPL kludge

The neat serial printf() mechanism and its implementation described earlier are
tarnished a little by a kludge having to do with interrupts.  Underneath this
kludge is a philosophical dilemma: should printf() be usable from interrupt
handlers or not?  If printf() is called from an interrupt handler, it must
always raise the IPL (with splup()) for the duration of its execution to that
of the highest-IPL interrupt handler using the service, even when called by
lower IPL users, otherwise one gets bitten by re-entrancy problems.  (In other
words, use of the printf buffer queue needs to be serialized.)

If we were to keep printf() clean of all IPL gunk, it would be unsafe to use
from any interrupt handlers.  If we made it do splup(7), it would be safe to
use from any interrupt handler, but having all interrupts disabled for its
entire duration (both the formatting and the wait in the case all 4 buffers are
busy) may be an unacceptable penalty for those applications which don't need to
print from interrupt handlers.

As one size does not fit all in this area, the following kludge has been
implemented as a compromise: each program using printf() must define a const
u_short variable named _printf_ipl; its value tells printf() what IPL it should
splup() to.  Setting it to 0 is equivalent to disabling the entire kludge as
splup(0) is a no-op.

Operational code autoboot

In the Hack-o-Rocket project the term "operational code" means any code that is
intended to make the HoR usable to an end user somewhat similar to the original
unhacked rocket, rather than a mere hacking toy.  At the present time the only
operational code is the IP router.

One rather essential feature of operational code is that it should run
automatically on power-up without any action from the serial console, i.e., the
rocket should be able to run headless once it's been put into production
operation.

The following convention has been established: if operational code is present,
it must be programmed into flash at 208000 (at the beginning of the next erase
block after the monitor), there must be a magic signature 'OPER' at 208000 and
the entry point must be at 208004.  The monitor checks for this magic signature
on boot and if it is found, the entry point at 208004 is called after the user
is given 5 s to abort with ^C if desired.  If the magic signature is not found
or if the user aborts autoboot with ^C, the debug monitor is entered.

The operational code entry point is invoked via a subroutine call, but it will
typically reset the stack pointer as first order business.

Main RAM usage

RAM is a precious resource on the Hack-o-Rocket as there is only 64 KB of it.
(To be more precise, there are two kinds of RAM on the rocket: 64 KB of main
RAM and 576 bytes of LC302 internal RAM "user area".  This section discusses
the former; the latter is discussed in the next section.)

The lowest 1024 bytes of RAM are reserved by the M68K architecture for the
interrupt and exception vector table.  We do not use this area for any other
purpose and will thus not consider it any further as it basically does not
qualify as general purpose RAM.

The monitor does not maintain any persistant global variables in the main RAM,
instead it is almost completely stack-based.  As explained in more detail in
mon302.doc, the monitor always executes in an exception context which is
basically a stack frame; all of its internal data storage is then built up on
the stack.  The initial stack is set up at the highest addresses of the main
RAM.

The only main RAM usage by the HoR monitor aside from the stack consists of a
few static variables used by the TFTP code.  These variables are located
starting at address 400 and are used only during the TFTP load operation.  They
do not have any "global" persistent function and the user is free to overwrite
them; the only caveat associated with these memory locations is that they are
overwritten during TFTP load operations and cannot be part of the destination
of such load operations.

The software convention established is that main RAM usage by application
programs starts at address 600.

LC302 internal RAM usage

Just like the full MC68302 IMP, LC302 has 1152 bytes of internal RAM divided
into 576 bytes of "user RAM" and 576 bytes of parameter RAM for the CP
microcode.

The HoR software suite uses the "user RAM" area for a global data structure
shared between all programs in the suite.  This data structure relates mostly
to the serial console port and consists of the following:

* printf buffers (512 bytes)
* serial receive buffers read by getchar() (8 bytes)
* current pointers into the two queues above
* a few miscellaneous global variables tacked on:

last_memacc_addr:	Last memory location accessed by the monitor
			(EXAMINE and DEPOSIT commands)
last_memacc_size:	Size of that last access (byte/word/longword)
sw_decr_timer:		Decrementing timer described earlier
tftp_local_ip:		Hack-o-Rocket's own IP address used for TFTP load
			operations.

Compilation and linking

C and assembly source modules are first compiled into ELF.  Executable code is
placed into the .text section; strings, tables and other constant data intended
to go into the binary image go into the .rodata section.  These two sections
are merged together by the final linker script (see below).  In the case of most
run-from-flash programs this code and constant data will reside in flash and
execute in place, hence it needs to be treated as unmodifiable.

RAM-based global and static variables are generally treated as uninitialized
data and reside in the .bss section in the ELF.  The .data section (initialized
data) isn't currently used and no bulk zeroing of uninitialized data is done;
the convention is that it's better to treat all these variables as having an
undefined initial state.

For each program a final ELF image is produced by running m68k-elf-ld with a
custom linker script; m68k-elf-objcopy then converts this final ELF into the
actual raw binary for the target.  The custom linker scripts define sections
and symbols for resources outside the program itself: the LC302 on-chip block,
the external devices on the board and the M68K exception vector table
(m68000_vectors).  Specifically, an ELF section named onchip_block is created
inside which symbols are defined for all resources in that block; symbols for
the M68K vector table and for board level devices are defined as absolute
(outside all sections).

3 different linker scripts are used currently.  The monitor has its own unique
linker script and two other scripts are provided in the libmem subdirectory for
RAM-based and flash-based application/utility programs.

The global data structure in the LC302 internal user RAM is introduced into
every program by way of the libmem/ocr.o module (assembled from libmem/ocr.s)
linked into every program.  This module defines the data structure as an ELF
section named ocr (for on-chip RAM); the linker scripts then take care of its
correct placement in terms of final absolute addresses.

The result of all these tricks is that for each program all resources outside
of the program itself (whether they are defined by the M68K architecture, by
the MC68LC302 chip, by the CopperRocket hardware or by the Hack-o-Rocket
software architecture conventions) appear as global variables with specific
fixed names to C code.

Everything is linked with absolute addresses.  No position-independent code is
used as it's a pain on the MC68000 CPU and was deemed not to be worth the
effort.

Source tree overview

include subdirectory

The convention used is that all header files in the top level include directory
contain definitions pertaining to external reality, i.e., objective reality
outside arbitrary HoR software constructs and conventions.  This external
objective reality consists of hardware registers and chip-defined data
structures as well as definitions for communication protocols we need to speak.

libc

Our libc is based on the 4.3BSD-Quasijarus one, heavily stripped down, a few
functions rewritten in M68K assembly and a few non-standard functions added.

Header files with definitions which serve to provide a comfortable programming
environment but which are pure software constructs which we are free to change
(i.e., not hardware register definitions or externally spoken communication
protocols) are placed in the libc directory instead of include.

libidsl

This library contains utility functions for working with resources specific to
the IDSL version of the rocket: SCP and the MC145572 ISDN chip.

libmem

This is not an actual library.  The libmem directory contains the ocr.s module
and linker scripts described above.

libtftp

Rather self-explanatory.  This library is only linked by the monitor and the
flash writer utility; programs which need more substantial Ethernet
functionality like the IP router should not link it.

libutil

This library contains various utility functions.

libzw

This is our port of the ZipWire bitpump control library to the 68x302 processor
family and the UNIX development environment.  It is described on this web page:

http://ifctfvax.Harhan.ORG/OpenSDSL/2B1Q/btsw.html#libzw

It is obviously specific to the SDSL rocket.
