Sunday, 23 February 2014

Design for a real Single Board Computer

Though the previous PCB design was a vaguely useable computer, it required devices attached to an expansion port to be even moderately useful. A real Single Board Computer should be useful with just the single board. Combining a serial interface with an IDE interface should make the single PCB a computer in its own right, albeit one that only has a serial terminal.

As my previous post described, I planned to use a single CPLD to perform all the glue-logic and associated functions. As it turned out, this was not possible due to the number of pins required. Roughly speaking I require the following IO pins:
  • 6 address lines for decoding
  • RAM, ROM, DUART, IDE, chip selects (4 in total)
  • 6 expansion IO selects (to give 6 additional devices on expansion PCBs)
  • R/W, E, and Q inputs
  • READ and WRITE outputs
  • /RESET in and RESET out (for the DUART)
  • 8 data lines
  • 8 IDE latch lines
  • 5 bank switching address outputs
  • /NMI, /IRQ and /FIRQ outputs
  • DUART and IDE IRQ inputs
  • 4 expansion interrupt inputs
This makes 53 IO lines; too many for a 44 pin PLCC XC9572, which has about 34 useable IO lines. So I will need two CPLDs.

The logical thing to do is put the things which require the data bus on there own CPLD, and leave the address decoding and related functions on the other. Also, since I now have about 68 pins I have the luxury of using a few more:
  • ROM write protect inputs (whole EOM and bootcode) so the WRITE line can be filtered so some parts of the ROM can be write protected
  • 8 address line inputs instead of 6
  • 8 expansion selects instead of 6
This also means I will have to route some lines (like the READ/WRITE lines) between the two CPLDs, using more pins.

Ideally I would use a single PLCC84 CPLD. It would certainly make the PCB a bit simpler, and I would not have to feed lines between two CPLDs. But since I want to try the circuit on breadboard before I make up a PCB I will have to use two 44 pin devices.

The circuit might get some small changes, but more or less it will be the following:


The DUART will drive two serial interfaces: one available via a 3 pin PCB header, and the other one (as well as being on a 3 pin header) will also be available via a standard RS232 DB9 connector, which is driven with a MAX232. For fun (and for debugging) the two non serial outputs on the DUART (OP1 and OP2) are attached to two LEDs.

One thing I have not yet figured out is how I will power the board. Using a USB port for power is a bit odd and not very in keeping with the rest of the computer. So I might use a standard barrel connector instead. On the other hand, USB is very convenient.

I am still waiting for two more PLCC44 to DIP44 adapters. Once I have them I will start making up the computer on breadboard. In the meantime, since I'm pretty confident the computer will work first time on breadboard, I have begun to layout the PCB. This has proven to be much more of a challenge then the previous iteration of the computer, mostly because of the PLCC44 sockets. The board layout is about 90% done.

Roughly speaking the jobs to be done, and the order they need to be done in is:
  1. Assemble the computer on breadboard
  2. Write the VHDL code for the CPLDs (the simplest possible at first, without the 16bit IDE latch, interrupt routing etc)
  3. Test out the computer
  4. Finish the PCB
  5. Get the PCB made
I am determined, this time round, to make the computer easily expandable with addon PCBs. These will be attached on top of the main computer board, and could even be stackable. Using PCB headers on the top side of each board, and female pin sockets on the underside should mean I can stack the expansion "daughter boards" one on top of another. Being ever the optimist, I have a couple in mind already:
  •  Sound, video, and some kind of keyboard interface
  • General expansion, 65SPI, VIA etc
I can dream!

Monday, 3 February 2014

A dissasembler and thoughts on a new core computer

First though, the integration of the seven segment display with the 6809 circuit.

This turned out to be easier then expected. I added a bus driver component which replaces the counter to drive the mux, and in turn is driven (and drives) the databus. It also uses the read/write line and chip select control lines.

The code for the bus driver is as follows:

entity busdriver is
  port ( CS : in STD_LOGIC;
R : in STD_LOGIC;
W : in STD_LOGIC;
DATA : inout STD_LOGIC_VECTOR (7 downto 0);
RESET : in STD_LOGIC;
X : out STD_LOGIC_VECTOR (7 downto 0);
OE : out STD_LOGIC);
end busdriver;

architecture Behavioral of busdriver is
signal WRITING : STD_LOGIC;
signal READING : STD_LOGIC;
signal LATCH : STD_LOGIC_VECTOR (7 downto 0);
begin
WRITING <= CS nor W;
READING <= CS nor R;

process (RESET, WRITING)
begin
if (RESET = '0') then
LATCH <= x"00";
elsif (WRITING'Event and WRITING = '1') then
LATCH <= DATA;
end if;
end process;

X <= LATCH;
OE <= '1' when (READING = '1') else '0';
end Behavioral;

You can see here that a NOR operation is used with the write and chip select. This will then be low (active) only when both the select and write lines are low. When this happens the databus lines are copied in to the "latch" which feeds out and onto the mux.  In addition to write handling, the bus driver also outputs an "output enable" line, which is low when the chip select and read lines are low.  This is used by the top level architecture:

entity main is
    port ( CS : in STD_LOGIC;
           R : in STD_LOGIC;
           W : in STD_LOGIC;
           DATABUS : inout STD_LOGIC_VECTOR (7 downto 0);
           RESET : in STD_LOGIC;
           DIGIT_ADDR : in STD_LOGIC; -- Which digit to show (Mux input)
           DIGIT_A : out STD_LOGIC;   -- Show digit A (from Mux)
           DIGIT_B : out STD_LOGIC;   -- Show digit B (from Mux)
           EN : in STD_LOGIC;       -- Decoder enable
           SEGS : out STD_LOGIC_VECTOR (6 downto 0)); -- 7 segment output
end main;

architecture Behavioral of main is
...
    signal LATCH : STD_LOGIC_VECTOR(7 downto 0);
signal OE : STD_LOGIC;
signal MUX_OUT : STD_LOGIC_VECTOR(3 downto 0); -- 4 bits from Mux
signal NOT_SEGS : STD_LOGIC_VECTOR(6 downto 0); -- 7 segment outputs
begin
bd1: busdriver port map (CS, R, W, DATABUS, RESET, LATCH, OE);
mux1: twobyfourmux port map (LATCH(3 downto 0), LATCH(7 downto 4), DIGIT_ADDR, MUX_OUT, DIGIT_A, DIGIT_B);
decoder1: sevensegdecoder port map(MUX_OUT, NOT_SEGS, EN);
SEGS <= not NOT_SEGS;
DATABUS <= LATCH when (OE = '1') else (others => 'Z');
end Behavioral;

Note that the databus is "inout" so bi-directional. In some ways it would have been cleaner to put both the read and write databus logic in the databus component, but I could not get this to compile. It seemed like inout lines have to be at the top level. More research needed there.

In summary this worked very well, and first time too. Here's a picture of some extremely messy breadboards:



This was very much a quick bodge. Note the RTC IC on the bottom right, which has no SPI controller to drive it.

There's not a great deal more to say about this really. I did play about with a few additional ideas: I made it so the displayed value didn't switch straight to the number put on the bus, but instead "clocked up" (or down) to the number from the current value. This involved using an additional latch, to hold the value read from the databus. Then on the change in the "count up or down" line, a comparison was done and if the number on the display was more then the number pushed onto the databus, then the display would count down, and likewise for counting up. The method used to implement the comparison is probably the most interesting aspect here. Initially I used the VHDL < and > operators. This turned out to be very expensive in terms of gates used in the CPLD, so in the end I switched to an explicit subtraction and then a check of the sign bit (bit 7) of the subtraction result. The comparison can then be done with a relatively cheap subtraction operation, and a bit test.  The code for this part is here:

architecture Behavioral of mover is
    signal COUNT : STD_LOGIC_VECTOR (7 downto 0);
    signal DIFF : STD_LOGIC_VECTOR (7 downto 0);
begin
    DIFF <= TARGET - COUNT;
    process (MOVE_CLK, RESET)
    begin
        if (RESET = '0') then
            COUNT <= x"00";
        elsif (MOVE_CLK'Event and MOVE_CLK = '1') then
            if (DIFF /= x"00") then
                if (DIFF(7) = '0') then
                    COUNT <= COUNT + 1;
                else
                    COUNT <= COUNT - 1;
                end if;
            end if;
        end if;
    end process;
    X <= COUNT;
end Behavioral;

It is also possible to read back the digits on the display as they are counting, such that repeated reading the value shows it reaching its "target" value. This proves that I can put a CPLD on the databus for both reads and (the simpler to implement) writes.

This leads me onto thinking about a new design for the "core" computer, ie. one that includes a CPLD for all the glue logic functions.

I have come up with a list of components for the core board and it is as follows:
  • 8Mhz MC68B09
  • 16KB EEPROM
  • 512KB SRAM in 16KB banks
  • Glue logic in an XC9572
    • Including bank switching port for switching banks on the 512KB SRAM
    • And an 8 bit latch for driving the IDE port in 16bit mode
  • DUART along with a MAX323 and DB9 socket
    • The second DUART port will be for other "auxiliary" devices and will have a TTL level header
  • IDE interface
  • Expansion interface
The CPLD is the most interesting aspect. I'm hoping I can fit all of the following logic and registers inside a XC9572:
  • Address decoding for all devices on the main board, plus some lines for the expansion interface
  • Read/Write line generation
  • Reset control (possibly)
  • Interrupt routing
  • Bank switching port
  • Latches for the IDE port so it can read and write 16bits at a time
Bank switching will allow my computer to have a whole half megabyte of RAM, albeit accessed in 16KB slices. The idea is half the 32KB of address space set aside for RAM will be always mapped to the same fixed location in the 512KB RAM, whilst the other half will be free to be mapped to one of the 32 (512KB/16KB) banks. The fixed half will likely be the low half, since this is where the stack is. Actual control of the banking will be performed by writing a 5 bit value into a register in the CPLD. I've not quite worked out the logic yet, but the beauty of the CPLD is I can do this after I've laid out the PCB.

Another thing the CPLD will be responsible for is interrupt routing. So far my computer has steered clear of interrupts, but it is high time I tackle this feature of the 6809. Again, using the CPLD I can "worry about this in the code" by having all my device interrupt lines terminate at the CPLD, and have it generate signals for the 3 interrupt lines in the 6809. Most likely the CPLD can simply OR a bunch of device interrupt lines into (say) the main IRQ line. But until I'm ready to write the interrupt handling code the VHDL for that part of the glue logic can just hold the 6809 interrupt lines high.

Finally I'm keen to get my IDE interface working with every type of IDE device, including harddisks. To do this means my IDE port must handle 16bit transfers. But my 6809 only has an 8 bit databus, so to do this requires some additional latches to store the "high byte" of both read and writes. These latches can then be read after the "low byte" comes directly from the IDE port. The CPLD should be perfect for this, assuming I have enough registers and pins available.

I've started designing the circuit for my new core computer, but it's not quite there yet. Hopefully from the description you can see roughly what I'm trying to achieve.

In other news, I have been busy implementing another monitor feature: a disassembler. All good monitors had these. I also fancied a software challenge after spending several weeks on programmable logic. Mine is fairly basic (though what does a fully features disassembly for an 8bit CPU ever have to do?); just point it at some memory and tell it how many instructions to disassemble and it will print:
  • The memory address (label)
  • The mnemonic and any parameters
  • The hex for the memory values in the whole instruction
  • The printable ASCII for the above
The disassembler is clever enough to calculate and display absolute addresses when showing relative addresses in branch instructions, and it shows signed and unsigned values at the right point.

Coding wise, this is easily the most complex and largest piece of 6809 assembly I've yet written. It uses asxxxx macros to define arrays of "structures" that make up lookup tables. It also uses references (pointers) to the different tables, some of which contain subroutine addresses. The disassembler is mostly data-driven, with nearly all of the instruction decoding being driven by lookup tables. The code needs some further cleanups but is mostly there, if a bit rough in places  At some point I want to go back over the indexed mode decoding, which is by far the most complicated aspect of the 6809 instruction set, and thus this disassembler.

All the work was done with only a listing of opcodes to hand, namely this document: http://public.logica.com/~burgins/emulator/com/m6809.html

As an illustration of its use, here is a fragment from the monitor code. In this case here is the routine from the disassembly itself (so we are disassembling a disassembler, hmm) which deals with the processing of the stacking opcodes (pshs, puls, pshu, pulu) and how an immediate byte is turned into a handy list of registers:

stkimmedhandle: ldb #8                  ; 8 shifts for a byte
                lda opcode              ; get the original opcode
                bita #0x02              ; bit 1 0 means s stacking , else u
                beq sstacking           ; it is pshs or puls
                ldy #ustackingtab       ; so set the table pointer
                bra stackingstart       ; hop hop
sstacking:      ldy #sstackingtab       ; it is pshu or pulu
stackingstart:  lda ,u                  ; advance at the end not now
stackingloop:   rora                    ; rorate into carry
                bcc endstacking         ; if 0, then we are not stacking
                ldx ,y                  ; deref to get the reg string
                lbsr outputappend       ; append the reg name
                ldx #commamsg           ; and we need a comma
                lbsr outputappend       ; so add that
endstacking:    leay 2,y                ; eitherway move index along
                decb                    ; and decrement bit counter
                bne stackingloop        ; see if there is more bits
                tst ,u+                 ; now we can advance to next opcode
                beq stkimmedout         ; if there was nothing to stack
                lbsr outputbackone      ; don't move the cursor back one
stkimmedout:    rts                     ; out

And here is a screenshot of the disassembler in use:


Obviously the symbolic labels are missing, and all values are shown in hex, but the code is mostly readable, if somewhat hard to understand due mostly to the lack of comments.

Writing the disassembler was useful because I learned a bit more about the 6809 instruction set. For instance LDY requires an extra byte compared to LDX. Previously I was treating the two index registers the same.

The code, as always, is on github.

Next up I will finalise the core computer circuit diagram and then start on the PCB layout.