Saturday, 7 March 2015

FPGA preliminaries

This last few months has been a bit "scrappy" on the projects front. I've not made as much progress as I'd hoped but have started down several different fronts and have done a fair amount of reading.

To learn about implementing a DMA controller, I initially started working on a new 6809 computer core on breadboard. In the end I had a better idea: since I had five PCBs made up for my computer main board, I decided to make some board "hacks" to allow me to use my existing board for researching approaches to DMA. The only real problem with using my existing board for DMA experiments is it doesn't have connections between the low half of the address bus and the CPLD. But what it does have is eight CPLD lines that were designed to be used as chip selects. It also has the low half of the address bus on the expansion connector. Therefore if those pins can be joined together the CPLD will have access to the entire address bus. Other points in the circuit can easily be linked through to the CPLD via other pins on the expansion connector:


Whilst there are several ways to interleave access to the busses for the DMA controller and the CPU, the simplest approach is to halt the CPU whilst the DMA transfer is in progress. This is more useful then it sounds, since a byte can be read from an address and then written to another address far quicker in hardware then it can in MPU software. Only four clock edges (2uS at 2Mhz clock) are needed for the transfer of a byte vs about 10uS for the same operation in 6809 code, since index registers have to incremented, and compared etc.

Actually controlling the transfer is done in several steps:

First MPU code needs to setup the transfer. In my trivial design only whole 256 byte pages can be copied, so a source and destination is set by writing to two registers. This is the top byte of the source and destination addresses (the page address, if you will).

Next the MPU writes to a "start transfer" register.

The CPLD then asserts /HALT. The 6809 won't stop right away (it needs to finish the current instruction) instead it will set the halted state later on by setting the BA and BS pins to 1. These are connected to the CPLD using the two white wires in the above picture. At this point a page transfer is started. There are four steps to a byte transfer, each happens on a different E clock edge:
  1. First the source address is put in the address bus.
  2. Then the databus is read.
  3. Then the destination address is put on the address bus.
  4. Finally the databus is written to.
The byte within the page is then incremented and the transfer continues until the end of the page. After which time /HALT is released so the MPU can continue.

I implemented this and it seemed to work quite well. I improved the design slightly by having the DMA transfer interleaved with the MPU running such that the page transfer was chunked into four pieces. This meant the MPU could do stuff whilst the transfer was going on. All in all I was quite pleased with my little prototype. I managed to make a screenshot of a waveform capture:



This shows nicely the interleaved DMA and CPU access to the busses. When this screenshot was made the VHDL had an error which resulted in a further byte being transferred at the end of the page. This is the cause of the "blip" at the tail end of the capture.

The fact that the transfers were all done in whole pages is of course a massive limitation. 

Unfortunately it wasn't through choice, but rather because of the limitations of the XC95108 CPLD. There wasn't room for a better DMA controller design in the programmable logic since using arbitrary addresses for the source and destination address of the transfer meant three 16 bit counters would be needed (source, destination and length) instead of a single 8 bit counter in the page transfer version. This was and is obviously a big problem. Also an MMU was going to require even more logic since it needs to hold page tables etc.

Another problem is the number of IO pins I will need. The XC95108 has barely enough in my current computer, but the DMA controller and MMU will mean I need lots more. This means either using multiple PLCC84 devices or going to TQFP surface mount packaging, something I'm very nervous about due to the tinyness of the pins and the PCB design challenges.

After doing some reading over the course of several weeks, I made a list of different options and listed out the pros and cons for each one:

Multiple XC95108s/72s CPLDs

Good:
  • Known technology and software (this is also a bad)
  • Easy mounting with PLCC
  • Got the parts already

Bad:
  • Large board required
  • Even with (say) 3 PLCC parts it wouldn't be possible to do everything, eg a nice MMU. DMA controller would be limited

XL95288XL in TQFP208

Good:
  • No startup problem like FPGA
  • There shouldn't be 5V problems

Bad:
  • Still can't do an MMU, though decent DMA controller is just possible.
  • Probably still requires multiple CPLDs
  • Tricky mounting, if I was to attempt to attach it directly to the PCB instead of using an adapter

Spartan 1 FPGA (datasheet)

Good:
  • Can be prototyped with PLCC parts
  • Just enough logic for an MMU, I think.
  • 5V

Bad:
  • Requires using some very old non-free software which I can't find anywhere
  • FPGA startup problem needs solving
  • Needs a new programmer
  • Very obsolete hardware
  • May require a CPLD for other core logic 

Spartan 2 FPGA (datasheet)

Good:
  • Loads of logic. The smallest one gives just enough in 100 pins, but I need more pins then that
  • Easy to obtain on eBay
  • Supported by ISE 10, which is modern

Bad:
  • No PLCC - 0.5mm pitch pins so harder prototyping
  • Would need 2.5V and 3.3V connections
  • FPGA startup problem needs solving
  • Possible problems mixing 5V and 3.3V though Spartan 2 is 5V tolerant

Atmel AT40K FPGA (datasheet)

Good:
  • 5V FPGA
  • Appears to be a current product
  • Easy prototyping in PLCC

Bad:
  • Very hard to get hold of
  • Requires paid for software
  • Atmel software looks crappy and does no synthisis
  • FPGA startup problem

Altera Flex 6000 FPGA (datasheet)

Good:
  • 5V
  • Bigger ones have enough logic for decent DMAC and MMU, though 0.5mm TQFP
  • Almost current software

Bad:
  • Rather difficult to get hold of
  • Need to learn the Altera software
  • Needs a new programmer
  • No PLCC versions so protyping requires SMT part
In the end it seemed obvious that the Altera FPGAs was the way to go.

The first thing I did was download the software, Altera Quartus. A slightly old version of the software was needed to be compatible with the older parts, but it appears to work well. After importing the VHDL for my prototype DMA controller, and modifying it to support arbitrary addressing two things were apparent.

Firstly VHDL was not the same everywhere. I had to make some small coding changes to make the design build.

Secondly the DMA controller fit easily in the Flex 6000 I had in mind for my computer the TQFP144, with lots of room to spare.

The next thing to do was to buy a programmer, since each vendor appears to have there own one and the Xilinx USB  Platform I already had was clearly not compatible. I opted for a programmer that came with a small Cyclone FPGA development board. I did this mostly because the dev board only added a few pounds to the cost of the board, and also so I would get some experience of working with FPGAs on a dev board I knew would work because some else made it. Here's a picture of the dev board with the programmer:


To be specific, the dev board I purchased has a EP1C3T144. This is more advanced then the Flex 6000, but not massively so. Of course it isn't a 5V part, so it is not very suitable for use in my micro made of mostly 30 year old parts. The board is fairly simple. It has an oscillator, LEDs, buttons and a DS1305 real time clock, for experiments. It also has a flash to hold the design. Documentation consisted of a schematic which I was eventually able to find on a random forum post.

After receiving the board and programmer in the post I set about playing with it and learning about the main difference between FPGAs and CPLDs, that is where the design is stored.

Though there are now exceptions, CPLDs store the design in flash memory in the IC. This means the logic is available to use the moment the device is powered up. With FPGAs, the design is stored in an external flash memory IC. At power up the design bit stream is transferred to the FPGA, usually via a serial protocol. This takes anything up to a few hundred milliseconds, for the bigger FPGAs, and makes FPGAs less suitable for some applications like glue logic.

With the Altera software and my new dev board it is possible to either program the separate flash, in which case the design is persistent and will be loaded into the FPGA at start-up, or the FPGA directly. If the FPGA is programmed then the design uploaded will be active in the device only until power is removed.

To verify I understood what was happening on the board I wrote a simple frequency divider, driven by the on-board oscillator, and had the LEDs count up. This was trivial to do. One thing I would like to do sometime is "port" my up and down counter, described in a previous blog post. It should work quite well. As an exercise I might even implement the real time clock with buttons for setting the time etc.

Anyway, now that I had figured out roughly how to use the Altera software and programmer, the next stage was to order a Flex 6000 and some other parts. My plan was to make a PCB which would be a development board for the Flex FPGA. It would consist of:
  • FPGA
  • ATMega8 microcontrollers for generating clocks etc
  • LEDs
  • Buttons
The plan was to mount the FPGA to an adapter board, which brings out the 144 pins into DIP headers arranged in a square, and then attach this to my home made dev board PCB. The adapter boards, five of them, were ordered from eBay.

It was only after ordering the parts that I noticed a big problem when reading, in more detail, the datasheets: the Flex 6000 supports only One Time Programmable flashes. This means it would be impossible to change the design after programming it into a flash, unlike the flash in my professionally made Cyclone dev board. This makes the Flex 6000 unsuitable for my 6809 microcomputer.

However, not wanting to give up with the Flex 6000, I decided I could still use it in a dev board. The limitation would be that it wouldn't have a flash IC, and would only be directly programmable. It would forget the design after the power was removed, but it would still be useful for learning about the Flex 6000.

To use a TQFP144 Flex 6000 in a homemade dev board would require the FPGA be attached to an adapter board. This adapter board would then be attached to the single sided PCB dev board. This presented me with the task of soldering a 144 pin TQFP device, with 0.5mm pitch pins, to an adapter board. I'd never done this level of Surface Mounted Technology soldering before, and unfortunately my first attempt ended in disaster. Despite watching a few videos on the topic I used the wrong techniques and used too much solder and ended up with a ruined part. I think I know what I did wrong though. To give me more chance of success next time I ordered a few more things from good old eBay: a magnifying loupe and a (cheap) USB microscope:


The second attempt was more successful. I used a lot of flux and, with the technique of applying a mostly "dry" iron to each lead and dragging down to the pad, I seemed to have much better luck then the first time:


Here's a better view of some of the pins from the microscope:


Whilst the amount of solder on the pads is not "great", a continuity check indicates that all appears to be well. When I next try this kind of thing, I will vary my technique and hopefully get more solder onto the pads.

I have also drawn up the schematic for the dev board:


As you can see it is dominated by the FPGA. As well as the push buttons, LEDs and the ATMega8 microcontroller, there is also a 16 pin header attached to the FPGA. I included this so I could, eventually, attach the FPGA to the 6809 board for experimentation purposes.

Whilst I have laid out the board for this schematic so I can make it up myself, and I have done a toner transfer and etch, using the technique I've described in previous posts, I'm not that happy with the result. Therefore I will make up another board before drilling it out, and soldering it up with the adapter board attached via pin headers.

One piece of good news is I've found out about another Altera FPGA, the Flex 10K (datasheet). It was also made in 5V parts, which is nice and convenient. It also supports multiple times programmable flashes, namely the EPC2. Whilst it has a little less logic elements, it does have built in memory, which will be useful when it comes to implementing the MMU. Interestingly, it was also made in PLCC (I'm guessing the 10K was produced before the 6000). This might be useful after all my getting used to TQFP parts, since switching back to PLCC would certainly be easier when it comes to making up the new 6809 computer board. However I would need multiple ICs, not just one, since the number of pins on a PLCC84 device would not be enough to do all I want to do. Multiple big ICs on the main computer board would be a pain, so I may yet end up with a "strange" TQFP part in the middle of a sea of 30 year old DIP ICs.

Hopefully in the coming weeks I'll have the Flex 6000 dev board up and running. Then I'll know I understand a little more of the world of FPGAs...