FPGARelated.com
Forums

A Challenge for serialized processor design and implementation

Started by Antti March 19, 2008
Summary The discussion centers on a challenge to design an ultra-minimalist, bit-serial 32-bit processor for FPGAs that executes code directly from serial flash or SD cards.

The discussion centers on a challenge to design an ultra-minimalist, bit-serial 32-bit processor for FPGAs that executes code directly from serial flash or SD cards. The goal is to create a core that uses less than 25% of the logic in the smallest modern FPGAs while maintaining compatibility with high-level C compilers.

Participants debated whether to revive historical bit-serial architectures like the COP800 and 6804 or to implement a serialized version of a modern RISC architecture like MIPS or Transputer to leverage existing toolchains.

  • A 32-bit bit-serial ALU can be implemented with nearly zero additional FPGA fabric cost compared to a 1-bit version.
  • Serial memory (Quad SPI Flash) is proposed as the primary code storage to save FPGA pins and internal Block RAM.
  • Compatibility with an existing C compiler is the most critical constraint for making the processor usable.
  • The Transputer's minimalist hardware stack was highlighted as a strong candidate for a compact serial implementation.
  • Specialized FPGA resources, such as DataFlash buffers in Spartan-3AN, could be leveraged to eliminate BRAM usage entirely.
Soft Processor CoresFPGA ArchitectureBit-Serial ComputingEmbedded Systems
Hi

I have been think and part time working towards a goal to make useable
and useful serialized processor. The idea is that it should be

1) VERY small when implemented in any modern FPGA (less 25% of
smallest device, 1 BRAM)
2) be supported by high level compiler (C ?)
3) execute code in-place from either serial flash (Winbond quad speed
SPI memory delivers 320 mbit/s!) or from file on sd-card

serial implementation would be smaller and run at higher speeds, so
128 clock per machine cycle would already mean 2 MIPS, what would be
acceptable for many applications.

Parallax basic stamps I executes 2KIPS only, so ultra lite serial
processor in FPGA with 2 MIPS would be eh, for me its some to dream
off :)

I have poked around this idea for some years, but never got the "final
kick" to really go and do-complete the design and development of this
processor.

So I decided to offer some bounty for others to maybe motivate to work
for this goal and dream, current list of items available for the
developers from my own funding is listed here (I hope to add items and
maybe some $ by the time)

http://code.google.com/p/serial-processor/w/list

there is also very preliminary spec-goal document as well

Antti Lukats
Antti wrote:

> Hi > > I have been think and part time working towards a goal to make useable > and useful serialized processor. The idea is that it should be > > 1) VERY small when implemented in any modern FPGA (less 25% of > smallest device, 1 BRAM) > 2) be supported by high level compiler (C ?)
That means an existing core ? Did someone mention an 8080 earlier - how small is that ? To work best with serial memory, a smart set of skip opcodes is needed. eg Conditional skip of 1,2,3,4 opcodes. Jumps are very slow, tho to accept existing cores, I suppose some (small enough?) extra logic could be added that 'sniffed' the jump opcodes, and diverted the very small forward ones to a Skip-Instead block ?
> 3) execute code in-place from either serial flash (Winbond quad speed > SPI memory delivers 320 mbit/s!) or from file on sd-card
What about adding this for DATA memory option ? SPI SRAM. http://www.amis.com/products/ulp_memory/serial_srams/index.html
> > serial implementation would be smaller and run at higher speeds, so > 128 clock per machine cycle would already mean 2 MIPS, what would be > acceptable for many applications.
I'd target the Winbond Quad parts as a primary target, and look at 1, 2, 4 bit serial choices. (1 bit may not be the smallest?) 4 bit (nibble serial) would look to have merit, as it matches the memory, and you also can do 8/16/(24?)/32 cores, with simply wider busses. IIRC, natsemi used 10 clocks on their COP's which gave 8 clocks to transport data, and 2 clocks for opcode-action. 2 Winbond devices would morph this to 4 DT clocks + 2 opcode, for a 6 clock Core. -jg
> I have been think and part time working towards a goal to make useable > and useful serialized processor.
The idea is not new, according to Denyer, Renshaw "VLSI Signal Processing A Bit serial Approach" Addison-Wesley 1985 there were machines back in the 50ies, 60ies ( Pilot ACE 1953 ). Description of a typical application would be "Speech Codec architecture for Pan-European Digital Mobile Radio using bit-serial Signal processing" ( Nokia & Tampere University ) in: Brodersen "VLSI Signal Processing III" IEEE Press 1988 Thats a DSP for GSM, fixed application. But serial 8 bit microprocessors like the MC6803 or ST62xx were a slow nuisance. MfG JRD
"Antti" <Antti.Lukats@googlemail.com> skrev i meddelandet 
news:b2a2d8ca-444d-4508-9f2c-33adb30c8829@p73g2000hsd.googlegroups.com...
> Hi > > I have been think and part time working towards a goal to make useable > and useful serialized processor. The idea is that it should be > > 1) VERY small when implemented in any modern FPGA (less 25% of > smallest device, 1 BRAM) > 2) be supported by high level compiler (C ?) > 3) execute code in-place from either serial flash (Winbond quad speed > SPI memory delivers 320 mbit/s!) or from file on sd-card >
You mean a COP800 FPGA core :-)
> serial implementation would be smaller and run at higher speeds, so > 128 clock per machine cycle would already mean 2 MIPS, what would be > acceptable for many applications. >
The COP800 uses 10 clocks, Clock 1 = Precharge Clock 2-9 process 8 bits of data. Clock 10 ?
> Parallax basic stamps I executes 2KIPS only, so ultra lite serial > processor in FPGA with 2 MIPS would be eh, for me its some to dream > off :) > > I have poked around this idea for some years, but never got the "final > kick" to really go and do-complete the design and development of this > processor. > > So I decided to offer some bounty for others to maybe motivate to work > for this goal and dream, current list of items available for the > developers from my own funding is listed here (I hope to add items and > maybe some $ by the time) > > http://code.google.com/p/serial-processor/w/list > > there is also very preliminary spec-goal document as well > > Antti Lukats
-- Best Regards, Ulf Samuelsson This is intended to be my personal opinion which may, or may not be shared by my employer Atmel Nordic AB
Ulf Samuelsson wrote:
> "Antti" <Antti.Lukats@googlemail.com> skrev i meddelandet > news:b2a2d8ca-444d-4508-9f2c-33adb30c8829@p73g2000hsd.googlegroups.com... > >>Hi >> >>I have been think and part time working towards a goal to make useable >>and useful serialized processor. The idea is that it should be >> >>1) VERY small when implemented in any modern FPGA (less 25% of >>smallest device, 1 BRAM) >>2) be supported by high level compiler (C ?) >>3) execute code in-place from either serial flash (Winbond quad speed >>SPI memory delivers 320 mbit/s!) or from file on sd-card >> > > > You mean a COP800 FPGA core :-)
With the COP800 well past it's commercial life, what is the status of 'abandonware' C Compilers for it ? How small is a COP800 in fpga ?
> >>serial implementation would be smaller and run at higher speeds, so >>128 clock per machine cycle would already mean 2 MIPS, what would be >>acceptable for many applications. >> > > > The COP800 uses 10 clocks, > Clock 1 = Precharge > Clock 2-9 process 8 bits of data. > Clock 10 ?
Opcode execute ? With BRAM being cheap, and DATA SRAM being slow/relatively expensive, the core needs to use many registers well. Features like Bank swap, or register frame pointer aka Z8/XC166 etc. Also a Push NN/PopNN would avoid costly DATA ram thrashing. -jg

Jim Granville wrote:

> >>I have been think and part time working towards a goal to make useable > >>and useful serialized processor. The idea is that it should be > >> > >>1) VERY small when implemented in any modern FPGA (less 25% of > >>smallest device, 1 BRAM) > >>2) be supported by high level compiler (C ?) > >>3) execute code in-place from either serial flash (Winbond quad speed > >>SPI memory delivers 320 mbit/s!) or from file on sd-card > >> > > > > > > You mean a COP800 FPGA core :-) > > With the COP800 well past it's commercial life, what > is the status of 'abandonware' C Compilers for it ? > How small is a COP800 in fpga ? > > > > >>serial implementation would be smaller and run at higher speeds, so > >>128 clock per machine cycle would already mean 2 MIPS, what would be > >>acceptable for many applications. > >> > > > > > > The COP800 uses 10 clocks, > > Clock 1 = Precharge > > Clock 2-9 process 8 bits of data. > > Clock 10 ?
We have a COP8 compiler. The COP8 certainly fits the description because it original implementation was almost identical. 8051, Z8 and 6804 would also be on my historical short list. Walter..
Antti,

my impression is that a subset of the 32 bit Transputers with a serial
implementation would be the best tradeoff in terms of complexity and
performance. The 3 element hardware stack and ALU would be very
compact on the FPGAs and the instruction set would give you reasonable
access to a block RAM (you don't want to write too much to a Flash and
it is nice to avoid reading random data from it to keep the
instruction stream flowing).

Depending on how compatible it was (I wouldn't include the
multitasking and message sending stuff) you would have these languages
available for it:

http://www.classiccmp.org/transputer/languages.htm

A serial implementation of a classic RISC (like MIPS or DLX) would be
somewhat more awkward, but it would be doable.

A well known serial processor was the Motorola MC14500B, which is
probably as simple as it can get - see VHDL at
http://www.brouhaha.com/~eric/retrocomputing/motorola/mc14500b/

Sadly, it is far too simple and wouldn't do a good job for the
applications you are probably interested in. Since you mentioned Basic
Stamps, a very unorthodox approach would be to implement in hardware
the virtual machine for TinyBasic, put the interpreter in a block RAM
and have your Flash hold the source (or tokenized source) directly.

Sorry about only giving advice instead of actual help, but you know
how hard it is to find time for so many projects.

-- Jecel
Walter Banks wrote:
> > Jim Granville wrote: >>With the COP800 well past it's commercial life, what >>is the status of 'abandonware' C Compilers for it ? > > We have a COP8 compiler. The COP8 certainly fits the description > because it original implementation was almost identical.
Do you still sell any ?
> > 8051, Z8 and 6804 would also be on my historical short list.
The target has to be smaller than PicoBlaze/Mico8 - otherwise, why bother ? So, 8051 are too large for this project, and I thought of the Modern RS08 (as that does have C compilers ;), but I get the feeling all the multi-byte opcodes, and address variants would not map well onto a FPGA. (so it would fail size) ) Z8 - Nice Reg-Reg scheme, but likely to be close to 8051 in FPGA resource usage. 8048 ? maybe, but no compilers for this ? The best-mapped FPGA small CPUs use 18 bit opcodes, so that makes finding some old chip, with a C-Compiler, unlikely! There is CoolRisc 816, which uses a 22 bit opcode, and does have a GNU C compiler, and some infrastructure ? Again, could be too large... A 24 bit opcode is also possible. It WOULD give faster operation, and more opcodes/KB, than 32 bit opcodes. Not sure how many C compilers, for 24 Bit opcodes ? -jg
On Mar 19, 8:36 pm, Jim Granville <no.s...@designtools.maps.co.nz>
wrote:
> Walter Banks wrote: > > > Jim Granville wrote: > >>With the COP800 well past it's commercial life, what > >>is the status of 'abandonware' C Compilers for it ? > > > We have a COP8 compiler. The COP8 certainly fits the description > > because it original implementation was almost identical. > > Do you still sell any ? > > > > > 8051, Z8 and 6804 would also be on my historical short list. > > The target has to be smaller than PicoBlaze/Mico8 > - otherwise, why bother ? > > So, 8051 are too large for this project, and I thought of the Modern RS08 > (as that does have C compilers ;), but I get the feeling > all the multi-byte opcodes, and address variants would > not map well onto a FPGA. (so it would fail size) ) > > Z8 - Nice Reg-Reg scheme, but likely to be close to 8051 > in FPGA resource usage. > > 8048 ? maybe, but no compilers for this ? > > The best-mapped FPGA small CPUs use 18 bit opcodes, > so that makes finding some old chip, with a C-Compiler, > unlikely! > > There is CoolRisc 816, which uses a 22 bit opcode, > and does have a GNU C compiler, and some infrastructure ? > Again, could be too large... > > A 24 bit opcode is also possible. It WOULD give faster operation, > and more opcodes/KB, than 32 bit opcodes. > Not sure how many C compilers, for 24 Bit opcodes ?
Maybe I am missing something, but I have seen CPUs in FPGAs as small as 600 LUTs. I am pretty sure the picoBlaze is about that size. Isn't there a C compiler for that? The OP asked for something that would use no more than 25% of the smallest FPGA in a given family. That is still nearly 1000 LUTs. So why go with anything else? A bit serial CPU might be smaller than an 8 bit CPU, but what is the driving need for something that small? 600 LUTs is not much in a 3000 LUT FPGA!
rickman wrote:

> Maybe I am missing something, but I have seen CPUs in FPGAs as small > as 600 LUTs. I am pretty sure the picoBlaze is about that size.
I think it is smaller, about 200 LUTs: http://www.embeddedrelated.com/groups/fpga-cpu/show/2028.php
> A bit serial CPU might be smaller than an 8 bit CPU, but what is the > driving need for something that small? 600 LUTs is not much in a 3000 > LUT FPGA!
Could be interesting to pack it in a Max II, where the smallest device has 240 LEs. Sometimes you need some high speed logic and some more complex tasks, but which can be low speed (keyboard sampling, output to LCD text display). If you can get an additional low speed CPU for free, you could save an external microcontroller. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de