comp.arch.fpga | 6502 FPGA core| page 2

Reply by emu ●May 28, 20072007-05-28

On May 27, 11:36 am, Frank Buss <f...@frank-buss.de> wrote:

> I think readability is very good (ok, maybe because I know Forth and Lisp)
> and power-usage should be good, too, because fewer LEs are used. My current
> Forth FPGA implementation needs 319 LEs (about 5% of the small Cyclone
> EP1C6Q240C8). But I expect 10 times slower than e.g. the T65, so the all in
> all cycles per power would be not so good.

> Frank Buss, f...@frank-buss.dehttp://www.frank-buss.de,http://www.it4-systems.de

Don't worry to much about speed. You will be amazed how easy it is to
optimize uCode, as soon as the
processor really works.
And nobody says, that you have only one execution unit in the system.
(Had once something like this with 2.5 execution units, controlling a
36 bit proecessor (data width)

But, excellent project !

Reply by Jim Granville ●May 28, 20072007-05-28

Brian Drummond wrote:

> On Mon, 28 May 2007 07:34:24 +1200, Jim Granville
> <no.spam@designtools.maps.co.nz> wrote:
> 
> 
>>Frank Buss wrote:
>><snip>
> 
> 
>>If this runs slower, one of my pet ideas for FPGA cores, is to design 
>>them to run from SerialFLASH memory. Top end ones (winbond) run at
>>150MBd of link speed, so can feed nearly 20MB/s of streaming code.
>>Ideally, the core has a short-skip opcode, as the jump in such memory
>>has a higher cost.
> 
> 
> Or a "four address instruction" like the Pilot Ace, with SerialFlash in
> place of a tube full of mercury?

You've lost me ?
-jg

Reply by Tommy Thorn ●May 28, 20072007-05-28

On May 28, 12:40 pm, Jim Granville <no.s...@designtools.maps.co.nz>
wrote:
> Brian Drummond wrote:
> > On Mon, 28 May 2007 07:34:24 +1200, Jim Granville
> > <no.s...@designtools.maps.co.nz> wrote:
>
> >>Frank Buss wrote:
> >><snip>
>
> >>If this runs slower, one of my pet ideas for FPGA cores, is to design
> >>them to run from SerialFLASH memory. Top end ones (winbond) run at
> >>150MBd of link speed, so can feed nearly 20MB/s of streaming code.
> >>Ideally, the core has a short-skip opcode, as the jump in such memory
> >>has a higher cost.
>
> > Or a "four address instruction" like the Pilot Ace, with SerialFlash in
> > place of a tube full of mercury?
>
> You've lost me ?
> -jg


Me too, but this looks relevant

  http://research.microsoft.com/~GBell/Computer_Structures__Readings_and_Examples/00000213.htm

Reply by Jim Granville ●May 28, 20072007-05-28

Tommy Thorn wrote:
> On May 28, 12:40 pm, Jim Granville <no.s...@designtools.maps.co.nz>
> wrote:
> 
>>Brian Drummond wrote:
>>
>>>On Mon, 28 May 2007 07:34:24 +1200, Jim Granville
>>><no.s...@designtools.maps.co.nz> wrote:
>>
>>>>Frank Buss wrote:
>>>><snip>
>>
>>>>If this runs slower, one of my pet ideas for FPGA cores, is to design
>>>>them to run from SerialFLASH memory. Top end ones (winbond) run at
>>>>150MBd of link speed, so can feed nearly 20MB/s of streaming code.
>>>>Ideally, the core has a short-skip opcode, as the jump in such memory
>>>>has a higher cost.
>>
>>>Or a "four address instruction" like the Pilot Ace, with SerialFlash in
>>>place of a tube full of mercury?
>>
>>You've lost me ?
>>-jg
> 
> 
> Me too, but this looks relevant
> 
>   http://research.microsoft.com/~GBell/Computer_Structures__Readings_and_Examples/00000213.htm

Wow, that's quite impressive. A 1MHz clock, back in 1951!

I had not thought of Serial Data, only Serial code access, as those 
speeds are getting tolerable, and the pin/pcb savings are massive.

Most FPGAs have some SRAM, and uC projects commonly need less DATA than 
Code, but it raises a good point: Serial data _could_ also be used, and
the Ramtron FRAM devices would be good candidates - up to 64K bytes of 
Data, in 20MHz SPI. So, you'd set that up on separate pins.

-jg

Reply by Brian Drummond ●May 29, 20072007-05-29

On Tue, 29 May 2007 07:40:21 +1200, Jim Granville
<no.spam@designtools.maps.co.nz> wrote:

>Brian Drummond wrote:
>
>> On Mon, 28 May 2007 07:34:24 +1200, Jim Granville
>> <no.spam@designtools.maps.co.nz> wrote:
>> 
>> 
>>>Frank Buss wrote:
>>><snip>
>> 
>> 
>>>If this runs slower, one of my pet ideas for FPGA cores, is to design 
>>>them to run from SerialFLASH memory. Top end ones (winbond) run at
>>>150MBd of link speed, so can feed nearly 20MB/s of streaming code.
>>>Ideally, the core has a short-skip opcode, as the jump in such memory
>>>has a higher cost.
>> 
>> 
>> Or a "four address instruction" like the Pilot Ace, with SerialFlash in
>> place of a tube full of mercury?
>
>You've lost me ?
>-jg

In some designs of that era, three address instructions were common,
source1, source2 and dest, very like the register addresses in a RISC.
The innovation here was a fourth address; for the next instruction,
coded to appear out of the delay line (or drum memory) just when it was
needed. Important because the next location in program memory would have
flashed past, and you'd have to wait for the memory's cycle time (or a
whole drum revolution) before it came round again. 

Apparently it was a headache to hand-code for maximum performance, or
"offered great scope for programmer ingenuity" :-) but worthwhile for
heavily used code. (I believe it had the first floating point library,
coded this way)

But it could still be useful for streaming instructions from serial
memory.

- Brian

Reply by Brian Drummond ●May 29, 20072007-05-29

On Tue, 29 May 2007 10:48:51 +1200, Jim Granville
<no.spam@designtools.maps.co.nz> wrote:

>Tommy Thorn wrote:
>> On May 28, 12:40 pm, Jim Granville <no.s...@designtools.maps.co.nz>
>> wrote:
>> 
>>>Brian Drummond wrote:

>>>>Or a "four address instruction" like the Pilot Ace, with SerialFlash in
>>>>place of a tube full of mercury?
>>>
>>>You've lost me ?
>>>-jg
>> 
>> 
>> Me too, but this looks relevant
>> 
>>   http://research.microsoft.com/~GBell/Computer_Structures__Readings_and_Examples/00000213.htm
>
>Wow, that's quite impressive. A 1MHz clock, back in 1951!

 "it is not thought wise to design for higher speeds than this as yet"
http://www.alanturing.net/turing_archive/archive/p/p01/P01-001.html
(from 1945)

May 1950 according to
http://www.npl.co.uk/publications/metromnia/issue8/
which has some details. Apparently both code and data, but the "fourth
address" was specifically to optimise code location.

Surprisingly small, according to
http://www.scienceandsociety.co.uk/results.asp?image=10303412

- Brian (wondering how many tubes you can fit in a CLB)

Reply by PeteS ●May 29, 20072007-05-29

Frank Buss wrote:
> I've implemented a first version of a 6502 core. It has a very simple
> architecture: First the command is read and then for every command a list
> of microcodes are executed, controlled by a state machine. To avoid the
> redundant VHDL typing, the VHDL code is generated with a Lisp program:
> 
> http://www.frank-buss.de/vhdl/cpu.lisp
> 
> This is the output:
> 
> http://www.frank-buss.de/vhdl/t_rex_test.vhdl
> 
> I've tested some instructions, like LDA, and looks like it works, but I'm
> sure there are many bugs and not all features are implemented (e.g. BCD
> mode or interrupt handling). It uses 2,960 LEs with Quartus 7.1, which is
> too much compared to the 797 LEs of the T65 project. Any ideas how to
> improve it? My idea was, that the synthesizer would be able to merge the
> addressing mode implementations for the commands, but maybe this has to be
> refactored by hand.
> 
> My goal is to beat the T65 project in LE usage. Speed and 100%
> compatibility with the original 6502 (e.g. the strange S0 and V-flag
> feature or the original hardware reset vectors) is not important for me,
> but code compiled with http://www.cc65.org/ must work.
> 
> Most FPGAs have some kbyte memory (>5 kByte, even for inexpensive FPGAs,
> freely configurable as ROM and RAM), so maybe a good idea would be to store
> some microcode in memory? What instruction set is useful to implement the
> 6502 instruction set? Maybe a Forth-like microcode? 
> 
> Any ideas how to improve the Lisp code? I like my idea of using a lambda
> function in addressing-commands, because this looks more clean than a
> macro, which I've tried first, but I don't like the explicit call of
> emit-lines. How can I refactor it to a more DSL like approach?
> 

Somewhere around here I have a (very old) reference manual for the 6502 
- one of my all time favourite processors - that actually listed the 
instruction decode by bit positions. I'll have to dig it out and amuse 
myself by writing some code to actually do the decode using straight 
combinational logic ;)


Cheers

PeteS

Reply by Frank Buss ●May 29, 20072007-05-29

Brian Drummond wrote:

>  "it is not thought wise to design for higher speeds than this as yet"
> http://www.alanturing.net/turing_archive/archive/p/p01/P01-001.html
> (from 1945)

That's on page 3. Another funny sentence, describingthe architecture:

"Erasible memory units of fairly large capacity, to be known as dynamic
storage (DS). Probable consisting of between 50 and 500 mercury tanks with
a capacity of about 1000 digits each."

"digit" here means binary digit, so this will be about 0.04 per mill of my
current PC main memory.

The interesting thing is the instruction set, but it is very difficult to
extract it from the document, because it is a mix of proposals and detailed
descriptions of which registers to use for which arithmetic operations. Is
there any documentation of the actually running system? If possible, as a
modern, pure functional, description, without describing the problems and
architecture of mercury delay lines :-)

-- 
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

Previous 12Next

6502 FPGA core

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group