FPGARelated.com
Forums

Tiny CPUs for Slow Logic

Started by Unknown March 18, 2019
On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote:
> On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote: > > On 16/10/2019 17:14, jim.brakefield@ieee.org wrote: > > > On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote: > > >> On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote: > > .. > > >=20 > > > Missed this thread back in March. > > > My interest was/is in treating EDIF as the machine language. > > > If you can simulate all the logic in under, say, 50 usec. > > > that's faster than human reaction times and suitable for controllers. > > > So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle. > > >=20 > > > So an FPGA uP design using one block RAM and under 200 LUTs is suffic=
ient.
> > > Duplicate the uP if you have more logic than this. 200 LUTs is much =
less than $1.
> > >=20 > > > Jim Brakefield > > >=20 > >=20 > > Are you trying to emulate a small FPGA on a microcontroller? This sound=
s=20
> > like an overly complicated especially as an EDIF is normally full of=20 > > complex (and not always fully documented) primitives which will be a=20 > > real pain to simulate. > >=20 > > Perhaps I am wrong and this is a brilliant idea? I would be interested=
=20
> > to hear some more, > >=20 > > Regards, > > Hans > > www.ht-lab.com >=20 > My experience with EDIF was the output of VHDL/Verilog compilers for FPGA=
s. EDIF output was lots of simple gates and black boxes for block RAM. Th= e FPGA vendors then write a mapper, placer and router into their silicon. = So for the application with low duty cycle gates it's more efficient to emu= late the gates via a small CPU and its single block RAM. For a while Xilin= x supported a similar approach via HDL to "C" and run on their ARM or PPC h= ard cores.
> Now EDIF is just a bunch of black boxes, simple gates or as complex as de=
sired, wired together.
> There are applications, such as industrial control that run the control l=
ogic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is tra= ding speed for density.
> As a side note, ASIC logic simulators have some of the same issues, howev=
er, one wants to run the ASIC simulation as fast as possible, essentially i= n the megahertz range.
>=20 > In summary, there are a range of applications that do logic "simulation" =
over a wide range of cycle rates; from millisecond human reaction times all= the way up to "as fast as possible". Would argue that there needs to be t= ool chains that support the six order of magnitude range of logic cycle rat= es. In particular, not much attention to the low end of cycle rates, which= currently is supported by real-time embedded tools.
>=20 > Jim Brakefield
I get what you are saying. But why would anyone first design something in = HDL only to have it compiled and then simulated in a CPU? Why would such l= ow bandwidth processing not be coded in a sequential language conventionall= y used on CPUs, like C? Skip the hassle of compiling in an HDL tool and th= en importing to a simulator running on the target CPU? Where is the advant= age exactly?=20 I will say that if you use the language output from the place and route too= ls you will get something more like LUTs which are likely to simulate faste= r than individual gates. Remember that unless you have some very tiny amou= nt of logic that can be implemented in some sort of immense look up table, = every connection between gates is a signal that will need to be scheduled t= o "run" when the inputs change. Fewer entities means less scheduling... ma= ybe.=20 --=20 Rick C. + Get 1,000 miles of free Supercharging + Tesla referral code - https://ts.la/richard11209
On Friday, October 25, 2019 at 6:32:28 PM UTC-5, Rick C wrote:
> On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote: > > On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote: > > > On 16/10/2019 17:14, jim.brakefield@ieee.org wrote: > > > > On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote: > > > >> On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote: > > > .. > > > > > > > > Missed this thread back in March. > > > > My interest was/is in treating EDIF as the machine language. > > > > If you can simulate all the logic in under, say, 50 usec. > > > > that's faster than human reaction times and suitable for controllers. > > > > So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle. > > > > > > > > So an FPGA uP design using one block RAM and under 200 LUTs is sufficient. > > > > Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1. > > > > > > > > Jim Brakefield > > > > > > > > > > Are you trying to emulate a small FPGA on a microcontroller? This sounds > > > like an overly complicated especially as an EDIF is normally full of > > > complex (and not always fully documented) primitives which will be a > > > real pain to simulate. > > > > > > Perhaps I am wrong and this is a brilliant idea? I would be interested > > > to hear some more, > > > > > > Regards, > > > Hans > > > www.ht-lab.com > > > > My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores. > > Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together. > > There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density. > > As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range. > > > > In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools. > > > > Jim Brakefield > > I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventionally used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the advantage exactly? > > I will say that if you use the language output from the place and route tools you will get something more like LUTs which are likely to simulate faster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling... maybe. > > -- > > Rick C. > > + Get 1,000 miles of free Supercharging > + Tesla referral code - https://ts.la/richard11209
|>But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Was thinking of EDIF as a universal assembly language. Parallel processing via multiple interconnected processors. Hard real-time: Easy to determine worst case delay = # of instructions executed per cycle. Very few processors are under $0.10. |>every connection between gates is a signal that will need to be scheduled to "run" when the inputs change Was thinking in terms of synchronous simulation where each "gate" is evaluated only once per clock cycle.
On Friday, October 25, 2019 at 9:05:12 PM UTC-4, jim.br...@ieee.org wrote:
> On Friday, October 25, 2019 at 6:32:28 PM UTC-5, Rick C wrote: > > On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote: > > > On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote: > > > > On 16/10/2019 17:14, jim.brakefield@ieee.org wrote: > > > > > On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote: > > > > >> On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote: > > > > .. > > > > > > > > > > Missed this thread back in March. > > > > > My interest was/is in treating EDIF as the machine language. > > > > > If you can simulate all the logic in under, say, 50 usec. > > > > > that's faster than human reaction times and suitable for controllers. > > > > > So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle. > > > > > > > > > > So an FPGA uP design using one block RAM and under 200 LUTs is sufficient. > > > > > Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1. > > > > > > > > > > Jim Brakefield > > > > > > > > > > > > > Are you trying to emulate a small FPGA on a microcontroller? This sounds > > > > like an overly complicated especially as an EDIF is normally full of > > > > complex (and not always fully documented) primitives which will be a > > > > real pain to simulate. > > > > > > > > Perhaps I am wrong and this is a brilliant idea? I would be interested > > > > to hear some more, > > > > > > > > Regards, > > > > Hans > > > > www.ht-lab.com > > > > > > My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores. > > > Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together. > > > There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density. > > > As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range. > > > > > > In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools. > > > > > > Jim Brakefield > > > > I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventionally used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the advantage exactly? > > > > I will say that if you use the language output from the place and route tools you will get something more like LUTs which are likely to simulate faster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling... maybe. > > > > -- > > > > Rick C. > > > > + Get 1,000 miles of free Supercharging > > + Tesla referral code - https://ts.la/richard11209 > > |>But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? > Was thinking of EDIF as a universal assembly language. > Parallel processing via multiple interconnected processors. > Hard real-time: Easy to determine worst case delay = # of instructions executed per cycle. > Very few processors are under $0.10. > > |>every connection between gates is a signal that will need to be scheduled to "run" when the inputs change > Was thinking in terms of synchronous simulation where each "gate" is evaluated only once per clock cycle.
I'm pretty sure that does not exist. Race conditions exist in simulations if you interconnect gates as you are describing. VHDL handles this by introducing delta delays which are treated like small delays, but no time ticks off the clock, just deltas. Then each gate can have a delta delay associated with it. This in turn requires that each signal (gate output) be evaluated each time any of the inputs change. Because of the unit delays there can be multiple changes at different delta delays. Otherwise the input to a FF much be written as an expression with defined rules of order of evaluation. Or do you have a method of assuring the consistency of evaluation of signals through gates? -- Rick C. -- Get 1,000 miles of free Supercharging -- Tesla referral code - https://ts.la/richard11209