FPGARelated.com
Forums

Combination loops and false paths

Started by Rob Doyle January 15, 2013
I creating an FPGA implementation of a old DEC PDP-10 (KS-10,
specifically) Mainframe Computer.  Why?  Because I always wanted one...

The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU.
At this stage, most of the instruction set diagnostics simulate correctly.

When I synthesize this design using Xilinx ISE I get warnings about
combinatorial loops involving the ALU - and an associated "Minimum
period: 656.595ns (Maximum Frequency: 1.523MHz)" message...

My understanding is that if combination loops really existed then the
simulation wouldn't stabilize. I can't really add pipelining or
registers to the design without affecting the microcode - and I don't
want to do that.

Most of the information that I've read about "false paths" assume two
clocked processes not a combinatorial loop.

Anyway.  I'm not sure how to resolve this.   I can mark the path as a
false path but I think that it will ignore /all/ the timing (even the
desired timing) through that path.

What should I do?

Rob.
Rob Doyle wrote:
> > I creating an FPGA implementation of a old DEC PDP-10 (KS-10, > specifically) Mainframe Computer. Why? Because I always wanted one... > > The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. > At this stage, most of the instruction set diagnostics simulate correctly. > > When I synthesize this design using Xilinx ISE I get warnings about > combinatorial loops involving the ALU - and an associated "Minimum > period: 656.595ns (Maximum Frequency: 1.523MHz)" message... > > My understanding is that if combination loops really existed then the > simulation wouldn't stabilize. I can't really add pipelining or > registers to the design without affecting the microcode - and I don't > want to do that. >
Combinatorial loops _with delay_ will simulate correctly. Otherwise you couldn't simulate a ring oscillator.
> Most of the information that I've read about "false paths" assume two > clocked processes not a combinatorial loop. > > Anyway. I'm not sure how to resolve this. I can mark the path as a > false path but I think that it will ignore /all/ the timing (even the > desired timing) through that path. >
Not necessarily true. False paths have a FROM and a TO specification, and would not affect other paths that don't start at the FROM or don't end at the TO timing group. This allows you for example to say that you don't care how long a control register bit takes to get through some logic, but you want the streaming data to get through in the standard PERIOD time.
> What should I do?
You could always run your machine at 1.5 MHz. After all, how fast was the PDP-10? Other than that, we'd probably need to analyze this path to give any useful advice.
> > Rob.
On 1/15/2013 12:54 AM, Rob Doyle wrote:
> > I creating an FPGA implementation of a old DEC PDP-10 (KS-10, > specifically) Mainframe Computer. Why? Because I always wanted one... > > The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. > At this stage, most of the instruction set diagnostics simulate correctly. > > When I synthesize this design using Xilinx ISE I get warnings about > combinatorial loops involving the ALU - and an associated "Minimum > period: 656.595ns (Maximum Frequency: 1.523MHz)" message... > > My understanding is that if combination loops really existed then the > simulation wouldn't stabilize. I can't really add pipelining or > registers to the design without affecting the microcode - and I don't > want to do that. > > Most of the information that I've read about "false paths" assume two > clocked processes not a combinatorial loop. > > Anyway. I'm not sure how to resolve this. I can mark the path as a > false path but I think that it will ignore /all/ the timing (even the > desired timing) through that path. > > What should I do?
Do you know why the tool is complaining? Did you write the code describing the ALU? Off the top of my head, I can't think of why an ALU would have a combinatorial loop. It should have two input data busses, a number of control inputs, an output data bus and some output status signals. I don't recall the details of the 2901 bit slice and my data books are not handy. That's the problem with paper books, you can't just shove them on your hard drive... Does this part have an internal register file? Even so, that means the part would have a clock and not a combinatorial loop. Maybe this is because of some internal bus that is shared in a way that looks like a loop even though it would never be used that way? I may have to find my old AMD data book. That could be an archeological dig! Rick
On Mon, 14 Jan 2013 22:54:53 -0700, Rob Doyle wrote:

> I creating an FPGA implementation of a old DEC PDP-10 (KS-10, > specifically) Mainframe Computer. Why? Because I always wanted one... > > The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. > At this stage, most of the instruction set diagnostics simulate > correctly. > > When I synthesize this design using Xilinx ISE I get warnings about > combinatorial loops involving the ALU - and an associated "Minimum > period: 656.595ns (Maximum Frequency: 1.523MHz)" message...
> What should I do?
Look at the critical path reported by synthesis. Sounds like a VHDL coding error; that delay would equate to a chain of 3-400 LUTs between FFs which strongly suggests a mistake somewhere. - Brian
On 1/15/2013 12:54 AM, Rob Doyle wrote:
> > I creating an FPGA implementation of a old DEC PDP-10 (KS-10, > specifically) Mainframe Computer. Why? Because I always wanted one... > > The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. > At this stage, most of the instruction set diagnostics simulate correctly. > > When I synthesize this design using Xilinx ISE I get warnings about > combinatorial loops involving the ALU - and an associated "Minimum > period: 656.595ns (Maximum Frequency: 1.523MHz)" message... > > My understanding is that if combination loops really existed then the > simulation wouldn't stabilize. I can't really add pipelining or > registers to the design without affecting the microcode - and I don't > want to do that. > > Most of the information that I've read about "false paths" assume two > clocked processes not a combinatorial loop. > > Anyway. I'm not sure how to resolve this. I can mark the path as a > false path but I think that it will ignore /all/ the timing (even the > desired timing) through that path. > > What should I do? > > Rob.
I took a look at the block diagram and I don't see any combinatorial loops. However, they use latches for the RAM outputs. These are combinatorial if implemented that way. Typically latches are used because they can provide speed advantages since the data will flow through before being held while D type registers don't change outputs until the clock edge. Are the RAM and output latches in the path being reported as too long? If so, I would recommend changing the latches to rising edge registers. This should cut these loops. The RAM is level sensitive which may be a problem in an FPGA. I think all of the sequential elements are edge sensitive these days. I supposed you could make it out of latches; it's not that many elements. The clock runs down the center of the block diagram cutting the data paths and preventing any internal loops I can see. Of course, there could be combinatorial loops created by the way it is used. The only one I see is created if you loop the ALU output Y back to the ALU input D. This could happen if you try to put these busses on a tri-state bus. Since tri-states aren't used in an FPGA, you are better off using multiple separate busses to drive all the various inputs hanging on the tri-state bus. I think it would be very interesting to implement this in a low power FPGA and see just how efficient it can become. What target were you thinking of? I am currently working with the iCE40 family from Lattice and it has very impressive power consumption. You likely could run a design at some 10's of MHz while drawing the power level of an LED... a small LED. Any interest in making this a group project? Or are you keeping all the fun to yourself? BTW, where did you get documentation on the PDP-10 sufficient to design an equivalent? Just from the instruction manual? Or do you have more details? Rick
rickman <gnuarm@gmail.com> wrote:
> On 1/15/2013 12:54 AM, Rob Doyle wrote:
>> I creating an FPGA implementation of a old DEC PDP-10 (KS-10, >> specifically) Mainframe Computer. Why? Because I always wanted one...
>> The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. >> At this stage, most of the instruction set diagnostics simulate correctly.
>> When I synthesize this design using Xilinx ISE I get warnings about >> combinatorial loops involving the ALU - and an associated "Minimum >> period: 656.595ns (Maximum Frequency: 1.523MHz)" message...
>> My understanding is that if combination loops really existed then the >> simulation wouldn't stabilize. I can't really add pipelining or >> registers to the design without affecting the microcode - and I don't >> want to do that.
Loops with an odd number of inversions won't stabilize, but with an even number they should be fine. (snip)
> I took a look at the block diagram and I don't see any combinatorial > loops. However, they use latches for the RAM outputs. These are > combinatorial if implemented that way. Typically latches are used > because they can provide speed advantages since the data will flow > through before being held while D type registers don't change outputs > until the clock edge.
The tools should be good enough to figure out latches. As I wrote above, though, be sure that there is no (odd number) of inverters in the loop.
> Are the RAM and output latches in the path being reported as too long? > If so, I would recommend changing the latches to rising edge registers. > This should cut these loops. The RAM is level sensitive which may be > a problem in an FPGA. I think all of the sequential elements are edge > sensitive these days. I supposed you could make it out of latches; it's > not that many elements.
The BRAM on most FPGAs are synchronous (clocked). That might not match what you need for some older designs. If it isn't too big, and you really need asynchronous RAM, you have to make it out of CLB logic.
> The clock runs down the center of the block diagram cutting the data > paths and preventing any internal loops I can see. Of course, there > could be combinatorial loops created by the way it is used. The only > one I see is created if you loop the ALU output Y back to the ALU input > D. This could happen if you try to put these busses on a tri-state bus. > Since tri-states aren't used in an FPGA, you are better off using > multiple separate busses to drive all the various inputs hanging on the > tri-state bus.
As I understand it, the Xilinx tools, at least, know how to convert tristate logic to MUX logic. I suppose in some cases that might generate unexpected, or even false, loops. I believe that the KA-10 was done in asynchronous (non-clocked) logic. That might make an interesting FPGA project. -- glen
You might be able to get around the async/sync ram issues by using the
other edge of your clock (and if necessary, a dual-port bram could run
off different edges for read & write).

There are also ways to build a DDR register out of two registers and 3
XOR (or XNOR) gates, without gating the clock. Google "flancter
circuit". It is STA-friendly too. That might be another trick you
could use.

If the loops were not stable, it would show up even in RTL sim
(assuming the conditions needed to make it unstable were met). Since
it works with the loops in there (in simulation), I assume it is at
least not always unstable.

Andy

On 1/17/2013 3:49 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com> wrote: >> On 1/15/2013 12:54 AM, Rob Doyle wrote: > >>> I creating an FPGA implementation of a old DEC PDP-10 (KS-10, >>> specifically) Mainframe Computer. Why? Because I always wanted one... > >>> The KS-10 was microcoded and used 10x am2901 4-bit slices in the ALU. >>> At this stage, most of the instruction set diagnostics simulate correctly. > >>> When I synthesize this design using Xilinx ISE I get warnings about >>> combinatorial loops involving the ALU - and an associated "Minimum >>> period: 656.595ns (Maximum Frequency: 1.523MHz)" message... > >>> My understanding is that if combination loops really existed then the >>> simulation wouldn't stabilize. I can't really add pipelining or >>> registers to the design without affecting the microcode - and I don't >>> want to do that. > > Loops with an odd number of inversions won't stabilize, but with an > even number they should be fine. > > (snip) > >> I took a look at the block diagram and I don't see any combinatorial >> loops. However, they use latches for the RAM outputs. These are >> combinatorial if implemented that way. Typically latches are used >> because they can provide speed advantages since the data will flow >> through before being held while D type registers don't change outputs >> until the clock edge. > > The tools should be good enough to figure out latches.
Figure out in what context? The tool can't know when the latch is enabled or disabled. When enabled it is transparent and so is logic. I'm not sure what your point is. For timing a latch is combinational logic and has to be figured into the timing paths. In fact, that is usually why latches are used, because they improve timing.
> As I wrote above, though, be sure that there is no (odd number) > of inverters in the loop. > >> Are the RAM and output latches in the path being reported as too long? >> If so, I would recommend changing the latches to rising edge registers. >> This should cut these loops. The RAM is level sensitive which may be >> a problem in an FPGA. I think all of the sequential elements are edge >> sensitive these days. I supposed you could make it out of latches; it's >> not that many elements. > > The BRAM on most FPGAs are synchronous (clocked). That might not > match what you need for some older designs. If it isn't too big, > and you really need asynchronous RAM, you have to make it out > of CLB logic.
Yes, not only are the block RAMs synchronous, the LUT RAMs (distributed) are also synchronous. That is why I say you have to make async RAM out of latches.
>> The clock runs down the center of the block diagram cutting the data >> paths and preventing any internal loops I can see. Of course, there >> could be combinatorial loops created by the way it is used. The only >> one I see is created if you loop the ALU output Y back to the ALU input >> D. This could happen if you try to put these busses on a tri-state bus. >> Since tri-states aren't used in an FPGA, you are better off using >> multiple separate busses to drive all the various inputs hanging on the >> tri-state bus. > > As I understand it, the Xilinx tools, at least, know how to convert > tristate logic to MUX logic. I suppose in some cases that might > generate unexpected, or even false, loops.
That should not create loops unless the loop is already there in the connected logic. Translating tristate busses create multiple sets of multiplexors which are all distinct, preventing loops... unless the rest of the logic connects input to output.
> I believe that the KA-10 was done in asynchronous (non-clocked) logic. > That might make an interesting FPGA project.
Doing async logic in an FPGA is not so easy. You need timing info that is hard to get. Rick
On 1/15/2013 7:21 PM, rickman wrote:
 > On 1/15/2013 12:54 AM, Rob Doyle wrote:
 >>
 >> I creating an FPGA implementation of a old DEC PDP-10 (KS-10,
 >> specifically) Mainframe Computer. Why? Because I always wanted
 >> one...
 >>
 >> The KS-10 was microcoded and used 10x am2901 4-bit slices in the
 >> ALU. At this stage, most of the instruction set diagnostics
 >> simulate correctly.
 >>
 >> When I synthesize this design using Xilinx ISE I get warnings
 >> about combinatorial loops involving the ALU - and an associated
 >> "Minimum period: 656.595ns (Maximum Frequency: 1.523MHz)"
 >> message...
 >>
 >> My understanding is that if combination loops really existed then
 >> the simulation wouldn't stabilize. I can't really add pipelining
 >> or registers to the design without affecting the microcode - and I
 >>  don't want to do that.
 >>
 >> Most of the information that I've read about "false paths" assume
 >> two clocked processes not a combinatorial loop.
 >>
 >> Anyway. I'm not sure how to resolve this. I can mark the path as a
 >> false path but I think that it will ignore /all/ the timing (even
 >> the desired timing) through that path.
 >>
 >> What should I do?
 >
 > Do you know why the tool is complaining?  Did you write the code
 > describing the ALU?  Off the top of my head, I can't think of why an
 >  ALU would have a combinatorial loop.  It should have two input data
 >  busses, a number of control inputs, an output data bus and some
 > output status signals.

Oops.  Sorry I guess I relied to rickman instead of following up with 
the group.   I'm resending...

I guess I'm using term ALU and am2901 interchangeably.  I'll be more
specific.

There is nothing wrong with the am2901 proper.  It is what it is.

 > I don't recall the details of the 2901 bit slice and my data books
 > are not handy.  That's the problem with paper books, you can't just
 > shove them on your hard drive...  Does this part have an internal
 > register file?  Even so, that means the part would have a clock and
 > not a combinatorial loop.

 > Maybe this is because of some internal bus that is shared in a way
 > that looks like a loop even though it would never be used that way?

Exactly.

The problems is that am2901 output goes to a bus that eventually routes
back to the am2901 input for some unused (as best I can tell)
configuration of the microcode.  This all happens with no registers in
the loop.

The am2901 does have an internal dual-ported register file.  Register
file writes from the ALU output are clocked.  Register file reads to the
ALU input are latched only. The am2901 control inputs and register file
addresses all originate from the microcode which is registered.

The am2901 has a single input bus which is combinatorial through the ALU
to output bus.  Therefore all am2901 ops require at least one register
(or the constant zero) as an ALU source.

I think I know what to do.  It looks like ISE supports a
FROM-THRU-THRU-THRU-THRU-TO timing constraint - with an indefinite
number of THRUs.  I think I just want to very specifically exclude the
paths that the tool is whining about and leave everything else.

 > I may have to find my old AMD data book.  That could be an
 > archeological dig!

I guess that it is just a design from another day - a whole lot less
synchronous than anything I've done in an FPGA before.

I have enjoyed going back through that all.   I even found my "Mick and
Brick" book.  I'll probably do a VAX 11/780 next which also used
bit-sliced parts.

Bob.








On 1/17/2013 1:03 PM, rickman wrote:

> BTW, where did you get documentation on the PDP-10 sufficient to > design an equivalent? Just from the instruction manual? Or do you > have more details?
Folks have done a remarkable job at archiving design information, software, and hardware for these historic machines. Notably Paul Allen (of Microsoft Fame) sponsors a museum that maintains a few of these machines in working order. See http://www.livingcomputermuseum.org/ The folks at bitsavers.org are frantically scanning documents/books and imaging magnetic media before it becomes unreadable. AMD Info is at: http://bitsavers.org/pdf/amd/ All the KS10/PDP10 information is at: http://bitsavers.org/pdf/dec/pdp10/KS10/ Microcode source/listings and processor diagnostics are available from: http://pdp-10.trailing-edge.com/klad_sources/index.html A webpage that describes my/our project is at: http://www.techtravels.org/KS10FPGA/ I put a block diagram of the KS10 CPU on the techtravels website. Referring to that block diagram, the false paths are from the ALU output through the DBM mux, through the DBUS Mux, and back into the ALU. There is another false path though the SCAD. (The SCAD is a 12-bit mini-ALU built from 3x 74S181s that is used for managing floating-point exponents and loop constructs in the microcode).
> Any interest in making this a group project? Or are you keeping all > the fun to yourself?
This is definitely a group project. Right now, I'm doing all the FPGA work by myself - If you are interested in participating, contact me off-list. You used the term 'archeology'. It sure feels like that... Rob.