Superscalar Out-of-Order Processor on an FPGA

Started by Luke May 9, 2006
"alpha" <zhg.liu@gmail.com> wrote in message
news:1148669021.747497.47730@g10g2000cwb.googlegroups.com...
> > > > Uncle Noah wrote: > > >> Xilinx block RAM is synchronous read. Is this the source of your > > >> problem? > > [YES, Xilinx's sync read give me trouble.]
You may already know about it, but there's a useful trick you might be able to make use of to obtain "asynchronous-looking" BRAM in Xilinx parts. It should work with anyone else's parts as well. If you disable the output registers of the BRAM, your latency of data out is a single cycle. More precisely: you set up the read address, apply a clock edge, and your data comes out a little while later. The key thing is, you only need a single edge. Now, if you use the opposite edge from the surrounding logic (i.e. usually the falling edge instead of the rising edge), you can get the read done within a single cycle (instead of having to wait until the next rising edge for the data). In the timing diagram below, point 'B' is the edge used for the BRAM, where the address is sampled. As you can see, the data addressed by the address value presented at 'A' is ready by the next edge 'C'. The region marked 'x' is the clock-to-out delay from the BRAM (2ns or more depending on the device family). A B C -----+ +------+ +------+ +---- | | | | | | +------+ +------+ +------+ _____________ / \ ------------+ Addr +------------------ \_____________/ __________ /x/ \ --------------------+xx| Data +---------- \x\__________/ The only problem with this approach is that it limits the speed of your circuit. Since you were talking about 33MHz, this won't hurt you at all, whatever family you're targeting. Cheers, -Ben-