Reply by sjulhes January 19, 20062006-01-19
Hello,

Thanks for your answer and advices.
What you describe is pretty much like what I have planned to do, just not to
say this is what I wanted to do !!

St�phane.




"Gabor" <gabor@alacron.com> a &#4294967295;crit dans le message de news:
1137626090.900239.283190@g47g2000cwa.googlegroups.com...
St&#4294967295;phane,

I do similar designs to this (framegrabber cards) and haven't found a
ready-made SDRAM FIFO, although I haven't looked in a long time
since rolling my own many designs ago.

I think that getting good performance from SDRAM in this situation
involves tuning the burst accesses to allow continuous use of the
data bus when not changing direction between read and write.

Peter was on the right track with the read / write ping-pong machine,
but for SDRAM you need to read multiple words / write multiple words
to get any sort of performance.

A design I re-use frequently has a constant 21-cycle loop in which
16 of the cycles are used for either read or write (not both) during
one loop.  Un-needed loops are replaced by a 21-cycle NOP and
auto refresh.  Start-up code counts refresh loops until 8 refreshes
occur and then sets the mode register.  For simplicity, the address
pins are reset to the state required for the mode register set (MRS)
and not clock enabled until the MRS has completed.  This reduces
the gate depth of the address mux.

I have a simple "arbiter" that decides what to do with the next 21
clock
cycles (read, write, or refresh).  To reduce power on some designs
I only refresh if a refresh timer has expired, then unused 21-clock
loops are just completely NOP'd.

Memories are set for burst length of 4 and one burst of 4 words is
sent to each bank (thus overlapping data and control cycles).  Think
of the bank address as the least significant address bits.  For better
throughput you can increase the loop length and each additional
16 cycles would add 16 more memory accesses (e.g. 32 accesses
in a 37-cycle loop).

The basic element going in or out of this "FIFO" is then a burst of
16 words (think of this as your FIFO width if you will.  Generally I
use
a COREgen FIFO to buffer up data at both ends of the SDRAM to
deal with asynchronous data rates and differing byte widths on
input and output.

The whole design is remarkably small, but as I said I haven't seen such
a design made generally available.

Good luck,
Gabor

sjulhes wrote:
> Hello, > > The goal is not to create delay but to handle the fact the 32 bits / 33
Mhz
> bus is not always available. > > For the SDRAM controller I already have my solution in mind. > > I was only wondering if this function ( huge FIFO for FPGA using an
external
> SDRAM ) ,which i'm shure a lot of people would need, would already exist
and
> be available. > > St&#4294967295;phane. > > "Peter Alfke" <peter@xilinx.com> a &#4294967295;crit dans le message de news: > 1137514850.769502.197970@f14g2000cwb.googlegroups.com... > > Give more details: > > depth, width, clock rate for write and for read. Asynchronous clocks? > > Peter Alfke, Xilinx > >
Reply by Peter Alfke January 18, 20062006-01-18
This kind of design has too many variables and trade-offs that
different users will never agree upon.
I started thinking SRAM, which is the right approach up to a certain
size. Cost is the question.
Then there is speed and acceptable through-delay or latency or block
length.
I still believe in using a dual-ported BlockRAM a the staging area,
because it offers so much timing flexibility.
Peter Alfke

Reply by Gabor January 18, 20062006-01-18
St=E9phane,

I do similar designs to this (framegrabber cards) and haven't found a
ready-made SDRAM FIFO, although I haven't looked in a long time
since rolling my own many designs ago.

I think that getting good performance from SDRAM in this situation
involves tuning the burst accesses to allow continuous use of the
data bus when not changing direction between read and write.

Peter was on the right track with the read / write ping-pong machine,
but for SDRAM you need to read multiple words / write multiple words
to get any sort of performance.

A design I re-use frequently has a constant 21-cycle loop in which
16 of the cycles are used for either read or write (not both) during
one loop.  Un-needed loops are replaced by a 21-cycle NOP and
auto refresh.  Start-up code counts refresh loops until 8 refreshes
occur and then sets the mode register.  For simplicity, the address
pins are reset to the state required for the mode register set (MRS)
and not clock enabled until the MRS has completed.  This reduces
the gate depth of the address mux.

I have a simple "arbiter" that decides what to do with the next 21
clock
cycles (read, write, or refresh).  To reduce power on some designs
I only refresh if a refresh timer has expired, then unused 21-clock
loops are just completely NOP'd.

Memories are set for burst length of 4 and one burst of 4 words is
sent to each bank (thus overlapping data and control cycles).  Think
of the bank address as the least significant address bits.  For better
throughput you can increase the loop length and each additional
16 cycles would add 16 more memory accesses (e.g. 32 accesses
in a 37-cycle loop).

The basic element going in or out of this "FIFO" is then a burst of
16 words (think of this as your FIFO width if you will.  Generally I
use
a COREgen FIFO to buffer up data at both ends of the SDRAM to
deal with asynchronous data rates and differing byte widths on
input and output.

The whole design is remarkably small, but as I said I haven't seen such
a design made generally available.

Good luck,
Gabor

sjulhes wrote:
> Hello, > > The goal is not to create delay but to handle the fact the 32 bits / 33 M=
hz
> bus is not always available. > > For the SDRAM controller I already have my solution in mind. > > I was only wondering if this function ( huge FIFO for FPGA using an exter=
nal
> SDRAM ) ,which i'm shure a lot of people would need, would already exist =
and
> be available. > > St=E9phane. > > "Peter Alfke" <peter@xilinx.com> a =E9crit dans le message de news: > 1137514850.769502.197970@f14g2000cwb.googlegroups.com... > > Give more details: > > depth, width, clock rate for write and for read. Asynchronous clocks? > > Peter Alfke, Xilinx > >
Reply by sjulhes January 18, 20062006-01-18
Hello,

The goal is not to create delay but to handle the fact the 32 bits / 33 Mhz
bus is not always available.

For the SDRAM controller I already have my solution in mind.

I was only wondering if this function ( huge FIFO for FPGA using an external
SDRAM ) ,which i'm shure a lot of people would need, would already exist and
be available.

St&#4294967295;phane.

"Peter Alfke" <peter@xilinx.com> a &#4294967295;crit dans le message de news:
1137514850.769502.197970@f14g2000cwb.googlegroups.com...
> Give more details: > depth, width, clock rate for write and for read. Asynchronous clocks? > Peter Alfke, Xilinx >
Reply by Fred January 17, 20062006-01-17
"Peter Alfke" <peter@xilinx.com> wrote in message 
news:1137520894.534736.226480@g44g2000cwa.googlegroups.com...
> Fred, I agree. I was thinking in terms of external SRAM, where timing > is so much easier and faster. > If the external RAM has to be SDRAM (does it really, is the required > depth so large?), it might make sense to convert the transfer to blocks > of data, assembled in a BlockRAM. > Then there is the question of allowed latency. > Peter Alfke >
I don't think I would use the block memories since they're another interface to mess around with. Also if written in Verilog or VHDL it would be a more portable piece of code. It all depends on resources and if the design is close to the limit when there are some spare block RAMs. The fact there's a FIFO creates more latency than any FPGA design can ever add!
Reply by Peter Alfke January 17, 20062006-01-17
Fred, I agree. I was thinking in terms of external SRAM, where timing
is so much easier and faster.
If the external RAM has to be SDRAM (does it really, is the required
depth so large?), it might make sense to convert the transfer to blocks
of data, assembled in a BlockRAM.
Then there is the question of allowed latency.
Peter Alfke

Reply by Fred January 17, 20062006-01-17
"Peter Alfke" <peter@xilinx.com> wrote in message 
news:1137519790.491285.321000@g47g2000cwa.googlegroups.com...
> Stephane, > you did not say how big, but let me believe you that you need an > external SDRAM. > The speed you mention is slow by today's standards. I suggest you use a > single read/write SDRAM interface and you inter-digitate (time-share) > the read and write operations on that common SDRAM interface. The FPGA > is then just the cintroller and arbiter between read and write > operations. For simplicity, you could run the memory as a synchronous > 2-stroke operation, always a read followed by a write. > These two cycles would fit into your 15 ns period. Try to store ony one > write word and one read word in the FPGA. But you can also communicate > through one 32-bit wide BlockRAM, where you can use one half of the > address space on one port as transmit buffer, the other half of the > address space on the other port as receive buffer. > The possibilities are endless... > Peter Alfke, Xilinx >
So in 15 ns he has to send a RAS, wait 2 clocks and then a CAS and a further 2 clocks to get the data. I know the write may be a bit quicker but you still need a couple of clocks for precharge after write. Are sure each operation can be done in 15ns. That's a mighty fast SDRAM. I think I'd use a burst of 4 words to reduce the "setting up" overhead of a random read or write. Otherwise yes I'd use a multiple of 33MHz using on board PLL and alternate reads and writes.
Reply by Peter Alfke January 17, 20062006-01-17
Stephane,
you did not say how big, but let me believe you that you need an
external SDRAM.
The speed you mention is slow by today's standards. I suggest you use a
single read/write SDRAM interface and you inter-digitate (time-share)
the read and write operations on that common SDRAM interface. The FPGA
is then just the cintroller and arbiter between read and write
operations. For simplicity, you could run the memory as a synchronous
2-stroke operation, always a read followed by a write.
These two cycles would fit into your 15 ns period. Try to store ony one
write word and one read word in the FPGA. But you can also communicate
through one 32-bit wide BlockRAM, where you can use one half of the
address space on one port as transmit buffer, the other half of the
address space on the other port as receive buffer.
The possibilities are endless...
Peter Alfke, Xilinx

Reply by sjulhes January 17, 20062006-01-17
Well, I have non continous 16 bits @66Mhz incoming data flow ( video ) which
is send to a 32 bits @ 33Mhz bus which is not always available.
I have to put a big FIFO between theses two busses, so I have to implement a
FIFO using a SDRAM.
Transmission data can be up to 32Mbits.

St&#4294967295;phane.


"Peter Alfke" <peter@xilinx.com> a &#4294967295;crit dans le message de news:
1137514850.769502.197970@f14g2000cwb.googlegroups.com...
> Give more details: > depth, width, clock rate for write and for read. Asynchronous clocks? > Peter Alfke, Xilinx >
Reply by Peter Alfke January 17, 20062006-01-17
Give more details:
depth, width, clock rate for write and for read. Asynchronous clocks?
Peter Alfke, Xilinx