Hello newsreaders, For a while I have been confronted with the following task which I find quite challenging but unfortuantely didn't manage to solve it, yet. What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on one printed circuit board (PCB). They are used to process a large amount of incoming serial data (data rates of several GHz's). My idea is to handle that data parallel by the 2-4 FPGAs. But now there arises the problem how to adequately split the data and how to synchronize the FPGAs among one another, in particular? Is it possible or first of all a realistic idea to synchronize multiple FPGAs in the GHz range? How can this be done without much protocoll overhead? I would like to do it without applying an extra transfer protocoll among the FPGAs just for that purpose! Up to this date I didn't find a proper solution, yet. Maybe someone can give me a hint? Any ideas how to solve that problem? Regards, Leroy Tanner
How To Synchronize FPGAs
Started by ●September 22, 2004
Reply by ●September 22, 20042004-09-22
Maybe I am missing something, but wouldn't you just drive all the chips with one onboard clock then in your code trigger the processes on the rising edge? Don "Leroy Tanner" <ikeepthespiritalive@freenet.de> wrote in message news:cirft3$j4c$1@mamenchi.zrz.TU-Berlin.DE...> Hello newsreaders, > > For a while I have been confronted with the following task which I find > quite challenging but unfortuantely didn't manage to solve it, yet. > What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on > one > printed circuit board (PCB). They are used to process a large amount of > incoming serial data (data rates of several GHz's). My idea is to handle > that data parallel by the 2-4 FPGAs. But now there arises the problem how > to > adequately split the data and how to synchronize the FPGAs among one > another, in particular? > Is it possible or first of all a realistic idea to synchronize multiple > FPGAs in the GHz range? How can this be done without much protocoll > overhead? I would like to do it without applying an extra transfer > protocoll > among the FPGAs just for that purpose! Up to this date I didn't find a > proper solution, yet. > Maybe someone can give me a hint? Any ideas how to solve that problem? > > Regards, Leroy Tanner > >
Reply by ●September 22, 20042004-09-22
Post Below... "Don Golding" <dgolding@sbcglobal.net> wrote in message news:Prf4d.24210$uJ3.5681@newssvr29.news.prodigy.com...> Maybe I am missing something, but wouldn't you just drive all the chipswith> one onboard clock then in your code trigger the processes on the rising > edge? > > Don > > "Leroy Tanner" <ikeepthespiritalive@freenet.de> wrote in message > news:cirft3$j4c$1@mamenchi.zrz.TU-Berlin.DE... > > Hello newsreaders, > > > > For a while I have been confronted with the following task which I find > > quite challenging but unfortuantely didn't manage to solve it, yet. > > What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on > > one > > printed circuit board (PCB). They are used to process a large amount of > > incoming serial data (data rates of several GHz's). My idea is to handle > > that data parallel by the 2-4 FPGAs. But now there arises the problemhow> > to > > adequately split the data and how to synchronize the FPGAs among one > > another, in particular? > > Is it possible or first of all a realistic idea to synchronize multiple > > FPGAs in the GHz range? How can this be done without much protocoll > > overhead? I would like to do it without applying an extra transfer > > protocoll > > among the FPGAs just for that purpose! Up to this date I didn't find a > > proper solution, yet. > > Maybe someone can give me a hint? Any ideas how to solve that problem? > > > > Regards, Leroy Tanner > > > > > >Start Post.... It gets tricky when you have multiple FPGAs clocked at hundred(s) of MHz. I don't have any direct expeience there, but I think looking for appnotes on vendor sites that address "Board Level De-skew" (using FPGA clocking resources to account for clock distribution headaches) and specifically for Xilinx, "Channel bonding" (using multiple RocketIO transceivers to receive data in parallel). The RocketIO transceivers are difficult beasts, at least if you're not using a standard protocol. I'm not sure if the channel bonding can span multiple V2pro devices, but I know it can span multiple transceivers. Not sure on your budget, or application requirements, but it may be worthwhile going to a single, larger part that contains the resources you need. It at least partially removes the headache of high-speed PCB design/layout. --Josh Model
Reply by ●September 22, 20042004-09-22
...or at least take all the high speed serial stuff into one FPGA and distribute it from that one to the others at a slower parallel rate. Also, it looks like V4 could take care of this with its ChipSync thingy for source synchronous application. Cheers, Syms. "Josh Model" <model@ll.nospam.mit.edu> wrote in message news:iWf4d.45>> Not sure on your budget, or application requirements, but it may be > worthwhile going to a single, larger part that contains the resources you > need. It at least partially removes the headache of high-speed PCB > design/layout. > > > --Josh Model
Reply by ●September 22, 20042004-09-22
Yes, you *are* missing something... ;) Don Golding wrote:> > Maybe I am missing something, but wouldn't you just drive all the chips with > one onboard clock then in your code trigger the processes on the rising > edge? > > Don > > "Leroy Tanner" <ikeepthespiritalive@freenet.de> wrote in message > news:cirft3$j4c$1@mamenchi.zrz.TU-Berlin.DE... > > Hello newsreaders, > > > > For a while I have been confronted with the following task which I find > > quite challenging but unfortuantely didn't manage to solve it, yet. > > What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on > > one > > printed circuit board (PCB). They are used to process a large amount of > > incoming serial data (data rates of several GHz's). My idea is to handle > > that data parallel by the 2-4 FPGAs. But now there arises the problem how > > to > > adequately split the data and how to synchronize the FPGAs among one > > another, in particular? > > Is it possible or first of all a realistic idea to synchronize multiple > > FPGAs in the GHz range? How can this be done without much protocoll > > overhead? I would like to do it without applying an extra transfer > > protocoll > > among the FPGAs just for that purpose! Up to this date I didn't find a > > proper solution, yet. > > Maybe someone can give me a hint? Any ideas how to solve that problem? > > > > Regards, Leroy Tanner > > > >-- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX
Reply by ●September 22, 20042004-09-22
Leroy Tanner wrote: > But now there arises the problem how to> adequately split the data and how to synchronize the FPGAs among one > another, in particular?> Is it possible or first of all a realistic idea to synchronize multiple > FPGAs in the GHz range? How can this be done without much protocoll > overhead?I believe most important is to first latch the signals in the IOB to minimize clock skew problems. Otherwise, an external shift register to generate bit parallel signals for input to the FPGA. -- glen
Reply by ●September 24, 20042004-09-24
"Symon" <symon_brewer@hotmail.com>:> ...or at least take all the high speed serial stuff into one FPGA and > distribute it from that one to the others at a slower parallel rate.ok, I agree on that and it might be a good approach to minimize skewing in the first section. but nevertheless I must synchronize the other FPGAs to each other, not at a rate of several GHz but say at ca. 300 MHz. In my opinion a central clock isn't an appropriate solution!?
Reply by ●September 24, 20042004-09-24
Think about what a central clock entails from purely a routing perspective. Let's assume you're an SI wizard, and have no issues there. 300 MHz would be ~ 3.3 ns per clock cycle. If I remember my rule of thumb, you've got about 6 inches per 1 ns for the speed of an electrical signal in FR-4 material. So the worst case match between all your data lines and all clock lines for all FPGA's will be the skew that eats into your timing budget. Just as an example (I'm not really a layout person, so it's my posterior speaking), matching all lines to 4 FPGAs +/- 3 inches seems relatively tricky, but not completely unreasonable. So now ~1/3 of your entire clock cycle is wasted (more, if you were assuming DDR) before you even get to the FPGA fabric. it makes laying out your design that much more tricky. Now, in the slightly more real world you've got to throw in the jitter present on a 300 MHz clock, impedance mismatches causing reflections, crosstalk on your board with all that data zipping around (because GHz and even 300 MHz lines are really antennae) and you've got a lot to deal with. Anyhow, synchronzing dataflow at those speeds on a PCB is not nearly as simple as just plopping down a clock. It's a hard design, but you get to choose where to place the burden. If you've got really good PCB people, maybe they can match and terminate the really well. If you've got the DCM/ DLL (or their altera, or "insert brand" counterpart) hardware to de-skew the board clock, you could let the FPGA do it (though I don't recall at what frequencies the DCM's top out). If you've got neither, you might want to consider going to a single chip serial interface, because you're going to get into trouble otherwise. --Josh "Leroy Tanner" <ikeepthespiritalive@freenet.de> wrote in message news:cj1476$9pc$1@mamenchi.zrz.TU-Berlin.DE...> > "Symon" <symon_brewer@hotmail.com>: > > ...or at least take all the high speed serial stuff into one FPGA and > > distribute it from that one to the others at a slower parallel rate. > > ok, I agree on that and it might be a good approach to minimize skewing in > the first section. but nevertheless I must synchronize the other FPGAs to > each other, not at a rate of several GHz but say at ca. 300 MHz. In my > opinion a central clock isn't an appropriate solution!? > >
Reply by ●September 24, 20042004-09-24
Hi Leroy, Say you've got 4 FPGAs A, B, C & D. Each gets fed the 300MHz clock, so on the fabric of each FPGA is CLK_A, CLK_B etc. When you send data from (say) FPGA B to FPGA D, send a clock with the data, generated by FPGA B from its internal CLK_B, called (say) CLK_B_TO_D. Use this source synchronous clock with a DCM in FPGA D to get the data into a BRAM FIFO inside FPGA D. Get the data out from this FIFO into the fabric of FPGA D using CLK_D. Repeat for all the other paths. Any good? Cheers, Syms. "Leroy Tanner" <ikeepthespiritalive@freenet.de> wrote in message news:cj1476$9pc$1@mamenchi.zrz.TU-Berlin.DE...> > "Symon" <symon_brewer@hotmail.com>: > > ...or at least take all the high speed serial stuff into one FPGA and > > distribute it from that one to the others at a slower parallel rate. > > ok, I agree on that and it might be a good approach to minimize skewing in > the first section. but nevertheless I must synchronize the other FPGAs to > each other, not at a rate of several GHz but say at ca. 300 MHz. In my > opinion a central clock isn't an appropriate solution!? > >
Reply by ●September 25, 20042004-09-25
On Wed, 22 Sep 2004 11:14:39 +0200, Leroy Tanner wrote:> For a while I have been confronted with the following task which I find > quite challenging but unfortuantely didn't manage to solve it, yet. > What I want to do is to use 2-4 FPGAs (Xilinx Virtex 2 Pro) together on one > printed circuit board (PCB). They are used to process a large amount of > incoming serial data (data rates of several GHz's). My idea is to handle > that data parallel by the 2-4 FPGAs. But now there arises the problem how to > adequately split the data and how to synchronize the FPGAs among one > another, in particular?There are two ways to approach this problem: (1) have each FPGA perform a part of the process on the entire data stream or (2) have each FPGA perform the entire process on part of the data stream. We once implemented (2) for a bandwidth expander where each chip did the complete process (one clock cycle Huffman decoding, translation of the code to a value, then arithmetic processing) for a portion of the incoming data stream. Each chip was provided a chunk of the incoming data (e.g., in a two-chip system, chip one processed chunks 1,3,5,... of the data and chip two was processed chunks 2,4,6,... of the data). We actually used two on the board because of I/O bandwidth limitations, but the chip was designed to allow for 1,2,4,or 8 chip operation. -=Dave=-






