Hi, We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 dram chip. The coregen thing seems to successfully build a dram interface that's claimed to work up to 133 MHz. The DRAM is spec'd to work down to 128 MHz, so there's a small overlap window. We'd run at 128. Our Xilinx FAE seems to be discouraging us from doing this, without saying precisely why, suggesting some other parts. Spartan 6 would be ideal (hard dram controller as I understand it) but are unavailable for some vague time. We'd rather not use a new part for a single project, since we will cut over to the s6's when they are available. Has anyone done DDR2 from a Spartan 3? Success/horror stories? John
Spartan 3 and DDR2
Started by ●July 21, 2009
Reply by ●July 21, 20092009-07-21
"John Larkin" <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote in message news:e4nb65t2drpjnba1453ms7541utu8lojf3@4ax.com...> Hi, > > We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 > dram chip. The coregen thing seems to successfully build a dram > interface that's claimed to work up to 133 MHz. The DRAM is spec'd to > work down to 128 MHz, so there's a small overlap window. We'd run at > 128. > > Our Xilinx FAE seems to be discouraging us from doing this, without > saying precisely why, suggesting some other parts. Spartan 6 would be > ideal (hard dram controller as I understand it) but are unavailable > for some vague time. We'd rather not use a new part for a single > project, since we will cut over to the s6's when they are available. > > Has anyone done DDR2 from a Spartan 3? Success/horror stories? > > John >Spartan 3 doesn't have any of the Virtex 4 (and later) type of regional IOB group clocking, so doing DDR and/or DDR2 memory interfaces is tougher and certainly will limit your max data rate. Can you get Spartan 3 to work at 4ns data rate? Probably, but if you have the option you should use a later family. Even with the families that have regional clocking that can be strobed by DQS, it's still a challenge. You have to be super careful about which pins you use and be aware of the "reach" of the local clocking resources in terms of the data bits associated with each DQS and also the reach of the divided local clock that feeds the fifo that is the interface into the fabric (proper). It sure would have been nice if the DRAM manufacturers would have provided a CONTINUOUS read clock rather than a freeking dual purpose read/write strobe (DQS). My hair would be darker. Bob -- == All google group posts are automatically deleted due to spam ==
Reply by ●July 21, 20092009-07-21
On Tue, 21 Jul 2009 08:31:47 -0700, John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote:>Hi, > >We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >dram chip. The coregen thing seems to successfully build a dram >interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >work down to 128 MHz, so there's a small overlap window. We'd run at >128. > >Our Xilinx FAE seems to be discouraging us from doing this, without >saying precisely why, suggesting some other parts. Spartan 6 would be >ideal (hard dram controller as I understand it) but are unavailable >for some vague time. We'd rather not use a new part for a single >project, since we will cut over to the s6's when they are available. > >Has anyone done DDR2 from a Spartan 3? Success/horror stories? > >John >We're using the XC3S1400A with DDR2 using a pair of Gb memory. We're also using a Virtex 5 with DDR2. The Spartan-3A interface to DDR2 was done using the coregen thing since the consultants that did the code for this part of the project only had the skills to do drop-in cores. We used the pinout from the demo board since that was a known working configuration and the wiring produced a nearly optimal layout. If you let the coregen do it's own pinout selection, you will end up with a horrible wiring mess. The consultants got it working, but it was a fight. If you read the documentation, you soon find out that using the coregen requires a lot of fiddling with the config files, making sure that pin functions are grouped in certain ways. The consultants assured us that the coregen was an easy way to do memory interface, but obviously they didn't heed the warnings - Xilinx peppers their notes with "may" and "might". The memory coregen is a horrible piece of crap. Plus, the documentation is extremely confusing on how you deal with trace length between memory and FPGA and a delay line required for the Spartan-3. We finally found the correct answer in their on-line answer bank. We did the Virtex 5 DDR2 interface in-house. We produced a much more elegant interface than the coregen. Oh, if you place the memory right next to the FPGA and optimize the pinout, you can leave all the termination resistors out. We ran a bunch of simulations which shows that this is possible. In practice, we got beautiful looking waveforms using a GHz scope and a differential probe on single ended lines. It saves a huge amount of space and power. By putting in terminations, it requires that you need terminations because of all the excess wiring you need! This was done on both the Spartan and Virtex. All traces were buried. I think you'll find that Xilinx cores are generally poorly done except for a few things. Lots of bugs and poor functionality. They generally require huge amounts of space to do things not quite right. Don't get me started on their bloody buggy tools!!!! -- Mark
Reply by ●July 22, 20092009-07-22
John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> writes:> Hi, > > We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 > dram chip. The coregen thing seems to successfully build a dram > interface that's claimed to work up to 133 MHz. The DRAM is spec'd to > work down to 128 MHz, so there's a small overlap window. We'd run at > 128. >If it's any help - I've done DDR (not DDR2) with S3ADSP3400-4. I used the MIG tool which, contrary to others, I found to work well, as long as the documented limits (esp. on pin placement) are followed. Given that the ultimate target was to use XPS's memory controller (MPMC), which uses a MIG physical layer, I didn't think rolling my own controller was worth the hassle. And indeed, the board worked first time (4 independent x16 DDR chips, 4 MIGs/MPMCs @ 125MHz). On the bench, I pushed the clock up to 140MHz, and used the chipscope VIO core (which MIG builds in) to push the timing over to one "side", but it just sat there and took it - one channel would fail at the most extreme sampling point offset I could set. I used series terminators on the signal lines, but to be honest that was maybe paranoia. The sims did show rather more overshoot that I wanted with fast-edge models, so I put them in. I also turned down the drive strength on the data lines in the MIG/MPMC config.> Our Xilinx FAE seems to be discouraging us from doing this, without > saying precisely why, suggesting some other parts. Spartan 6 would be > ideal (hard dram controller as I understand it) but are unavailable > for some vague time. We'd rather not use a new part for a single > project, since we will cut over to the s6's when they are available. > > Has anyone done DDR2 from a Spartan 3? Success/horror stories? >I don't know how much of DDR reads over to DDR2, but hopefully the above is of interest! Cheers, Martin -- martin.j.thompson@trw.com TRW Conekt - Consultancy in Engineering, Knowledge and Technology http://www.conekt.net/electronics.html
Reply by ●July 24, 20092009-07-24
John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote:>Hi, > >We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >dram chip. The coregen thing seems to successfully build a dram >interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >work down to 128 MHz, so there's a small overlap window. We'd run at >128. > >Our Xilinx FAE seems to be discouraging us from doing this, without >saying precisely why, suggesting some other parts. Spartan 6 would be >ideal (hard dram controller as I understand it) but are unavailable >for some vague time. We'd rather not use a new part for a single >project, since we will cut over to the s6's when they are available. > >Has anyone done DDR2 from a Spartan 3? Success/horror stories?I did a DDR design at 100MHz which shares a standard PC memory module (64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't like the MIG tool (way too big, ugly and too limited) so I rolled my own DDR controller. The trick is to get the sampling point for the incoming data right. I used a 90 degrees phase shifted capture clock that hit the sweet spot perfectly. I'm planning on upgrading this design to DDR2 using the speed grade 5 devices. I still have to do the math whether the phase shifted clock will work. There has to be a window in which the data is stable for the FPGA to sample it. If there is no such window a calibration scheme is required. I looked at the Spartan 6 FPGA but I doubt it will offer much improvement. The memory controller is still very limited when it comes to the amount of memory (width and address space) it can control. -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... "If it doesn't fit, use a bigger hammer!" --------------------------------------------------------------
Reply by ●July 25, 20092009-07-25
On Fri, 24 Jul 2009 10:44:40 GMT, nico@puntnl.niks (Nico Coesel) wrote:>John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote: > >>Hi, >> >>We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >>dram chip. The coregen thing seems to successfully build a dram >>interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >>work down to 128 MHz, so there's a small overlap window. We'd run at >>128. >> >>Our Xilinx FAE seems to be discouraging us from doing this, without >>saying precisely why, suggesting some other parts. Spartan 6 would be >>ideal (hard dram controller as I understand it) but are unavailable >>for some vague time. We'd rather not use a new part for a single >>project, since we will cut over to the s6's when they are available. >> >>Has anyone done DDR2 from a Spartan 3? Success/horror stories? > >I did a DDR design at 100MHz which shares a standard PC memory module >(64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't >like the MIG tool (way too big, ugly and too limited) so I rolled my >own DDR controller. The trick is to get the sampling point for the >incoming data right. I used a 90 degrees phase shifted capture clock >that hit the sweet spot perfectly. I'm planning on upgrading this >design to DDR2 using the speed grade 5 devices. I still have to do the >math whether the phase shifted clock will work. There has to be a >window in which the data is stable for the FPGA to sample it. If there >is no such window a calibration scheme is required. > >I looked at the Spartan 6 FPGA but I doubt it will offer much >improvement. The memory controller is still very limited when it comes >to the amount of memory (width and address space) it can control.*Now* our Xilinx guy is saying, oops, the XC3S is fine to work with DDR2. Maybe I'll add an external variable delay just in case we need to tweak the read clock edge. John
Reply by ●July 25, 20092009-07-25
"John Larkin" <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote in message news:ia1l6511mrt05k4vhifkcu1j9s1kc5i4nu@4ax.com...> On Fri, 24 Jul 2009 10:44:40 GMT, nico@puntnl.niks (Nico Coesel) > wrote: > >>John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote: >> >>>Hi, >>> >>>We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >>>dram chip. The coregen thing seems to successfully build a dram >>>interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >>>work down to 128 MHz, so there's a small overlap window. We'd run at >>>128. >>> >>>Our Xilinx FAE seems to be discouraging us from doing this, without >>>saying precisely why, suggesting some other parts. Spartan 6 would be >>>ideal (hard dram controller as I understand it) but are unavailable >>>for some vague time. We'd rather not use a new part for a single >>>project, since we will cut over to the s6's when they are available. >>> >>>Has anyone done DDR2 from a Spartan 3? Success/horror stories? >> >>I did a DDR design at 100MHz which shares a standard PC memory module >>(64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't >>like the MIG tool (way too big, ugly and too limited) so I rolled my >>own DDR controller. The trick is to get the sampling point for the >>incoming data right. I used a 90 degrees phase shifted capture clock >>that hit the sweet spot perfectly. I'm planning on upgrading this >>design to DDR2 using the speed grade 5 devices. I still have to do the >>math whether the phase shifted clock will work. There has to be a >>window in which the data is stable for the FPGA to sample it. If there >>is no such window a calibration scheme is required. >> >>I looked at the Spartan 6 FPGA but I doubt it will offer much >>improvement. The memory controller is still very limited when it comes >>to the amount of memory (width and address space) it can control. > > *Now* our Xilinx guy is saying, oops, the XC3S is fine to work with > DDR2. Maybe I'll add an external variable delay just in case we need > to tweak the read clock edge. > > John >The issue with not clocking the read data from each byte's DQS is that you're losing some of your data valid window. However, if you're set on using Spartan 3 then it's your only choice. The trick is to make sure that your internal fabric clock is phase stable with the incoming data (over voltage, temperature, and unit-to-unit variations) in order to maximize the read data valid window at each IOB. There are also techniques for stabilizing the write data and FPGA-generated RAM clock with respect to voltage, temperature, and unit-to-unit variations. This involves routing an output pin back to a clock input (IBUFG) and putting it in the DCM->BUFG feedback loop. I would assume that the Xilinx appnote does this (I think it does, iirc). You'll need to use DCMs, anyway, so adding external variable delay doesn't buy you anything. I would recommend putting hooks in your code so that you can, at run time, adjust the read DCM phase shift so you can find the middle of the data valid window while testing over supply and temperature variation. If you use an external clock (copy of the one feeding the FPGA) to clock your RAM then you lose the ability to do RAM checking via JTAG at production test time. If you do FPGA JTAG-controlled RAM testing then you'll need to have the FPGA forward the clock to the RAM. With all that, you should be able to get it to work reliably at your 256M data rate, but it takes a lot of up-front planning. Bob -- == All google group posts are automatically deleted due to spam ==
Reply by ●July 25, 20092009-07-25
"BobW" <nimby_GIMME_SOME_SPAM@roadrunner.com> wrote:> >"John Larkin" <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote in message >news:ia1l6511mrt05k4vhifkcu1j9s1kc5i4nu@4ax.com... >> On Fri, 24 Jul 2009 10:44:40 GMT, nico@puntnl.niks (Nico Coesel) >> wrote: >> >>>John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote: >>> >>>>Hi, >>>> >>>>We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >>>>dram chip. The coregen thing seems to successfully build a dram >>>>interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >>>>work down to 128 MHz, so there's a small overlap window. We'd run at >>>>128. >>>> >>>>Our Xilinx FAE seems to be discouraging us from doing this, without >>>>saying precisely why, suggesting some other parts. Spartan 6 would be >>>>ideal (hard dram controller as I understand it) but are unavailable >>>>for some vague time. We'd rather not use a new part for a single >>>>project, since we will cut over to the s6's when they are available. >>>> >>>>Has anyone done DDR2 from a Spartan 3? Success/horror stories? >>> >>>I did a DDR design at 100MHz which shares a standard PC memory module >>>(64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't >>>like the MIG tool (way too big, ugly and too limited) so I rolled my >>>own DDR controller. The trick is to get the sampling point for the >>>incoming data right. I used a 90 degrees phase shifted capture clock >>>that hit the sweet spot perfectly. I'm planning on upgrading this >>>design to DDR2 using the speed grade 5 devices. I still have to do the >>>math whether the phase shifted clock will work. There has to be a >>>window in which the data is stable for the FPGA to sample it. If there >>>is no such window a calibration scheme is required. >>> >>>I looked at the Spartan 6 FPGA but I doubt it will offer much >>>improvement. The memory controller is still very limited when it comes >>>to the amount of memory (width and address space) it can control. >> >> *Now* our Xilinx guy is saying, oops, the XC3S is fine to work with >> DDR2. Maybe I'll add an external variable delay just in case we need >> to tweak the read clock edge. >> >> John >>>There are also techniques for stabilizing the write data and FPGA-generated >RAM clock with respect to voltage, temperature, and unit-to-unit variations. >This involves routing an output pin back to a clock input (IBUFG) and >putting it in the DCM->BUFG feedback loop. I would assume that the Xilinx >appnote does this (I think it does, iirc).IMHO this trick has no effect if you source the memory clock from the FPGA, the delay variation in the IOB is cancelled when writing. Writing data becomes a walk in the park. -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... "If it doesn't fit, use a bigger hammer!" --------------------------------------------------------------
Reply by ●July 25, 20092009-07-25
"Nico Coesel" <nico@puntnl.niks> wrote in message news:4a6acdbc.71573125@news.planet.nl...> "BobW" <nimby_GIMME_SOME_SPAM@roadrunner.com> wrote: > >> >>"John Larkin" <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote in >>message >>news:ia1l6511mrt05k4vhifkcu1j9s1kc5i4nu@4ax.com... >>> On Fri, 24 Jul 2009 10:44:40 GMT, nico@puntnl.niks (Nico Coesel) >>> wrote: >>> >>>>John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote: >>>> >>>>>Hi, >>>>> >>>>>We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >>>>>dram chip. The coregen thing seems to successfully build a dram >>>>>interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >>>>>work down to 128 MHz, so there's a small overlap window. We'd run at >>>>>128. >>>>> >>>>>Our Xilinx FAE seems to be discouraging us from doing this, without >>>>>saying precisely why, suggesting some other parts. Spartan 6 would be >>>>>ideal (hard dram controller as I understand it) but are unavailable >>>>>for some vague time. We'd rather not use a new part for a single >>>>>project, since we will cut over to the s6's when they are available. >>>>> >>>>>Has anyone done DDR2 from a Spartan 3? Success/horror stories? >>>> >>>>I did a DDR design at 100MHz which shares a standard PC memory module >>>>(64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't >>>>like the MIG tool (way too big, ugly and too limited) so I rolled my >>>>own DDR controller. The trick is to get the sampling point for the >>>>incoming data right. I used a 90 degrees phase shifted capture clock >>>>that hit the sweet spot perfectly. I'm planning on upgrading this >>>>design to DDR2 using the speed grade 5 devices. I still have to do the >>>>math whether the phase shifted clock will work. There has to be a >>>>window in which the data is stable for the FPGA to sample it. If there >>>>is no such window a calibration scheme is required. >>>> >>>>I looked at the Spartan 6 FPGA but I doubt it will offer much >>>>improvement. The memory controller is still very limited when it comes >>>>to the amount of memory (width and address space) it can control. >>> >>> *Now* our Xilinx guy is saying, oops, the XC3S is fine to work with >>> DDR2. Maybe I'll add an external variable delay just in case we need >>> to tweak the read clock edge. >>> >>> John >>> > >>There are also techniques for stabilizing the write data and >>FPGA-generated >>RAM clock with respect to voltage, temperature, and unit-to-unit >>variations. >>This involves routing an output pin back to a clock input (IBUFG) and >>putting it in the DCM->BUFG feedback loop. I would assume that the Xilinx >>appnote does this (I think it does, iirc). > > IMHO this trick has no effect if you source the memory clock from the > FPGA, the delay variation in the IOB is cancelled when writing. > Writing data becomes a walk in the park. >That's right. Writing is always the easy part. When you generate the clock from the FPGA then that clock and data will always remain stable in relative phase. However, when you do this, the phase of the read data will now "drift" with respect to FPGA's internal clock. That's the problem. So, if you phase lock the output clock (and associated write data) by using the clock loop around technique then the read data will be locked, too. Of course, this all assumes that John needs to read the RAM. Maybe he's using it in WOM mode. ;-D Bob -- == All google group posts are automatically deleted due to spam ==
Reply by ●July 26, 20092009-07-26
"BobW" <nimby_GIMME_SOME_SPAM@roadrunner.com> wrote:> >"Nico Coesel" <nico@puntnl.niks> wrote in message >news:4a6acdbc.71573125@news.planet.nl... >> "BobW" <nimby_GIMME_SOME_SPAM@roadrunner.com> wrote: >> >>> >>>"John Larkin" <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote in >>>message >>>news:ia1l6511mrt05k4vhifkcu1j9s1kc5i4nu@4ax.com... >>>> On Fri, 24 Jul 2009 10:44:40 GMT, nico@puntnl.niks (Nico Coesel) >>>> wrote: >>>> >>>>>John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote: >>>>> >>>>>>Hi, >>>>>> >>>>>>We want to use an XC3S1500 to talk to a single 16-wide 1 gbit DDR2 >>>>>>dram chip. The coregen thing seems to successfully build a dram >>>>>>interface that's claimed to work up to 133 MHz. The DRAM is spec'd to >>>>>>work down to 128 MHz, so there's a small overlap window. We'd run at >>>>>>128. >>>>>> >>>>>>Our Xilinx FAE seems to be discouraging us from doing this, without >>>>>>saying precisely why, suggesting some other parts. Spartan 6 would be >>>>>>ideal (hard dram controller as I understand it) but are unavailable >>>>>>for some vague time. We'd rather not use a new part for a single >>>>>>project, since we will cut over to the s6's when they are available. >>>>>> >>>>>>Has anyone done DDR2 from a Spartan 3? Success/horror stories? >>>>> >>>>>I did a DDR design at 100MHz which shares a standard PC memory module >>>>>(64 bit wide) between two Spartan 3 FPGAs (800MB/s per FPGA). I didn't >>>>>like the MIG tool (way too big, ugly and too limited) so I rolled my >>>>>own DDR controller. The trick is to get the sampling point for the >>>>>incoming data right. I used a 90 degrees phase shifted capture clock >>>>>that hit the sweet spot perfectly. I'm planning on upgrading this >>>>>design to DDR2 using the speed grade 5 devices. I still have to do the >>>>>math whether the phase shifted clock will work. There has to be a >>>>>window in which the data is stable for the FPGA to sample it. If there >>>>>is no such window a calibration scheme is required. >>>>> >>>>>I looked at the Spartan 6 FPGA but I doubt it will offer much >>>>>improvement. The memory controller is still very limited when it comes >>>>>to the amount of memory (width and address space) it can control. >>>> >>>> *Now* our Xilinx guy is saying, oops, the XC3S is fine to work with >>>> DDR2. Maybe I'll add an external variable delay just in case we need >>>> to tweak the read clock edge. >>>> >>>> John >>>> >> >>>There are also techniques for stabilizing the write data and >>>FPGA-generated >>>RAM clock with respect to voltage, temperature, and unit-to-unit >>>variations. >>>This involves routing an output pin back to a clock input (IBUFG) and >>>putting it in the DCM->BUFG feedback loop. I would assume that the Xilinx >>>appnote does this (I think it does, iirc). >> >> IMHO this trick has no effect if you source the memory clock from the >> FPGA, the delay variation in the IOB is cancelled when writing. >> Writing data becomes a walk in the park. >> > >That's right. Writing is always the easy part. When you generate the clock >from the FPGA then that clock and data will always remain stable in relative >phase. > >However, when you do this, the phase of the read data will now "drift" with >respect to FPGA's internal clock. That's the problem. > >So, if you phase lock the output clock (and associated write data) by using >the clock loop around technique then the read data will be locked, too.More precisely: you'll get a clock that has the same phase shift as the delay in the input path + output path + some clock routing delays (ibufg, obugf). I'm not sure if all this results in a clock with less uncertainty than just taking the worst case variations in the input and output pads into account. Besides, you'll need to re-clock the data into other flipflops downstream. So a properly phase shifted clock which doesn't violate setup and hold times is still required. The problem more or less remains. The Spartan 3 has regional clocks. On the top and bottom banks it is possible to use DQS to clock the data into the IOB flipflops. There are some severe routing contraints though (not well documented either) which prohibits using the banks on the sides. You'll have to route this part of the design first and get the pin assignment right before making the PCB. This method also requires thourough analyses of where the DQS is at to be able to clock the data into the downstream flipflops. It has been several years since I worked on the design but I clearly remember that putting more 'helper' elements in series with signals made things worse. Keeping things simple (clocking the data into a known clock domain inside the IOB) resulted in a design which had the best timing margins. -- Failure does not prove something is impossible, failure simply indicates you are not using the right tools... "If it doesn't fit, use a bigger hammer!" --------------------------------------------------------------






