On Sep 29, 1:11=A0pm, n...@puntnl.niks (Nico Coesel) wrote:
> "KJ" <kkjenni...@sbcglobal.net> wrote:
>
> >"Nico Coesel" <n...@puntnl.niks> wrote in message
> >news:48dfe4ee.168426254@news.planet.nl...
> >> jhal...@TheWorld.com (Joseph H Allen) wrote:
>
> >> An easier way without the extra jitter is to use an
> >> output flipflop (aka DDR flipflop) which can be clocked using 2
> >> clocks. The first clock sets the output, the second clock (inverted
> >> first clock) resets the output. And presto, you'll have a clock output
> >> which is (within pin-to-pin skew) perfectly synchronous to the other
> >> outputs.
>
> >Hold time requirements (like the .4ns in the OP) will be impossible to
> >guarantee with this method though.
>
> A slightly longer PCB track will do that for you.
Slightly? The OP estimated the PCB traces at ~1 inch. To add .4 ns
of delay (the hold time requirement of the SRAM) would require adding
~2.5 inches of trace. Tommy eyeballed the existing traces at ~1
inch. To do what you suggest would require adding 2.5x of the
existing trace to each and every address/control signal on something
that is running ~200 MHz...not the sort of thing one would design into
a board...at least not intentionally.
KJ
Reply by Nico Coesel●September 29, 20082008-09-29
"KJ" <kkjennings@sbcglobal.net> wrote:
>
>"Nico Coesel" <nico@puntnl.niks> wrote in message
>news:48dfe4ee.168426254@news.planet.nl...
>> jhallen@TheWorld.com (Joseph H Allen) wrote:
>>
>> An easier way without the extra jitter is to use an
>> output flipflop (aka DDR flipflop) which can be clocked using 2
>> clocks. The first clock sets the output, the second clock (inverted
>> first clock) resets the output. And presto, you'll have a clock output
>> which is (within pin-to-pin skew) perfectly synchronous to the other
>> outputs.
>>
>
>Hold time requirements (like the .4ns in the OP) will be impossible to
>guarantee with this method though.
A slightly longer PCB track will do that for you. But that won't work
on a pre-made board because you can't alter the PCB. Still, if the OP
uses an off the shelf board the designer should have thought about
these sort of things... Perhaps the OP could ask them.
--
Programmeren in Almere?
E-mail naar nico@nctdevpuntnl (punt=.)
Reply by KJ●September 28, 20082008-09-28
"Nico Coesel" <nico@puntnl.niks> wrote in message
news:48dfe4ee.168426254@news.planet.nl...
> jhallen@TheWorld.com (Joseph H Allen) wrote:
>
> An easier way without the extra jitter is to use an
> output flipflop (aka DDR flipflop) which can be clocked using 2
> clocks. The first clock sets the output, the second clock (inverted
> first clock) resets the output. And presto, you'll have a clock output
> which is (within pin-to-pin skew) perfectly synchronous to the other
> outputs.
>
Hold time requirements (like the .4ns in the OP) will be impossible to
guarantee with this method though.
KJ
Reply by Nico Coesel●September 28, 20082008-09-28
jhallen@TheWorld.com (Joseph H Allen) wrote:
>There are a number of ways to do this. Here is one way:
>
>What you usually want is for the clock at the pin of the SRAM chip to rise
>at the same time as the FPGA internal global clock driving the output pads.
>
>A way to do this is to route the clock from the second PLL to two output
>pins (right next to each other). One pin goes to the SSRAM. The other pin
>goes to the feedback input of the PLL. The trace length for this feedback
>line should be the same as the one which goes to the RAM. You have to use a
>dedicated PLL feedback input pin for this to work. This way is nice because
>you can usually leave the PLL phase shift setting at 0.
This is quite cumbersome and the PLL will add extra jitter (clock
uncertainty). An easier way without the extra jitter is to use an
output flipflop (aka DDR flipflop) which can be clocked using 2
clocks. The first clock sets the output, the second clock (inverted
first clock) resets the output. And presto, you'll have a clock output
which is (within pin-to-pin skew) perfectly synchronous to the other
outputs.
--
Programmeren in Almere?
E-mail naar nico@nctdevpuntnl (punt=.)
Reply by KJ●September 28, 20082008-09-28
On Sep 28, 1:52=A0am, Tommy Thorn <tommy.th...@gmail.com> wrote:
> KJ, thanks for a detailed reply. It makes perfect sense, unfortunately
> as detailed below I'm not sure it's all feasible for me to do this
> analytically, but I feel more comfortable playing with phase shifting
> of the clock.
>
Well, the bottom line in the calculations will still be coming up with
a phase shift of the clock so you're poking around at what the correct
solution will be
>
> > Keep in mind though that there can easily be different delays from
> > these two PLL outputs to the different destinations.
>
> Opps. I had assumed they were phase locked and that the screw between
> them would be small enough to ignore. Using just a single PLL doesn't
> seem like it would help (?) as the on-die clock might be offset from
> the clock at the pin.
>
Some pins (actually one) will be better choices than others since they
are intended to be used as a PLL output pin. Using other pins is not
necessarily a problem, just that you'll get more skew and have
somewhat less control of it. But as you've discovered, there must be
some skew between the two otherwise your adjusting of the phase
wouldn't have made any difference (and it did).
> > You'll need to know...
> > - What is an achievable skew between the clock at the internal flops
> > and the clock as it is leaving your controller.
>
> Ok, but how do I find that? Browsing the Cyclone II data sheet didn't
> reveal anything useful (unless I missed it).
>
Peruse the timing reports of your design. There should be something
saying what the delay to the output pin that goes to the SRAM is.
Here it can get a bit muddy depending on whether you're using the
'Classic' timing analyzer or 'TimeQuest' but what you're trying to
determine should be in the timing reports.
What you do with that delay min/max numbers is use it to set a timing
constraint that it will now have to always meet and also use that
constraint to figure out what the phase shift is needs to be on the
internal clock so that everything hangs together. It may sound kind
of backwards using the result of a run to figure out what the
constraints needs to be but it's not really. Ideally you would like
there to be 0ns skew at the clock output pin, but if the software
can't deliver that then what? At the end of the day, the absolute
value of the delay doesn't matter since that gets accomodated for by
changing the PLL phase delay, what hurts the most is the difference
between the min and the max of that delay (again, available from the
timing reports). The spread between min and max is something that
can't be designed around and is something that will be tighter if the
'pll_out' pin is used. I'm not sure if the board you're using did
this, but it's easy enough to check...nothing you can do about it, but
might be worth knowing.
>
> > - Net lengths and capacitive loading of the signals on the PCBA that
> > go between the two devices.
>
> Ough. While that makes perfect sense, I simply don't think I have
> enough information (or experience) to do that. Remember, this is an
> off-the-shelf development kit and the provided examples do not even
> use constraints. I do have the SSRAM datasheet which lists the Cin as
> 6 pF and Cout as 8 pF. Eyeballing the traces, they looks roughly an
> inch long (the SSRAM sits very close to the FPGA).
>
That's good. Trace delays is ~6 inch per ns so round trip delay from
FPGA to SRAM and back (as you would have during an SRAM read) would be
< 1/3 ns which is pretty small (~3% of the overall clock cycle).
> >=A0What's important here is really
> > differences between the various SRAM inputs and the clock. =A0From that
> > you calculate an additional delay.
>
> Again, eyeballing it, the clock trace looks near identical to most of
> the data.
>
That's good too.
>
> > From all of that you should come out with a sketch that shows where
> > things need to be valid in order for the system to read and write
> > properly. =A0Adjust the nominal phase of the clock leaving the device
> > (or equivalently the FPGA internal clock) so that the clock occurs
> > (both min and max time) at a point where everything is stable. =A0Keep
> > in mind that as you shove the clock one way to improve setup time at
> > the SRAM, you're most likely stealing that from the setup time at the
> > FPGA when it is reading data back.
>
> Exactly. Which is why I was weary of the advice I found multiple
> places: phase shift the SSRAM clock by 180 degree.
>
180 degrees means you're inverting the clock. While that's an easy
technique, giving up half of a 10 ns clock period will likely end up
in still failing timing. It's best to figure out based on the Tco of
the outputs, the skew of the clocks and the setup and hold times just
where the clock can be placed. If it happens to be that an inverted
clock would work, OK. If not, your analysis will show just where it
can be placed. Again, ideally you would like the FPGA and the SRAM to
both receive the clock at exactly the same time, that will give you
the most margin.
>
> Thankfully this board appears to be well designed. I can already hit
> 170 MHz with my simple solution, but I'd like to push it to the limit
> of the SRAM (200 MHz).
>
Pushing to the limit usually just means that some extra analysis work
is needed in order to make the design solid. That's all that is going
on here.
> > Since it appears from your constraint that you're using Quartus, you
> > might want to put in numbers that are representative of the correct
> > capacitive loading as well as checking that the I/O drive strengths
> > are appropriate and not just the defaults (unless =A0you've already don=
e
> > this).
>
> Ah, yes, I should do that. The pin capacitive loading I gave above.
> I'm not sure how much I should estimate for the short trace. The DC
> ELECTRICAL CHARACTERISTICS states this:
>
> Output HIGH Voltage min 2.4 V (test cond I_OH =3D -4 mA)
> Output LOW Voltage max 0.4 V (test cond I_OL =3D -8 mA)
> Input HIGH Voltage min 2.0 V
> Input LOW Voltage max 0.8 V
> Input Leakage [-5 uA; 5 uA]
> Output Leakage [-5 uA; 5 uA]
>
> but doesn't explicitly mention an IO standard. I assume that LVTTL
> (3.3 V) is a fine choice (the default?). Or is LVCMOS a better choice.
>
> I guess I have no idea of how to pick a suitable drive strength.
>
LVTTL is the voltage standard (LVCMOS will be essentially the same).
There is another setting for drive strength, look for something
measured in mA. Since it sounds like the PCBA design has no obvious
problems, set the drive strength to the max (likely 24mA).
Kevin Jennings
Reply by Tommy Thorn●September 28, 20082008-09-28
KJ, thanks for a detailed reply. It makes perfect sense, unfortunately
as detailed below I'm not sure it's all feasible for me to do this
analytically, but I feel more comfortable playing with phase shifting
of the clock.
On Sep 26, 12:14=A0pm, KJ <kkjenni...@sbcglobal.net> wrote:
> On Sep 26, 2:34=A0pm, Tommy Thorn <tommy.th...@gmail.com> wrote:
>
> > First, all output are fully registered (and constrained to guarantee
> > they stay registered). The main logic is clocked by a PLL. A second
> > but identically configured output on this PLL drives the SSRAM.
>
> Keep in mind though that there can easily be different delays from
> these two PLL outputs to the different destinations.
Opps. I had assumed they were phase locked and that the screw between
them would be small enough to ignore. Using just a single PLL doesn't
seem like it would help (?) as the on-die clock might be offset from
the clock at the pin.
> You'll need to know...
> - What is an achievable skew between the clock at the internal flops
> and the clock as it is leaving your controller.
Ok, but how do I find that? Browsing the Cyclone II data sheet didn't
reveal anything useful (unless I missed it).
>=A0In some sense it
> doesn't matter too much what that actual skew is, but you need to know
> what it is so that you can then add a timing constraint so that this
> delay is always met, or flagged as a timing error for you. =A0For the
> sake of an example, let's just say that that there is a skew of 1 ns
> between the internal clock and the clock at the output of the FPGA.
> Keep in mind that this skew will have both a minimum and a maximum so
> the skew is really a range between those two extremes.
>
> - Net lengths and capacitive loading of the signals on the PCBA that
> go between the two devices.
Ough. While that makes perfect sense, I simply don't think I have
enough information (or experience) to do that. Remember, this is an
off-the-shelf development kit and the provided examples do not even
use constraints. I do have the SSRAM datasheet which lists the Cin as
6 pF and Cout as 8 pF. Eyeballing the traces, they looks roughly an
inch long (the SSRAM sits very close to the FPGA).
>=A0What's important here is really
> differences between the various SRAM inputs and the clock. =A0From that
> you calculate an additional delay.
Again, eyeballing it, the clock trace looks near identical to most of
the data.
>=A0Practically speaking, you most
> likely have roughly equal net lengths and loading on all of the
> signals and this is not going to be a concern, but you should at least
> be aware of this as well. =A0Different parts may have different
> capacitive loading so if you want to get nit picky this delay will
> also be a range with a min and a max but that range will typically be
> much smaller than the uncertainty with the FPGA.
>
> From the FPGA clock skew min/max add on the additional delay for
> length/loading differences and now you have a known window of clock
> uncertainty. =A0Now get a piece of paper and sketch out some waveforms
> showing the min/max switching times of the control signals (i.e.
> address, oe, write, data_in, data_out) as well as the setup/hold time
> of both the SRAM and the FPGA (for the data coming back in). =A0Somebody
> was advertising a free timing waveform tool out here a few months back
> (I don't remember the name), that may help but it's not that difficult
> to paper sketch it either.
>
> From all of that you should come out with a sketch that shows where
> things need to be valid in order for the system to read and write
> properly. =A0Adjust the nominal phase of the clock leaving the device
> (or equivalently the FPGA internal clock) so that the clock occurs
> (both min and max time) at a point where everything is stable. =A0Keep
> in mind that as you shove the clock one way to improve setup time at
> the SRAM, you're most likely stealing that from the setup time at the
> FPGA when it is reading data back.
Exactly. Which is why I was weary of the advice I found multiple
places: phase shift the SSRAM clock by 180 degree.
> There can also be other concerns like if the nets are long you'll get
> ringing which distorts the waveforms which basically means that you'll
> need to wait a longer for things to stabilize which cuts into the
> allowable timing. =A0At 100 MHz, just 1 ns is 10% of the clock cycle
> budget. =A0Whether or not that's an issue or not you'll need to
> determine with a scope.
Thankfully this board appears to be well designed. I can already hit
170 MHz with my simple solution, but I'd like to push it to the limit
of the SRAM (200 MHz).
> > Phase-shift the SSRAM clock?
>
> Yes
>
> > Specify it as a timing constraint?
>
> Yes
>
> > Do timing constraints influence the output buffer or are they purely fo=
r
> > checking?
>
> For the most part it's just checking, although it can affect place and
> route as well. =A0I haven't seen a case where it affected the output
> buffer itself (i.e. kicked up or down the drive strength) in order to
> meet a constraint. =A0This is most likely because drive strength
> considerations have a much larger impact than just timing. =A0It does go
> the other way though, as you fiddle with drive strength the software
> should take this into account when it does the timing analysis.
>
> Since it appears from your constraint that you're using Quartus, you
> might want to put in numbers that are representative of the correct
> capacitive loading as well as checking that the I/O drive strengths
> are appropriate and not just the defaults (unless =A0you've already done
> this).
Ah, yes, I should do that. The pin capacitive loading I gave above.
I'm not sure how much I should estimate for the short trace. The DC
ELECTRICAL CHARACTERISTICS states this:
Output HIGH Voltage min 2.4 V (test cond I_OH =3D -4 mA)
Output LOW Voltage max 0.4 V (test cond I_OL =3D -8 mA)
Input HIGH Voltage min 2.0 V
Input LOW Voltage max 0.8 V
Input Leakage [-5 uA; 5 uA]
Output Leakage [-5 uA; 5 uA]
but doesn't explicitly mention an IO standard. I assume that LVTTL
(3.3 V) is a fine choice (the default?). Or is LVCMOS a better choice.
I guess I have no idea of how to pick a suitable drive strength.
Thanks again
Tommy
Reply by Tommy Thorn●September 27, 20082008-09-27
On Sep 26, 12:37=A0pm, jhal...@TheWorld.com (Joseph H Allen) wrote:
> You have to use a
> dedicated PLL feedback input pin for this to work. =A0This way is nice
> because you can usually leave the PLL phase shift setting at 0.
Thanks. This is a nice solution and I've used it on other projects.
Unfortunately, I didn't design Terasic's DE2-70 dev kit and it doesn't
have a feedback clock trace, so this isn't an option here.
Tommy
Reply by Joseph H Allen●September 26, 20082008-09-26
There are a number of ways to do this. Here is one way:
What you usually want is for the clock at the pin of the SRAM chip to rise
at the same time as the FPGA internal global clock driving the output pads.
A way to do this is to route the clock from the second PLL to two output
pins (right next to each other). One pin goes to the SSRAM. The other pin
goes to the feedback input of the PLL. The trace length for this feedback
line should be the same as the one which goes to the RAM. You have to use a
dedicated PLL feedback input pin for this to work. This way is nice because
you can usually leave the PLL phase shift setting at 0.
You have to use an "enhanced" PLL with external feedback pin and set it to
use external feedback pin mode.
In article <11ffca10-bec7-4cb7-bed4-458a3a8743dd@v13g2000pro.googlegroups.com>,
Tommy Thorn <tommy.thorn@gmail.com> wrote:
>I wrote a little controller + tester app for the SSRAM on Terasic's
>DE2-70 which is rated for 200 MHz. I have gotten it working @ 170 MHz,
>but I'm a little uneasy about the SSRAM clock. Being a non-EE I
>suspect I'm missing something fundamental here.
>
>First, all output are fully registered (and constrained to guarantee
>they stay registered). The main logic is clocked by a PLL. A second
>but identically configured output on this PLL drives the SSRAM. The
>SSRAM datasheet lists the setup and hold times for the inputs as 1.4
>ns / 0.4 ns respectively. What is the correct way to achieve this?
>Phase-shift the SSRAM clock? Specify it as a timing constraint? Do
>timing constraints influence the output buffer or are they purely for
>checking?
>
>Keeping the clock in phase with the main clock led to errors showing
>up (once beyond 100 MHz), but shifting it a few degrees made it
>perfectly stable again. However, what is the appropriate engineering
>approach to this source synchronous problem?
>
>Any help would be much appreciated.
>
>Thanks,
>Tommy
>
>
>FWIW, these are the constraints I'm currently using:
>
># timing constraints for SSRAM
>set_instance_assignment -name FAST_OUTPUT_REGISTER ON -to oSRAM*
>set_instance_assignment -name FAST_OUTPUT_REGISTER ON -to SRAM_*
>set_instance_assignment -name FAST_OUTPUT_ENABLE_REGISTER ON -to
>SRAM_*
>set_instance_assignment -name TCO_REQUIREMENT "3 ns" -to oSRAM*
>set_instance_assignment -name TCO_REQUIREMENT "3 ns" -to SRAM*
>set_instance_assignment -name TSU_REQUIREMENT "2.2 ns" -to SRAM*
>
># other default timings
>set_global_assignment -name TSU_REQUIREMENT "5 ns"
>set_global_assignment -name TCO_REQUIREMENT "10 ns"
--
/* jhallen@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
Reply by KJ●September 26, 20082008-09-26
On Sep 26, 2:34=A0pm, Tommy Thorn <tommy.th...@gmail.com> wrote:
> I wrote a little controller + tester app for the SSRAM on Terasic's
> DE2-70 which is rated for 200 MHz. I have gotten it working @ 170 MHz,
> but I'm a little uneasy about the SSRAM clock. Being a non-EE I
> suspect I'm missing something fundamental here.
>
> First, all output are fully registered (and constrained to guarantee
> they stay registered). The main logic is clocked by a PLL. A second
> but identically configured output on this PLL drives the SSRAM.
Keep in mind though that there can easily be different delays from
these two PLL outputs to the different destinations.
> The
> SSRAM datasheet lists the setup and hold times for the inputs as 1.4
> ns / 0.4 ns respectively. What is the correct way to achieve this?
You'll need to know...
- What is an achievable skew between the clock at the internal flops
and the clock as it is leaving your controller. In some sense it
doesn't matter too much what that actual skew is, but you need to know
what it is so that you can then add a timing constraint so that this
delay is always met, or flagged as a timing error for you. For the
sake of an example, let's just say that that there is a skew of 1 ns
between the internal clock and the clock at the output of the FPGA.
Keep in mind that this skew will have both a minimum and a maximum so
the skew is really a range between those two extremes.
- Net lengths and capacitive loading of the signals on the PCBA that
go between the two devices. What's important here is really
differences between the various SRAM inputs and the clock. From that
you calculate an additional delay. Practically speaking, you most
likely have roughly equal net lengths and loading on all of the
signals and this is not going to be a concern, but you should at least
be aware of this as well. Different parts may have different
capacitive loading so if you want to get nit picky this delay will
also be a range with a min and a max but that range will typically be
much smaller than the uncertainty with the FPGA.
From the FPGA clock skew min/max add on the additional delay for
length/loading differences and now you have a known window of clock
uncertainty. Now get a piece of paper and sketch out some waveforms
showing the min/max switching times of the control signals (i.e.
address, oe, write, data_in, data_out) as well as the setup/hold time
of both the SRAM and the FPGA (for the data coming back in). Somebody
was advertising a free timing waveform tool out here a few months back
(I don't remember the name), that may help but it's not that difficult
to paper sketch it either.
From all of that you should come out with a sketch that shows where
things need to be valid in order for the system to read and write
properly. Adjust the nominal phase of the clock leaving the device
(or equivalently the FPGA internal clock) so that the clock occurs
(both min and max time) at a point where everything is stable. Keep
in mind that as you shove the clock one way to improve setup time at
the SRAM, you're most likely stealing that from the setup time at the
FPGA when it is reading data back.
There can also be other concerns like if the nets are long you'll get
ringing which distorts the waveforms which basically means that you'll
need to wait a longer for things to stabilize which cuts into the
allowable timing. At 100 MHz, just 1 ns is 10% of the clock cycle
budget. Whether or not that's an issue or not you'll need to
determine with a scope.
> Phase-shift the SSRAM clock?
Yes
> Specify it as a timing constraint?
Yes
> Do timing constraints influence the output buffer or are they purely for
> checking?
For the most part it's just checking, although it can affect place and
route as well. I haven't seen a case where it affected the output
buffer itself (i.e. kicked up or down the drive strength) in order to
meet a constraint. This is most likely because drive strength
considerations have a much larger impact than just timing. It does go
the other way though, as you fiddle with drive strength the software
should take this into account when it does the timing analysis.
Since it appears from your constraint that you're using Quartus, you
might want to put in numbers that are representative of the correct
capacitive loading as well as checking that the I/O drive strengths
are appropriate and not just the defaults (unless you've already done
this).
KJ
Reply by Tommy Thorn●September 26, 20082008-09-26
I wrote a little controller + tester app for the SSRAM on Terasic's
DE2-70 which is rated for 200 MHz. I have gotten it working @ 170 MHz,
but I'm a little uneasy about the SSRAM clock. Being a non-EE I
suspect I'm missing something fundamental here.
First, all output are fully registered (and constrained to guarantee
they stay registered). The main logic is clocked by a PLL. A second
but identically configured output on this PLL drives the SSRAM. The
SSRAM datasheet lists the setup and hold times for the inputs as 1.4
ns / 0.4 ns respectively. What is the correct way to achieve this?
Phase-shift the SSRAM clock? Specify it as a timing constraint? Do
timing constraints influence the output buffer or are they purely for
checking?
Keeping the clock in phase with the main clock led to errors showing
up (once beyond 100 MHz), but shifting it a few degrees made it
perfectly stable again. However, what is the appropriate engineering
approach to this source synchronous problem?
Any help would be much appreciated.
Thanks,
Tommy
FWIW, these are the constraints I'm currently using:
# timing constraints for SSRAM
set_instance_assignment -name FAST_OUTPUT_REGISTER ON -to oSRAM*
set_instance_assignment -name FAST_OUTPUT_REGISTER ON -to SRAM_*
set_instance_assignment -name FAST_OUTPUT_ENABLE_REGISTER ON -to
SRAM_*
set_instance_assignment -name TCO_REQUIREMENT "3 ns" -to oSRAM*
set_instance_assignment -name TCO_REQUIREMENT "3 ns" -to SRAM*
set_instance_assignment -name TSU_REQUIREMENT "2.2 ns" -to SRAM*
# other default timings
set_global_assignment -name TSU_REQUIREMENT "5 ns"
set_global_assignment -name TCO_REQUIREMENT "10 ns"