There are 17 messages in this thread.
You are currently looking at messages 0 to 10.
Hello-- What is the best standard practice to have a data bus cross a clock domain by implementing a data freeze? There is an extremely brief description of the data freeze given here: http://www.fpga4fun.com/CrossClockDomain4.html What is the best way to "freeze" the data in the source clock domain? I have a 108-bit bus which needs to cross between a high-speed clock domain (280MHz) and a clock domain operated at a lower speed (70MHz). I am using Verilog as the HDL and my FPGA is a Cyclone II. Nicholas______________________________
On Jan 6, 3:11=A0pm, Nicholas Kinar <n.ki...@usask.ca> wrote: > Hello-- > > What is the best standard practice to have a data bus cross a clock > domain by implementing a data freeze? > > There is an extremely brief description of the data freeze given here:htt= p://www.fpga4fun.com/CrossClockDomain4.html > > What is the best way to "freeze" the data in the source clock domain? > > I have a 108-bit bus which needs to cross between a high-speed clock > domain (280MHz) and a clock domain operated at a lower speed (70MHz). > > I am using Verilog as the HDL and my FPGA is a Cyclone II. > > Nicholas The "flag" is the important item in that description. If you never have more than one data value to transfer within any 8 (os so) high- speed clock cycles you can get by with transferring one value at a time. If you have bursts of data, you need a FIFO but the average speed cannot be greater than one in four high-speed clocks. The FIFO would need to be sized such that the longest burst could always be drained. When you have new data in the fast domain, toggle a single bit. Read (register) that single bit in the slow domain. If the bit has changed, load the data on the next cycle. You keep track of whether the bit has changed with a simple clock delay of that bit in the slow domain. Why not just load the data on the same clock the bit changes, using an XOR of the fast and slow flag bits for an enable? If the clocks aren't synchronous with guaranteed setup and hold, the enable may get to some bits on one side of the clock transition, other bits on the opposite side resulting in "half" transferred data. I mentioned earlier to "toggle" a single bit in the fast domain. This eliminates the need to have a reset handshake back from the slow domain; it's only when the bit changes that a write occurs. This points out that you can't have the bit toggle twice within one slow- domain clock cycle or the change won't be seen and data lost. Also, since there's a full slow clock cycle between registering the bit and performing the data load, the data has to remain static for that duration. If you need more help than descriptions, write again. I love to see people think through the issue and understand why they write the code they do. - John
On Wed, 06 Jan 2010 14:11:50 -0600, Nicholas Kinar wrote:
>What is the best way to "freeze" the data in the source clock domain?
>
>I have a 108-bit bus which needs to cross between a high-speed clock
>domain (280MHz) and a clock domain operated at a lower speed (70MHz).
If you are certain that the source clock is more than 2x faster
than your target clock, I think it's rather straightforward.
Create a divide-by-2 signal in the target domain. No
reset is required, because the phase of the divide-by-2
is irrelevant; only its changes are of interest. So
we use a Verilog model that doesn't need a reset in
simulation either:
always @(posedge slow_clock)
if (slow_flag == 1'b1)
slow_flag <= 0;
else
slow_flag <= 1;
In the source domain, put the new data in your freeze
register as soon as you detect a change on slow_flag,
taking care to resynchronize slow_flag to avoid the risk
of input hazards. Again no reset is required; it'll
sort itself out within three clock cycles.
always @(posedge fast_clock) begin
resync_slow_flag <= slow_flag;
old_slow_flag <= resync_slow_flag;
if (old_slow_flag != resync_slow_flag) begin
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end
And finally, capture freeze_register on every slow_clock:
always @(posedge slow_clock)
useful_data <= freeze_register;
In this way you can get a new data value on every slow_clock.
You can carry "data valid" information along with the data
itself, if you don't have a new data item soon enough for
every slow clock.
Draw lots of timing diagrams, and do lots of worst-case
analysis, to convince yourself whether this very simple
approach is robust in your situation. I believe that
it works reliably provided the clock periods obey the
following relationship:
slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps
where Tss is the setup time of the useful_data register,
Tsf is the setup time of the resync_slow_flag register,
Tpf is the propagation delay (including routing) from
fast clock to the freeze_register data becoming available
at the useful_data register's input, and Tps is the
propagation delay from slow clock to slow_flag becoming
available at the input to resync_slow_flag.
Note that Tss+Tpf and Tsf+Tps are both pretty much equal
to the shortest clock period that the FPGA can usefully
use, since they are both simple register-to-register
paths with no intervening logic. So, as a first
estimate, you could say
slow_period >= (2*fast_period) + (2/Fmax)
where Fmax is the FPGA's fastest useful clock speed.
But you'll need to apply timing constraints and check
the static timing results to be sure that you are safe.
Whatever you do, please double-check my assumptions
for yourself before doing anything upon which your
life, fortune or good name depends. Clock domain
crossings have been the undoing of many.
See also the "Flancter", and standard asynchronous FIFOs
in the FPGA macrocell library (although they will be much
more resource-hungry than the simple freeze, because they
must work for all combinations of source and target
clock frequency).
--
Jonathan Bromley
Hello John-- Thank you for your response! > The "flag" is the important item in that description. If you never > have more than one data value to transfer within any 8 (os so) high- > speed clock cycles you can get by with transferring one value at a > time. If you have bursts of data, you need a FIFO but the average > speed cannot be greater than one in four high-speed clocks. The FIFO > would need to be sized such that the longest burst could always be > drained. Essentially what I have is a 108-bit register which holds samples from six 18-bit ADCs. Once the register is full of data, I bring high an "offload_flag" signal which is read in the lower-speed clock domain. Once the "offload_flag" signal goes high, the 108-bit register is copied into another register in the slow clock domain. Then logic in the lower-speed clock domain brings high an "rs_offload_flag" signal, which is read in the high speed clock domain. When the "rs_offload_flag" signal goes high, logic in the high speed clock domain then brings low the "offload_flag" signal. This code fails timing analysis. There is no more than one data value to transfer within 8 high speed clock cycles. Perhaps my problem is that I need to use a synchronizer to bring the "offload_flag" signal and the "rs_offload_flag" signal between clock domains? > > When you have new data in the fast domain, toggle a single bit. Read > (register) that single bit in the slow domain. If the bit has > changed, load the data on the next cycle. You keep track of whether > the bit has changed with a simple clock delay of that bit in the slow > domain. So I would have to keep track of the state of the single bit in the slow domain? Would this involve having a register that holds the previous value of the bit? Every clock cycle, the register would be monitored for a change. If there is a transition in the bit, then the register is read. Then my register would have 109 bits = 108 bits data + 1 bit for transfer. > > Why not just load the data on the same clock the bit changes, using an > XOR of the fast and slow flag bits for an enable? If the clocks > aren't synchronous with guaranteed setup and hold, the enable may get > to some bits on one side of the clock transition, other bits on the > opposite side resulting in "half" transferred data. What is the difference between the "fast" and "slow" flag bits? Do you mean that there are two flag bits? > > I mentioned earlier to "toggle" a single bit in the fast domain. This > eliminates the need to have a reset handshake back from the slow > domain; it's only when the bit changes that a write occurs. This > points out that you can't have the bit toggle twice within one slow- > domain clock cycle or the change won't be seen and data lost. Also, > since there's a full slow clock cycle between registering the bit and > performing the data load, the data has to remain static for that > duration. I think that I understand how to do this. The state of the single bit (say data[0] in a 108-bit data word) is examined for a change in the slow clock domain. If the bit changes, then it is time to read the data. The 108-bit register is simply copied into another register that is in the slow clock domain. > > If you need more help than descriptions, write again. I love to see > people think through the issue and understand why they write the code > they do. > Yes - it's far easier to write the code yourself than struggle through understanding lines and lines of code that has been written by someone else. Nicholas______________________________
Hello Jonathan-- >> In the source domain, put the new data in your freeze >> register as soon as you detect a change on slow_flag, >> taking care to resynchronize slow_flag to avoid the risk >> of input hazards. Again no reset is required; it'll >> sort itself out within three clock cycles. >> >> always @(posedge fast_clock) begin >> resync_slow_flag <= slow_flag; >> old_slow_flag <= resync_slow_flag; >> if (old_slow_flag != resync_slow_flag) begin >> freeze_register <= source_data; >> // Do whatever it takes to indicate that >> // source_data has been consumed, and make >> // the next source_data available no more >> // than 2 fast clocks later >> end >> end What I don't understand is the assignment logic at the top of this always block. Isn't "old_slow_flag" equal to "resync_slow_flag"? What is the best way to detect a change on "slow_flag"? Once again, thank you so much for your help. Nicholas (and I am sorry for misspelling your name in previous posts...)______________________________
>
> What I don't understand is the assignment logic at the top of this
> always block. Isn't "old_slow_flag" equal to "resync_slow_flag"?
>
To help me better understand this, I've re-written the code:
reg old_slow_flag;
reg slow_flag;
always @(posedge fast_clock) begin
if (old_slow_flag != slow_flag) begin
old_slow_flag <= slow_flag;
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end
______________________________> > If you are certain that the source clock is more than 2x faster > than your target clock, I think it's rather straightforward. > The examples are very helpful. Thank you for posting terse snippets of code (rather than a large entire example program with lines and lines of code). Nicholas______________________________
> > Draw lots of timing diagrams, and do lots of worst-case > analysis, to convince yourself whether this very simple > approach is robust in your situation. I believe that > it works reliably provided the clock periods obey the > following relationship: > > slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps > Yes - I just tried this using the Quartus II synthesis tools. Many thanks for posting this procedure! I've verified that the solution you propose passes timing analysis in Quartus II for my particular design. > > slow_period >= (2*fast_period) + (2/Fmax) > > where Fmax is the FPGA's fastest useful clock speed. > But you'll need to apply timing constraints and check > the static timing results to be sure that you are safe. > > Whatever you do, please double-check my assumptions > for yourself before doing anything upon which your > life, fortune or good name depends. Clock domain > crossings have been the undoing of many. > To me, this equation seems to be very reasonable. The procedure also works well. > See also the "Flancter", and standard asynchronous FIFOs > in the FPGA macrocell library (although they will be much > more resource-hungry than the simple freeze, because they > must work for all combinations of source and target > clock frequency). Agreed. Thanks again for your help, Jonathan!______________________________
> > Perhaps my problem is that I need to use a synchronizer to bring the > "offload_flag" signal and the "rs_offload_flag" signal between clock > domains? > No, this is not the problem. Although a synchronizer is indeed useful for these signals, it is not the cause of the failing timing analysis. >> >> When you have new data in the fast domain, toggle a single bit. Read >> (register) that single bit in the slow domain. If the bit has >> changed, load the data on the next cycle. You keep track of whether >> the bit has changed with a simple clock delay of that bit in the slow >> domain. > Toggling a single bit is indeed a solution. Thank you for suggesting this.______________________________