FPGARelated.com
Forums

Up-counter with async load/clear and overflow detection (Verilog)

Started by Philip Pemberton October 1, 2009
Summary A developer is designing a data acquisition system to measure intervals between pulses from a rotating magnetic disc and store them in external SRAM.

A developer is designing a data acquisition system to measure intervals between pulses from a rotating magnetic disc and store them in external SRAM. The discussion focuses on resolving data corruption issues that occur when a microcontroller attempts to read the SRAM while acquisition is in progress.

The consensus is that the design suffers from asynchronous logic errors across multiple clock domains. Experts recommend transitioning to a strictly synchronous design and implementing a FIFO-style memory controller to manage the high-capacity external SRAM effectively.

  • Adopt a strictly synchronous design by using a single master clock and treating external pulses as clock enables rather than independent clocks.
  • Utilize a FIFO architecture for SRAM management, potentially cascading small internal BlockRAM FIFOs with larger external memory.
  • Synchronize all external signals and asynchronous inputs into the FPGA's primary clock domain to avoid race conditions and metastability.
  • Account for the large storage requirements of high-frequency pulse timing by correctly calculating SRAM depth based on minimum interval times.
VerilogSynchronous DesignFPGA Memory InterfaceFIFO Design
Hi guys,
This is most likely a problem with a really simple solution, but I've 
spent the past two hours hacking away at it and gotten nowhere...

I'm designing some data acquisition hardware that (basically) measures 
the time between a bunch of pulses, stores them in an SRAM chip, then a 
microcontroller reads the acquired data back later on. If the MCU tries 
to read from memory while an acquisition is in progress, it reads 
garbage. The MCU communicates through a series of registers in the FPGA.

My plan was something along these lines:
  - Memory addresses generated by an 18-bit binary counter
  - MCU data bus is 8 bits wide
  - A L->H edge on "LOAD_L" sets ADDRESS[7:0] to the value on the data 
bus and resets the FULL flag.
  - A L->H edge on "LOAD_H" sets ADDRESS[15:8] to the value on the data 
bus and resets the FULL flag.
  - A L->H edge on "LOAD_U" sets ADDRESS[17:16] to the value on the data 
bus and resets the FULL flag.
  - A L->H edge on "RESET" clears the counter to 0 and resets the FULL 
flag.
  - A L->H edge on "INCREMENT" increments the address. If the address 
counter wraps around, the FULL flag is set.

This is the code I've got now:
~~~~
module AddressCounter(ADDR,INCREMENT,EMPTY,FULL,RESET,DATA,LOAD_U, 
LOAD_H, LOAD_L);

	// Current address
	output reg [17:0] ADDR;

	/// Empty/Full status
	// EMPTY is 1 whenever ADDR == 0.
	// FULL is 1 if the last INCREMENT caused the address counter to 
roll over.
	output		EMPTY;
	output reg	FULL;

	/// Control inputs
	// INCREMENT: L->H edge causes address to increment
	input		INCREMENT;
	// RESET: L->H edge causes ADDR and the FULL flag  to be cleared.
	input		RESET;
	// LOAD_[UHL]: L->H edge loads contents of DATA into the upper, 
high or low byte
	//				of the counter register.
	input		LOAD_U, LOAD_H, LOAD_L;
	// DATA: Data that is loaded in by LOAD_[UHL].
	input [7:0]	DATA;
	
	/// EMPTY output logic
	assign EMPTY = (ADDR == 0);
	
	/// Counting logic
	always @(posedge INCREMENT or posedge RESET or posedge LOAD_U or 
posedge LOAD_H or posedge LOAD_L) begin
		if (RESET) begin
			// Reset -- clear ADDR to 0 and clear FULL flag
			ADDR <= 0;
			FULL <= 0;
		end else begin
			if (LOAD_L) begin
				// Load Low Byte
				ADDR[7:0] <= DATA;
				FULL <= 0;
			end else if (LOAD_H) begin
				// Load High Byte
				ADDR[15:8] <= DATA;
				FULL <= 0;
			end else if (LOAD_U) begin
				// Load Upper Byte
				ADDR[17:16] <= DATA[1:0];
				FULL <= 0;
			end else begin
				// Not a load, must be an increment.
				{FULL, ADDR} <= ADDR + 1'b1;
			end
		end
	end
endmodule
~~~~

The problem is, this seems to upset Quartus:

Warning: Presettable and clearable registers converted to equivalent 
circuits with latches. Registers power up to an undefined state, and 
DEVCLRn places the registers in an undefined state.

If I expand this, there are warnings for "ADDR[0]~reg0" through "ADDR[17]
~reg0":

Warning (13310): Register "ADDR[0]~reg0" is converted into an equivalent 
circuit using register "ADDR[0]~reg0_emulated" and latch "ADDR[0]
~reg0latch"


What I'd like to know is, is there a better way to do what I want to do? 
By that I mean, one that doesn't infer latches, or perhaps a cleaner way 
to achieve what I want (I'd certainly be interested in alternative 
implementations of the wrap-around detection)?

Is it really that bad to have a latch-inference in this situation?

Thanks,
-- 
Phil.
usenet09@philpem.me.uk
http://www.philpem.me.uk/
If mail bounces, replace "09" with the last two digits of the current 
year.
Philip Pemberton <usenet09@philpem.me.uk> wrote:

< This is most likely a problem with a really simple solution, but I've 
< spent the past two hours hacking away at it and gotten nowhere...
 
< I'm designing some data acquisition hardware that (basically) measures 
< the time between a bunch of pulses, stores them in an SRAM chip, then a 
< microcontroller reads the acquired data back later on. If the MCU tries 
< to read from memory while an acquisition is in progress, it reads 
< garbage. The MCU communicates through a series of registers in the FPGA.

It sounds like you need a FIFO, and are attempting to make
one out of the SRAM.  The FPGA tools usually know how to make a FIFO,
at least when using the FPGA BRAM (block RAM).  Also, the BRAM
are dual port, which makes FIFO design easier.  

-- glen
Read up on syncrhonous design. You need to do everything on a clock
edge (i.e. posedge clk), and maybe reset (if it is asyncrhonous
reset), but not on edges of other signals.

Andy
On Thu, 01 Oct 2009 19:31:22 +0000, glen herrmannsfeldt wrote:

> It sounds like you need a FIFO, and are attempting to make one out of > the SRAM. The FPGA tools usually know how to make a FIFO, at least when > using the FPGA BRAM (block RAM). Also, the BRAM are dual port, which > makes FIFO design easier.
While a FIFO would be easier, it probably wouldn't have the storage capacity required. Say you have a stream of pulses like this: ___ ____ ___ ___| |____| |____| |____ etc. : : : : t1 : t2 : I need to record the "t" intervals, i.e. the time between rising edges. The issue is, these intervals are around 3us (min 2us, max 4us or thereabouts). They come from a rotating magnetic disc, which spins at about 300RPM. 300RPM is 5 revolutions per second, or 200ms per revolution. Assuming the intervals are all at the minimum point (2us), that's 200,000 timing values. The reference clock is 40MHz, meaning each 2us interval would cause a count of 80 to be stored; a 4us interval would store 160. So for nominal conditions, a 256 kilobyte SRAM is required. I was under the impression that BlockRAM on Altera parts topped out at about 64K on the lower-end Cyclones... At this point, measuring the intervals isn't an issue -- I have more-or- less working Verilog code for that. What I need to get working is the RAM interface stuff... Thanks, -- Phil. usenet09@philpem.me.uk http://www.philpem.me.uk/ If mail bounces, replace "09" with the last two digits of the current year.
On Oct 1, 1:22=A0pm, Philip Pemberton <usene...@philpem.me.uk> wrote:
> On Thu, 01 Oct 2009 19:31:22 +0000, glen herrmannsfeldt wrote: > > It sounds like you need a FIFO, and are attempting to make one out of > > the SRAM. =A0The FPGA tools usually know how to make a FIFO, at least w=
hen
> > using the FPGA BRAM (block RAM). =A0Also, the BRAM are dual port, which > > makes FIFO design easier. > > While a FIFO would be easier, it probably wouldn't have the storage > capacity required. > > Say you have a stream of pulses like this: > =A0 =A0 ___ =A0 =A0 =A0____ =A0 =A0 =A0___ > ___| =A0 |____| =A0 =A0|____| =A0 |____ =A0 etc. > =A0 =A0: =A0 =A0 =A0 =A0: =A0 =A0 =A0 =A0 : > =A0 =A0: t1 =A0 =A0 : t2 =A0 =A0 =A0: > > I need to record the "t" intervals, i.e. the time between rising edges. > The issue is, these intervals are around 3us (min 2us, max 4us or > thereabouts). They come from a rotating magnetic disc, which spins at > about 300RPM. 300RPM is 5 revolutions per second, or 200ms per > revolution. Assuming the intervals are all at the minimum point (2us), > that's 200,000 timing values. > > The reference clock is 40MHz, meaning each 2us interval would cause a > count of 80 to be stored; a 4us interval would store 160. So for nominal > conditions, a 256 kilobyte SRAM is required. > > I was under the impression that BlockRAM on Altera parts topped out at > about 64K on the lower-end Cyclones... > > At this point, measuring the intervals isn't an issue -- I have more-or- > less working Verilog code for that. What I need to get working is the RAM > interface stuff... > > Thanks, > -- > Phil. > usene...@philpem.me.ukhttp://www.philpem.me.uk/ > If mail bounces, replace "09" with the last two digits of the current > year.
As Andy pointed out, you need to think about how a flip-flop works. It has one clock input, yet you're trying to feed multiple control edges into them. Is your time base (?INCREMENT?) a fee running clock? If so, make your load signals operate in that clock domain. Maybe think have a higher speed clock than you increment signal that is the master clock and have everything in that domain: always @(posedge fast_clk) if (load_u) .... else if (load_l) ... else if (increment) .... Make sure all the load/increment signals are pulses in the fast_clk domain. John Providenza
On Thu, 01 Oct 2009 14:17:34 -0700, johnp wrote:

> As Andy pointed out, you need to think about how a flip-flop works. It > has > one clock input, yet you're trying to feed multiple control edges into > them.
OK.
> Is your time base (?INCREMENT?) a fee running clock? If so, make your > load > signals operate in that clock domain.
The increment signal is generated when an edge appears on the input data line -- one of the pulses I mentioned in my previous posting. Basically: module top(CLK40MHZ, DATA, RAM_ADDRESS, ...); (...) wire INCREMENT = DATA; (...) endmodule Like I said, this thing talks to a disc drive. The 40MHz clock is used to derive other clocks (e.g. 20MHz, 10MHz acq clocks, the 500Hz head stepping pulses), and also as the reference for measuring the time between the data pulses. I think you and Andy are both right here -- I'm going to do some reading up. My "working" read code only worked on testbench because it wasn't simulating the effect of DATA_IN being on a separate clock domain... It actually *doesn't* work on physical hardware. Although I'm surprised I never thought of pulling INCREMENT in and using it (effectively) as a clock enable... Maybe I've been taking too many examples from old Xilinx manuals... Thanks, -- Phil. usenet09@philpem.me.uk http://www.philpem.me.uk/ If mail bounces, replace "09" with the last two digits of the current year.
On Oct 1, 3:45=A0pm, Philip Pemberton <usene...@philpem.me.uk> wrote:
> > Although I'm surprised I never thought of pulling INCREMENT in and using > it (effectively) as a clock enable... Maybe I've been taking too many > examples from old Xilinx manuals... > > Thanks, > -- > Phil.
Phil, please do not blame your bad habits on Xilinx, always a strong advocate of synchronous design methods. Peter Alfke, formerly associated with Xilinx.
Peter Alfke <alfke@sbcglobal.net> wrote:
< On Oct 1, 3:45?pm, Philip Pemberton <usene...@philpem.me.uk> wrote:

<> Although I'm surprised I never thought of pulling INCREMENT in and using
<> it (effectively) as a clock enable... Maybe I've been taking too many
<> examples from old Xilinx manuals...
 
< Phil, please do not blame your bad habits on Xilinx, 
< always a strong advocate of synchronous design methods.
< Peter Alfke, formerly associated with Xilinx.

I am not so sure how to read it, but I didn't read it as Xilinx
being against synchronous design.  It seems that he has multiple
clock domains, which always complicates synchronous logic.

Are there any Xilinx examples for generating a FIFO using 
external RAM?  That would seem to be one solution to the problem.

More usual would be a big enough FIFO in the FPGA, and then
external logic to extract data fast enough that it doesn't
overflow.

-- glen
On Thu, 01 Oct 2009 16:57:00 -0700, Peter Alfke wrote:

> Phil, please do not blame your bad habits on Xilinx, always a strong > advocate of synchronous design methods. Peter Alfke, formerly associated > with Xilinx.
I wasn't blaming my bad habits on Xilinx -- just commenting on how some of the ISE3 manual examples cause flip-flop inference, and one or two that I've copy-pasted just plain don't work as advertised. But I will say one thing: providing a free Linux version of the IDE gets them significant points over Altera. AIUI the Linux version of Quartus Web Edition has been canned, while ISE Webpack was still available in a Linux version last time I checked. It's just a shame the distributors I use don't carry Xilinx FPGA parts. I can get small Altera CPLDs, Altera FPGAs or Xilinx CPLDs, but not any Xilinx FPGAs or (large) Altera CPLDs. Very annoying.
On Thu, 01 Oct 2009 16:57:00 -0700, Peter Alfke wrote:

> Phil, please do not blame your bad habits on Xilinx, always a strong > advocate of synchronous design methods. Peter Alfke, formerly associated > with Xilinx.
I wasn't blaming my bad habits on Xilinx -- just commenting on how some of the ISE3 manual examples cause flip-flop inference, and one or two that I've copy-pasted just plain don't work as advertised. But I will say one thing: providing a free Linux version of the IDE gets them significant points over Altera. AIUI the Linux version of Quartus Web Edition has been canned, while ISE Webpack was still available in a Linux version last time I checked. It's just a shame the distributors I use don't carry Xilinx FPGA parts. I can get small Altera CPLDs, Altera FPGAs or Xilinx CPLDs, but not any Xilinx FPGAs or (large) Altera CPLDs. Very annoying.