Dear All, I'm not an expert in VHDL, i'm just a curious trying to solve a research problem with an FPGA. I'm using a 32 bit accumulator in a IP, as part of a SoC project with a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a Xilinx XC3S200). The code is included at the end of this message. The input is a 32 bit signed integer coded in two's complement and the output also a 32 bit signed integer. What I would like the accumulator to do is to accumulate synchronously with the rising edge of clk when enb=1 and maintain the result stable at the output when enb=0 ( enb is a asynchronous signal generated elsewhere in the system) But it does not work in this way, it behaves in a strange manner... Some times I get the expected results but often I get strange values (large when they should be small, often negative instead of positive, etc.). If I look at the binary representation of the output, it looks like if the output din't had time to sum and propagate to the output again. In fact, the post place and route simulation shows that when the enb signal goes to 0, the output stays in a undetermined condition (you know, red line with XXXX). I'm guessing I'm doing a very basic mistake that as something to do with the timing of the enb signal, but after 3 days banging my had to the wall, all I have is a a monumental headache. Can some kind soul help me with this? jmariano ================ library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity int_accum is port (clk:in std_logic; clr:in std_logic; enb:in std_logic; d: in std_logic_vector(31 downto 0); ovf:out std_logic; -- overflow q: out std_logic_vector(31 downto 0)); end int_accum; architecture archi of int_accum is signal tmp : signed(32 downto 0); begin process(clk, clr) begin if (clr = '1') then tmp <= (others => '0'); elsif (rising_edge (clk)) then if (enb = '1') then -- The result of the adder will be on 33 bits to keep the carry tmp <= tmp + signed ('0'& d); end if; end if; end process; -- The carry is extracted from the most significant bit of the result ovf <= tmp(32); -- The q output is the 32 least significant bits of sum q <= std_logic_vector (tmp(31 downto 0)); end archi;
accumulator (again)
Started by ●July 2, 2012
Reply by ●July 2, 20122012-07-02
On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote:> Dear All, > > I'm not an expert in VHDL, i'm just a curious trying to solve a > research problem with an FPGA. > > I'm using a 32 bit accumulator in a IP, as part of a SoC project with > a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > Xilinx XC3S200). The code is included at the end of this message. =A0The > input is a 32 bit signed integer coded in two's complement and the > output also a 32 bit signed integer. What I would like the accumulator > to do is to accumulate synchronously with the rising edge of clk when > enb=3D1 and maintain the result stable at the output when enb=3D0 ( enb i=s> a asynchronous signal generated elsewhere in the system) > > But it does not work in this way, it behaves in a strange manner... > > Some times I get the expected results but often I get strange values > (large when they should be small, often negative instead of positive, > etc.). If I look at the binary representation of the output, it looks > like if the output din't had time to sum and propagate to the output > again. In fact, the post place and route simulation shows that when > the enb signal goes to 0, the output stays in a undetermined condition > (you know, red line with XXXX). > > I'm guessing I'm doing a very basic mistake that as something to do > with the timing of the enb signal, but after 3 days banging my had to > the wall, all I have is a a monumental headache. > > Can some kind soul help me with this? > > jmariano > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > library ieee; > use ieee.std_logic_1164.all; > use ieee.numeric_std.all; > > entity int_accum is > =A0 port =A0(clk:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow > =A0 =A0 =A0 =A0 =A0q: =A0out std_logic_vector(31 downto 0)); > end int_accum; > > architecture archi of int_accum is > > =A0 signal tmp : signed(32 downto 0); > > =A0 begin > > =A0 process(clk, clr) > =A0 begin > =A0 =A0 =A0 =A0 if (clr =3D '1') then > =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > =A0 =A0elsif (rising_edge (clk)) then > =A0 =A0 =A0 =A0 if (enb =3D '1') then > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on 33 =bits to keep the carry> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tmp <=3D tmp + signed ('0'& d); > =A0 =A0 end if; > =A0 =A0end if; > =A0 end process; > > =A0 -- The carry is extracted from the most significant bit of the > result > =A0 ovf <=3D tmp(32); > > =A0 -- The q output is the 32 least significant bits of sum > =A0 q <=3D std_logic_vector (tmp(31 downto 0)); > > end archi;This is the key to your problem:> enb is a asynchronous signal generated elsewhere in the systemYou can't expect to take an asynchronous signal into multiple (32 in this case) registers in a synchronous domain and expect that it will work reliably. You need to first synchronize the asynchronous input to the synchronous clock domain before you can use it. Ed McGettigan -- Xilinx Inc.
Reply by ●July 3, 20122012-07-03
On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote:> On Jul 2, 4:20 pm, jmariano <jmarian...@gmail.com> wrote: >> Dear All, >> >> I'm not an expert in VHDL, i'm just a curious trying to solve a >> research problem with an FPGA. >> >> I'm using a 32 bit accumulator in a IP, as part of a SoC project with a >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a >> Xilinx XC3S200). The code is included at the end of this message. The >> input is a 32 bit signed integer coded in two's complement and the >> output also a 32 bit signed integer. What I would like the accumulator >> to do is to accumulate synchronously with the rising edge of clk when >> enb=1 and maintain the result stable at the output when enb=0 ( enb is >> a asynchronous signal generated elsewhere in the system) >> >> But it does not work in this way, it behaves in a strange manner... >> >> Some times I get the expected results but often I get strange values >> (large when they should be small, often negative instead of positive, >> etc.). If I look at the binary representation of the output, it looks >> like if the output din't had time to sum and propagate to the output >> again. In fact, the post place and route simulation shows that when the >> enb signal goes to 0, the output stays in a undetermined condition (you >> know, red line with XXXX). >> >> I'm guessing I'm doing a very basic mistake that as something to do >> with the timing of the enb signal, but after 3 days banging my had to >> the wall, all I have is a a monumental headache. >> >> Can some kind soul help me with this? >> >> jmariano >> >> ================ >> >> library ieee; >> use ieee.std_logic_1164.all; >> use ieee.numeric_std.all; >> >> entity int_accum is >> port (clk:in std_logic; >> clr:in std_logic; >> enb:in std_logic; >> d: in std_logic_vector(31 downto 0); >> ovf:out std_logic; -- overflow q: out >> std_logic_vector(31 downto 0)); >> end int_accum; >> >> architecture archi of int_accum is >> >> signal tmp : signed(32 downto 0); >> >> begin >> >> process(clk, clr) >> begin >> if (clr = '1') then >> tmp <= (others => '0'); >> elsif (rising_edge (clk)) then >> if (enb = '1') then >> -- The result of the adder will be on 33 bits >> to keep the carry tmp <= tmp + signed ('0'& d); >> end if; >> end if; >> end process; >> >> -- The carry is extracted from the most significant bit of the >> result >> ovf <= tmp(32); >> >> -- The q output is the 32 least significant bits of sum q <= >> std_logic_vector (tmp(31 downto 0)); >> >> end archi; > > This is the key to your problem: > >> enb is a asynchronous signal generated elsewhere in the system > > You can't expect to take an asynchronous signal into multiple (32 in > this case) registers in a synchronous domain and expect that it will > work reliably. You need to first synchronize the asynchronous input to > the synchronous clock domain before you can use it.Which means that you should latch enb in a register, with the same clock that you're using to twiddle your accumulator, and use the output of that register as your enable signal. Paranoid logic designers will have a string of two or three registers to avoid metastability, but I've been told that's not necessary. (I'm not much of a logic designer). -- Tim Wescott Control system and signal processing consulting www.wescottdesign.com
Reply by ●July 3, 20122012-07-03
On Mon, 02 Jul 2012 16:20:52 -0700, jmariano wrote:> Dear All, > > I'm not an expert in VHDL, i'm just a curious trying to solve a research > problem with an FPGA. > > I'm using a 32 bit accumulator in a IP,... The> input is a 32 bit signed integer coded in two's complement and the > output also a 32 bit signed integer.> But it does not work in this way, it behaves in a strange manner...You have one likely answer from Ed and Tim : unless you KNOW that the input signals "enb" and "d" are already synchronous with "clk" you MUST synchronise them. But there is another problem: tmp <= tmp + signed ('0'& d); This is NOT how to add a leading bit to d. It will convert a small negative d to a very large positive value! Instead you must replicate d's sign bit (MSB) into the leading bit. tmp <= tmp + signed (d(d'high) & d); (Or look for "resize" functions in numeric_std to do this for you). This is far more likely to be the problem, especially if you are detecting these errors at behavioural simulation (as you should be) Incidentally, unless this is the top level of your design, I would consider making the D and Q ports signed. Apart from keeping the type conversions to a minimum, this means the external view of the design (the entity specification) better reflects (or documents) what the design does; preventing surprises when someone re-uses it with unsigned data... - Brian
Reply by ●July 3, 20122012-07-03
On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:> On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: >=20 > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > >> Dear All, > >> > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > >> research problem with an FPGA. > >> > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project with =a> >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > >> Xilinx XC3S200). The code is included at the end of this message. =A0T=he> >> input is a 32 bit signed integer coded in two's complement and the > >> output also a 32 bit signed integer. What I would like the accumulator > >> to do is to accumulate synchronously with the rising edge of clk when > >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( en=b is> >> a asynchronous signal generated elsewhere in the system) > >> > >> But it does not work in this way, it behaves in a strange manner... > >> > >> Some times I get the expected results but often I get strange values > >> (large when they should be small, often negative instead of positive, > >> etc.). If I look at the binary representation of the output, it looks > >> like if the output din't had time to sum and propagate to the output > >> again. In fact, the post place and route simulation shows that when th=e> >> enb signal goes to 0, the output stays in a undetermined condition (yo=u> >> know, red line with XXXX). > >> > >> I'm guessing I'm doing a very basic mistake that as something to do > >> with the timing of the enb signal, but after 3 days banging my had to > >> the wall, all I have is a a monumental headache. > >> > >> Can some kind soul help me with this? > >> > >> jmariano > >> > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> library ieee; > >> use ieee.std_logic_1164.all; > >> use ieee.numeric_std.all; > >> > >> entity int_accum is > >> =A0 port =A0(clk:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0out > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > >> end int_accum; > >> > >> architecture archi of int_accum is > >> > >> =A0 signal tmp : signed(32 downto 0); > >> > >> =A0 begin > >> > >> =A0 process(clk, clr) > >> =A0 begin > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > >> =A0 =A0elsif (rising_edge (clk)) then > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on =33 bits> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + signe=d ('0'& d);> >> =A0 =A0 end if; > >> =A0 =A0end if; > >> =A0 end process; > >> > >> =A0 -- The carry is extracted from the most significant bit of the > >> result > >> =A0 ovf <=3D tmp(32); > >> > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > >> =A0 std_logic_vector (tmp(31 downto 0)); > >> > >> end archi; > >=20 > > This is the key to your problem: > >=20 > >> enb is a asynchronous signal generated elsewhere in the system > >=20 > > You can't expect to take an asynchronous signal into multiple (32 in > > this case) registers in a synchronous domain and expect that it will > > work reliably. You need to first synchronize the asynchronous input to > > the synchronous clock domain before you can use it. >=20 > Which means that you should latch enb in a register, with the same clock==20> that you're using to twiddle your accumulator, and use the output of that==20> register as your enable signal. >=20 > Paranoid logic designers will have a string of two or three registers to==20> avoid metastability, but I've been told that's not necessary. (I'm not==20> much of a logic designer). >=20 > --=20 > Tim Wescott > Control system and signal processing consulting > www.wescottdesign.comIt isn't just the paranoid logic designer, it should be every logic designe= r. =20 A single register only partially solves the problem of an asynchronous inpu= t with multiple register destinations, but it does not solve the very real = metastability problem. At least two registers should be used to ensure tha= t the metastability condition has resolved and with increasing clock freque= ncy and finer process nodes using three or more stages may be necessary. Ed McGettigan -- Xilinx Inc.
Reply by ●July 4, 20122012-07-04
On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:> On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: > > > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > > >> Dear All, > > > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > > >> research problem with an FPGA. > > > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project wit=h a> > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > > >> Xilinx XC3S200). The code is included at the end of this message. ==A0The> > >> input is a 32 bit signed integer coded in two's complement and the > > >> output also a 32 bit signed integer. What I would like the accumulat=or> > >> to do is to accumulate synchronously with the rising edge of clk whe=n> > >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( =enb is> > >> a asynchronous signal generated elsewhere in the system) > > > >> But it does not work in this way, it behaves in a strange manner... > > > >> Some times I get the expected results but often I get strange values > > >> (large when they should be small, often negative instead of positive=,> > >> etc.). If I look at the binary representation of the output, it look=s> > >> like if the output din't had time to sum and propagate to the output > > >> again. In fact, the post place and route simulation shows that when =the> > >> enb signal goes to 0, the output stays in a undetermined condition (=you> > >> know, red line with XXXX). > > > >> I'm guessing I'm doing a very basic mistake that as something to do > > >> with the timing of the enb signal, but after 3 days banging my had t=o> > >> the wall, all I have is a a monumental headache. > > > >> Can some kind soul help me with this? > > > >> jmariano > > > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > >> library ieee; > > >> use ieee.std_logic_1164.all; > > >> use ieee.numeric_std.all; > > > >> entity int_accum is > > >> =A0 port =A0(clk:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0o=ut> > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > > >> end int_accum; > > > >> architecture archi of int_accum is > > > >> =A0 signal tmp : signed(32 downto 0); > > > >> =A0 begin > > > >> =A0 process(clk, clr) > > >> =A0 begin > > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > > >> =A0 =A0elsif (rising_edge (clk)) then > > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be o=n 33 bits> > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + sig=ned ('0'& d);> > >> =A0 =A0 end if; > > >> =A0 =A0end if; > > >> =A0 end process; > > > >> =A0 -- The carry is extracted from the most significant bit of the > > >> result > > >> =A0 ovf <=3D tmp(32); > > > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > > >> =A0 std_logic_vector (tmp(31 downto 0)); > > > >> end archi; > > > > This is the key to your problem: > > > >> =A0enb is a asynchronous signal generated elsewhere in the system > > > > You can't expect to take an asynchronous signal into multiple (32 in > > > this case) registers in a synchronous domain and expect that it will > > > work reliably. =A0You need to first synchronize the asynchronous inpu=t to> > > the synchronous clock domain before you can use it. > > > Which means that you should latch enb in a register, with the same cloc=k> > that you're using to twiddle your accumulator, and use the output of th=at> > register as your enable signal. > > > Paranoid logic designers will have a string of two or three registers t=o> > avoid metastability, but I've been told that's not necessary. =A0(I'm n=ot> > much of a logic designer). > > > -- > > Tim Wescott > > Control system and signal processing consulting > >www.wescottdesign.com > > It isn't just the paranoid logic designer, it should be every logic desig=ner.> > A single register only partially solves the problem of an asynchronous in=put with multiple register destinations, but it does not solve the very rea= l metastability problem. =A0At least two registers should be used to ensure= that the metastability condition has resolved and with increasing clock fr= equency and finer process nodes using three or more stages may be necessary= .> > Ed McGettigan > -- > Xilinx Inc.Hi Ed. They way it was explained to me, I believe from Peter Alfke, is that what really resolves metastability is the slack time in a register to register path. Over the years FPGA process has resulted in FFs which only need a couple of ns to resolve metastability to 1 in a million operation years or something like that (I don't remember the metric, but it was good enough for anything I do). It doesn't matter that you have logic in that path, you just need those few ns in every part of the path. In theory, even if you use multiple registers with no logic, what really matters is the slack time in the path and that is not guaranteed even with no logic. So the design protocol should be to assure the slack time from the input register to all subsequent registers have sufficient slack time. Do you remember how much time that needs to be? I want to say 2 ns, but it might be more like 5 ns, I just can't recall. Of course it depends on your clock rates, but I believe Peter picked some more aggressive speeds like 100 MHz for his example. Rick
Reply by ●July 5, 20122012-07-05
Dear All, Thank you very much for your input and sorry for the late reply. It is really great to be able to get the opinion of such experts, specially since, at my current location and in a radius of some 200 km, I must be the only person working with FPGA and VHDL! I'm also glad that the discussion as evolved to levels of complexity far beyond my knowledge. I was hoping that by now I would be able to say that the thing was working as expected but, unfortunately, no. I've synchronized the enable signal, as suggested by Ed and Tim, using 3 FF (I'm not paranoid, I just have room). Also, following Brian suggestions, I've clean up the code regarding type conversions. All this as allow me to isolate the remaining source of error, thank you very much. Here's the full story: I'm implementing a gated integrator, as a part of a boxcar averager. This is the standard noise reduction technique used in nuclear magnetic resonance (nmr). This is research, not a commercial product! The module gets is data from 4 8 bits ADC's at 5 MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=1. enb is generated in a different module. The module does this: 1 - generates the acquisition clock (adc_clk) by division by 10 of the S3-SKB 50 MHz main clock 2- generates the accumulation clock (acc_clk) by inverting adc_clk. In this way, there is a delay of 100 ns from the moment the ADC's receive the rising edge of the clock to the moment when the data gets registered at the output. 3 - converts the data from the adc's to excess 128 (bipolar adc) and extends to 32 bit signed 4 - calculates u = adc0-adc180 and v=adc90-adc270. u and v go through a switch and emerge as r and i, to be delivered to 2 alike accumulators. Of course, 3 and 4 must occur in less than 100 ns. The switch unit is very simple: It has a control signal, s[1:0] that comes from a different module, and the following table: 00 -> r=u, i=v; 01 -> r=v, i=-u; 10 -> r=-v, i=u; 11 -> r=-v, i=-u. The s signal is generated in a different clock domain and is stable 500 us before the enb. enb has a typical duration of 10 us. The code is at the end of this message. I continue to get errors, specially when the input values are closed to zero, which means that the result is changing from say FFFFFFFF to 00000001, so lots of bits to change. I have (i think!) trace the source of error to the switch_unit because, if I tie the s signal to a fixed value, 11 for example, the unit works well, but if I connect to a real s signal, I get errors. So I thought, this must be because the real s is noisy and r and i change during the acquisition period (1mm ns) so I have synchronized s with acc_clk, but the problem persists. What is more strange is that, if I do s <= "01" inside the synchronization process, I also get the same type of errors. Really, don't now what to do next. jmariano ================= architecture archi of int_su is begin process(u, v, s) begin case s is when "00" => r <= u; i <= v; when "01" => r <= v; i <= -u; when "10" => r <= -u; i <= -v; when "11" => r <= -v; i <= u; when others => r <= (others => 'X'); i <= (others => 'X'); end case; end process; end archi; ============
Reply by ●July 5, 20122012-07-05
On Jul 5, 7:44=A0am, jmariano <jmarian...@gmail.com> wrote:> Dear All, > > Thank you very much for your input and sorry for the late reply. > It is really great to be able to get the opinion of such experts, > specially since, at my current location and in a radius of some 200 > km, I must be the only person working with FPGA and VHDL! I'm also > glad that the discussion as evolved to levels of complexity far beyond > my knowledge. > > I was hoping that by now I would be able to say that the thing was > working as expected but, unfortunately, no. > > I've synchronized the enable signal, as suggested by Ed and Tim, using > 3 FF (I'm not paranoid, I just have room). Also, following Brian > suggestions, I've clean up the code regarding type conversions. All > this as allow me to isolate the remaining source of error, thank you > very much. > > Here's the full story: I'm implementing a gated integrator, as a part > of a boxcar averager. =A0This is the standard noise reduction technique > used in nuclear magnetic resonance (nmr). This is research, not a > commercial product! The module gets is data from 4 8 bits ADC's at 5 > MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=3D1. enb is > generated in a different module. The module does this: > 1 - generates the acquisition clock (adc_clk) by division by 10 of the > S3-SKB 50 MHz main clock > 2- =A0generates the accumulation clock (acc_clk) by inverting adc_clk. > In this way, there is a delay of 100 ns from the moment the ADC's > receive the rising edge of the clock to the moment when the data gets > registered at the output. > 3 - converts the data from the adc's to excess 128 (bipolar adc) and > extends to 32 bit signed > 4 - calculates u =3D adc0-adc180 and v=3Dadc90-adc270. u and v go through > a switch and emerge as r and i, to be delivered to 2 alike > accumulators. > Of course, 3 and 4 must occur in less than 100 ns. > > The switch unit is very simple: It has a control signal, s[1:0] that > comes from a different module, and the following table: 00 -> r=3Du, > i=3Dv; 01 -> r=3Dv, i=3D-u; 10 -> r=3D-v, i=3Du; 11 -> r=3D-v, i=3D-u. Th=e s signal> is generated in a different clock domain and is stable 500 us before > the enb. enb has a typical duration of 10 us. The code is at the end > of this message. > > I continue to get errors, specially when the input values are closed > to zero, which means that the result is changing from say FFFFFFFF to > 00000001, so lots of bits to change. > > I have (i think!) trace the source of error to the switch_unit > because, if I tie the s signal to a fixed value, 11 for example, the > unit works well, but if I connect to a real s signal, I get errors. So > I thought, this must be because the real s is noisy and r and i change > during the acquisition period (1mm ns) so I have synchronized s with > acc_clk, but the problem persists. =A0What is more strange is that, if I > do s <=3D "01" inside the synchronization process, I also get the same > type of errors. > > Really, don't now what to do next. > > jmariano > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > architecture archi of int_su is > begin > =A0 =A0 =A0 =A0 process(u, v, s) > =A0 =A0 =A0 =A0 begin > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 case s is > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "00" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "01" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "10" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "11" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when others =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D (others =3D> 'X'); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D (others =3D> 'X'); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 end case; > =A0 =A0 =A0 =A0 end process; > end archi; > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3DI'm not real clear on your description of your design, but if you are really generating clocks from the 50 MHz, I recommend that inside the FPGA you instead use a single clock and generate clock enables for the various functions. When you use multiple clocks in a circuit you have to do extra work for every signal that crosses a clock domain. Could that be your problem? I don't see anything in your original post about simulation. Do you simulate your modules? I highly recommend that you write a test benche for each and every module you code. You may think this takes too much time, but I believe it pays off in the end with shorter integration time. Rick
Reply by ●July 5, 20122012-07-05
On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote:> On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote: > > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > > > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: > > > > > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > > > >> Dear All, > > > > > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > > > >> research problem with an FPGA. > > > > > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project w=ith a> > > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is =a> > > >> Xilinx XC3S200). The code is included at the end of this message. ==A0The> > > >> input is a 32 bit signed integer coded in two's complement and the > > > >> output also a 32 bit signed integer. What I would like the accumul=ator> > > >> to do is to accumulate synchronously with the rising edge of clk w=hen> > > >> enb=3D1 and maintain the result stable at the output when enb=3D0 =( enb is> > > >> a asynchronous signal generated elsewhere in the system) > > > > > >> But it does not work in this way, it behaves in a strange manner..=.> > > > > >> Some times I get the expected results but often I get strange valu=es> > > >> (large when they should be small, often negative instead of positi=ve,> > > >> etc.). If I look at the binary representation of the output, it lo=oks> > > >> like if the output din't had time to sum and propagate to the outp=ut> > > >> again. In fact, the post place and route simulation shows that whe=n the> > > >> enb signal goes to 0, the output stays in a undetermined condition=(you> > > >> know, red line with XXXX). > > > > > >> I'm guessing I'm doing a very basic mistake that as something to d=o> > > >> with the timing of the enb signal, but after 3 days banging my had=to> > > >> the wall, all I have is a a monumental headache. > > > > > >> Can some kind soul help me with this? > > > > > >> jmariano > > > > > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > >> library ieee; > > > >> use ieee.std_logic_1164.all; > > > >> use ieee.numeric_std.all; > > > > > >> entity int_accum is > > > >> =A0 port =A0(clk:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > > > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: ==A0out> > > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > > > >> end int_accum; > > > > > >> architecture archi of int_accum is > > > > > >> =A0 signal tmp : signed(32 downto 0); > > > > > >> =A0 begin > > > > > >> =A0 process(clk, clr) > > > >> =A0 begin > > > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > > > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > > > >> =A0 =A0elsif (rising_edge (clk)) then > > > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be=on 33 bits> > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + s=igned ('0'& d);> > > >> =A0 =A0 end if; > > > >> =A0 =A0end if; > > > >> =A0 end process; > > > > > >> =A0 -- The carry is extracted from the most significant bit of the > > > >> result > > > >> =A0 ovf <=3D tmp(32); > > > > > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > > > >> =A0 std_logic_vector (tmp(31 downto 0)); > > > > > >> end archi; > > > > > > This is the key to your problem: > > > > > >> =A0enb is a asynchronous signal generated elsewhere in the system > > > > > > You can't expect to take an asynchronous signal into multiple (32 i=n> > > > this case) registers in a synchronous domain and expect that it wil=l> > > > work reliably. =A0You need to first synchronize the asynchronous in=put to> > > > the synchronous clock domain before you can use it. > > > > > Which means that you should latch enb in a register, with the same cl=ock> > > that you're using to twiddle your accumulator, and use the output of =that> > > register as your enable signal. > > > > > Paranoid logic designers will have a string of two or three registers=to> > > avoid metastability, but I've been told that's not necessary. =A0(I'm=not> > > much of a logic designer). > > > > > -- > > > Tim Wescott > > > Control system and signal processing consulting > > >www.wescottdesign.com > > > > It isn't just the paranoid logic designer, it should be every logic des=igner.> > > > A single register only partially solves the problem of an asynchronous =input with multiple register destinations, but it does not solve the very r= eal metastability problem. =A0At least two registers should be used to ensu= re that the metastability condition has resolved and with increasing clock = frequency and finer process nodes using three or more stages may be necessa= ry.> > > > Ed McGettigan > > -- > > Xilinx Inc. >=20 > Hi Ed. They way it was explained to me, I believe from Peter Alfke, > is that what really resolves metastability is the slack time in a > register to register path. Over the years FPGA process has resulted > in FFs which only need a couple of ns to resolve metastability to 1 in > a million operation years or something like that (I don't remember the > metric, but it was good enough for anything I do). It doesn't matter > that you have logic in that path, you just need those few ns in every > part of the path. In theory, even if you use multiple registers with > no logic, what really matters is the slack time in the path and that > is not guaranteed even with no logic. So the design protocol should > be to assure the slack time from the input register to all subsequent > registers have sufficient slack time. >=20 > Do you remember how much time that needs to be? I want to say 2 ns, > but it might be more like 5 ns, I just can't recall. Of course it > depends on your clock rates, but I believe Peter picked some more > aggressive speeds like 100 MHz for his example. >=20 > RickI'm glad to see that one of my 5-6 attempts to post was finally accepted by= Google. I have got to switch to something else. Peter Alfke's publications on metastability definitely fall into the semina= l category, but you must be careful to extrapolate the original data to the= latest technology nodes, circuits and design requirements. There are two = major factors that impact the metastability equations, the tau or metastabi= lity decay rate and the settling time. =20 The tau value is an inherent characteristic of the circuit and technology n= ode and for a long time the expectation was that this is would decrease wit= h each generation, but this has stopped being true. The settling time, Ts, is dependent on the design and is under the user's c= ontrol. Ts is a factor of the destination clock frequency and the timing sl= ack between registers. If you have 100 MHz clock frequency, but you use up = 9.5nS to get to the destination your slack is only 500pS. Adding register s= tages allows for maximum use of the clock period increasing the settling ti= me and for each stage it increases again.=20 Ed McGettigan -- Xilinx Inc.
Reply by ●July 5, 20122012-07-05
Hi Rick, tanks for your help.> I'm not real clear on your description of your design, but if you are > really generating clocks from the 50 MHz, I recommend that inside the > FPGA you instead use a single clock and generate clock enables for the > various functions. =20Yes, I generate a 5 MHz clock inside the module from the main 50 MHz clock = by simple division by 10 because I need a 5 MHz adc clock. I can't use cloc= k enable because the AD9058 adc does not have a enable input, just clock.> When you use multiple clocks in a circuit you have > to do extra work for every signal that crosses a clock domain. Could > that be your problem?What is the extra work? Have no idea! Synchronization?> I don't see anything in your original post about simulation. Do you > simulate your modules? I highly recommend that you write a test > benche for each and every module you code. You may think this takes > too much time, but I believe it pays off in the end with shorter > integration time.Sorry about that, I did, in fact, simulate each module and the top entity. = The behavior simulation gives the expected results, the post and place simu= lation gives same errors that I could not understand, but I'll run the simu= lations again and post the results here. jmariano=20 =20