Hi, I am learning FPGA by a DSP book (1st edition) written by Uew Meyer_Baese. There is an example about 16 bit adder with detailed timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder is 74.6 MHz I think it is EPF10K20RC240-4 from the context. In my practice project, I can only see the IC(interconect) and CELL delay timing parameters from the timing closure floorplan. Because the fmax I get is lower (only 59.52 MHz from timing analyzer) than that in the book, I want to know the difference. The purpose is I want to be ensure whether I get the fast 16 bit adder. BTW, I post my code below. Because the timing parameters give tco etc., I use register at the input and output ports. Could you help me? Thanks in advance. LIBRARY lpm; USE lpm.lpm_components.ALL; LIBRARY ieee; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_arith.ALL; ENTITY add16 IS GENERIC (WIDTH : INTEGER := 16; -- Total bit width ONE : INTEGER := 1); -- 1 bit for carry reg. PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); -- Inputs sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); clk : IN STD_LOGIC); END add16; ARCHITECTURE flex OF add16 IS SIGNAL sum0, q2, q1 -- LSBs of inputs : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); BEGIN reg_1: lpm_ff -- Save LSBs of x+y and carry GENERIC MAP ( LPM_WIDTH => WIDTH ) PORT MAP ( data => x, q => q1,clock => clk ); reg_2: lpm_ff GENERIC MAP ( LPM_WIDTH => WIDTH ) PORT MAP ( data => y, q => q2, clock => clk ); -------------- First stage of the adder ------------------ add_1: lpm_add_sub -- Add LSBs of x and y GENERIC MAP ( LPM_WIDTH => WIDTH, LPM_REPRESENTATION => "UNSIGNED", LPM_DIRECTION => "ADD") PORT MAP ( dataa => q1, datab => q2, result => sum0); reg_3: lpm_ff GENERIC MAP ( LPM_WIDTH => WIDTH ) PORT MAP ( data => sum0, q => sum, clock => clk ); END flex;
Can I see the detail timing parameter by Quartus II tools?
Started by ●November 30, 2006
Reply by ●November 30, 20062006-11-30
After doing a full compile, open the "Timing Analysis" report folder. Under that report, look for the "Clock Setup: <name>" panel where <name> is the name of you clock signal. From that report, select any interested row and then using the right mouse button (button-2), select "List Path". This command will report the full path detail in the form of a message in your message window. You can do this operation on any of the other timing report panels. Also note that when you did the full compiled, the Classic Timing Analyzer automatically did a List path operation on the worst case Fmax, Tsu, Th and Tco paths. BTW, make sure you have a timing constraint if you want the fitter to do its best. Go to the "Timing Analysis Settings" panel and create a clock requirement for your clock. For example, it can be the 75MHz the book gives. For more, see: http://www.altera.com/support/software/quartus2/timing/sof-qts-timing.html -David On Nov 30, 6:45 am, "fl" <rxjw...@gmail.com> wrote:> Hi, > I am learning FPGA by a DSP book (1st edition) written by Uew > Meyer_Baese. There is an example about 16 bit adder with detailed > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > In my practice project, I can only see the IC(interconect) and CELL > delay timing parameters from the timing closure floorplan. Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference. The purpose is I want to be > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > Because the timing parameters give tco etc., I use register at the > input and output ports. > > Could you help me? Thanks in advance. > > LIBRARY lpm; > USE lpm.lpm_components.ALL; > > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > USE ieee.std_logic_arith.ALL; > > ENTITY add16 IS > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > ONE : INTEGER := 1); -- 1 bit for carry reg. > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > -- Inputs > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > clk : IN STD_LOGIC); > END add16; > > ARCHITECTURE flex OF add16 IS > SIGNAL sum0, q2, q1 -- LSBs of inputs > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > BEGIN > reg_1: lpm_ff -- Save LSBs of x+y and carry > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => x, q => q1,clock => clk ); > reg_2: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => y, q => q2, clock => clk ); > -------------- First stage of the adder ------------------ > add_1: lpm_add_sub -- Add LSBs of x and y > GENERIC MAP ( LPM_WIDTH => WIDTH, > LPM_REPRESENTATION => "UNSIGNED", > LPM_DIRECTION => "ADD") > PORT MAP ( dataa => q1, datab => q2, > result => sum0); > reg_3: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => sum0, q => sum, clock => clk ); > > END flex;
Reply by ●November 30, 20062006-11-30
fl, Meyer Baese was my teacher at FSU, and he taught my class using the DSP w/ FPGAs book he wrote. Its a small world... If you want to run an adder really fast you can instantiate an adder component from the altera mega function tool within the arithmetic components subsection.>From inside the megafunction wizard GUI you can select the level ofpipelining within the adder. So you can select anywhere from 1 to N clock cycles of delay. The more delay, the faster you can clock the adder. As far as timing parameters, you can see all the timing (slack, etc) from one critical path to another within the timing subsection of the compilation report after you compile your design in quartus II. --geoff fl wrote:> Hi, > I am learning FPGA by a DSP book (1st edition) written by Uew > Meyer_Baese. There is an example about 16 bit adder with detailed > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > In my practice project, I can only see the IC(interconect) and CELL > delay timing parameters from the timing closure floorplan. Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference. The purpose is I want to be > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > Because the timing parameters give tco etc., I use register at the > input and output ports. > > Could you help me? Thanks in advance. > > > > LIBRARY lpm; > USE lpm.lpm_components.ALL; > > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > USE ieee.std_logic_arith.ALL; > > ENTITY add16 IS > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > ONE : INTEGER := 1); -- 1 bit for carry reg. > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > -- Inputs > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > clk : IN STD_LOGIC); > END add16; > > ARCHITECTURE flex OF add16 IS > SIGNAL sum0, q2, q1 -- LSBs of inputs > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > BEGIN > reg_1: lpm_ff -- Save LSBs of x+y and carry > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => x, q => q1,clock => clk ); > reg_2: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => y, q => q2, clock => clk ); > -------------- First stage of the adder ------------------ > add_1: lpm_add_sub -- Add LSBs of x and y > GENERIC MAP ( LPM_WIDTH => WIDTH, > LPM_REPRESENTATION => "UNSIGNED", > LPM_DIRECTION => "ADD") > PORT MAP ( dataa => q1, datab => q2, > result => sum0); > reg_3: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => sum0, q => sum, clock => clk ); > > END flex;
Reply by ●November 30, 20062006-11-30
Thank you very much. The problem is that I cannot get the claimed performance of the example in that book. For the add_1p.vhd, 15-bit (which said it could be 97.08 MHz in the book), I can only get 69.87 MHz fmax. In this example, I don't modify the code at all. Even I add the timing constraint, the result is the same. It is really bizarre. For the post code, it was an exercise with the given results. I don't know where it is wrong. Anyone could test the best performance of the post code? wallge wrote:> fl, > > Meyer Baese was my teacher at FSU, and > he taught my class using the DSP w/ FPGAs book he wrote. > Its a small world... > > If you want to run an adder really fast you can instantiate an adder > component > from the altera mega function tool within the arithmetic components > subsection. > >From inside the megafunction wizard GUI you can select the level of > pipelining > within the adder. So you can select anywhere from 1 to N clock cycles > of delay. > The more delay, the faster you can clock the adder. > > As far as timing parameters, you can see all the timing (slack, etc) > from one critical path to another > within the timing subsection of the compilation report after you > compile your design in quartus II. > > > --geoff > > fl wrote: > > Hi, > > I am learning FPGA by a DSP book (1st edition) written by Uew > > Meyer_Baese. There is an example about 16 bit adder with detailed > > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > > > In my practice project, I can only see the IC(interconect) and CELL > > delay timing parameters from the timing closure floorplan. Because the > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > the book, I want to know the difference. The purpose is I want to be > > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > > Because the timing parameters give tco etc., I use register at the > > input and output ports. > > > > Could you help me? Thanks in advance. > > > > > > > > LIBRARY lpm; > > USE lpm.lpm_components.ALL; > > > > LIBRARY ieee; > > USE ieee.std_logic_1164.ALL; > > USE ieee.std_logic_arith.ALL; > > > > ENTITY add16 IS > > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > > ONE : INTEGER := 1); -- 1 bit for carry reg. > > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > -- Inputs > > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > clk : IN STD_LOGIC); > > END add16; > > > > ARCHITECTURE flex OF add16 IS > > SIGNAL sum0, q2, q1 -- LSBs of inputs > > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > BEGIN > > reg_1: lpm_ff -- Save LSBs of x+y and carry > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => x, q => q1,clock => clk ); > > reg_2: lpm_ff > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => y, q => q2, clock => clk ); > > -------------- First stage of the adder ------------------ > > add_1: lpm_add_sub -- Add LSBs of x and y > > GENERIC MAP ( LPM_WIDTH => WIDTH, > > LPM_REPRESENTATION => "UNSIGNED", > > LPM_DIRECTION => "ADD") > > PORT MAP ( dataa => q1, datab => q2, > > result => sum0); > > reg_3: lpm_ff > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => sum0, q => sum, clock => clk ); > > > > END flex;
Reply by ●November 30, 20062006-11-30
send meyer-baese an email saying you bought his book... and ask him your question... you can find his email on the faculty page from the FSU college of engineering page: http://www.eng.fsu.edu/index_noright.php?page=faculty_staff_directory fl wrote:> Thank you very much. The problem is that I cannot get the claimed > performance of the example in that book. For the add_1p.vhd, 15-bit > (which said it could be 97.08 MHz in the book), I can only get 69.87 > MHz fmax. In this example, I don't modify the code at all. Even I add > the timing constraint, the result is the same. It is really bizarre. > For the post code, it was an exercise with the given results. I don't > know where it is wrong. Anyone could test the best performance of the > post code? > > > wallge wrote: > > fl, > > > > Meyer Baese was my teacher at FSU, and > > he taught my class using the DSP w/ FPGAs book he wrote. > > Its a small world... > > > > If you want to run an adder really fast you can instantiate an adder > > component > > from the altera mega function tool within the arithmetic components > > subsection. > > >From inside the megafunction wizard GUI you can select the level of > > pipelining > > within the adder. So you can select anywhere from 1 to N clock cycles > > of delay. > > The more delay, the faster you can clock the adder. > > > > As far as timing parameters, you can see all the timing (slack, etc) > > from one critical path to another > > within the timing subsection of the compilation report after you > > compile your design in quartus II. > > > > > > --geoff > > > > fl wrote: > > > Hi, > > > I am learning FPGA by a DSP book (1st edition) written by Uew > > > Meyer_Baese. There is an example about 16 bit adder with detailed > > > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > > > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > > > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > > > > > In my practice project, I can only see the IC(interconect) and CELL > > > delay timing parameters from the timing closure floorplan. Because the > > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > > the book, I want to know the difference. The purpose is I want to be > > > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > > > Because the timing parameters give tco etc., I use register at the > > > input and output ports. > > > > > > Could you help me? Thanks in advance. > > > > > > > > > > > > LIBRARY lpm; > > > USE lpm.lpm_components.ALL; > > > > > > LIBRARY ieee; > > > USE ieee.std_logic_1164.ALL; > > > USE ieee.std_logic_arith.ALL; > > > > > > ENTITY add16 IS > > > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > > > ONE : INTEGER := 1); -- 1 bit for carry reg. > > > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > -- Inputs > > > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > clk : IN STD_LOGIC); > > > END add16; > > > > > > ARCHITECTURE flex OF add16 IS > > > SIGNAL sum0, q2, q1 -- LSBs of inputs > > > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > BEGIN > > > reg_1: lpm_ff -- Save LSBs of x+y and carry > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => x, q => q1,clock => clk ); > > > reg_2: lpm_ff > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => y, q => q2, clock => clk ); > > > -------------- First stage of the adder ------------------ > > > add_1: lpm_add_sub -- Add LSBs of x and y > > > GENERIC MAP ( LPM_WIDTH => WIDTH, > > > LPM_REPRESENTATION => "UNSIGNED", > > > LPM_DIRECTION => "ADD") > > > PORT MAP ( dataa => q1, datab => q2, > > > result => sum0); > > > reg_3: lpm_ff > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => sum0, q => sum, clock => clk ); > > > > > > END flex;
Reply by ●November 30, 20062006-11-30
fl wrote:> Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference.I don't have the book, but what do you get if you do it like this? http://home.comcast.net/~mike_treseler/add16.vhd -- Mike Treseler
Reply by ●November 30, 20062006-11-30
Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz (period=3D16.8 ns). fmax is the same even though I constraint fmax to 75 MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. Because the result of fmax is the same, there may be something wrong in the utilization of Quartus II 6.0 webpack, Windows XP. Why? Thank you Mike Treseler a =E9crit :> fl wrote: > > > Because the > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > the book, I want to know the difference. > > I don't have the book, but what do you get > if you do it like this? > > http://home.comcast.net/~mike_treseler/add16.vhd >=20 >=20 > -- Mike Treseler
Reply by ●November 30, 20062006-11-30
fl wrote:> Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > (period=16.8 ns). fmax is the same even though I constraint fmax to 75 > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > Because the result of fmax is the same, there may be something wrong in > the utilization of Quartus II 6.0 webpack, Windows XP.Or maybe the author was using a faster speed grade or a different device. With an epm240f100c4 I got 197.51 MHz ( period = 5.063 ns ) In any case, I think he was doing it the hard way. Good luck. -- Mike Treseler
Reply by ●December 1, 20062006-12-01
Hi, I still cannot get better performance, even for your code. The following is part of the info in listpath. Why there is so much IC and CELL delay? Could you guess that? I doubt there are some settings I must set besides fmax setting. Thank you very much. Info: Clock "clk" has Internal fmax of 59.52 MHz between source register "\only:x_v[0]" and destination register "\only:sum_v[14]" (period=3D 16.8 ns) Info: + Longest register to register delay is 13.200 ns Info: 1: + IC(0.000 ns) + CELL(0.000 ns) =3D 0.000 ns; Loc. =3D LC6_F16; Fanout =3D 2; REG Node =3D '\only:x_v[0]' Info: 2: + IC(2.200 ns) + CELL(1.200 ns) =3D 3.400 ns; Loc. =3D LC1_F13; Fanout =3D 2; COMB Node =3D 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[0]' Info: 3: + IC(0.000 ns) + CELL(0.300 ns) =3D 3.700 ns; Loc. =3D LC2_F13; Fanout =3D 2; COMB Node =3D 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[1]' Mike Treseler a =E9crit :> fl wrote: > > Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > > (period=3D16.8 ns). fmax is the same even though I constraint fmax to 75 > > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > > Because the result of fmax is the same, there may be something wrong in > > the utilization of Quartus II 6.0 webpack, Windows XP. > > > Or maybe the author was using a faster speed grade > or a different device. > With an epm240f100c4 I got 197.51 MHz ( period =3D 5.063 ns ) > > In any case, I think he was doing it the hard way. > Good luck. >=20 > -- Mike Treseler
Reply by ●December 1, 20062006-12-01
As Mike suggested, I would check to see if I am using the same device and speed-grade as in the book. Typically, the timing parameters that you are interested in are constant for a particular device and speed-grade in that device. So they would be published in the datasheet. In quartus you can try the property editor. Nevertheless, the timing analyzer already includes these values in its cell delay component. So no matter what you find out from the datasheet it would not change the timing. Also, in the timing analyzer report you can get an idea where most of your delays are coming from... if the IC delay is greater that 75%, you probably have some thing else going on that causes the logic to be placed too far away. It could also be that the clock is not routed on a global route. -sanjay fl wrote:> Hi, I still cannot get better performance, even for your code. The > following is part of the info in listpath. Why there is so much IC and > CELL delay? Could you guess that? I doubt there are some settings I > must set besides fmax setting. Thank you very much. > > > Info: Clock "clk" has Internal fmax of 59.52 MHz between source > register "\only:x_v[0]" and destination register "\only:sum_v[14]" > (period=3D 16.8 ns) > Info: + Longest register to register delay is 13.200 ns > Info: 1: + IC(0.000 ns) + CELL(0.000 ns) =3D 0.000 ns; Loc. =3D LC6_F16; > Fanout =3D 2; REG Node =3D '\only:x_v[0]' > Info: 2: + IC(2.200 ns) + CELL(1.200 ns) =3D 3.400 ns; Loc. =3D LC1_F13; > Fanout =3D 2; COMB Node =3D > 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[0]' > Info: 3: + IC(0.000 ns) + CELL(0.300 ns) =3D 3.700 ns; Loc. =3D LC2_F13; > Fanout =3D 2; COMB Node =3D > 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[1]' > > > > Mike Treseler a =E9crit : > > > fl wrote: > > > Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > > > (period=3D16.8 ns). fmax is the same even though I constraint fmax to=75> > > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > > > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > > > Because the result of fmax is the same, there may be something wrong =in> > > the utilization of Quartus II 6.0 webpack, Windows XP. > > > > > > Or maybe the author was using a faster speed grade > > or a different device. > > With an epm240f100c4 I got 197.51 MHz ( period =3D 5.063 ns ) > > > > In any case, I think he was doing it the hard way. > > Good luck. > >=20 > > -- Mike Treseler




