FPGARelated.com
Forums

Can I see the detail timing parameter by Quartus II tools?

Started by fl November 30, 2006
Hi,
I am learning FPGA by a DSP book (1st edition) written by Uew
Meyer_Baese. There is an example about 16 bit adder with detailed
timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns,
tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder
is 74.6 MHz I think it is EPF10K20RC240-4 from the context.

In my practice project, I can only see the IC(interconect) and CELL
delay timing parameters from the timing closure floorplan. Because the
fmax I get is lower (only 59.52 MHz from timing analyzer) than that in
the book, I want to know the difference. The purpose is I want to be
ensure whether I get the fast 16 bit adder. BTW, I post my code below.
Because the timing parameters give tco etc., I use register at the
input and output ports.

Could you help me? Thanks in advance.



LIBRARY lpm;
USE lpm.lpm_components.ALL;

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;

ENTITY add16 IS
  GENERIC (WIDTH  : INTEGER := 16; -- Total bit width
           ONE    : INTEGER := 1); -- 1 bit for carry reg.
  PORT (x,y : IN  STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0);
                                                  -- Inputs
        sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0);
		clk : IN  STD_LOGIC);
END add16;

ARCHITECTURE flex OF add16 IS
  SIGNAL sum0, q2, q1                   -- LSBs of inputs
                     : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0);
BEGIN
  reg_1: lpm_ff           -- Save LSBs of x+y and carry
         GENERIC MAP ( LPM_WIDTH => WIDTH )
         PORT MAP ( data => x, q => q1,clock => clk );
  reg_2: lpm_ff
         GENERIC MAP ( LPM_WIDTH => WIDTH )
         PORT MAP ( data => y, q => q2, clock => clk );
-------------- First stage of the adder  ------------------
  add_1: lpm_add_sub                 -- Add LSBs of x and y
         GENERIC MAP ( LPM_WIDTH => WIDTH,
                       LPM_REPRESENTATION => "UNSIGNED",
                       LPM_DIRECTION => "ADD")
         PORT MAP ( dataa => q1, datab => q2,
                    result => sum0);
  reg_3: lpm_ff
         GENERIC MAP ( LPM_WIDTH => WIDTH )
         PORT MAP ( data => sum0, q => sum, clock => clk );

END flex;

After doing a full compile, open the "Timing Analysis" report folder.
Under that report, look for the "Clock Setup: <name>" panel where
<name> is the name of you clock signal. From that report, select any
interested row and then using the right mouse button (button-2), select
"List Path". This command will report the full path detail in the form
of a message in your message window. You can do this operation on any
of the other timing report panels.

Also note that when you did the full compiled, the Classic Timing
Analyzer automatically did a List path operation on the worst case
Fmax, Tsu, Th and Tco paths.

BTW, make sure you have a timing constraint if you want the fitter to
do its best. Go to the "Timing Analysis Settings" panel and create a
clock requirement for your clock. For example, it can be the 75MHz the
book gives.

For more, see:

http://www.altera.com/support/software/quartus2/timing/sof-qts-timing.html

-David

On Nov 30, 6:45 am, "fl" <rxjw...@gmail.com> wrote:
> Hi, > I am learning FPGA by a DSP book (1st edition) written by Uew > Meyer_Baese. There is an example about 16 bit adder with detailed > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > In my practice project, I can only see the IC(interconect) and CELL > delay timing parameters from the timing closure floorplan. Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference. The purpose is I want to be > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > Because the timing parameters give tco etc., I use register at the > input and output ports. > > Could you help me? Thanks in advance. > > LIBRARY lpm; > USE lpm.lpm_components.ALL; > > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > USE ieee.std_logic_arith.ALL; > > ENTITY add16 IS > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > ONE : INTEGER := 1); -- 1 bit for carry reg. > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > -- Inputs > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > clk : IN STD_LOGIC); > END add16; > > ARCHITECTURE flex OF add16 IS > SIGNAL sum0, q2, q1 -- LSBs of inputs > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > BEGIN > reg_1: lpm_ff -- Save LSBs of x+y and carry > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => x, q => q1,clock => clk ); > reg_2: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => y, q => q2, clock => clk ); > -------------- First stage of the adder ------------------ > add_1: lpm_add_sub -- Add LSBs of x and y > GENERIC MAP ( LPM_WIDTH => WIDTH, > LPM_REPRESENTATION => "UNSIGNED", > LPM_DIRECTION => "ADD") > PORT MAP ( dataa => q1, datab => q2, > result => sum0); > reg_3: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => sum0, q => sum, clock => clk ); > > END flex;
fl,

Meyer Baese was my teacher at FSU, and
he taught my class using the DSP w/ FPGAs book he wrote.
Its a small world...

If you want to run an adder really fast you can instantiate an adder
component
from the altera mega function tool within the arithmetic components
subsection.
>From inside the megafunction wizard GUI you can select the level of
pipelining within the adder. So you can select anywhere from 1 to N clock cycles of delay. The more delay, the faster you can clock the adder. As far as timing parameters, you can see all the timing (slack, etc) from one critical path to another within the timing subsection of the compilation report after you compile your design in quartus II. --geoff fl wrote:
> Hi, > I am learning FPGA by a DSP book (1st edition) written by Uew > Meyer_Baese. There is an example about 16 bit adder with detailed > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > In my practice project, I can only see the IC(interconect) and CELL > delay timing parameters from the timing closure floorplan. Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference. The purpose is I want to be > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > Because the timing parameters give tco etc., I use register at the > input and output ports. > > Could you help me? Thanks in advance. > > > > LIBRARY lpm; > USE lpm.lpm_components.ALL; > > LIBRARY ieee; > USE ieee.std_logic_1164.ALL; > USE ieee.std_logic_arith.ALL; > > ENTITY add16 IS > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > ONE : INTEGER := 1); -- 1 bit for carry reg. > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > -- Inputs > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > clk : IN STD_LOGIC); > END add16; > > ARCHITECTURE flex OF add16 IS > SIGNAL sum0, q2, q1 -- LSBs of inputs > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > BEGIN > reg_1: lpm_ff -- Save LSBs of x+y and carry > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => x, q => q1,clock => clk ); > reg_2: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => y, q => q2, clock => clk ); > -------------- First stage of the adder ------------------ > add_1: lpm_add_sub -- Add LSBs of x and y > GENERIC MAP ( LPM_WIDTH => WIDTH, > LPM_REPRESENTATION => "UNSIGNED", > LPM_DIRECTION => "ADD") > PORT MAP ( dataa => q1, datab => q2, > result => sum0); > reg_3: lpm_ff > GENERIC MAP ( LPM_WIDTH => WIDTH ) > PORT MAP ( data => sum0, q => sum, clock => clk ); > > END flex;
Thank you very much. The problem is that I cannot get the claimed
performance of the example in that book. For the add_1p.vhd, 15-bit
(which said it could be 97.08 MHz in the book), I can only get 69.87
MHz fmax. In this example, I don't modify the code at all. Even I add
the timing constraint, the result is the same. It is really bizarre.
For the post code, it was an exercise with the given results. I don't
know where it is wrong. Anyone could test the best performance of the
post code?


wallge wrote:
> fl, > > Meyer Baese was my teacher at FSU, and > he taught my class using the DSP w/ FPGAs book he wrote. > Its a small world... > > If you want to run an adder really fast you can instantiate an adder > component > from the altera mega function tool within the arithmetic components > subsection. > >From inside the megafunction wizard GUI you can select the level of > pipelining > within the adder. So you can select anywhere from 1 to N clock cycles > of delay. > The more delay, the faster you can clock the adder. > > As far as timing parameters, you can see all the timing (slack, etc) > from one critical path to another > within the timing subsection of the compilation report after you > compile your design in quartus II. > > > --geoff > > fl wrote: > > Hi, > > I am learning FPGA by a DSP book (1st edition) written by Uew > > Meyer_Baese. There is an example about 16 bit adder with detailed > > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > > > In my practice project, I can only see the IC(interconect) and CELL > > delay timing parameters from the timing closure floorplan. Because the > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > the book, I want to know the difference. The purpose is I want to be > > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > > Because the timing parameters give tco etc., I use register at the > > input and output ports. > > > > Could you help me? Thanks in advance. > > > > > > > > LIBRARY lpm; > > USE lpm.lpm_components.ALL; > > > > LIBRARY ieee; > > USE ieee.std_logic_1164.ALL; > > USE ieee.std_logic_arith.ALL; > > > > ENTITY add16 IS > > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > > ONE : INTEGER := 1); -- 1 bit for carry reg. > > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > -- Inputs > > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > clk : IN STD_LOGIC); > > END add16; > > > > ARCHITECTURE flex OF add16 IS > > SIGNAL sum0, q2, q1 -- LSBs of inputs > > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > BEGIN > > reg_1: lpm_ff -- Save LSBs of x+y and carry > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => x, q => q1,clock => clk ); > > reg_2: lpm_ff > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => y, q => q2, clock => clk ); > > -------------- First stage of the adder ------------------ > > add_1: lpm_add_sub -- Add LSBs of x and y > > GENERIC MAP ( LPM_WIDTH => WIDTH, > > LPM_REPRESENTATION => "UNSIGNED", > > LPM_DIRECTION => "ADD") > > PORT MAP ( dataa => q1, datab => q2, > > result => sum0); > > reg_3: lpm_ff > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > PORT MAP ( data => sum0, q => sum, clock => clk ); > > > > END flex;
send meyer-baese an email saying you bought his book...
and ask him your question...
you can find his email on the faculty page from the FSU
college of engineering page:

http://www.eng.fsu.edu/index_noright.php?page=faculty_staff_directory


fl wrote:
> Thank you very much. The problem is that I cannot get the claimed > performance of the example in that book. For the add_1p.vhd, 15-bit > (which said it could be 97.08 MHz in the book), I can only get 69.87 > MHz fmax. In this example, I don't modify the code at all. Even I add > the timing constraint, the result is the same. It is really bizarre. > For the post code, it was an exercise with the given results. I don't > know where it is wrong. Anyone could test the best performance of the > post code? > > > wallge wrote: > > fl, > > > > Meyer Baese was my teacher at FSU, and > > he taught my class using the DSP w/ FPGAs book he wrote. > > Its a small world... > > > > If you want to run an adder really fast you can instantiate an adder > > component > > from the altera mega function tool within the arithmetic components > > subsection. > > >From inside the megafunction wizard GUI you can select the level of > > pipelining > > within the adder. So you can select anywhere from 1 to N clock cycles > > of delay. > > The more delay, the faster you can clock the adder. > > > > As far as timing parameters, you can see all the timing (slack, etc) > > from one critical path to another > > within the timing subsection of the compilation report after you > > compile your design in quartus II. > > > > > > --geoff > > > > fl wrote: > > > Hi, > > > I am learning FPGA by a DSP book (1st edition) written by Uew > > > Meyer_Baese. There is an example about 16 bit adder with detailed > > > timing parameters, such as: tco=0.2 ns, tcgen=1.5ns tcico=0.3ns, > > > tsamerow=2.9ns, tLUT=1.9ns and tsu=2.7ns. The fmax of that 16 bit adder > > > is 74.6 MHz I think it is EPF10K20RC240-4 from the context. > > > > > > In my practice project, I can only see the IC(interconect) and CELL > > > delay timing parameters from the timing closure floorplan. Because the > > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > > the book, I want to know the difference. The purpose is I want to be > > > ensure whether I get the fast 16 bit adder. BTW, I post my code below. > > > Because the timing parameters give tco etc., I use register at the > > > input and output ports. > > > > > > Could you help me? Thanks in advance. > > > > > > > > > > > > LIBRARY lpm; > > > USE lpm.lpm_components.ALL; > > > > > > LIBRARY ieee; > > > USE ieee.std_logic_1164.ALL; > > > USE ieee.std_logic_arith.ALL; > > > > > > ENTITY add16 IS > > > GENERIC (WIDTH : INTEGER := 16; -- Total bit width > > > ONE : INTEGER := 1); -- 1 bit for carry reg. > > > PORT (x,y : IN STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > -- Inputs > > > sum : OUT STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > clk : IN STD_LOGIC); > > > END add16; > > > > > > ARCHITECTURE flex OF add16 IS > > > SIGNAL sum0, q2, q1 -- LSBs of inputs > > > : STD_LOGIC_VECTOR(WIDTH-1 DOWNTO 0); > > > BEGIN > > > reg_1: lpm_ff -- Save LSBs of x+y and carry > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => x, q => q1,clock => clk ); > > > reg_2: lpm_ff > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => y, q => q2, clock => clk ); > > > -------------- First stage of the adder ------------------ > > > add_1: lpm_add_sub -- Add LSBs of x and y > > > GENERIC MAP ( LPM_WIDTH => WIDTH, > > > LPM_REPRESENTATION => "UNSIGNED", > > > LPM_DIRECTION => "ADD") > > > PORT MAP ( dataa => q1, datab => q2, > > > result => sum0); > > > reg_3: lpm_ff > > > GENERIC MAP ( LPM_WIDTH => WIDTH ) > > > PORT MAP ( data => sum0, q => sum, clock => clk ); > > > > > > END flex;
fl wrote:

> Because the > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > the book, I want to know the difference.
I don't have the book, but what do you get if you do it like this? http://home.comcast.net/~mike_treseler/add16.vhd -- Mike Treseler
Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz
(period=3D16.8 ns). fmax is the same even though I constraint fmax to 75
MHz in the dialog box Clock Settings: Default required fmax: 75 MHz.
The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4.
Because the result of fmax is the same, there may be something wrong in
the utilization of Quartus II 6.0 webpack, Windows XP. Why?

Thank you





Mike Treseler a =E9crit :

> fl wrote: > > > Because the > > fmax I get is lower (only 59.52 MHz from timing analyzer) than that in > > the book, I want to know the difference. > > I don't have the book, but what do you get > if you do it like this? > > http://home.comcast.net/~mike_treseler/add16.vhd >=20 >=20 > -- Mike Treseler
fl wrote:
> Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > (period=16.8 ns). fmax is the same even though I constraint fmax to 75 > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > Because the result of fmax is the same, there may be something wrong in > the utilization of Quartus II 6.0 webpack, Windows XP.
Or maybe the author was using a faster speed grade or a different device. With an epm240f100c4 I got 197.51 MHz ( period = 5.063 ns ) In any case, I think he was doing it the hard way. Good luck. -- Mike Treseler
Hi, I still cannot get better performance, even for your code. The
following is part of the info in listpath. Why there is so much IC and
CELL delay? Could you guess that? I doubt there are some settings I
must set besides fmax setting. Thank you very much.


Info: Clock "clk" has Internal fmax of 59.52 MHz between source
register "\only:x_v[0]" and destination register "\only:sum_v[14]"
(period=3D 16.8 ns)
	Info: + Longest register to register delay is 13.200 ns
		Info: 1: + IC(0.000 ns) + CELL(0.000 ns) =3D 0.000 ns; Loc. =3D LC6_F16;
Fanout =3D 2; REG Node =3D '\only:x_v[0]'
		Info: 2: + IC(2.200 ns) + CELL(1.200 ns) =3D 3.400 ns; Loc. =3D LC1_F13;
Fanout =3D 2; COMB Node =3D
'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[0]'
		Info: 3: + IC(0.000 ns) + CELL(0.300 ns) =3D 3.700 ns; Loc. =3D LC2_F13;
Fanout =3D 2; COMB Node =3D
'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[1]'



Mike Treseler a =E9crit :

> fl wrote: > > Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > > (period=3D16.8 ns). fmax is the same even though I constraint fmax to 75 > > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > > Because the result of fmax is the same, there may be something wrong in > > the utilization of Quartus II 6.0 webpack, Windows XP. > > > Or maybe the author was using a faster speed grade > or a different device. > With an epm240f100c4 I got 197.51 MHz ( period =3D 5.063 ns ) > > In any case, I think he was doing it the hard way. > Good luck. >=20 > -- Mike Treseler
As Mike suggested, I would check to see if I am using the same device
and speed-grade as in the book.  Typically, the timing parameters that
you are interested in are constant for a particular device and
speed-grade in that device.  So they would be published in the
datasheet.  In quartus you can try the property editor.  Nevertheless,
the timing analyzer already includes these values in its cell delay
component.  So no matter what you find out from the datasheet it would
not change the timing.

Also, in the timing analyzer report you can get an idea where most of
your delays are coming from... if the IC delay is greater that 75%, you
probably have some thing else going on that causes the logic to be
placed too far away.  It could also be that the clock is not routed on
a global route.

-sanjay

fl wrote:
> Hi, I still cannot get better performance, even for your code. The > following is part of the info in listpath. Why there is so much IC and > CELL delay? Could you guess that? I doubt there are some settings I > must set besides fmax setting. Thank you very much. > > > Info: Clock "clk" has Internal fmax of 59.52 MHz between source > register "\only:x_v[0]" and destination register "\only:sum_v[14]" > (period=3D 16.8 ns) > Info: + Longest register to register delay is 13.200 ns > Info: 1: + IC(0.000 ns) + CELL(0.000 ns) =3D 0.000 ns; Loc. =3D LC6_F16; > Fanout =3D 2; REG Node =3D '\only:x_v[0]' > Info: 2: + IC(2.200 ns) + CELL(1.200 ns) =3D 3.400 ns; Loc. =3D LC1_F13; > Fanout =3D 2; COMB Node =3D > 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[0]' > Info: 3: + IC(0.000 ns) + CELL(0.300 ns) =3D 3.700 ns; Loc. =3D LC2_F13; > Fanout =3D 2; COMB Node =3D > 'lpm_add_sub:Add0|addcore:adder|a_csnbuffer:result_node|cout[1]' > > > > Mike Treseler a =E9crit : > > > fl wrote: > > > Thank you, Mike. You are so kind. I get the same fmax, i.e. 59.52 MHz > > > (period=3D16.8 ns). fmax is the same even though I constraint fmax to=
75
> > > MHz in the dialog box Clock Settings: Default required fmax: 75 MHz. > > > The slack is -3.467 ns. The device is FLEX10K: EPF10K20RC240-4. > > > Because the result of fmax is the same, there may be something wrong =
in
> > > the utilization of Quartus II 6.0 webpack, Windows XP. > > > > > > Or maybe the author was using a faster speed grade > > or a different device. > > With an epm240f100c4 I got 197.51 MHz ( period =3D 5.063 ns ) > > > > In any case, I think he was doing it the hard way. > > Good luck. > >=20 > > -- Mike Treseler