FPGARelated.com
Forums

nicer code => slower code??

Started by Unknown October 5, 2006
I was playing with this old design the other day and decided to
"clean it up" a little bit (it was kinda messy). I moved some
logic to their own modules for better readability. I also grouped
some signals into VHDL records (don't know what its called, but
if you ever browsed the LEON code, you know what i mean).


I didn't change the functionality of the design (I did ran a
large number of tests to be sure, of course). Furthermore, the changes
were very isolated (only two files affected in a relatively large
design)

I was kind of surprised to see that after synthesis and PAR,
I got a design that was:

1. 10% slower
2. marginally larger (few hundred LUTs)

(yes, with same tool, same speed grade and so on)


Has anyone seen this kind of behaviour before?
Would this go away if I somehow "flattened" my design?




bruns

burn.sir@gmail.com wrote:

> Would this go away if I somehow "flattened" my design?
I would back out the changes one at a time to find the culprit. Maybe adding the entities created a pipeline stage somehow. Also check the rtl viewer. -- Mike Treseler
burn.sir@gmail.com wrote:

> >I was playing with this old design the other day and decided to >"clean it up" a little bit (it was kinda messy). I moved some >logic to their own modules for better readability. I also grouped >some signals into VHDL records (don't know what its called, but >if you ever browsed the LEON code, you know what i mean). > > >I didn't change the functionality of the design (I did ran a >large number of tests to be sure, of course). Furthermore, the changes >were very isolated (only two files affected in a relatively large >design) > >I was kind of surprised to see that after synthesis and PAR, >I got a design that was: > >1. 10% slower >2. marginally larger (few hundred LUTs) > >(yes, with same tool, same speed grade and so on) > > >Has anyone seen this kind of behaviour before? >Would this go away if I somehow "flattened" my design?
Xilinx allows to flatten the design and optimize across module boundaries. From my experience, this does speed things up. -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nl
Mike Treseler wrote:
> burn.sir@gmail.com wrote: > > > Would this go away if I somehow "flattened" my design? > > I would back out the changes one at a time > to find the culprit. Maybe adding the entities > created a pipeline stage somehow. > Also check the rtl viewer. > > -- Mike Treseler
Actually, both designs are _indentical_. I know that because we have some really tough tests for them. In addition, I spent 3-4 hours manually inspecting the code, the tests and all the simulation logs.. What happened is that the synthesis tool stopped optimising after a certain depth. At least, that is my theory... burns
burn.sir@gmail.com wrote:
> Mike Treseler wrote: > > burn.sir@gmail.com wrote: > > > > > Would this go away if I somehow "flattened" my design? > > > > I would back out the changes one at a time > > to find the culprit. Maybe adding the entities > > created a pipeline stage somehow. > > Also check the rtl viewer. > > > > -- Mike Treseler > > > Actually, both designs are _indentical_. I know that because we have > some > really tough tests for them. In addition, I spent 3-4 hours manually > inspecting the code, the tests and all the simulation logs.. > > > > What happened is that the synthesis tool stopped optimising after > a certain depth. At least, that is my theory... > > > burns
Just curiosity, have you been able to re-route the old design and got the same result as before?
> > Actually, both designs are _indentical_. I know that because we have > > some > > really tough tests for them. In addition, I spent 3-4 hours manually > > inspecting the code, the tests and all the simulation logs.. > > > > > > > > What happened is that the synthesis tool stopped optimising after > > a certain depth. At least, that is my theory... > > > > > > burns > > Just curiosity, have you been able to re-route the old design and got > the same result as before?
Yes, I ran the same tool on both designs, same configuration, same options. I ran some isolated tests and it seems like the synthesis tool (Synplify Pro) does not have any problems with the VHDL records. So I guess that the problem is the deeper level of hierarchy that new modules introduce. Any ideas how to make Synplify to flatten (at least partially) the design? burns
<burn.sir@gmail.com> wrote in message 
news:1160073562.629841.275340@c28g2000cwb.googlegroups.com...
> Mike Treseler wrote: >> burn.sir@gmail.com wrote: >> >> > Would this go away if I somehow "flattened" my design? >> >> I would back out the changes one at a time >> to find the culprit. Maybe adding the entities >> created a pipeline stage somehow. >> Also check the rtl viewer. >> >> -- Mike Treseler > > > Actually, both designs are _indentical_. I know that because we have > some > really tough tests for them. In addition, I spent 3-4 hours manually > inspecting the code, the tests and all the simulation logs..
That is not very strong proof that they are 'identical' unless your testbench covered darn near every possible logic path and checked for outputs to be identical on each and every clock cycle, there may be something subtle about the change that is being overlooked. What you should do is look at the critical timing path in the slower design and compare that to the same path in the original design in both cases looking at the synthesis output code (not your source). That may give you a clue as to where the original and the cleaned up' design start to differ in the synthesis implementation.
> > What happened is that the synthesis tool stopped optimising after > a certain depth. At least, that is my theory...
And like any theory, you need to test it. So since you now have two representations of a design that you believe to be functionally identical but one has more 'depth' to it, then you need to analyze those two designs and figure out why the performance of one is worse than the other (or hope that someone else has seen this behaviour too and what they found)....and if the theory holds up, open up a bug report with the folks at Synplify telling them what you've found. Posting what you found here would be interesting too. I haven't seen this particular problem with Synplify (or other tools) but also haven't specifically looked for it either. I have been knee deep in analyzing/improving timing paths and never ran across hierarchy depth as being the culprit. KJ
I had the same thing happen to me using ISE to synthesize a arithmatic unit. 
I can guarantee that no real changes were made, I just merged two modules 
into one and it was faster (didn't look at the size).


---Matthew Hicks


<burn.sir@gmail.com> wrote in message 
news:1160071377.108163.26640@m73g2000cwd.googlegroups.com...
> > I was playing with this old design the other day and decided to > "clean it up" a little bit (it was kinda messy). I moved some > logic to their own modules for better readability. I also grouped > some signals into VHDL records (don't know what its called, but > if you ever browsed the LEON code, you know what i mean). > > > I didn't change the functionality of the design (I did ran a > large number of tests to be sure, of course). Furthermore, the changes > were very isolated (only two files affected in a relatively large > design) > > I was kind of surprised to see that after synthesis and PAR, > I got a design that was: > > 1. 10% slower > 2. marginally larger (few hundred LUTs) > > (yes, with same tool, same speed grade and so on) > > > Has anyone seen this kind of behaviour before? > Would this go away if I somehow "flattened" my design? > > > > > bruns >
Most synthesis tools will not share resources across module boundaries,
other than really trivial identical registers, etc.

Splitting it up into more modules may have defeated some sharing going
on.

Andy


Matthew Hicks wrote:
> I had the same thing happen to me using ISE to synthesize a arithmatic unit. > I can guarantee that no real changes were made, I just merged two modules > into one and it was faster (didn't look at the size). > > > ---Matthew Hicks > > > <burn.sir@gmail.com> wrote in message > news:1160071377.108163.26640@m73g2000cwd.googlegroups.com... > > > > I was playing with this old design the other day and decided to > > "clean it up" a little bit (it was kinda messy). I moved some > > logic to their own modules for better readability. I also grouped > > some signals into VHDL records (don't know what its called, but > > if you ever browsed the LEON code, you know what i mean). > > > > > > I didn't change the functionality of the design (I did ran a > > large number of tests to be sure, of course). Furthermore, the changes > > were very isolated (only two files affected in a relatively large > > design) > > > > I was kind of surprised to see that after synthesis and PAR, > > I got a design that was: > > > > 1. 10% slower > > 2. marginally larger (few hundred LUTs) > > > > (yes, with same tool, same speed grade and so on) > > > > > > Has anyone seen this kind of behaviour before? > > Would this go away if I somehow "flattened" my design? > > > > > > > > > > bruns > >
Thanks for the insight.  I thought that the synthesis engine would make a 
huge netlist/graph out of all of my modules and optimize from there ... but 
now that I think about it, ISE does report optimization of only single 
modules.


---Matthew Hicks


"Andy" <jonesandy@comcast.net> wrote in message 
news:1160599670.498873.57930@m7g2000cwm.googlegroups.com...
> Most synthesis tools will not share resources across module boundaries, > other than really trivial identical registers, etc. > > Splitting it up into more modules may have defeated some sharing going > on. > > Andy > > > Matthew Hicks wrote: >> I had the same thing happen to me using ISE to synthesize a arithmatic >> unit. >> I can guarantee that no real changes were made, I just merged two modules >> into one and it was faster (didn't look at the size). >> >> >> ---Matthew Hicks >> >> >> <burn.sir@gmail.com> wrote in message >> news:1160071377.108163.26640@m73g2000cwd.googlegroups.com... >> > >> > I was playing with this old design the other day and decided to >> > "clean it up" a little bit (it was kinda messy). I moved some >> > logic to their own modules for better readability. I also grouped >> > some signals into VHDL records (don't know what its called, but >> > if you ever browsed the LEON code, you know what i mean). >> > >> > >> > I didn't change the functionality of the design (I did ran a >> > large number of tests to be sure, of course). Furthermore, the changes >> > were very isolated (only two files affected in a relatively large >> > design) >> > >> > I was kind of surprised to see that after synthesis and PAR, >> > I got a design that was: >> > >> > 1. 10% slower >> > 2. marginally larger (few hundred LUTs) >> > >> > (yes, with same tool, same speed grade and so on) >> > >> > >> > Has anyone seen this kind of behaviour before? >> > Would this go away if I somehow "flattened" my design? >> > >> > >> > >> > >> > bruns >> > >