I have tried to classify a few different ways to optimize a design
for an FPGA. I have tried to keep the categories fairly general
without going into too much details. Comments (both positive and
negative) would be appreciated.
1 Pipelining
* A must in almost any FPGA design. Relatively cheap to do since
flip-flops are usually abundant. This is not very FPGA specific
though, but usually an ASIC design doesn't have to be pipelined
as much as an FPGA design is.
2 Utilizing FPGA resources efficiently
2.1 Change the design to use as much of the FPGA LUTs as possible.
* For example, a 32-bit adder/subtracter takes up the same amount
of space as a plain 32-bit adder. Sometimes you might have to
instantiate LUTs, flip-flops, carry-chains, etc manually to
do this.
2.2 Utilizing memories efficiently.
* For example, if your design will be more efficient by utilizing
both ports of a block RAM you should probably do so. Using
distributed memories, shift registers, etc efficiently.
2.3 Utilizing DSP blocks efficiently
* Change the architecture of your design to be able to take
maximum advantage of the DSP blocks. For example, if you have a
DSP processor with 4 accumulation registers this will not map
very well to a Virtex-4 DSP48 block which only have one accumulation
register. (Although this can be fixed by using result forwarding and
utilizing a register file outside the DSP48 block.)
2.4 Utilizing other embedded FPGA resources
* Embedded processors, serializers/deserializers, DLLs/PLLs, etc
3. Manual floorplanning
* Either through RLOCs or graphical tools
4. Manual routing
* Not very common but can be a powerful tool to meet timing in extreme
situations.
5. Partial reconfiguration
* Not very common yet but has a potential to save a lot of area if
certain parts of a design are not needed all the time.
6. <Insert your comment here> :)
Have I forgotten something very important here?
And by the way, is there any definitive book one should read to learn
more about how to optimize a design for an FPGA? I have looked at many
FPGA books but most books only seem to cover fairly introductory
material.
I have looked at for example "Digital Signal Processing with Field
Programmable Gate Arrays (Signals and Communication Technology)" but
this book doesn't really have that much FPGA material. (Although it has
a lot of nice DSP material.)
Another book, "Advanced FPGA Design" by Steve Kilts was a decent text for
intermediate designers but I'm not sure I would have called it "Advanced".
At the moment I have probably had more use of some of the postings on this
newsgroup than most of the FPGA related books I have looked at however :)
/Andreas
Classifying different kinds of FPGA optimizations
Started by ●January 1, 2009
Reply by ●January 1, 20092009-01-01
On 2009-01-01, valwn@silvtrc.org <valwn@silvtrc.org> wrote:> Andreas Ehliar <ehliar-nospam@isy.liu.se> wrote: >>I have tried to classify a few different ways to optimize a design >>for an FPGA. I have tried to keep the categories fairly general >>without going into too much details. Comments (both positive and >>negative) would be appreciated. > >>1 Pipelining > >>6. <Insert your comment here> :) > > Parallellisation.True, although my intent (which I realize that I did not describe in as much detail as I should have) was to list optimizations that were mostly FPGA specific in contrast to ASIC specific. Parallellism can be used to great effect in both FPGAs and in ASICs. Basically, I am trying to come up with a list of what to do to optimize an FPGA design in contrast to an ASIC design. (Although I realize that both floorplanning and manual routing can be done in an ASIC as well. In fact, I would guess that more ASIC designs than FPGA designs have some sort of floorplanning although I don't have any numbers to support this claim.) /Andreas
Reply by ●January 1, 20092009-01-01
Andreas Ehliar <ehliar-nospam@isy.liu.se> wrote:>I have tried to classify a few different ways to optimize a design >for an FPGA. I have tried to keep the categories fairly general >without going into too much details. Comments (both positive and >negative) would be appreciated.>1 Pipelining>6. <Insert your comment here> :)Parallellisation.
Reply by ●January 1, 20092009-01-01
On Jan 1, 7:31 am, Andreas Ehliar <ehliar-nos...@isy.liu.se> wrote:> I have tried to classify a few different ways to optimize a design > for an FPGA. I have tried to keep the categories fairly general > without going into too much details. Comments (both positive and > negative) would be appreciated. > > 1 Pipelining > * A must in almost any FPGA design. Relatively cheap to do since > flip-flops are usually abundant. This is not very FPGA specific > though, but usually an ASIC design doesn't have to be pipelined > as much as an FPGA design is. > 2 Utilizing FPGA resources efficiently > 2.1 Change the design to use as much of the FPGA LUTs as possible. > * For example, a 32-bit adder/subtracter takes up the same amount > of space as a plain 32-bit adder. Sometimes you might have to > instantiate LUTs, flip-flops, carry-chains, etc manually to > do this. > 2.2 Utilizing memories efficiently. > * For example, if your design will be more efficient by utilizing > both ports of a block RAM you should probably do so. Using > distributed memories, shift registers, etc efficiently. > 2.3 Utilizing DSP blocks efficiently > * Change the architecture of your design to be able to take > maximum advantage of the DSP blocks. For example, if you have a > DSP processor with 4 accumulation registers this will not map > very well to a Virtex-4 DSP48 block which only have one accumulation > register. (Although this can be fixed by using result forwarding and > utilizing a register file outside the DSP48 block.) > 2.4 Utilizing other embedded FPGA resources > * Embedded processors, serializers/deserializers, DLLs/PLLs, etc > 3. Manual floorplanning > * Either through RLOCs or graphical tools > 4. Manual routing > * Not very common but can be a powerful tool to meet timing in extreme > situations. > 5. Partial reconfiguration > * Not very common yet but has a potential to save a lot of area if > certain parts of a design are not needed all the time. > 6. <Insert your comment here> :) > > Have I forgotten something very important here? > > And by the way, is there any definitive book one should read to learn > more about how to optimize a design for an FPGA? I have looked at many > FPGA books but most books only seem to cover fairly introductory > material. > > I have looked at for example "Digital Signal Processing with Field > Programmable Gate Arrays (Signals and Communication Technology)" but > this book doesn't really have that much FPGA material. (Although it has > a lot of nice DSP material.) > > Another book, "Advanced FPGA Design" by Steve Kilts was a decent text for > intermediate designers but I'm not sure I would have called it "Advanced". > > At the moment I have probably had more use of some of the postings on this > newsgroup than most of the FPGA related books I have looked at however :) > > /AndreasOne thing you could add is to adjust your design to fully utilize the FFs. Often more LUTs are needed than FFs and some deigns can be changed to use more FFs. The only example I can think of is the tradeoff between fanout of a signal from a single FF vs reducing the fanout by using multiple copies of the FF and putting some of the fanout before the FF. This reduces delay because the total fanout is the multiple of the individual fanouts with less total delay than the fanout if done all at once. As long as you are listing resources to be fully utilized, you should list clock distribution. Clock lines can be used for high fanout signals and the tools often do this automatically, but sometimes you need to instantiate the clock driver. Those are my suggestion, but I don't actually get your post. The entire section on using resources efficiently is just a list of the resources saying "use efficiently". That is not really a technique, it is a goal. The manual floorplanning and routing, to me, are things to be avoided since they are brand specific (or even family specific), very labor intensive and make design rework very difficult. Partial reconfiguration is not really a useful tool and is only supported by one vendor, if you can call what they are doing with it "support". I have tried to use modular reconfiguration (a subset of partial reconfiguration) for a long time and never found it to be ready for prime time, mainly because it is not supported in the affordable FPGA families. In fact, I read a paper once that talked about optimization as being "evil". Optimization is something you do only when necessary because it takes a lot more time to do, the added complexity can create bugs, it can make debugging more difficult and it makes design changes and updates more difficult. All in all, optimization increases the cost of the product life cycle significantly. Listing optimizations is of some value, but without details of how they can be useful, it is still just a list. When, where and how do you see dual port memories efficient in place of a single port memory? How can DSP blocks be used in place of other logic? When does an adder/subtracter improve on a simple adder? When do you take the leap of using manual routing and floorplanning? What type of application would efficiently utilize partial reconfiguration (or even modular reconfiguation)? My answer to that last one is a design I did that used an FPGA to interface to daughter cards (DC). There were a number of daughter cards with different hardware and different interfaces. The design in the FPGA needed to load one module for each DC. The initial module would be able to read a serial I/O pin to determine the DC type. Then the appropriate interface module could be loaded for each DC as required. Why was this important? Because there were planned around a dozen DC types and the main board had four DC slots. Do the math and you will see that the permutations are ***HUGE***. In theory, modular configuration would have reduced the design to a single module for the processor interface and a module for each DC type. Since this was not really a viable approach, instead of being able to supply a main board with variable combinations of DCs along with a generic set of software and FPGA modules, there were unique FPGA loads for each customer's DC combinations which were no fun trying to maintain. Maybe I am just sour on the promise of partial reconfiguration, but I see very little use for it, not because it can't work, but because it has never been treated like a real development tool by the vendors. In general, I think that in volume production there is little technical reason for utilizing FPGAs versus other logic. The real reason is just economic. FPGAs have gotten so cheap that many devices cost more than the FPGAs that are replacing them, especially when they are replaced several in one design. Even the issue of firmware updates (a common justification for using FPGAs vs ASICs) is mostly useful for the initial production and is often never used in a final product yet FPGAs are often used in place of ASICs. Technical issues actually are just not so important in selecting FPGAs except for a minority of designs. In the old days I would be putting on my asbestos suit to get ready for the vendor response. But that doesn't happen anymore. In some ways, I miss the old days... at least it was never boring. :^) Rick
Reply by ●January 2, 20092009-01-02
"Andreas Ehliar" <ehliar-nospam@isy.liu.se> wrote in message news:slrnglpqn8.pk4.ehliar-nospam@sabor.isy.liu.se...>I have tried to classify a few different ways to optimize a design > for an FPGA. I have tried to keep the categories fairly general > without going into too much details. Comments (both positive and > negative) would be appreciated. > > 1 Pipelining > * A must in almost any FPGA design. Relatively cheap to do since > flip-flops are usually abundant. This is not very FPGA specific > though, but usually an ASIC design doesn't have to be pipelined > as much as an FPGA design is.1.1 Use Multicycle path if results are not required every clockcycle, might save you some power as well.> 2 Utilizing FPGA resources efficiently > 2.1 Change the design to use as much of the FPGA LUTs as possible. > * For example, a 32-bit adder/subtracter takes up the same amount > of space as a plain 32-bit adder. Sometimes you might have to > instantiate LUTs, flip-flops, carry-chains, etc manually to > do this. > 2.2 Utilizing memories efficiently. > * For example, if your design will be more efficient by utilizing > both ports of a block RAM you should probably do so. Using > distributed memories, shift registers, etc efficiently. > 2.3 Utilizing DSP blocks efficiently > * Change the architecture of your design to be able to take > maximum advantage of the DSP blocks. For example, if you have a > DSP processor with 4 accumulation registers this will not map > very well to a Virtex-4 DSP48 block which only have one accumulation > register. (Although this can be fixed by using result forwarding and > utilizing a register file outside the DSP48 block.) > 2.4 Utilizing other embedded FPGA resources > * Embedded processors, serializers/deserializers, DLLs/PLLs, etc > 3. Manual floorplanning > * Either through RLOCs or graphical tools3.1 Use a Physical Synthesis tool to squeeze out that last ns.> 4. Manual routing > * Not very common but can be a powerful tool to meet timing in extreme > situations. > 5. Partial reconfiguration > * Not very common yet but has a potential to save a lot of area if > certain parts of a design are not needed all the time. > 6. <Insert your comment here> :)6. Investigate if your critical path is not a false path. 7. Use a different synthesis tool :-) Hans www.ht-lab.com> > > > Have I forgotten something very important here? > > And by the way, is there any definitive book one should read to learn > more about how to optimize a design for an FPGA? I have looked at many > FPGA books but most books only seem to cover fairly introductory > material. > > I have looked at for example "Digital Signal Processing with Field > Programmable Gate Arrays (Signals and Communication Technology)" but > this book doesn't really have that much FPGA material. (Although it has > a lot of nice DSP material.) > > Another book, "Advanced FPGA Design" by Steve Kilts was a decent text for > intermediate designers but I'm not sure I would have called it "Advanced". > > > At the moment I have probably had more use of some of the postings on this > newsgroup than most of the FPGA related books I have looked at however :) > > > /Andreas
Reply by ●January 5, 20092009-01-05
Andreas Ehliar schrieb:> I have tried to classify a few different ways to optimize a design > for an FPGA. I have tried to keep the categories fairly general > without going into too much details. Comments (both positive and > negative) would be appreciated. > > 1 Pipelining > * A must in almost any FPGA design. Relatively cheap to do since > flip-flops are usually abundant. This is not very FPGA specific > though, but usually an ASIC design doesn't have to be pipelined > as much as an FPGA design is. > 2 Utilizing FPGA resources efficiently > 2.1 Change the design to use as much of the FPGA LUTs as possible. > * For example, a 32-bit adder/subtracter takes up the same amount > of space as a plain 32-bit adder. Sometimes you might have to > instantiate LUTs, flip-flops, carry-chains, etc manually to > do this. > 2.2 Utilizing memories efficiently. > * For example, if your design will be more efficient by utilizing > both ports of a block RAM you should probably do so. Using > distributed memories, shift registers, etc efficiently. > 2.3 Utilizing DSP blocks efficiently > * Change the architecture of your design to be able to take > maximum advantage of the DSP blocks. For example, if you have a > DSP processor with 4 accumulation registers this will not map > very well to a Virtex-4 DSP48 block which only have one accumulation > register. (Although this can be fixed by using result forwarding and > utilizing a register file outside the DSP48 block.) > 2.4 Utilizing other embedded FPGA resources > * Embedded processors, serializers/deserializers, DLLs/PLLs, etcAvoid multiplexing of signals. Multiplexers infer considerable delay and resource use.> 3. Manual floorplanning > * Either through RLOCs or graphical tools > 4. Manual routing > * Not very common but can be a powerful tool to meet timing in extreme > situations. > 5. Partial reconfiguration > * Not very common yet but has a potential to save a lot of area if > certain parts of a design are not needed all the time. > 6. <Insert your comment here> :) > > > > Have I forgotten something very important here? > > And by the way, is there any definitive book one should read to learn > more about how to optimize a design for an FPGA? I have looked at many > FPGA books but most books only seem to cover fairly introductory > material. > > I have looked at for example "Digital Signal Processing with Field > Programmable Gate Arrays (Signals and Communication Technology)" but > this book doesn't really have that much FPGA material. (Although it has > a lot of nice DSP material.) > > Another book, "Advanced FPGA Design" by Steve Kilts was a decent text for > intermediate designers but I'm not sure I would have called it "Advanced". > > > At the moment I have probably had more use of some of the postings on this > newsgroup than most of the FPGA related books I have looked at however :) > > > /Andreas
Reply by ●January 5, 20092009-01-05
Are you writing a paper on this or what? Whether or not various optimizations are also effective in ASICs is moot. The value in identifying them is in identifying the situations in which these optimizations may help. Unless this is a very introductory paper targeted to experienced ASIC designers that want to design FPGAs (and haven't already done so for ASIC prototyping), limiting the scope to optimizations that are unique to FPGAs is pointless. That said, I would add register re-timing, or redistribution of logic between registers (often in pipeline stages) to equalize register- register delays, thereby reducing the maximum delays, and increasing the potential clock rate. This is not just an FPGA-only optimization, but the performance limitations on FPGAs may make it more applicable to them than for ASICs. The technology was developed first for ASIC synthesis. Another issue with porting ASIC design(er)s to FPGA is that the use of creative clock tree generation to fix small timing problems is usually not an option in FPGAs. Small memories are often easier to use in FPGAs because most FPGA synthesis tools will infer memory elements from array descriptions, as long as certain behavioral aspects are satisfied (i.e. avoiding accessing more than one or two array locations in the same clock cycle). Most ASIC synthesis tools do not support memory inference, which means you have to separately build and instantiate memories, which often makes it harder to integrate the memory's functionality into the design. Andy
Reply by ●January 6, 20092009-01-06
On 2009-01-05, Andy <jonesandy@comcast.net> wrote:> Are you writing a paper on this or what?Good guess. In fact, quite many of the papers I've been involved with have mentioned "FPGA optimizations" somewhere without truly defining it. Many other papers I have read also say that a design is (or sometimes isn't) optimized for FPGAs. Besides, we talk about optimizing designs for FPGAs here all the time. I know what I'm doing to optimize my designs for FPGAs (as opposed to optimizing a design for ASIC or just plainly optimizing a design for a generic architecture), and I have seen many interesting ideas on this group for example. Even so, I have never really seen a good definition of "FPGA optimized".> Small memories are often easier to use in FPGAs because most FPGAHow well I know... :) On the other hand, relatively small memories such as register files (and I'm not thinking about the Cell processors 128x128 bit register file here :)) can be implemented using standard cells without losing that much area. (And unless your volumes are huge you might even gain by not having to worry about the extra verification cost required for verifying a design with many custom blocks in it.) Anyway, thanks for the comments. /Andreas
Reply by ●January 7, 20092009-01-07
On Jan 6, 3:34=A0pm, Andreas Ehliar <ehliar-nos...@isy.liu.se> wrote:> On 2009-01-05, Andy <jonesa...@comcast.net> wrote: > > > Are you writing a paper on this or what? > > Good guess. In fact, quite many of the papers I've been involved with > have mentioned "FPGA optimizations" somewhere without truly defining > it. Many other papers I have read also say that a design is (or sometimes > isn't) optimized for FPGAs. Besides, we talk about optimizing designs > for FPGAs here all the time. > > I know what I'm doing to optimize my designs for FPGAs (as opposed to > optimizing a design for ASIC or just plainly optimizing a design > for a generic architecture), and I have seen many interesting ideas > on this group for example. Even so, I have never really seen a good > definition of "FPGA optimized". > > > Small memories are often easier to use in FPGAs because most FPGA > > How well I know... :) On the other hand, relatively small memories such > as register files (and I'm not thinking about the Cell processors 128x128 > bit register file here :)) can be implemented using standard cells withou=t> losing that much area. (And unless your volumes are huge you might even > gain by not having to worry about the extra verification cost required fo=r> verifying a design with many custom blocks in it.) > > Anyway, thanks for the comments. > > /AndreasHey I have one more point 7. Writing the Constraint File in a very logical and intelligent way. I believe Constraints play a major role in the design Optimizations and finally improves the Quality of result.
Reply by ●January 7, 20092009-01-07
On Jan 7, 12:51=A0am, nav_tiw...@rediffmail.com wrote:> On Jan 6, 3:34=A0pm, Andreas Ehliar <ehliar-nos...@isy.liu.se> wrote: > > > On 2009-01-05, Andy <jonesa...@comcast.net> wrote: > > > > Are you writing a paper on this or what? > > > Good guess. In fact, quite many of the papers I've been involved with > > have mentioned "FPGA optimizations" somewhere without truly defining > > it. Many other papers I have read also say that a design is (or sometim=es> > isn't) optimized for FPGAs. Besides, we talk about optimizing designs > > for FPGAs here all the time. > > > I know what I'm doing to optimize my designs for FPGAs (as opposed to > > optimizing a design for ASIC or just plainly optimizing a design > > for a generic architecture), and I have seen many interesting ideas > > on this group for example. Even so, I have never really seen a good > > definition of "FPGA optimized". > > > > Small memories are often easier to use in FPGAs because most FPGA > > > How well I know... :) On the other hand, relatively small memories such > > as register files (and I'm not thinking about the Cell processors 128x1=28> > bit register file here :)) can be implemented using standard cells with=out> > losing that much area. (And unless your volumes are huge you might even > > gain by not having to worry about the extra verification cost required =for> > verifying a design with many custom blocks in it.) > > > Anyway, thanks for the comments. > > > /Andreas > > Hey > I have one more point > 7. Writing the Constraint File in a very logical and intelligent way. > I believe Constraints play a major role in the design Optimizations > and finally improves the Quality of result.I agree and I also feel that vendors have largely ignored this aspect of FPGA design. They provide software to meet your timing constraints, but they do nothing to help you set them up. If you have a design with complex timing constraints, there is no way to verify that your constraints are interpreted the way you *think* they are. So a corollary to this is to choose a vendor who helps you verify your timing constraints are correct and not just that the software met constraints that may not be correct. Rick






