comp.arch.fpga | Lattice Announces EOL for XP and EC/P Product Lines| page 6

Reply by ●August 25, 20132013-08-25

On Sunday, August 25, 2013 1:51:57 AM UTC+3, rickman wrote:
> On 8/24/2013 1:14 PM, Gabor wrote:
> 
> > On 8/24/2013 12:37 PM, rickman wrote:
> 
> >>
> >> One of my potential solutions is to use a similar, but smaller part from
> >> the XO2 line (assuming I stay with Lattice after this) and roll my own
> >> processor to handle the slow logic. It will be a lot more work than
> >> just porting the HDL to a new chip though. I think rolling my own will
> >> likely produce a smaller and more efficient design. The 32 bit designs
> >> wouldn't even fit on the chip. I'm not sure how large the 8 bit designs
> >> are in LUTs. I'm thinking a 16 or 18 bit CPU would be a good data size.
> >> Or if the LUTs are really tight, a 4 or 5 bit processor might save on
> >> real estate. Maybe 8 or 9 bits is a good compromise depending on the
> >> LUT count.
> >>
> 
> > I would think the Lattice mico8 would be pretty small, and it should be
> > easy enough to try that out. Rolling your own would be more fun,
> > though...
> 
> 
> 
> I've already done the roll your own thing.  Stack processors can be 
> pretty small and they are what I prefer to use, but a register machine 
> is ok too.  I don't know how big the Micro8 is in terms of LUTs.  My 
> last design was 16 bits and about 600 LUTs and that included some 
> flotsam that isn't required in a final design.  By contrast the 
> microBlaze I know is multiple kLUTs, but that is a 32 bit design.  The 
> Xilinx picoBlaze 8 bit design is rather small (I don't recall the 
> number) but it instantiates LUTs and FFs and so is not portable.  There 
> may be a home grown version of the picoBlaze.
> -- 
> 
> Rick

I just measured Altera Nios2e on Stratix3 - 379 ALMs + 2 M9K blocks (out of 18K memory bits only 2K bits used). It's hard to translate exactly into old-fashioned LUTs, but I'd say - around 700.
Per clock Nios2e is pretty slow, but it clocks rather high and it is a 32-bit CPU - very easy to program in C.

Reimplementing Nios2 in minimal number of LUTs, e.g. trading memory for fabric, could be an interesting exercise, well suitable for coding competition. But, probably, illegal :(

Reply by glen herrmannsfeldt ●August 25, 20132013-08-25

rickman <gnuarm@gmail.com> wrote:

(snip)
> I used to rail against the FPGA vendor's decisions in packaging.  I find 
> it very inconvenient at a minimum.  But these days I have learned to do 
> the Zen thing and reabsorb my dissatisfaction so as to turn it to an 
> advantage.  I'm not sure what that means exactly, but I've given up 
> trying to think of FPGAs as an MCU substitute.
 
> I suppose the markets are different enough that FPGAs just can't be 
> produced economically in as wide a range of packaging.  I think that is 
> what Austin Leesa used to say, that it just costs too much to provide 
> the parts in a lot of packages.  Their real market is at the high end. 
> Like the big CPU makers, it is all about raising the ASP, Average 
> Selling Price.  Which means they don't spend a lot of time wooing the 
> small part users like us.

In many cases they would put one in a package with fewer pins
than pads, a waste of good I/O drivers, but maybe useful for
some people. 

I don't know much much it costs just to support an additional
package, though.

Also, a big problem with FPGAs, and ICs in general, is a low enough
lead inductance. Many packages that would otherwise be useful have
too much inductance.

-- glen

Reply by ●August 26, 20132013-08-26

On Saturday, August 24, 2013 11:46:59 AM UTC-5, rickman wrote:
> Two comments. He is describing two packages with the same body
> size and so the caps would be the same distance from the chip.
> But also, when you use power and group planes with effective
> coupling, the distance of the cap from the chip is nearly moot.
> The power planes act as a transmission line providing the
> current until the wave reaches the capacitor. Transmission
> lines are your friend. -- Rick

Two more comments...

The problem with leaded packages is, especially compared to flip-chip packa=
ges, is the electrical distance (and characteristic impedance) from the lea=
d/board joint to the die pad, particularly for power/ground connections. Th=
e substrate for a CSP looks like a mini-circuit board with its own power/gr=
ound planes.

Sure you can put the cap on the board close to the power/ground lead, but y=
ou cannot get it electrically as close to the die pad as you can with a fli=
p-chip package.

Transmission lines for power connections are not your friend, unless they a=
re of very low characteristic impedance at the high frequencies of interest=
 (e.g. transition times on your fast outputs, etc.) Until the wave traverse=
s the length of the transmission line, you are effectively supplying curren=
t through a resistor with the same value as the transmission line impedance=
.

What power planes do is provide "very low impedance transmission lines" for=
 the power/ground connections, and the ability to connect an appropriately =
packaged capacitor to the end of that line with very low inductance.

If your design is slow (output edge rates, not clock rates) or has few simu=
ltaneously switching outputs, it won't matter which package you use.

Andy

Reply by jg ●August 26, 20132013-08-26

On Saturday, August 24, 2013 1:06:37 PM UTC+12, rickman wrote:
> 
> 
> I suppose the markets are different enough that FPGAs just can't be 
> produced economically in as wide a range of packaging.  I think that is 
> what Austin Leesa used to say, that it just costs too much to provide 
> the parts in a lot of packages.  Their real market is at the high end. 
> Like the big CPU makers, it is all about raising the ASP, Average 
> Selling Price.  

I was meaning more at the lower end, - eg where lattice can offer parts in QFN32, then take a large jump to Qfp100 0.5mm.
QFN32 have only 21 io, so you easily exceed that, but there is a very large gap between the QFN32 and TQFP100.
 They claim to be chasing the lower cost markets with these parts, but seem rather blinkered in doing so.
 Altera well priced parts in gull wing,(MAX V) but only in a 0.4mm pitch.

> As long as we are wishing for stuff, I'd really love to see a smallish 
> MCU mated to a smallish FPGA.  

If you push that 'smallish', the Cypress PSoC series have uC+logic.

The newest PSoc4 seems to have solved some of the sticker shock, but I think they crippled the Logic to achieve that.
 Seems there is no free lunch.

Cypress do however, grasp the package issue, and offer QFN40(0.5mm), as well as SSOP28(0.65mm)  and TQFP44(0.8mm).

Reply by jg ●August 26, 20132013-08-26

On Saturday, August 24, 2013 10:02:23 PM UTC+12, Uwe Bonnes wrote:
> jg  wrote:
> > This allows higher yield PCB design rules.
>
> But that gets the decouplig caps farer away from the chip  and so worsens
> simultanious switching.

As rick has already pointed out, the package is the same size, so the die-cap pathway via the leadframe is no different. 

In fact, because the leads are wider, the inductance is actually less on the coarser pitch package. (not by much, but it is lower)

Also, someone wanting a lower pin count logic package is less likely to be pushing 'simultanious switching'

-jg

Reply by jg ●August 26, 20132013-08-26

On Sunday, August 25, 2013 12:35:56 PM UTC+12, rickman wrote:
> Actually, I'm fine 
> with the assembly language as long as it's a stack machine.  Very simple 
> instructions and very simple to implement.

 This tolerance of a subset opens an interesting design approach, whereby you make opcodes granular, and somehow list the ones used, to allow the tools to remove the unused ones.

 I think I saw a NXP design some time ago, along these lines of removing unused opcode logic.

 That means you can start with a more powerful core (hopefully more proven)
and then limit your SW to an opcode-subset.

-jg

Reply by rickman ●August 28, 20132013-08-28

On 8/25/2013 12:44 PM, already5chosen@yahoo.com wrote:
>
> I just measured Altera Nios2e on Stratix3 - 379 ALMs + 2 M9K blocks (out of 18K memory bits only 2K bits used). It's hard to translate exactly into old-fashioned LUTs, but I'd say - around 700.
> Per clock Nios2e is pretty slow, but it clocks rather high and it is a 32-bit CPU - very easy to program in C.

I can't say I fully understand the ALM, but I think it functions as a 
lot more than just a pair of 4 input LUTs.  It will do that without any 
issue.  But it will do a lot more and I expect this is used to great 
advantage in a CPU.  I'd say the ALM is equivalent to between 3 and 4 
LUT4s depending on the design.  I guess it is hard to compare between 
different device types.

> Reimplementing Nios2 in minimal number of LUTs, e.g. trading memory for fabric, could be an interesting exercise, well suitable for coding competition. But, probably, illegal :(

Yes, there are always lots of  tradeoffs to be considered.

-- 

Rick

Reply by rickman ●August 28, 20132013-08-28

On 8/25/2013 9:35 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>
> (snip)
>> I used to rail against the FPGA vendor's decisions in packaging.  I find
>> it very inconvenient at a minimum.  But these days I have learned to do
>> the Zen thing and reabsorb my dissatisfaction so as to turn it to an
>> advantage.  I'm not sure what that means exactly, but I've given up
>> trying to think of FPGAs as an MCU substitute.
>
>> I suppose the markets are different enough that FPGAs just can't be
>> produced economically in as wide a range of packaging.  I think that is
>> what Austin Leesa used to say, that it just costs too much to provide
>> the parts in a lot of packages.  Their real market is at the high end.
>> Like the big CPU makers, it is all about raising the ASP, Average
>> Selling Price.  Which means they don't spend a lot of time wooing the
>> small part users like us.
>
> In many cases they would put one in a package with fewer pins
> than pads, a waste of good I/O drivers, but maybe useful for
> some people.

They often put parts in packages with a lot fewer pins than pads.  Check 
out nearly any data sheet and you'll see what I mean.  The Cyclone V 
book shows parts with I/O pin counts varying between 240 and 480 
depending on the package.

> I don't know much much it costs just to support an additional
> package, though.

Austin Leesa from Xilinx claimed it was prohibitive, at least for the 
low end devices.

> Also, a big problem with FPGAs, and ICs in general, is a low enough
> lead inductance. Many packages that would otherwise be useful have
> too much inductance.

That depends entirely on your design.  All the devices I have used have 
at least two or more levels of I/O pin drive current which will help 
prevent issues from lead inductance.  Even so, there are plenty of small 
packages with no leads greatly reducing the inductance.

-- 

Rick

Reply by rickman ●August 28, 20132013-08-28

On 8/26/2013 6:58 PM, jonesandy@comcast.net wrote:
> On Saturday, August 24, 2013 11:46:59 AM UTC-5, rickman wrote:
>> Two comments. He is describing two packages with the same body
>> size and so the caps would be the same distance from the chip.
>> But also, when you use power and group planes with effective
>> coupling, the distance of the cap from the chip is nearly moot.
>> The power planes act as a transmission line providing the
>> current until the wave reaches the capacitor. Transmission
>> lines are your friend. -- Rick
>
> Two more comments...
>
> The problem with leaded packages is, especially compared to flip-chip packages, is the electrical distance (and characteristic impedance) from the lead/board joint to the die pad, particularly for power/ground connections. The substrate for a CSP looks like a mini-circuit board with its own power/ground planes.
>
> Sure you can put the cap on the board close to the power/ground lead, but you cannot get it electrically as close to the die pad as you can with a flip-chip package.
>
> Transmission lines for power connections are not your friend, unless they are of very low characteristic impedance at the high frequencies of interest (e.g. transition times on your fast outputs, etc.) Until the wave traverses the length of the transmission line, you are effectively supplying current through a resistor with the same value as the transmission line impedance.
>
> What power planes do is provide "very low impedance transmission lines" for the power/ground connections, and the ability to connect an appropriately packaged capacitor to the end of that line with very low inductance.
>
> If your design is slow (output edge rates, not clock rates) or has few simultaneously switching outputs, it won't matter which package you use.

My experience has been that for 95% of the designs done in FPGAs, 
especially at the low end, this is just not an issue.  FPGAs typically 
have selectable drive strength which limits the issues of ground bounce 
in less than electrically desirable packages.  If you have fast edges 
and lots of them, then use one of the more suitable packages.  But if 
you don't, then use the package that otherwise suits the design.

-- 

Rick

Reply by ●August 28, 20132013-08-28

On Wednesday, August 28, 2013 11:51:36 AM UTC+3, rickman wrote:
> On 8/25/2013 12:44 PM, already5chosen@yahoo.com wrote:
> 
> >
> 
> > I just measured Altera Nios2e on Stratix3 - 379 ALMs + 2 M9K blocks (out of 18K memory bits only 2K bits used). It's hard to translate exactly into old-fashioned LUTs, but I'd say - around 700.
> 
> > Per clock Nios2e is pretty slow, but it clocks rather high and it is a 32-bit CPU - very easy to program in C.
> 
> 
> 
> I can't say I fully understand the ALM, but I think it functions as a 
> lot more than just a pair of 4 input LUTs.  It will do that without any 
> issue.  But it will do a lot more and I expect this is used to great  
> advantage in a CPU.  I'd say the ALM is equivalent to between 3 and 4 
> LUT4s depending on the design.  I guess it is hard to compare between 
> different device types.
> 

No, ALM is close to two 4-input LUTs. May be, a bit more when implementing complex tightly-coupled logic with high internal complexity to fanout ratio. May be, a bit less, when implementing simple things with lots of registers and high fanout.

For sake of the argument, I compiled Nios2e for Cyclone4, which has more old-fashioned architecture - 676 LCs + 2 M9Ks.
I also dug out my real-world design from many years ago that embeds Nios2e into Cyclone2. It is even smaller at 565 LCs + 2 M4Ks. 
 

> 
> 
> 
> 
> > Reimplementing Nios2 in minimal number of LUTs, e.g. trading memory for fabric, could be an interesting exercise, well suitable for coding competition. But, probably, illegal :(
> 
> Yes, there are always lots of  tradeoffs to be considered.
> 

My point is - if you don't need performance and can use embedded memories then you can design useful 32-bit RISC CPU which would be non-trivially smaller than 600 LCs.
Nios2e core that I took as example is small, but hardly minimalistic. It implements full Nios2 architecture including several parts that you probably don't need. In particular:
- everything related to interrupts and exceptions
- support for big program address space
- ability to run execute programs from any memories others than on-chip SRAM

Previous 4 567 8 Next

Lattice Announces EOL for XP and EC/P Product Lines

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group