FPGARelated.com
Forums

ARM + FPGA CPU Module running Yocto Linux?

Started by A.P.Richelieu January 30, 2019
A.P.Richelieu <aprichelieu@gmail.com> wrote:
> You are trying to convince me to look at Zynq and SoC. > That is what I explicitly said I was not going to do.
No, I'm pointing out that your argument on costs doesn't necessarily stack up. The reason why the options are so constrained, and why this doesn't exist as a popular product, is that not many Linux-capable CPUs have an external bus interface or a high bandwidth GPIO interface. Basically you're stuck with PCIe (which ups the FPGA cost a lot) or things with SPI to try and squeeze enough bandwidth out. Things like the OMAP PRUs might do it, but I'm not sure what useful bandwidth you can get at the end of the day (since there's no help with the wire protocol, you have to do it all in software). That leaves the options as roughly: - Zynq/Intel SoC parts (on-chip FPGA) - some Microsemi parts with a hard Cortex M (not Linux capable) - OMAP PRU - I think I saw a single iMX part with an external bus interface, but it was slow - an FPGA with a soft core running Linux (Microblaze, NIOS-II, RISC-V of some kind). These have a myriad of sharp edges, as the core/kernel/drivers/compiler/distro is often not very polished - PCIe - a few parts (eg Cavium ThunderX) which expose the cache coherency protocol externally. You'd be very much on your own here. Another horrible idea: write a NAND flash interface for the FPGA and use that to emulate an external bus interface. You'd have to disentangle whatever cleverness the CPU's NAND controller tries to do, but in principle the bandwidth is there. Basically you've boxed yourself into a corner here, so all these options are not very appealing. Theo
On Wednesday, January 30, 2019 at 1:21:30 PM UTC-5, A.P.Richelieu wrote:
> Den 2019-01-30 kl. 18:44, skrev lasselangwadtchristensen@gmail.com: > > onsdag den 30. januar 2019 kl. 18.13.34 UTC+1 skrev A.P.Richelieu: > >> Is there any ARM + FPGA CPU Module running linux using any of: > >> > >> * NXP i.MX6/7/... > >> * Texas Instrument Sitara AM335x or better > >> * Microchip SAMA5 > >> * Renesas RZ/xxx > >> > >> It needs to be connected to a low price FPGA, Intel or Xilinx. > >> > >> * Zynq or Intel SoC solutions need not apply. > >> > >> Other vendors will be difficult to accept. > >> > >> ===================== > >> > >> The CPU Module needs at least > >> * 128 MB RAM > >> * 128 MB Flash. > >> Connector will have > >> * 100 Mbps Ethernet > >> * 12 x 10 Mbps SPI channels (most will be implemented in the FPGA) > >> * 5 x 921,200 BAUD serial ports (some in FPGA perhaps) > >> * SD-Card > >> * A few custom protocol LVDS channels > >> ===================== > >> The processor has to be connected to an FPGA on a suitable > >> interface providing 5-10 MB/second transfer rate. > >> The FPGA needs to have 80-100 free I/O, not including the > >> interface to the CPU to implement SPIs, UARTs and other custom signals > >> ===================== > >> The CPU should be able to load the FPGA after reset. > >> Preferably right after loading the U-Boot (during the BOOTDELAY timer). > >> ===================== > >> Preferably, the processor should be able to access the internals > >> of the FPGA like it was on the memory bus. > >> > >> Putting the FPGA on a 16 bit memory interface will work > >> > >> Some chip support a transparent mode where you do a memory read/write > >> which gets translated to a Quad SPI access, or a NAND flash controller > >> access. > >> > >> I.E: > >> You can write to a register over SPI by: > >> FPGA_REGISTER = value; > >> instead of > >> > >> spi_packet = { > >> .cmd = SPI_WRITE, > >> .addr = FPGA_REGISTER, > >> .size = sizeof(value), > >> .data = &value > >> } > >> spi_transfer(&spi_packet); > >> > >> > >> We plan to use Yocto for developing Linux, so any Yocto solution > >> would be appreciated. > >> > >> Looking forward to ideas. > >> > >> AP > > > > why not Zynq? it has everything you ask for and the same ARM-9 as the NXP > > > > Because it is way too expensive. > > You can get a better ARM chip for $6-7 in 1k qty. > A Cyclone 10 FPGA is $8-9. > Can You get a Zynq for $14-16 in 1k volume? > Digikey shows one off pricing for the cheapest Zynq to be $46. > If they can give 40% discount at 1k, it is still $30 = 2x price. > > > Another thing is that the onboard peripherals generally suck. > At least when I looked at them the last time. > I do not care to waste my time on why. > > This means that we have to spend time doing peripherals in the FPGA. > They need to be supported by Linux drivers. > We do not want to add that development effort. > > AP
If you're purchasing in the 1000+ qty annually, you should NOT be using Digikey for pricing. That should be a negotiation with your Avent rep. In higher volumes, I've seen pretty significant prices negotiated for our customers. You mentioned in a more recent response that you use Zynqs in other products, so you might consider trying to design in common parts to increase your total corporate purchase qty of the same part to help negotiate better prices.
On Thursday, January 31, 2019 at 6:25:32 AM UTC-5, Theo wrote:
> A.P.Richelieu <aprichelieu@gmail.com> wrote: > > You are trying to convince me to look at Zynq and SoC. > > That is what I explicitly said I was not going to do. > > No, I'm pointing out that your argument on costs doesn't necessarily stack > up. > > The reason why the options are so constrained, and why this doesn't exist as > a popular product, is that not many Linux-capable CPUs have an external bus > interface or a high bandwidth GPIO interface. Basically you're stuck with > PCIe (which ups the FPGA cost a lot) or things with SPI to try and squeeze > enough bandwidth out. > > Things like the OMAP PRUs might do it, but I'm not sure what useful > bandwidth you can get at the end of the day (since there's no help with the > wire protocol, you have to do it all in software). > > That leaves the options as roughly: > > - Zynq/Intel SoC parts (on-chip FPGA) > - some Microsemi parts with a hard Cortex M (not Linux capable) > - OMAP PRU > - I think I saw a single iMX part with an external bus interface, but it was > slow > - an FPGA with a soft core running Linux (Microblaze, NIOS-II, RISC-V of > some kind). These have a myriad of sharp edges, as the > core/kernel/drivers/compiler/distro is often not very polished > - PCIe > - a few parts (eg Cavium ThunderX) which expose the cache coherency protocol > externally. You'd be very much on your own here. > > Another horrible idea: write a NAND flash interface for the FPGA and use > that to emulate an external bus interface. You'd have to disentangle > whatever cleverness the CPU's NAND controller tries to do, but in principle > the bandwidth is there. > > Basically you've boxed yourself into a corner here, so all these options are > not very appealing.
I was thinking about his bandwidth requirement. While you say there aren't many ARMs running Linux with external memory interfaces (which makes me wonder how they build all those Beagle Bones, etc.) wouldn't an Ethernet interface at 100 Mbps do the job? Ok, I guess you'd need two since the OP wants one for other use. Are there any ARM CPUs with TWO Ethernet interfaces? Rick C. -- Get 6 months of free supercharging -- Tesla referral code - https://ts.la/richard11209
gnuarm.deletethisbit@gmail.com wrote:
> I was thinking about his bandwidth requirement. While you say there > aren't many ARMs running Linux with external memory interfaces (which > makes me wonder how they build all those Beagle Bones, etc.)
There are external DDR3 / NAND flash / (E)MMC / QSPI interfaces, but nothing that looks like a regular bus. You can pretend to be a flash chip, but it isn't very pleasant. An SoC with a NOR flash interface would be easiest, but I haven't seen one of those for a while. Not that I've been looking for one, I admit.
> wouldn't an Ethernet interface at 100 Mbps do the job? Ok, I guess you'd > need two since the OP wants one for other use. Are there any ARM CPUs > with TWO Ethernet interfaces?
Yes, this something we do - use point-to-point (switchless) ethernet as essentially a bit-pipe, with the MAC at each end doing minimal framing. It would, I suppose, be plausible to implement a minimal ethernet switch in the FPGA - FPGA has one ethernet MAC/PHY hardwired to the ARM SoC, another on the output, and a piece of FPGA logic pulls off packets to/from a magic MAC address that are going to the internal logic. An existing board that does this is Microsoft's Catapult - FPGA logic that sits on both PCIe and interposes between the in-box 40G NIC and the rack switch. Of a completely different league of course, and you can't buy the boards. I don't have latency numbers for our 10G ethernet approach, but it might be OK if you have enough space for buffering. I can't think of an off the shelf board wired in this configuration though. Theo
torsdag den 31. januar 2019 kl. 17.47.23 UTC+1 skrev gnuarm.del...@gmail.com:
> On Thursday, January 31, 2019 at 6:25:32 AM UTC-5, Theo wrote: > > A.P.Richelieu <aprichelieu@gmail.com> wrote: > > > You are trying to convince me to look at Zynq and SoC. > > > That is what I explicitly said I was not going to do. > > > > No, I'm pointing out that your argument on costs doesn't necessarily stack > > up. > > > > The reason why the options are so constrained, and why this doesn't exist as > > a popular product, is that not many Linux-capable CPUs have an external bus > > interface or a high bandwidth GPIO interface. Basically you're stuck with > > PCIe (which ups the FPGA cost a lot) or things with SPI to try and squeeze > > enough bandwidth out. > > > > Things like the OMAP PRUs might do it, but I'm not sure what useful > > bandwidth you can get at the end of the day (since there's no help with the > > wire protocol, you have to do it all in software). > > > > That leaves the options as roughly: > > > > - Zynq/Intel SoC parts (on-chip FPGA) > > - some Microsemi parts with a hard Cortex M (not Linux capable) > > - OMAP PRU > > - I think I saw a single iMX part with an external bus interface, but it was > > slow > > - an FPGA with a soft core running Linux (Microblaze, NIOS-II, RISC-V of > > some kind). These have a myriad of sharp edges, as the > > core/kernel/drivers/compiler/distro is often not very polished > > - PCIe > > - a few parts (eg Cavium ThunderX) which expose the cache coherency protocol > > externally. You'd be very much on your own here. > > > > Another horrible idea: write a NAND flash interface for the FPGA and use > > that to emulate an external bus interface. You'd have to disentangle > > whatever cleverness the CPU's NAND controller tries to do, but in principle > > the bandwidth is there. > > > > Basically you've boxed yourself into a corner here, so all these options are > > not very appealing. > > I was thinking about his bandwidth requirement. While you say there aren't many ARMs running Linux with external memory interfaces (which makes me wonder how they build all those Beagle Bones, etc.)
they "all" have external memory but it is dedicated for DDR RAM, some of them, like the beagle bone also have a general purpose memory interface controller for things like sync and async RAM and FLASH, that's the one you'd want to use for an FPGA, and some do: https://elinux.org/BeagleBoard/BeagleWire
Theo <theom+news@chiark.greenend.org.uk> writes:

> gnuarm.deletethisbit@gmail.com wrote: >> I was thinking about his bandwidth requirement. While you say there >> aren't many ARMs running Linux with external memory interfaces (which >> makes me wonder how they build all those Beagle Bones, etc.) > > There are external DDR3 / NAND flash / (E)MMC / QSPI interfaces, but nothing > that looks like a regular bus. You can pretend to be a flash chip, but it > isn't very pleasant. > > An SoC with a NOR flash interface would be easiest, but I haven't seen one > of those for a while. Not that I've been looking for one, I admit.
I did find one, only checked what AP listed. Turns out Atmel nee Microchip has ARM processors with SRAM-like external memory interfaces in the SAMA5 family. I don't know if it's fast or easy to use from a software point of view. A Belgian company has apparently designed a router based on one of these, http://dab-embedded.com/en/cases/openwrt-atmel-sama5d3-and-max10-fpga-board/ It has an FPGA (Intel's Max 10) connected to the ARM via this external memory interface.
Den 2019-01-31 kl. 12:25, skrev Theo:
> A.P.Richelieu <aprichelieu@gmail.com> wrote: >> You are trying to convince me to look at Zynq and SoC. >> That is what I explicitly said I was not going to do. > > No, I'm pointing out that your argument on costs doesn't necessarily stack > up. > > The reason why the options are so constrained, and why this doesn't exist as > a popular product, is that not many Linux-capable CPUs have an external bus > interface or a high bandwidth GPIO interface. Basically you're stuck with > PCIe (which ups the FPGA cost a lot) or things with SPI to try and squeeze > enough bandwidth out. > > Things like the OMAP PRUs might do it, but I'm not sure what useful > bandwidth you can get at the end of the day (since there's no help with the > wire protocol, you have to do it all in software). > > That leaves the options as roughly: > > - Zynq/Intel SoC parts (on-chip FPGA) > - some Microsemi parts with a hard Cortex M (not Linux capable) > - OMAP PRU > - I think I saw a single iMX part with an external bus interface, but it was > slow > - an FPGA with a soft core running Linux (Microblaze, NIOS-II, RISC-V of > some kind). These have a myriad of sharp edges, as the > core/kernel/drivers/compiler/distro is often not very polished > - PCIe > - a few parts (eg Cavium ThunderX) which expose the cache coherency protocol > externally. You'd be very much on your own here. > > Another horrible idea: write a NAND flash interface for the FPGA and use > that to emulate an external bus interface. You'd have to disentangle > whatever cleverness the CPU's NAND controller tries to do, but in principle > the bandwidth is there. >
I need 5-10 MByte per second, which I do not see as a problem if I design a module myself. I have found three acceptable ways of interfacing the FPGA. 1. A separate 8/16 bit memory bus 2. NAND Flash interface 3. QSPI interface which is memory mapped. You write to the Address range, and the H/W will generate an SPI read or write access automatically. There are several parts which has a Memory Bus and a Secondary bus, and that is useful. That includes the AM335x, AM437x, AM65x, SAMA5, Renesas RZ/xxx and some NXP iMX parts. The AMxxxx until the AM65xx has deficiencies we do not like, but the AM65xx is just sampling. Might be too late. I am pretty sure however, I can do the job with a simple ARM9 and a NAND flash interface to the FPGA. So there are plenty of options.
> Basically you've boxed yourself into a corner here, so all these options are > not very appealing. >
There are no corners. Any CPU from the list above combined with an FPGA is probably useable (we prefer something else than Renesas though) AP
> Theo >
Den 2019-01-31 kl. 22:47, skrev Theo:
> gnuarm.deletethisbit@gmail.com wrote: >> I was thinking about his bandwidth requirement. While you say there >> aren't many ARMs running Linux with external memory interfaces (which >> makes me wonder how they build all those Beagle Bones, etc.) > > There are external DDR3 / NAND flash / (E)MMC / QSPI interfaces, but nothing > that looks like a regular bus. You can pretend to be a flash chip, but it > isn't very pleasant. > > An SoC with a NOR flash interface would be easiest, but I haven't seen one > of those for a while. Not that I've been looking for one, I admit.
The AM335x on the Beaglebone has two buses. One for DDR3 memory, and then a second bus which can either be * A normal non-multiplexed bus A[2x..0], D[15..0] * A Multiplexed bus with A[2x..16], AD[15:0] * A double Multiplexed bus with AAD[15:0]. The Adress is sent over two cycles. Unfortunately, one of the write strobes is in conflict with a peripheral I need, so I can only do word access, not byte access to the FPGA.
> >> wouldn't an Ethernet interface at 100 Mbps do the job? Ok, I guess you'd >> need two since the OP wants one for other use. Are there any ARM CPUs >> with TWO Ethernet interfaces? >
Plenty of them, but we are not going to run Ethernet to the FPGA... I doubt a CPU module would do that as well.
> Yes, this something we do - use point-to-point (switchless) ethernet as > essentially a bit-pipe, with the MAC at each end doing minimal framing. > > It would, I suppose, be plausible to implement a minimal ethernet switch in > the FPGA - FPGA has one ethernet MAC/PHY hardwired to the ARM SoC, another on > the output, and a piece of FPGA logic pulls off packets to/from a magic MAC > address that are going to the internal logic. > > An existing board that does this is Microsoft's Catapult - FPGA logic that > sits on both PCIe and interposes between the in-box 40G NIC and the rack > switch. Of a completely different league of course, and you can't buy the > boards. > > I don't have latency numbers for our 10G ethernet approach, but it might be > OK if you have enough space for buffering. > > I can't think of an off the shelf board wired in this configuration though. > > Theo >
AP
Den 2019-02-01 kl. 14:18, skrev Anssi Saari:
> Theo <theom+news@chiark.greenend.org.uk> writes: > >> gnuarm.deletethisbit@gmail.com wrote: >>> I was thinking about his bandwidth requirement. While you say there >>> aren't many ARMs running Linux with external memory interfaces (which >>> makes me wonder how they build all those Beagle Bones, etc.) >> >> There are external DDR3 / NAND flash / (E)MMC / QSPI interfaces, but nothing >> that looks like a regular bus. You can pretend to be a flash chip, but it >> isn't very pleasant. >> >> An SoC with a NOR flash interface would be easiest, but I haven't seen one >> of those for a while. Not that I've been looking for one, I admit. > > I did find one, only checked what AP listed. Turns out Atmel nee > Microchip has ARM processors with SRAM-like external memory interfaces > in the SAMA5 family. I don't know if it's fast or easy to use from a > software point of view. > > A Belgian company has apparently designed a router based on one of > these, > http://dab-embedded.com/en/cases/openwrt-atmel-sama5d3-and-max10-fpga-board/ > > It has an FPGA (Intel's Max 10) connected to the ARM via this external > memory interface. >
Thank You. That is the type of answer I am looking for. The Max 10, is most likely too small. Cyclone 10, Spartan 6 or better in terms of logic. Need internal SRAM for buffers as well. No need for Gigabit transceivers though. AP.
Den 2019-01-31 kl. 22:57, skrev lasselangwadtchristensen@gmail.com:
> torsdag den 31. januar 2019 kl. 17.47.23 UTC+1 skrev gnuarm.del...@gmail.com: >> On Thursday, January 31, 2019 at 6:25:32 AM UTC-5, Theo wrote: >>> A.P.Richelieu <aprichelieu@gmail.com> wrote: >>>> You are trying to convince me to look at Zynq and SoC. >>>> That is what I explicitly said I was not going to do. >>> >>> No, I'm pointing out that your argument on costs doesn't necessarily stack >>> up. >>> >>> The reason why the options are so constrained, and why this doesn't exist as >>> a popular product, is that not many Linux-capable CPUs have an external bus >>> interface or a high bandwidth GPIO interface. Basically you're stuck with >>> PCIe (which ups the FPGA cost a lot) or things with SPI to try and squeeze >>> enough bandwidth out. >>> >>> Things like the OMAP PRUs might do it, but I'm not sure what useful >>> bandwidth you can get at the end of the day (since there's no help with the >>> wire protocol, you have to do it all in software). >>> >>> That leaves the options as roughly: >>> >>> - Zynq/Intel SoC parts (on-chip FPGA) >>> - some Microsemi parts with a hard Cortex M (not Linux capable) >>> - OMAP PRU >>> - I think I saw a single iMX part with an external bus interface, but it was >>> slow >>> - an FPGA with a soft core running Linux (Microblaze, NIOS-II, RISC-V of >>> some kind). These have a myriad of sharp edges, as the >>> core/kernel/drivers/compiler/distro is often not very polished >>> - PCIe >>> - a few parts (eg Cavium ThunderX) which expose the cache coherency protocol >>> externally. You'd be very much on your own here. >>> >>> Another horrible idea: write a NAND flash interface for the FPGA and use >>> that to emulate an external bus interface. You'd have to disentangle >>> whatever cleverness the CPU's NAND controller tries to do, but in principle >>> the bandwidth is there. >>> >>> Basically you've boxed yourself into a corner here, so all these options are >>> not very appealing. >> >> I was thinking about his bandwidth requirement. While you say there aren't many ARMs running Linux with external memory interfaces (which makes me wonder how they build all those Beagle Bones, etc.) > > they "all" have external memory but it is dedicated for DDR RAM, some of them, like the beagle bone also have a general purpose memory interface controller for things like sync and async RAM and FLASH, that's the one you'd want to use for an FPGA, and some do: https://elinux.org/BeagleBoard/BeagleWire > > >
That is more like it, unfortunately, it is not a single module, and the Lattice FPGA is not on the accepted list. Might look at it for some homebrew stuff though. AP.