comp.arch.fpga | high bandwitch ethernet communication| page 3

Reply by eliben ●September 7, 20072007-09-07

> We run the EMAC 8 bits wide at 125 MHz, and the PicBlaze at 62.5 MHz
> using a divided by two version of the EMAC clock.  The PicoBlaze takes
> two cycles per instruction, and the packets we are offloading are a
> bit over 1KB, so we about 512 instructions to deal with an offloaded
> packet and other overhead.  Dealing with a non-offloaded packet takes
> the shortest path through the code to keep the number of packets per
> second we can handle up.  The network the data is on is tightly
> controlled, so there is very little on it that is not the protocol we
> are offloading, mostly just IGMP packets for dealing with the
> multicast groups, and they are at a very low rate.
>
> You are correct in that we do not look at the entire packet with the
> PicoBlaze, just the header.  Once it has determined that it wants to
> offload that packet, it then has a little bit more work to do to
> calculate addresses and load them into the DMA engine.  To make sure
> that we do not drop packets, we just need to make sure that the
> longest path through the code takes less time than about how long it
> takes to receive a packet.  We have a FIFO between the EMAC and the
> DMA engine, so we can smooth things out a bit.
>

Thanks for the information. I must say I'm impressed with your design
- a great interoperability of logic, cores, small and large CPUs and
software.

Eli

Reply by glen herrmannsfeldt ●September 7, 20072007-09-07

Paul Keinanen wrote:

(snip)

> If the OP required only something dedicated point to point
> connectivity, why bother with the IP wrapper, just send raw Ethernet
> frames with MAC addressing ?

You could, but sending UDP isn't that much harder.  If you really want
to simplify it, put the destination MAC address in as a constant
(saves doing ARP, but ARP could also be done in external software
and the result written to the FPGA).  The next complication is
generating the CRC for UDP, but that is optional.  The ethernet
CRC has to be generated in either case.

http://www.networksorcery.com/enp/protocol/udp.htm

(snip)

> The hard thing is to get the transmit data into the transmit buffers
> fast enough, but for direct port to port copying, there should not be
> much need to move the actual data in the memory.

If the FPGA isn't fast enough, write it out in 8 bit parallel and
use an external shift register.

-- glen

Reply by David Brown ●September 7, 20072007-09-07

eliben wrote:
> Hello,
> 
> In our application we have to receive and merge several proprietary
> serial channels (200 MHz) over fibers, and send all the data over
> Gigabit Ethernet. The bandwidth is ~60 MByte/s, sustained.
> 
> While generally sending this amount of data is possible over Gbit
> Ethernet, doing so in an embedded system isn't easy. That's because we
> need to send it by UDP or TCP, for which a TCP/UDP/IP stack is
> required (software).
> 
> Since the translation of the proprietary format is certainly done in
> an FPGA, I tried to calculate how to implement the whole process in an
> FPGA. For example, I can take an Altera Stratix II GX (with a built in
> Gbit Ethernet PHY), add Altera's MAC and use a TCP/IP stack running on
> the Nios II soft-core processor. Unfortunately, as Altera's appnote
> 440 shows, the maximal bandwidth attainable this way is only 15-17
> MByte/s. For the sake of comparison, benchmarks of Gbit Ethernet
> adapters on PCs show a maximal bandwidth of 80-90 MByte/s.
> 
> However, I wouldn't like to build in a Pentium into the embedded
> system. Any suggestions / recommendations on how to solve the
> problem ?
> 
> Thanks in advance
> 

There are a number of things that can be used to speed up the Ethernet 
communication (I've read about these, but not tried them - but they 
might give you a clue).

On the software side, there are a number of different tcp/ip stacks 
available, and the particular implementation can make a lot of difference.

In the FPGA, you can make sure you are using DMA for memory transfers 
rather than cpu memory accesses.  You can also use the FPGA to 
accelerate things like CRC calculations enormously - perhaps you can get 
these ready-written, or make one yourself, and modify the stack to use it.

There are also several different Ethernet MAC's available, with widely 
different throughputs.  Have a look at the OpenCores lists and try some 
out (I gather the prices are not insignificant, but they may be worth 
the money).

mvh.,

David

Reply by eliben ●September 7, 20072007-09-07

On Sep 7, 8:14 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Paul Keinanen wrote:
>
> (snip)
>
> > If the OP required only something dedicated point to point
> > connectivity, why bother with the IP wrapper, just send raw Ethernet
> > frames with MAC addressing ?
>
> You could, but sending UDP isn't that much harder.  If you really want
> to simplify it, put the destination MAC address in as a constant

This is exactly why direct MAC communication is undesirable here.
Tying the sender to the receiver's MAC address as a constant isn't
good engineering.

> (saves doing ARP, but ARP could also be done in external software
> and the result written to the FPGA).

Indeed. ARP needs to be done only once in a while anyway, so it can be
done by a slow software program.

> The next complication is
> generating the CRC for UDP, but that is optional.  The ethernet
> CRC has to be generated in either case.

CRC generation in FPGAs is blazing fast, so I don't really see a
problem here.

Eli

Reply by John McCaskill ●September 7, 20072007-09-07

On Sep 6, 11:55 pm, eliben <eli...@gmail.com> wrote:
> > We run the EMAC 8 bits wide at 125 MHz, and the PicBlaze at 62.5 MHz
> > using a divided by two version of the EMAC clock.  The PicoBlaze takes
> > two cycles per instruction, and the packets we are offloading are a
> > bit over 1KB, so we about 512 instructions to deal with an offloaded
> > packet and other overhead.  Dealing with a non-offloaded packet takes
> > the shortest path through the code to keep the number of packets per
> > second we can handle up.  The network the data is on is tightly
> > controlled, so there is very little on it that is not the protocol we
> > are offloading, mostly just IGMP packets for dealing with the
> > multicast groups, and they are at a very low rate.
>
> > You are correct in that we do not look at the entire packet with the
> > PicoBlaze, just the header.  Once it has determined that it wants to
> > offload that packet, it then has a little bit more work to do to
> > calculate addresses and load them into the DMA engine.  To make sure
> > that we do not drop packets, we just need to make sure that the
> > longest path through the code takes less time than about how long it
> > takes to receive a packet.  We have a FIFO between the EMAC and the
> > DMA engine, so we can smooth things out a bit.
>
> Thanks for the information. I must say I'm impressed with your design
> - a great interoperability of logic, cores, small and large CPUs and
> software.
>
> Eli


Thanks for the compliment!

I really like the PicoBlaze.  It makes a great compliment to the
PowerPC, and is very small. It is very good in this appication of
being an IO processor.

Regards,

John McCaskill
www.fastertechnology.com

Reply by Hal Murray ●September 7, 20072007-09-07

>This is exactly why direct MAC communication is undesirable here.
>Tying the sender to the receiver's MAC address as a constant isn't
>good engineering.
>
>> (saves doing ARP, but ARP could also be done in external software
>> and the result written to the FPGA).
>
>Indeed. ARP needs to be done only once in a while anyway, so it can be
>done by a slow software program.

I think you are missing the big picture.

Before you can do ARP, you have to have an IP Address.  Where
did that come from?  Why not provide the MAC address through
the same path?

If you have a send-only application, you don't need any protocol
stack.  Just fill in a few constants and blast away.

Yes, it might be convenient to have some software around.

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

Reply by Hal Murray ●September 7, 20072007-09-07

                                          The next complication is
>generating the CRC for UDP, but that is optional.  The ethernet
>CRC has to be generated in either case.

UDP doesn't have a CRC.  There is a software checksum, but it's
designed to be easy to compute with typical CPUs.  There is
a clean way to say "none".


Just for the record...

The Ethernet CRC is not really necessary.  It's getting kludgy
to avoid it, but it can be done.  The trick is to get the
adapter on the receive side to give you the bad packet too.
That may be hard/impossible on some of them.  I haven't tried
to do it in ages.  It used to be a common hack for debugging.


-- 
These are my opinions, not necessarily my employer's.  I hate spam.

Reply by eliben ●September 8, 20072007-09-08

On Sep 7, 9:31 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
Murray) wrote:
> >This is exactly why direct MAC communication is undesirable here.
> >Tying the sender to the receiver's MAC address as a constant isn't
> >good engineering.
>
> >> (saves doing ARP, but ARP could also be done in external software
> >> and the result written to the FPGA).
>
> >Indeed. ARP needs to be done only once in a while anyway, so it can be
> >done by a slow software program.
>
> I think you are missing the big picture.
>
> Before you can do ARP, you have to have an IP Address.  Where
> did that come from?  Why not provide the MAC address through
> the same path?

The IP address can be statically allocated. If my recipient changes
the computer during the runtime of the system (years), his MAC
changes, but not his IP.

Eli

Reply by glen herrmannsfeldt ●September 8, 20072007-09-08

eliben wrote:
> On Sep 7, 9:31 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
> Murray) wrote:

(snip)
>>Before you can do ARP, you have to have an IP Address.  Where
>>did that come from?  Why not provide the MAC address through
>>the same path?

> The IP address can be statically allocated. If my recipient changes
> the computer during the runtime of the system (years), his MAC
> changes, but not his IP.

The IP address can be static to the FPGA, in a programmable
register written into be an external processor doing ARP.

You can also use BOOTP to load the hosts own IP address,
with the destination IP address as one of the BOOTP options.
That can all be done in an external processor (or processor
internal to the FPGA).

-- glen

Reply by Paul Keinanen ●September 8, 20072007-09-08

On Sat, 08 Sep 2007 06:10:08 -0000, eliben <eliben@gmail.com> wrote:

>On Sep 7, 9:31 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
>Murray) wrote:
>> >This is exactly why direct MAC communication is undesirable here.
>> >Tying the sender to the receiver's MAC address as a constant isn't
>> >good engineering.
>>
>> >> (saves doing ARP, but ARP could also be done in external software
>> >> and the result written to the FPGA).
>>
>> >Indeed. ARP needs to be done only once in a while anyway, so it can be
>> >done by a slow software program.
>>
>> I think you are missing the big picture.
>>
>> Before you can do ARP, you have to have an IP Address.  Where
>> did that come from?  Why not provide the MAC address through
>> the same path?
>
>The IP address can be statically allocated. If my recipient changes
>the computer during the runtime of the system (years), his MAC
>changes, but not his IP.

You could quite safely use some random private IP address, such as
192.168.123.45 as the target address and require the receiver to
specify this IP address on one of the their Ethernet adaptors. You
just need to do the ARP translation to get the MAC address.

If the target PC is swapped, you just have to redo the ARP translation
at startup. This is quite sufficient in any non-redundant system. 

Paul

Previous 1 234 Next

high bandwitch ethernet communication

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About FPGARelated.com

Social Networks

The Related Media Group