Reply by Tom Gardner February 12, 20192019-02-12
On 10/02/19 03:04, Les Cargill wrote:
> Tom Gardner wrote: >> On 07/02/19 10:23, already5chosen@yahoo.com wrote: >>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: >>>> >>>> Back in the late 80s there was the perception that TCP was slow, and hence >>>> new transport protocols were developed to mitigate that, e.g. XTP. >>>> >>>> In reality, it wasn't TCP per se that was slow. Rather the implementation, >>>> particularly multiple copies of data as the packet went up the stack, and >>>> between network processor / main processor and between kernel and user space. >>> >>> TCP per se *is* slow when frame error rate of underlying layers is not near >>> zero. >> >> That's a problem with any transport protocol. >> >> The solution to underlying frame errors is FEC, but that >> reduces the bandwidth when there are no errors. Choose >> what you optimise for! >> >> >>> Also, there exist cases of "interesting" interactions between Nagle algorithm >>> at transmitter and ACK saving algorithm at receiver that can lead to slowness >>> of certain styles of TCP conversions (Send mid-size block of data, wait for >>> application-level acknowledge, send next mid-size block) that is typically >>> resolved by not following the language of RFCs too literally. >> >> That sounds like a "corner case". I'd be surprised >> if you couldn't find corner cases in all transport >> protocols. > > But if you need absolute maximum throughput, it's often advantageous to move the > retransmission mechanism up the software stack. You can > take advantage of local specialized knowledge rather than pay the "TCP tax".
The devil is indeed in trading off generality for performance. There's the old aphorism... If you know how to optimise, then optimise. If you don't know how to optimise, then randomise.
Reply by February 10, 20192019-02-10
On Monday, February 4, 2019 at 6:29:45 AM UTC, Swapnil Patil wrote:
> Hello folks, > > Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done? > > I don't want to connect any Hard or Soft core processor. > also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6. > So how can it be done? > > It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks
-------------------------------------------------------------------------------- An indirect solution would be to offload Ethernet to a hard wired UDP/TCP asic. Wiznet has developed such devices and I have somewhat more than 10k designs in the field that use them. Unless you require a cheap high volume solution (requiring development, Verification and Validation time and money) then Wiznet may well be a zero time and minimal cost solution. I have used both W5300 https://www.wiznet.io/product-item/w5300/ and W3150A+ https://www.wiznet.io/product-item/w3150a+/ devices. TCP has the drawback of latency and automatic resends. In real time application my preference is UDP, packets lost are okay but packets resend by TCP is a waste of bandwidth because these packets are out of date. My application have been in heavy industry machine vision where fibre is too fragile and RF by line of sight and interference is not suitable.
Reply by Les Cargill February 9, 20192019-02-09
Tom Gardner wrote:
> On 07/02/19 10:23, already5chosen@yahoo.com wrote: >> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: >>> >>> Back in the late 80s there was the perception that TCP was slow, and >>> hence >>> new transport protocols were developed to mitigate that, e.g. XTP. >>> >>> In reality, it wasn't TCP per se that was slow. Rather the >>> implementation, >>> particularly multiple copies of data as the packet went up the stack, >>> and >>> between network processor / main processor and between kernel and >>> user space. >> >> TCP per se *is* slow when frame error rate of underlying layers is not >> near >> zero. > > That's a problem with any transport protocol. > > The solution to underlying frame errors is FEC, but that > reduces the bandwidth when there are no errors. Choose > what you optimise for! > > >> Also, there exist cases of "interesting" interactions between Nagle >> algorithm >> at transmitter and ACK saving algorithm at receiver that can lead to >> slowness >> of certain styles of TCP conversions (Send mid-size block of data, >> wait for >> application-level acknowledge, send next mid-size block) that is >> typically >> resolved by not following the language of RFCs too literally. > > That sounds like a "corner case". I'd be surprised > if you couldn't find corner cases in all transport > protocols.
But if you need absolute maximum throughput, it's often advantageous to move the retransmission mechanism up the software stack. You can take advantage of local specialized knowledge rather than pay the "TCP tax". -- Les Cargill
Reply by February 8, 20192019-02-08
On Friday, February 8, 2019 at 3:33:01 PM UTC+2, Michael Kellett wrote:
> On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote: > > On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote: > >> Den 2019-02-04 kl. 07:29, skrev Swapnil Patil: > >>> Hello folks, > >>> > >>> Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done? > >>> > >>> I don't want to connect any Hard or Soft core processor. > >>> also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6. > >>> So how can it be done? > >>> > >>> It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks > >>> > >> Netnod has an open source implementation for a 10GB Ethernet MAC > >> and connects that to an NTP server, all in FPGA. > >> It was not a generic UDP/IP stack, so they had some problems > >> with not beeing able to handle ICMP messages when I last > >> looked at the stuff 2 years ago. > >> > >> They split up incoming packets outside so that all UDP packet > >> to port 123 went to the FPGA. > > > > So it's not a stand alone solution. Still, 10 Gbits is impressive. I've designed comms stuff at lower rates but still fast enough that things couldn't be done in single width, rather they had to be done in parallel. That gets complicated and big real fast as the speeds increase. But then "big" is a relative term. Yesterday's "big" is today's "fits down in the corner of this chip". > > > > Chips don't get faster so much these days, but they are still getting bigger! > > > > > > Rick C. > > > > ---- Tesla referral code - https://ts.la/richard11209 > > > > I've done it, not a full every single RFC implemented job, but a limited > UDP support.
To that level, who didn't have it done? Me, personally, I lost count for how many times I did it in last 15 years. But only a transmitters. It's not that UDP reception to pre-configured port would be much harder, I just never had a need for it. But TCP is a *completely* different story. And then standard application protocols that run on top of TCP.
> The way it worked (initially) was to use Lattice's > tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch). > The FPGA handled UDPs in and out in real time and offloaded any traffic > it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32 > bit wide SDRAM to keep up with the potential peak data transfer rate. > We did it because the FPGA was acquiring the data and sending it to a PC > (and sometimes getting data from a PC and streaming it out), the FPGA > did some data processing and buffering - to get the data to the PC it > had to use Ethernet, it could have been done (at the time, several years > ago) with a PCI interface to a PC class processor running a full OS, but > this would have used far too much power. The Lattice XP3 FPGA did all > the grunt work and used a couple of watts (might have been as much as > three watts). > The UDP system supported multi fragment messages and used a protocol > which would allow for messages to be sent again if needed. > > If any one wants to pay for tcp-ip and all the trimmings I'd be happy to > consider it. > > > MK > > --- > This email has been checked for viruses by AVG. > https://www.avg.com
Reply by Michael Kellett February 8, 20192019-02-08
On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote:
> On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote: >> Den 2019-02-04 kl. 07:29, skrev Swapnil Patil: >>> Hello folks, >>> >>> Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done? >>> >>> I don't want to connect any Hard or Soft core processor. >>> also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6. >>> So how can it be done? >>> >>> It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks >>> >> Netnod has an open source implementation for a 10GB Ethernet MAC >> and connects that to an NTP server, all in FPGA. >> It was not a generic UDP/IP stack, so they had some problems >> with not beeing able to handle ICMP messages when I last >> looked at the stuff 2 years ago. >> >> They split up incoming packets outside so that all UDP packet >> to port 123 went to the FPGA. > > So it's not a stand alone solution. Still, 10 Gbits is impressive. I've designed comms stuff at lower rates but still fast enough that things couldn't be done in single width, rather they had to be done in parallel. That gets complicated and big real fast as the speeds increase. But then "big" is a relative term. Yesterday's "big" is today's "fits down in the corner of this chip". > > Chips don't get faster so much these days, but they are still getting bigger! > > > Rick C. > > ---- Tesla referral code - https://ts.la/richard11209 >
I've done it, not a full every single RFC implemented job, but a limited UDP support. The way it worked (initially) was to use Lattice's tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch). The FPGA handled UDPs in and out in real time and offloaded any traffic it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32 bit wide SDRAM to keep up with the potential peak data transfer rate. We did it because the FPGA was acquiring the data and sending it to a PC (and sometimes getting data from a PC and streaming it out), the FPGA did some data processing and buffering - to get the data to the PC it had to use Ethernet, it could have been done (at the time, several years ago) with a PCI interface to a PC class processor running a full OS, but this would have used far too much power. The Lattice XP3 FPGA did all the grunt work and used a couple of watts (might have been as much as three watts). The UDP system supported multi fragment messages and used a protocol which would allow for messages to be sent again if needed. If any one wants to pay for tcp-ip and all the trimmings I'd be happy to consider it. MK --- This email has been checked for viruses by AVG. https://www.avg.com
Reply by Tom Gardner February 8, 20192019-02-08
On 08/02/19 10:35, Allan Herriman wrote:
> On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote: > >> On 07/02/19 10:23, already5chosen@yahoo.com wrote: >>> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: >>>> >>>> Back in the late 80s there was the perception that TCP was slow, and >>>> hence new transport protocols were developed to mitigate that, e.g. >>>> XTP. >>>> >>>> In reality, it wasn't TCP per se that was slow. Rather the >>>> implementation, >>>> particularly multiple copies of data as the packet went up the stack, >>>> and between network processor / main processor and between kernel and >>>> user space. >>> >>> TCP per se *is* slow when frame error rate of underlying layers is not >>> near zero. >> >> That's a problem with any transport protocol. >> >> The solution to underlying frame errors is FEC, but that reduces the >> bandwidth when there are no errors. Choose what you optimise for! > > FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC > implementations I've done, the 64B66B signal is recoded into something > more efficient to make room for the FEC overhead. IOW, the raw bit rate > on the fibre is the same whether FEC is on or off. > > Perhaps a more important issue is latency. In my experience these are > block codes, and the entire block must be received before it can be > corrected. The last one I did added about 240ns when FEC was enabled. > > Optics modules (e.g. QSFP) that have sufficient margin to work without > FEC are sometimes marketed as "low latency" even though they have the > same latency as the ones that require FEC.
Accepted. My background with FECs is in radio systems, where the overhead is worse and block length much longer!
Reply by Allan Herriman February 8, 20192019-02-08
On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote:

> On 07/02/19 10:23, already5chosen@yahoo.com wrote: >> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: >>> >>> Back in the late 80s there was the perception that TCP was slow, and >>> hence new transport protocols were developed to mitigate that, e.g. >>> XTP. >>> >>> In reality, it wasn't TCP per se that was slow. Rather the >>> implementation, >>> particularly multiple copies of data as the packet went up the stack, >>> and between network processor / main processor and between kernel and >>> user space. >> >> TCP per se *is* slow when frame error rate of underlying layers is not >> near zero. > > That's a problem with any transport protocol. > > The solution to underlying frame errors is FEC, but that reduces the > bandwidth when there are no errors. Choose what you optimise for!
FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC implementations I've done, the 64B66B signal is recoded into something more efficient to make room for the FEC overhead. IOW, the raw bit rate on the fibre is the same whether FEC is on or off. Perhaps a more important issue is latency. In my experience these are block codes, and the entire block must be received before it can be corrected. The last one I did added about 240ns when FEC was enabled. Optics modules (e.g. QSFP) that have sufficient margin to work without FEC are sometimes marketed as "low latency" even though they have the same latency as the ones that require FEC. Regards, Allan
Reply by February 7, 20192019-02-07
On Thursday, February 7, 2019 at 10:04:09 PM UTC+2, Tom Gardner wrote:
> On 07/02/19 10:23, already5chosen@yahoo.com wrote: > > On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: > >> > >> Back in the late 80s there was the perception that TCP was slow, and hence > >> new transport protocols were developed to mitigate that, e.g. XTP. > >> > >> In reality, it wasn't TCP per se that was slow. Rather the implementation, > >> particularly multiple copies of data as the packet went up the stack, and > >> between network processor / main processor and between kernel and user > >> space. > > > > TCP per se *is* slow when frame error rate of underlying layers is not near > > zero. > > That's a problem with any transport protocol. >
TCP is worse than most. Partly because it's jack of all trades in terms of latency and bandwidth. Partly, because it's stream (rather than datagram) oriented, which makes recovery, based on selective retransmission far more complicated=less practical.
> The solution to underlying frame errors is FEC, but that > reduces the bandwidth when there are no errors. Choose > what you optimise for! > > > > Also, there exist cases of "interesting" interactions between Nagle algorithm > > at transmitter and ACK saving algorithm at receiver that can lead to slowness > > of certain styles of TCP conversions (Send mid-size block of data, wait for > > application-level acknowledge, send next mid-size block) that is typically > > resolved by not following the language of RFCs too literally. > > That sounds like a "corner case". I'd be surprised > if you couldn't find corner cases in all transport > protocols.
Sure. But not a rare corner case. And again, far less likely to happen to datagram-oriented reliable transports.
Reply by Tom Gardner February 7, 20192019-02-07
On 07/02/19 10:23, already5chosen@yahoo.com wrote:
> On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote: >> >> Back in the late 80s there was the perception that TCP was slow, and hence >> new transport protocols were developed to mitigate that, e.g. XTP. >> >> In reality, it wasn't TCP per se that was slow. Rather the implementation, >> particularly multiple copies of data as the packet went up the stack, and >> between network processor / main processor and between kernel and user >> space. > > TCP per se *is* slow when frame error rate of underlying layers is not near > zero.
That's a problem with any transport protocol. The solution to underlying frame errors is FEC, but that reduces the bandwidth when there are no errors. Choose what you optimise for!
> Also, there exist cases of "interesting" interactions between Nagle algorithm > at transmitter and ACK saving algorithm at receiver that can lead to slowness > of certain styles of TCP conversions (Send mid-size block of data, wait for > application-level acknowledge, send next mid-size block) that is typically > resolved by not following the language of RFCs too literally.
That sounds like a "corner case". I'd be surprised if you couldn't find corner cases in all transport protocols.
Reply by David Brown February 7, 20192019-02-07
On 07/02/2019 11:07, already5chosen@yahoo.com wrote:
> On Tuesday, February 5, 2019 at 12:12:47 PM UTC+2, David Brown wrote: >> On 04/02/2019 21:55, gnuarm.deletethisbit@gmail.com wrote: >> >>> I don't know a lot about TCP/IP, but I've been told you can implement it to many different degrees depending on your requirements. I think it had to do with the fact that some aspects are specified rather vaguely, timeouts and who manages the retries, etc. I assume this was not as full an implementation as you might have on a PC. So I wonder if this is an apples to oranges comparison. >>> >> >> That is correct - there are lots of things in IP networking in general, >> and TCP/IP on top of that, which can be simplified, limited, or handled >> statically. For example, TCP/IP has window size control so that each >> end can automatically adjust if there is a part of the network that has >> a small MTU (packet size) - that way there will be less fragmentation, >> and greater throughput. That is an issue if you have dial-up modems and >> similar links - if you have a more modern network, you could simply >> assume a larger window size and leave it fixed. There are a good many >> such parts of the stack that can be simplified. >> >> >> >>> Are there any companies selling TCP/IP that they actually list on their web site? >>> > > TCP window size and MTU are orthogonal concepts. > Judged by this post, I'd suspect that you know more about TCP that Rick C, but less than Rick H which sounds like the only one of 3 of you that had his own hands dirty in attempt to implement it. >
They are different concepts, yes, window size can be reduced to below MTU size on small systems to ensure that you don't get fragmentation, and you don't need to resend more than one low-level packet. But it is not a level of detail that I have needed to work at, so I have no personal experience of that.