In general TCP just isn't great for high performance. In the film industry we used to use a commercial product Aspera (now owned by IBM) which emulated ftp or scp but used UDP with forward error correction (instead of TCP retransmission). You could configure it to use a specific amount of bandwidth and it would just push everything else off the network to achieve it.
I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.
So the TCP does not seem to be a bottleneck if 40 Gbit/s is "high" enough, which it probably is currently for most people.
I have also seen plenty situations in which TCP is faster than UDP in datacenters.
For example, on Hetzner Cloud VMs, iperf3 gets me 7 Gbit/s over TCP but only 1.5 Gbit/s over UDP. On Hetzner dedicated servers with 10 Gbit links, I get 10 Gbit/s over TCP but only 4.5 Gbit/s over UDP. But this could also be due to my use of iperf3 or its implementation.
I also suspect that TCP being a protocol whose state is inspectable by the network equipment between endpoints allows implementing higher performance, but I have not validated if that is done.
Aspera was/is designed for high latency links. Ie sending multi terabytes from london to new Zealand, or LA
For that use case, Aspera was the best tool for the job. It's designed to be fast over links that single TCP streams couldn't
You could, if you were so bold, stack up multiple TCP links and send data down those. You got the same speed, but possible not the same efficiency. It was a fucktonne cheaper to do though.
> I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.
Do you mean literally just streaming data from one process to another on the same machine, without that data ever actually transiting a real network link? There's so many caveats to that test that it's basically worthless for evaluating what could happen on a real network.
To measure other overhead of what's claimed (TCP the protocol being slow), one should exclude other things that necessarily affect alternative protocols as well (e.g. latency) as much as possible, which is what this does.
It sounds like you're reasoning starting from an assumption that any claimed slowness of TCP would be something like a fixed per-packet overhead or delay that could be isolated and added back in to the result of your local testing to get a useful prediction. And it sounds like you think alternative protocols must be equally affected by latency.
But it's much more complicated than that; TCP interacts with latency and congestion and packet loss as both cause and effect. If you're testing TCP without sending traffic over real networks that have their own buffering and congestion control and packet reordering and loss, you're going to miss all of the most important dynamics affecting real-world performance. For example, you're not going to measure how multiplexing multiple data streams onto one TCP connection allows head of line blocking to drastically inflate the impact of a lost or reordered packet, because none of that happens when all you're testing is the speed at which your kernel can context-switch packets between local processes.
And all of that is without even beginning to touch on what happens to wireless networks.
Somebody made a claim that TCP isn't high performance without specifying what that means, I gave a counterexample of just how high performance TCP is picking some arbitrary notion of "high performance".
Almost like it makes the point that arguing about "high performance" is useless without saying what that means.
That said:
> you're not going to measure how multiplexing multiple data streams onto one TCP connection
Of course not: When I want to argue against "TCP is not a high performance protocol", why would I want to measure some other protocol that multiplexes connections over TCP? That is not measuring the performance of TCP.
I could conjure any protocol that requires acknowledgement from the other side for each emitted packet before sending the next, and then claim "UDP is not high performance" when running that over UDP - that doesn't make sense.
No, the example exercises the full TCP protocol including all its logic.
There have been plenty of examples of protocols where that is far from infinitely fast and does not scale even on localhost, e.g. with OpenVPN or any protocol that requires full acknowledgements from the other side before sending more.
High performance means transferring files from NZ to a director's yacht in the Mediterranean with a 40Mbps satellite link and getting 40Mbps, to the point that the link is unusable for anyone else.
UDP by itself cannot be used to transfer files or any other kind of data with a size bigger than an IP packet.
So it is impossible to compare the performance of TCP and UDP.
UDP is used to implement various other protocols, whose performance can be compared with TCP. Any protocol implemented over UDP must have a performance better than TCP, at least in some specific scenarios, otherwise there would be no reason for its existence.
I do not know how UDP is used by iperf3, but perhaps it uses some protocol akin to TFTP, i.e. it sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP.
Sending UDP packets without acknowledgment will always be faster than for any usable transfer protocol, but the speed for this case does not provide any information about the network, but only about the speed of executing a loop in the sending computer and network-interface card.
You can transfer data without using any transfer protocol, by just sending UDP packets at maximum rate, if you accept that a fraction of the data will be lost. The fraction that is lost can be minimized, but not eliminated, by using an error-correcting code.
> perhaps it [..] sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP
It does not, otherwise it would be impossible by a factor ~100x to measure 4.5 Gbit/s as per the bandwidth-delay calculation (the ping is around the usual 0.2 ms).
With iperf3, as with many other UDP measurement tools, you set a sending rate and the other side reports how many bytes arrived.
It is a long time since I have last used iperf3, but now that you have mentioned it I have also remembered this.
So the previous poster has misinterpreted the iperf3 results, by believing that UDP was slower, as iperf3 cannot demonstrate a speed difference between TCP and UDP, since for the former the speed is determined by the network, while for the latter the speed is determined by the "--bandwidth" iperf3 command-line option, so the poster has probably just seen some default UDP speed.
Aspera's FASP [0] is very neat. One drawback to it is that the TCP stuff not being done the traditional way must be done on CPU. Say if one packet is missing or if packets are sent out of order, the Aspera client fixes those instead of all that being done as TCP.
As I understand it, this is also the approach of WEKA.io [1]. Another approach is RDMA [2] used by storage systems like Vast which pushes those order and resend tasks to NICs that support RDMA so that applications can read and write directly to the network instead of to system buffers.
FASP uses forward error correction instead of retransmission. So instead of waiting for something not to show up on the other end and sending it again, it calculates parity and transmits slightly more data up front, with enough redundancy that the receiving end is capable of reconstructing any missing bits.
This is basically how all storage systems work, not just Weka. You calculate enough parity bits to be able to reconstruct the missing data when a drive fails. The more disks you have, the smaller the parity overhead is. Object storage like S3 does this on a massive scale. With a network transfer you typically only need a few percent, unless it's really lossy like Wifi, in which case standards like 802.11n are doing FEC for you to reduce retransmissions at the TCP layer.
Usually RDMA is over a network that is supposed to be lossless, but it does have checksums to detect corruption and recovers with retransmission. Infiniband NICs handle all of that.
Can we throw a bunch of AI agents at it? This sounds like a pretty tightly defined problem, much better than wasting tokens on re-inventing web browsers.
If you strip out the swarm logic (ie. downloading from multiple peers), you're just left with a protocol that transfers big files via chunks, so there's no reason that'd be faster than any other sort of download manager that supports multi-thread downloads.
Aspera did the chunking and encryption for you, and it looked and acted like SFTP.
The cost of leaking data was/is catastrophic (as in company ending) So paying a bit of money to guarantee that your data was being sent to the right place (point to point) and couldn't leak was a worthwhile tradeoff.
For Point to point transfer torrenting is a lot higher overhead than you want. plus most clients have an anti-leaching setting, so you'd need not only a custom client, but a custom protocol as well.
The idea is sound though, have an index file with and then a list of chunks to pull over multiple TCP connections.
I'm in a tiny part of the film industry. Bigger clients lend us licenses to Aspera and FileCatalyst when receiving files from them, but for our own trans-oceanic transfers I dug up an ancient program called Tsunami UDP and fixed it up just enough.