In general TCP just isn't great for high performance. In the film industry we us...

nh2 · 2026-02-02T18:24:48 1770056688

What does "high performance" mean here?

I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.

So the TCP does not seem to be a bottleneck if 40 Gbit/s is "high" enough, which it probably is currently for most people.

I have also seen plenty situations in which TCP is faster than UDP in datacenters.

For example, on Hetzner Cloud VMs, iperf3 gets me 7 Gbit/s over TCP but only 1.5 Gbit/s over UDP. On Hetzner dedicated servers with 10 Gbit links, I get 10 Gbit/s over TCP but only 4.5 Gbit/s over UDP. But this could also be due to my use of iperf3 or its implementation.

I also suspect that TCP being a protocol whose state is inspectable by the network equipment between endpoints allows implementing higher performance, but I have not validated if that is done.

KaiserPro · 2026-02-02T22:00:59 1770069659

Aspera was/is designed for high latency links. Ie sending multi terabytes from london to new Zealand, or LA

For that use case, Aspera was the best tool for the job. It's designed to be fast over links that single TCP streams couldn't

You could, if you were so bold, stack up multiple TCP links and send data down those. You got the same speed, but possible not the same efficiency. It was a fucktonne cheaper to do though.

wtallis · 2026-02-02T22:40:22 1770072022

> I get 40 Gbit/s over a single localhost TCP stream on my 10 years old laptop with iperf3.

Do you mean literally just streaming data from one process to another on the same machine, without that data ever actually transiting a real network link? There's so many caveats to that test that it's basically worthless for evaluating what could happen on a real network.

nh2 · 2026-02-03T01:26:54 1770082014

Yes. Why?

To measure other overhead of what's claimed (TCP the protocol being slow), one should exclude other things that necessarily affect alternative protocols as well (e.g. latency) as much as possible, which is what this does.

wtallis · 2026-02-03T07:13:35 1770102815

It sounds like you're reasoning starting from an assumption that any claimed slowness of TCP would be something like a fixed per-packet overhead or delay that could be isolated and added back in to the result of your local testing to get a useful prediction. And it sounds like you think alternative protocols must be equally affected by latency.

But it's much more complicated than that; TCP interacts with latency and congestion and packet loss as both cause and effect. If you're testing TCP without sending traffic over real networks that have their own buffering and congestion control and packet reordering and loss, you're going to miss all of the most important dynamics affecting real-world performance. For example, you're not going to measure how multiplexing multiple data streams onto one TCP connection allows head of line blocking to drastically inflate the impact of a lost or reordered packet, because none of that happens when all you're testing is the speed at which your kernel can context-switch packets between local processes.

And all of that is without even beginning to touch on what happens to wireless networks.

nh2 · 2026-02-03T14:46:46 1770130006

Somebody made a claim that TCP isn't high performance without specifying what that means, I gave a counterexample of just how high performance TCP is picking some arbitrary notion of "high performance".

Almost like it makes the point that arguing about "high performance" is useless without saying what that means.

That said:

> you're not going to measure how multiplexing multiple data streams onto one TCP connection

Of course not: When I want to argue against "TCP is not a high performance protocol", why would I want to measure some other protocol that multiplexes connections over TCP? That is not measuring the performance of TCP.

I could conjure any protocol that requires acknowledgement from the other side for each emitted packet before sending the next, and then claim "UDP is not high performance" when running that over UDP - that doesn't make sense.

wtallis · 2026-02-05T05:07:54 1770268074

> I gave a counterexample of just how high performance TCP is picking some arbitrary notion of "high performance".

You gave a vacuous example of a measurement of packets going nowhere. When zero work gets done, you might as well call it infinitely fast.

nh2 · 2026-02-07T22:52:18 1770504738

No, the example exercises the full TCP protocol including all its logic.

There have been plenty of examples of protocols where that is far from infinitely fast and does not scale even on localhost, e.g. with OpenVPN or any protocol that requires full acknowledgements from the other side before sending more.

mprovost · 2026-02-02T22:37:56 1770071876

High performance means transferring files from NZ to a director's yacht in the Mediterranean with a 40Mbps satellite link and getting 40Mbps, to the point that the link is unusable for anyone else.

adrian_b · 2026-02-03T10:04:00 1770113040

UDP by itself cannot be used to transfer files or any other kind of data with a size bigger than an IP packet.

So it is impossible to compare the performance of TCP and UDP.

UDP is used to implement various other protocols, whose performance can be compared with TCP. Any protocol implemented over UDP must have a performance better than TCP, at least in some specific scenarios, otherwise there would be no reason for its existence.

I do not know how UDP is used by iperf3, but perhaps it uses some protocol akin to TFTP, i.e. it sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP.

Sending UDP packets without acknowledgment will always be faster than for any usable transfer protocol, but the speed for this case does not provide any information about the network, but only about the speed of executing a loop in the sending computer and network-interface card.

You can transfer data without using any transfer protocol, by just sending UDP packets at maximum rate, if you accept that a fraction of the data will be lost. The fraction that is lost can be minimized, but not eliminated, by using an error-correcting code.

nh2 · 2026-02-03T13:58:09 1770127089

> perhaps it [..] sends a new UDP packet when the other side acknowledges the previous UDP packet. In that case the speed of iperf3 over UDP will always be inferior to that of TCP

It does not, otherwise it would be impossible by a factor ~100x to measure 4.5 Gbit/s as per the bandwidth-delay calculation (the ping is around the usual 0.2 ms).

With iperf3, as with many other UDP measurement tools, you set a sending rate and the other side reports how many bytes arrived.

adrian_b · 2026-02-03T16:25:58 1770135958

You are right.

It is a long time since I have last used iperf3, but now that you have mentioned it I have also remembered this.

So the previous poster has misinterpreted the iperf3 results, by believing that UDP was slower, as iperf3 cannot demonstrate a speed difference between TCP and UDP, since for the former the speed is determined by the network, while for the latter the speed is determined by the "--bandwidth" iperf3 command-line option, so the poster has probably just seen some default UDP speed.

nh2 · 2026-02-04T19:57:49 1770235069

But I am "the previous poster".

And I am quite sure that UDP is slower, becuase I increase `--bandwitdh` until throughput stops increasing, which is at 20% of TCP's speed.

adolph · 2026-02-02T17:49:01 1770054541

Aspera's FASP [0] is very neat. One drawback to it is that the TCP stuff not being done the traditional way must be done on CPU. Say if one packet is missing or if packets are sent out of order, the Aspera client fixes those instead of all that being done as TCP.

As I understand it, this is also the approach of WEKA.io [1]. Another approach is RDMA [2] used by storage systems like Vast which pushes those order and resend tasks to NICs that support RDMA so that applications can read and write directly to the network instead of to system buffers.

0. https://en.wikipedia.org/wiki/Fast_and_Secure_Protocol

1. https://docs.weka.io/weka-system-overview/weka-client-and-mo...

2. https://en.wikipedia.org/wiki/Remote_direct_memory_access

mprovost · 2026-02-02T22:49:16 1770072556

FASP uses forward error correction instead of retransmission. So instead of waiting for something not to show up on the other end and sending it again, it calculates parity and transmits slightly more data up front, with enough redundancy that the receiving end is capable of reconstructing any missing bits. This is basically how all storage systems work, not just Weka. You calculate enough parity bits to be able to reconstruct the missing data when a drive fails. The more disks you have, the smaller the parity overhead is. Object storage like S3 does this on a massive scale. With a network transfer you typically only need a few percent, unless it's really lossy like Wifi, in which case standards like 802.11n are doing FEC for you to reduce retransmissions at the TCP layer.

adolph · 2026-02-03T16:23:24 1770135804

In RDMA are the NICs able to perform the reconstruction or does that use a different mechanism to avoid CPU?

mprovost · 2026-02-04T07:52:23 1770191543

Usually RDMA is over a network that is supposed to be lossless, but it does have checksums to detect corruption and recovers with retransmission. Infiniband NICs handle all of that.

digiown · 2026-02-02T16:09:07 1770048547

There's an open source implementation that does something similar but for a more specific use case: https://github.com/apernet/tcp-brutal

There's gotta be a less antisocial way though. I'd say using BBR and increasing the buffer sizes to 64 MiB does the trick in most cases.

tclancy · 2026-02-02T17:48:47 1770054527

Have you tried searching for "tcp-kind"?

Onavo · 2026-02-02T22:16:12 1770070572

Looks unmaintained.

Can we throw a bunch of AI agents at it? This sounds like a pretty tightly defined problem, much better than wasting tokens on re-inventing web browsers.

pezgrande · 2026-02-02T16:17:05 1770049025

Was the torrent protocol considered at some point? Always surprised how little presence has in the industry considering how good the technology is.

gruez · 2026-02-02T17:17:27 1770052647

If you strip out the swarm logic (ie. downloading from multiple peers), you're just left with a protocol that transfers big files via chunks, so there's no reason that'd be faster than any other sort of download manager that supports multi-thread downloads.

https://en.wikipedia.org/wiki/Download_manager

KaiserPro · 2026-02-02T22:04:56 1770069896

Aspera did the chunking and encryption for you, and it looked and acted like SFTP.

The cost of leaking data was/is catastrophic (as in company ending) So paying a bit of money to guarantee that your data was being sent to the right place (point to point) and couldn't leak was a worthwhile tradeoff.

For Point to point transfer torrenting is a lot higher overhead than you want. plus most clients have an anti-leaching setting, so you'd need not only a custom client, but a custom protocol as well.

The idea is sound though, have an index file with and then a list of chunks to pull over multiple TCP connections.

ambicapter · 2026-02-02T16:45:44 1770050744

torrent is great for many-to-one type downloads but I assume GP is talking about single machine to single machine transfers.

robaato · 2026-02-02T22:29:37 1770071377

So what do you use now in film industry?

magarnicle · 2026-02-02T23:01:43 1770073303

I'm in a tiny part of the film industry. Bigger clients lend us licenses to Aspera and FileCatalyst when receiving files from them, but for our own trans-oceanic transfers I dug up an ancient program called Tsunami UDP and fixed it up just enough.

mprovost · 2026-02-02T22:38:45 1770071925

I suspect mostly Aspera because there are still no good alternatives.