Good news for curl users: https://github.com/mandatoryprogrammer/thermoptic

benatkin · 2025-10-03T03:47:53 1759463273

> NOTE: Due to many WAFs employing JavaScript-level fingerprinting of web browsers, thermoptic also exposes hooks to utilize the browser for key steps of the scraping process. See this section for more information on this.

This reminds me of how Stripe does user tracking for fraude detection https://mtlynch.io/stripe-update/ I wonder if thermoptic could handle that.

mips_avatar · 2025-10-03T05:06:59 1759468019

Cool project!

mandatory · 2025-10-03T05:41:49 1759470109

Thanks!

joshmn · 2025-10-03T03:32:08 1759462328

Work like this is incredible. I did not know this existed. Thank you.

mandatory · 2025-10-03T05:41:24 1759470084

Thanks :) if you have any issues with it let me know.

snowe2010 · 2025-10-03T14:50:11 1759503011

People like you are why independent sites can’t afford to run on the internet anymore.

1gn15 · 2025-10-06T05:59:55 1759730395

I block all humans (only robots are allowed) and I'm still able to run independent websites.

mandatory · 2025-10-03T16:47:34 1759510054

They can't? I've run many free independent sites for years, that's news to me.

timbowhite · 2025-10-03T18:21:35 1759515695

I run independent websites and I'm not broke yet.

Symbiote · 2025-10-03T11:26:37 1759490797

Oh great /s

In a month or two, I can be annoyed when I see some vibe-coded AI startup's script making five million requests a day to work's website with this.

They'll have been ignoring the error responses:

  {"All data is public and available for free download": "https://example.edu/very-large-001.zip"}

— a message we also write in the first line of every HTML page source.

Then I will spend more time fighting this shit, and less time improving the public data system.

mandatory · 2025-10-03T18:42:55 1759516975

Feel free to read the README, this was already an ability that startups could pay for using private premium proxy services before thermoptic.

Having an open source version allows regular people to do scraping and not just those rich in capital.

Much of the best data services on the internet all start with scraping, the README lists many of them.