I mean its a nice idea. But short of FTL/Wormholes there is never going to be a practical need to do this. If the sun goes supernova then perhaps it would happen but even then we would spend centuries in space terraforming planets to make them liveable,
Differences:
Sharded SQLITE, used bigquery export, build script is open on GitHub, interactive “archived website” view of HN, updated weekly (each build takes a couple dollars on a custom GitHub runner)
@keepamovin thanks, your project was a big inspiration for this.
I built my own pipeline with a slightly different setup. I use Go to download and process the data, and update it every 5 minutes using the HN API, trying to stay within fair use. It is also easy to tweak if someone wants faster or slower updates.
One part I really like is the "dynamic" README on Hugging Face. It is generated automatically by the code and keeps updating as new commits come in, so you can just open it and quickly see the current state.
The code is still a bit messy right now (I open sourced it together with around 3.6M lines across 100+ other tools, hidden in a corner of GitHub, anyone interested can play Sherlock Holmes and find it :) ), but I will clean it up, and open source as clearer new repository and write a proper blog post explaining how it works.
Connecting directly with the author of the project that inspired me is awesome.
Let's collaborate and see how we can make our two projects work together.
DuckDB has a feature that can write to SQLite: https://duckdb.org/docs/stable/core_extensions/sqlite. Starting from Parquet files, we could use DuckDB to write into SQLite databases. This could reduce ingress time to around five minutes instead of a week.
If I have some free time this weekend, I would definitely like to contribute to your project. Would you be interested?
As for my background, I focus on data engineering and data architecture. I help clients build very large-scale data pipelines, ranging from near real-time systems (under 10 ms) to large batch processing systems (handling up to 1 billion business transactions per day across thousands of partners). Some of these systems use mathematical models I developed, particularly in graph theory.
One of the things that i got interested in from the comments on my show was parquet. Everyone raving about it. Happy to see a project using that today.
The way I bundle into SEA is modules that need to be imported from disk (that can't be bundled due to node or wasm modules), is just include them in the assets, and do a "write to tmp, import, delete" flow. It works.
Not saying vfs is bad, just it's not impossible in a few lines of code to set up that. My idea for a simple version of a vfs in node is to use a RAM disk/RAMfs - would that work?
It's keyed fairly simply: mine HN all time data for "related:" type comments on stories that link to other stories, then do a title-v-title bag-o-words cosine similarity and rank on descending least similar, with an overall sort on some other metrics across all pairings.
the point is to surface the "high signal" related comment edges. i started out on a few iterations of this idea surfacing the obvious "high overlap" cosine similarity titles. but that wasn't popular. i considered why - most rational reason is that it's most expected, least interesting, lowest entropy. so i inverted. hence the name. enjoy
Stripe Identity is good, especially if you already use Stripe.
The main difference is that Stripe built identity mostly for their payments ecosystem, while Didit is a standalone identity infrastructure that works across any platform and any identity flow.
We also optimized heavily on fraud detection, speed, and much better pricing.
The fragmentation and friction! Comparing prices usually requires 10 open browser tabs and a spreadsheet, which is what keeps people locked into their default cloud. I built a tool to solve this called BlueDot (ie, Earth, where all the clouds are)[0]. It’s a TUI that aggregates 58,000+ server configurations across 6 clouds (including Hetzner). It lets you view side-by-side price comparisons and deploy instantly from the terminal. It makes grabbing a cheap Hetzner box just as easy as spinning up something on AWS/GCP.
I use serververify which is created by jbiloh from the lowendtalk forum which uses yabs (yet-another-benchmark-script) to give details about lot more things than usually meets the eye.
That being said, I have found getdeploying.com to be a decent starting point as well if you aren't too well versed within the Lowend providers who are quite diverse and that comes at both costs and profits.
Btw legendary https://vpspricetracker.com (which was one of the first websites that I personally had opened to find vps prices when I was starting out or was curious) is also created by jbiloh.
So these few websites + casually scrolling LET is enough for me to nowadays find the winner with infinitely more customizability but I understand the point of TUI but actually the whole hosting industry has always revolved around websites even from the start. So they are less interested in making TUI's for such projects generally speaking atleast that's my opinion
Thanks! A planned next iteration is to include more non-mainstream cheap providers in the TUI. But that's not as simple as the current model which wraps official CLIs, as these alt providers typically don't have CLIs and diverse listing and control surfaces.
- Grok: ideas: how much did it cost? input a github repo, and we analyze LoC and commits and estimate what it would have cost on different AI providers (API, subscription) to build it.
- Gemini Pro 3.1: Hey bud i like this , but is this really getting the LOC accurate? And the commit count? Ideally I'd like to cost it on number of lines changed per commit, you know? But that could be fucking too many requests. So...I think we can use a mathematical model that assumes lines grow over time, and also to get to a commit might take so many iterations. <snip ... grok's attempt>
Earth is a jewel, but we have to expand and explore. It's our destiny.
Ultimately you need to live underground on the Mars to avoid radiation.
reply