HN2new | past | comments | ask | show | jobs | submitlogin

Wouldn't you have to re-invent things like ranges searches which are 'BETWEEN ... AND ...' queries in SQL? The same for finding events that are for users 1, 2 and 3, etc.

In a real application you'd probably have some user accounts of some sort that are stored in a relational database already and then you'd suddenly have to scan for events in a directory that you then have to connect to those records in the database.

So there might be some specific set of applications where you are right, but there are specific things that a database is really good at, which would make it a really good choice. With the proper indices you'd probably get the same or even better throughputs, unless you come up with some clever directory structure for your events, which would in fact be the same as an index and only on one dimension whereas in a database you'd be able to create indices for many dimensions and combinations of dimensions.

So you are right, trade offs.



I didn't mean to imply avoiding use of a database entirely, almost any DB system tasked with copying a few long strings around in a simple query won't perform much worse than a literal raw disk file.

Even just something like: CREATE TABLE calendar(id, user_id, blob)


Yes sure. I can imagine though that normally you'd also want to be able to query on details of an event. In which case having most things in columns would make sense because you can combine it with JOIN queries, etc.

Also, in the context of web applications, you probably already have a database and probably don't have persisted disks on your application servers, which then adds complexity to the file based scenario. In which case using blobs is a perfectly fine solution indeed.

Still you are right that in many cases, let's say a desktop application, you are probably better off reading directly from tens of files on disk rather than having to deal with the complexity of a database.

The same applies to vector databases. I read an article a few months ago that spoke about just storing vectors in files and looping through them instead of setting up a vector database and the performance was pretty much the same for the author's use case.


And then you quickly get to the not-author's-use cases where you have to start reinventing wheels, poorly.


You mean with the vectors or the other parts?


If I understood him correctly, I think this is where some language collections libraries ought to shine. PHP/Laravel collection or C# Link for exemple. Tell it how to load a record into the collection, add one liners for each criterias you want to define, and in a few dozen lines you're free to go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: