HN2new | past | comments | ask | show | jobs | submit | noobiemcfoob's commentslogin

The only feature keeping me on Chrome is web bluetooth.


>use Elasticsearch

Found your problem.


“Why didn’t we use an RDBMS in the first place? “

Because initial application specifications are sparse and definitely wrong. If your application is still up and running 5 years later and your data definition hasn't changed much in the past 3, then maybe refactor around an RDBMS. Designing around rigid structures during your first pass is costly. This is why there's been a rise in NoSQL and dynamic languages.


It seems the other way around to me. RDBMSes have well-defined ways of managing change in schema. Through a combination of modifying the schema itself and judicious use of views and stored procedures, it's often relatively easy to evolve the data model in ways that produce minimal disruption for the application's consumers.

Contrast with, I was at a Cassandra workshop a few weeks ago, and the speaker, when asked directly, conceded that, in Cassandra, you really do need to nail your schema on the first try, because there are no great retroactive schema migration mechanisms, and any evolution is going to result in all consuming applications needing to know about all possible versions of the data model. Which ends up being a huge source of technical debt. And heaven help you if you didn't get the indexing strategy right on the first try.

I think that this might point to the classic tension between easy and simple: RDBMSes are focused (admittedly to varying degrees of success) on trying to keep things simple, but there might be some work involved. NoSQL solutions are often sold as being easy to work with, and I don't deny that at all, but my experience is that, in the long run, they can become a huge source of complexity.

This isn't a tension that's unique to software. In my contractor days, we'd also do things one way when it was just a quick-and-dirty job, and a whole different way if we were looking to build something to last. e.g, you'll never catch me using a sprayer to paint my own house, no matter how much faster it is.


I've been trying to deal with migrating data in a Firebase database, and it's come down to exporting the data to JSON and searching through it with ripgrep and jq. Not ideal! And I've had to deal with entries missing "required" props, and entires that have keys with typos, etc. Luckily our dataset is still small enough that this is practical, but none of this work would be required if our product had been built on a SQL database in the first place (I wasn't around when that choice was originally made).


Completely disagree with this. SQL is more flexible than NoSQL. NoSQL databases generally have poor query support, which means if you want to do complex queries you need to precompute the results. This requires you to know your requirements up front. If you're using SQL, then you can do these queries on the fly.

Sure, SQL databases make you defined a schema. But it's very easy to change this when requirements change.


It's really not. That's a myth and you get all of the benefits that you want just by designing with a dynamic language.

It's less costly to create a relation table when you realize there may be multiple instances of a piece of data associated with a record than it is to just stick those pieces in an array.

Because when you don't use the relational data, you get the extra work of modifying all of your existing NoSQL records to use the array structure. And as a bonus, you make it easy to do queries in both directions with the relational data.

NoSQL offers virtually no efficiency benefit unless you're actually consuming unstructured and variable data.


With NoSQL solutions you're typically pushing data migrations to code. Yes this is technical debt. But not having to deal with SQL based data migrations is pretty big time saver early on.


Not really. Writing a SQL migration takes what, 10 minutes max? Or you add the columns as you go, and it all just merges into the normal dev time of the feature anyway.

You'll easily make up this lost time just in not having to immediately clean up crappy data that you've written to the database while you're developing the feature. That's been my experience anyway.


I mean, we're all programmers here, and we've all dealt with the growing pains of changing requirements.

But has an RDBMS ever been a major source of that pain? I can't say I've ever encountered a time when it has. If you need to change the structure, just write a script.

I'm not saying it's completely painless, but nothing is.


Before I started using migration scripts it was a bit of a pain (and before there was off-the-shelf libraries for this), it was a bit of pain. But these days, nope. MySQL is slightly more annoying than postgres because it doesn't let you wrap DDL queries in transactions so you can get left in an inconsistent state if your migration has a bug. Postgres is seamless.


IMO if your data model's still so nebulous that you're changing it so often that SQL migrations are a serious impediment to progress, you probably don't need any real datastore yet. You can usually figure out WTF you're going to do, broadly speaking, before persisting anything to a remote database. And if you can, you very much should.

Yes, yes, there are sometimes exceptions, one must repeat explicitly despite having already said it (see: "probably") because this is HN.


Can someone explain to me any circumstances where having no well defined data model is better than coming up with a clear relational model? Honest question. It seems like it would just be a huge hassle trying to deal with your dissimilar data.

When I'm designing an app from scratch I often think about the SQL tables first and how I'm going to build them, and it really sharpens my idea of what my program will be.

I don't see how skipping that process would make things easier.


I've built IoT platforms where data from any device must be accepted. This is largely where my preference for NoSQL comes from. A device created tomorrow will not have a schema I can predict or control. NoSQL allows the easy integration of that device while a traditional database will at worst require a migration for each new device you want to support.


Please correct me if I'm wrong, but that sounds like a very narrow use case, and also something that could be solved by simply stuffing JSON into a RDBMS.

However, perhaps there are tools that NoSQL provides that are handy.


What's wrong is that you asked for any usecase and then critique one because it's not broad enough for you.


Sorry, I'm not trying to be argumentative, I just argue in order to understand better.


Then argue honestly and work to steelman other's arguments. To address your point, I'd hardly consider IoT platforms to be a "narrow" usecase. Smaller than the whole of computing, surely, but it's a growing field. The reality is that more and more devices will become available that generate all sorts of hard to predict data. Being able to handle those easily will be a large strength for platforms going forward. Dropping this hard to predict data as JSON into a RDMS will certainly come back to haunt you in 5 years.


How will it come back to haunt? Again, just curious.


You'll have dumped it into a strict database, giving yourself a false sense of order and organization. But later when you need to query that amorphous data, you might be able to use OPENJSON or something else, but a NoSQL solution will have been built to handle this type of query specifically with utilities like key exists and better handling for keys missing or only sometimes being present.


You can't really design your tables well enough without knowing your UI, its a back and forth, forth and back process.


And it's something you can do entirely on paper, too, before you go to code.

However I've never greenfielded an actually large application.


How does UI inform table design?


It will give you lots of insight when planning your data models/tables.


I've found that MongoDB is really the Visual Basic of databases: It lets you rapidly get something running where your persistent data looks very similar to your data structures.

But, more importantly, this quote really is right:

> The more I work with existing NoSQL deployments however, the more I believe that their schemaless nature has become an excuse for sloppiness and unwillingness to dwell on a project’s data model beforehand.

Far too many software engineers just don't understand databases and don't understand the long-term implications of their decisions. Good SQL is rather easy, but it does take a few extra minutes than sloppy approaches.

Or, to put it differently, some NoSQL databases are great at rapid prototyping. But, they can't scale and hold little value for someone who will put in the extra hour or two upfront to use SQL.


If you embrace both structure and a defined path to change that structure, you get the best of all worlds. Database migrations and scheme changes should be the norm, not a feared last resort.

Otherwise you end up with equivalent logic littered throughout multiple application codebases sharing the same database.


I feel like this is where ActiveRecord shines, as the initial development is thought of as objects and their relations to each other and it's very easy to alter data definitions. And in the end you get a reasonably normalized database underneath everything. The reality is most of your models aren't going to be undergoing wild schema changes and if they are or you need traversable unstructured data you can defer to json columns for those scenarios.


The lack of (or too-weak) safety & consistency guarantees will waste more time than it saved way before 5 years are up. More like within 3-6mo of going to production, if not before you hit production. Even more true if you're dynamic the whole way up (JS-to-JSON-to-[Ruby/Python/JS]-to-NoSQL and back again).

With the usual caveat that yes, some datasets & workloads probably do benefit from certain NoSQLs, but that's a pretty small minority.


I like the concept of encoding properties, like a boolean of non-empty, in an object.

Beyond that, this article further convinced me type systems are for the pedantic. A given function signature is impossible? Seems like just another strength of a dynamic language.


What does your dynamic language do when you take the first element of an empty list? There is no obvious "correct" thing to do. Furthermore, whatever you do return is unlikely to be the same sort of thing that is returned for a non-empty list.

A dynamic language will have a behavior that corresponds to some sort of type signature, and it's not possible to write behavior that corresponds with the type signatures given as examples of "impossible" in the article.

A type signature is merely a statement about behavior, so it's nonsensical to make a false statement about the behavior, and Haskell catches this.


It throws an error. Errors are a type of behavior. Some of those error behaviors you can cope with. You catch those. Others you can't. You raise those and either the system can cope or it can't and you crash. What part is nonsensical?


Throwing an error should be part of the function signature. Otherwise the statement "This function takes a list of objects of type A, and returns an object of type A" is false; sometimes it will return an object of type A, other times it will signal an error.


It's better to assume all code could throw an error. No statement is safe. This becomes particularly important in distributed computing as resources may be offline at any given moment. It's from this perspective that type systems afford a false sense of security and appear to be scratching the itches of overactive Type As.


I'm not particularly an advocate for typed languages (my daily driver for personal projects is Common Lisp, which is mostly untyped), but I'm having trouble even understanding the point of view that "No statement is safe". If I had a system where the actual type of the expression "1+2" were "3 | Error" I would find a new system.

Code that doesn't read from unsafe memory locations, allocate memory, or perform I/O cannot throw an error. Code that does can throw an error. Some systems treat out-of-memory as fatal, removing one of those 3.

In some domains you don't want to know if the computation is happening locally or remotely, but IMO most of the time you do because pure computation cannot signal an error locally, but remotely it can. As you mention, distributed computing often runs in this mode, but distributed computing is an overkill for most problems; my phone is powerful enough to locally compute most daily tasks (even when it's done server-side for various reasons); my workstation is powerful enough to perform most daily tasks for 100s of people simultaneously.


It's hyperbole. A lot like systems that require hard type declarations for each and every function. The advantage of type systems is performance. Any correctness guarantees that come along with it are nice sugar but hardly the point and often misleading.


Nonsense. A distributed scenario is precisely where an advanced type system becomes really valuable, because you can draw a distinction between local and remote calls while still treating both in a uniform way. Treating every single call as though it were remote is impractical and wasteful.


Nice in theory but if you want to see how bad it is in practice, look no further than Java.


Java did a shitty job of checked exceptions. C++ did worse.


What would be the type signature of, say, Python's `pickle.load()`?


In the style TFA you should wrap pickle.load() with a function that will unpickle the specific type you are expecting. So you should write a unpickleArrayOfInts() or whatever.

The actual type would be a rather large union type, which would be unwieldy to use (but in an untyped system like python you still would need to deal with all of those corner cases to have a program that is correct in the face of arbitrary input).

Really the biggest annoyance of type systems is that they make you deal with corner cases that you don't think are practically possible. If you are right, then they are wasting your time. If you are wrong, then they are saving you from having bugs.



I think that's different. `read` requires you to know what you're deserializing up-front, while `pickle` decodes the type dynamically from the data.

Dynamic languages really can have functions whose behavior cannot be expressed as some sort of type signature.


I'm pretty confident that you could write something that was equivalent to all the useful `pickle` calls. By that I mean you'll need to know which operations you'll want to do on your unpickled object:

  readAny :: forall c r. [TypeWithReadAnd c] -> String -> (forall a. c a => a -> r) -> Maybe r
  readAny types string handle 
I think it's fair to say "hey, pickle doesn't require me to list all my types explicitly", but on the other hand, it's not like pickle can conjure those types out of thin air--it considers only the types that are defined in your program.

Here's an example that uses Read as the serialization format and only deals with Int, Char and String; but hopefully you can imagine that I could replace the use of read with a per-type function that deserializes from a byte string or whatever.

https://repl.it/@mrgriffin/unpickle


IIRC the pickle format can define new classes, but I haven't looked at it in over a decade so I might be misremembering.


You might be right, it's been a long time for me too. But if it can define new classes, then I'd expect that the code for those classes' methods would also be in the pickled format, at which point there's no particular reason you couldn't deserialize it in Haskell too... with some caveats about either needing to know what types those functions should have, or needing to build a run-time representation of those types so that they can be checked when called, or (hopefully!) crashing if you're okay with just unsafeCoercing things around.


read :: String -> Maybe PlainPythonObject

read :: String -> Maybe Json

goes a long way.


But as soon as you're using "Maybe", you're impeding much of a type systems strengths, as the article outlined.


Only if there is actually anything more you can say about String that would let you avoid the Maybe. If this is raw data and you don't even know that the JSON is well formatted (else why can a parse straight to a generic JSON object fail) then there's not any better way to type it.

You might want to return an error message on failure, or provide a mechanism for recovery, but those are somewhat specific to the use case and don't really address the key point of the article.


It's really a matter of what you care about to accomplish.

My point is that Python objects are certain data structure and if you bother to create type for it them then you can use it in function type.


Why would you want a language which extra support for bugs?

That's like saying Python is for the pedantic because it doesn't have a flag --crash_randomly, or won't let me write 0/0 and rely on the runtime to pick an abritrary value and keep going, even though I might want that.


A Data Analyst is a Data Scientist who can't prove Bayes Theorem.


I learned SQL in high school as my first "programming" class, though that's more of a quirk of how I moved through the curriculum. This was prior to learning C++ and then transitioning to Python some years later.


That's really fascinating...what the class focused on SQL language and syntax and working with data? Or did you get into database theory as well?


It was both language and database theory, all focused on Oracle. IIRC, it was titled "Database Administration", but I remember having to draw out schema diagrams and learn all the particular arrow types (one-to-one, one-to-many, etc) and what different block shapes meant in the diagram. All that only to get into industry 5 years later and find no one goes beyond basic squares and maybe double-ended arrows in practice >.>


Perception is important. A lot of these decisions aren't made by people who sat down and wrote down the numbers to arrive at the rational choice. But if I think I have utility for a pick up often because I like to think I'm that rugged, diy type of person, I'll think it's rational to own a pickup and assume the relative prices get massaged away.

/I drove a pick-up for a number of years for this reason until hybrid crossovers became viable.


I believe some research out of Duke had already done similar things with chimps, but bringing it to humans is major. Each of these studies becomes a proof of concept for what "should" be true of the brain, but theories fall flat all the time. Each lays a brick in the foundation for building a true BCI.

I'm most excited that this was done with only an 8-channel OpenBCI headset, 16-bit, for the "Receiver". The "Senders" had a much stronger/better headset, but seeing open hardware show up in this is awesomely inspiring!


Revenue Analytics | Software Engineer | Full-time | ONSITE Atlanta, GA

Revenue Analytics is a SaaS company that helps big companies make big revenue decisions in pricing, products and promotions. Our analytics solutions drive millions in revenue uplift and eliminate wasted time.

We're hiring software engineers of all levels to build out a catalog of analytics products. Our software stack is primarily Python and orchestrated containers in AWS.

Unfortunately, we DO NOT offer visa sponsorship.

Apply at https://www.revenueanalytics.com/careers or email me with a cover letter and resume at lblackwood[at]revenueanalytics[dot]com


An update, Revenue Analytics will support immigration sponsorship for qualified candidates.


The problem this is addressing is a code being impenetrable by a human. If your solution is adding a second (human readable) code beneath the machine readable code...you haven't addressed the problem. The user must still trust their reader to parse the code.

QR codes' reconstructability is a major strength that this lacks, but I'd bet there's a way to expand this to include ECC around it, much as QR codes can.

BUT...OCR is quickly advancing, so the need for a specialized code a specialized machine can read will diminish over time anyway.


As I suggested at the end, you could still employ OCR and have a barcode checksum. The checksum would ensure a misread of the human readable text would fail. As long as the checksum was not error-correcting (so could not be engineered to augment a correct OCR read), it doesn't matter that it is incomprehensible to humans, because the OCR is authoritative.

You could also implement the checksum as OCR-able text, although it wouldn't be as dense, and probably wouldn't help human readability.

I think ultimately it should not be trusted that a machine will read what a code appears to be. That should be enforced on the device: "Are you sure you want to visit malware.site?". It's also easy to manipulate computer vision; you can engineer patterns which will read as one thing to humans, but another to machines. In some ways it's better for these codes to not be human readable, such that trust is not misplaced, and the machine is used as the best source of truth.


I can imagine that it would not take long until someone comes up with patterns which are innocent in one orientation and malicious with a ±90 deg. rotation...


Not with QR codes, they read the same from any side


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: