Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin
The Anti-Human Consequences of Static Typing (jeapostrophe.github.io)
43 points by nickmain on Aug 12, 2013 | hide | past | favorite | 63 comments


> Theorems and conclude that in every logic there are statements which are true, but not provable, or provable, but not true. And this is the ultimate problem with type systems: in their quest to reject "bad" programs, they must reject "good" programs as well because they cannot prove their "goodness".

This really is a gross misuse of Gödel's theorem. Taken to it's logical end his argument is that any field that has logical foundations should value human judgement over rigor and proof because of incompleteness. Should we just throw out all of mathematics as well because there are true but unprovable theorems?


We should not throw out mathematics, but we should recognize that most mathematical proofs are not based on logic, but human intuition in the first place. Human intuition and wisdom are valuable and lay the foundation for all that we do and there's no reason to ignore them when we're programming and insist that all valid programs have to follow some particular set of rules.


> mathematical proofs are not based on logic

Mathematical proofs are explicitly logical statements, if they aren't based in logic they aren't proofs by definition. Mathematical intuition is what leads us to ideas but it's only within the framework of formal reasoning that we "know" anything within mathematics. Poincare said it nicely:

"It is by logic we prove, it is by intuition that we invent."


If you really believe this, you should try to translate any contemporary (published) mathematical proof into a real logic, like Coq. It is hard and people earn PhDs doing this because most proofs are so intuition heavy in the first place.


I have spent several months doing proofs in Coq, as well as in Isabelle. I contest conflating Coq with "real logic", it is perfectly possible to do rigorous proofs without the aid of a computer-aided proof assistant, if that wasn't the case then there wouldn't be any reason to trust the authors of the proof assistant in the first place.


Certainly, but most math proofs aren't "rigorous proofs" by any stretch.


Static typing is pro-human; I would call it even more humane than dynamic typing.

Vastly more type errors (i.e., conditions which cause the program not to hold the property checked by the type system) are caused by programmer error than by deliberate design; a strong, static type system, checked at compile time, helps ensure that a whole class of relatively weak but trivially easy to write bugs do not find their way into the finished program. The programmer hours saved in not having to worry that some little-exercised code path will attempt to add an integer to a string, nor to write exhaustive tests covering every possible code path to make sure no integers get added to strings, adds up significantly over the program's lifecycle, from initial development through maintenance and support.

Be kind to your programmers. Use a statically-typed language.


I mostly agree, but having the ability to use a dynamic type occasionally can be very useful. I like how objective-c handles it in practice.


The first example of a "good" program contains an if statement that should only ever branch in one direction.

I question whether that is a "good" program.


It's only good in that with the dual dichotomies he has defined as what the static typing compiler accepts and rejects and what will function or not function, it functions but may not compile.


But he's presenting it as a bad thing that the program doesn't compile. Static typing stopped the compile because the program does something foolish. That's a win.


Static typing stopped the compile because the program does something foolish.

But static typing would also stop the compile in situations where mixing types was not foolish. Static typing says you must give a definite type to every variable and every expression; but that's not always the best way to write your program. (To take a simple example from Python, there are plenty of cases where you want a function to return either a string, say, or None, depending on whether it is able to find a string meeting some set of criteria, and you can't use the empty string as your "not found" value because empty strings are semantically significant for your application.) So static typing unnecessarily restricts the kinds of code you can write. That seems to me to be the article's basic point.


Except most type systems can deal with that -- if you truly believe that mixing types isn't foolish in a particular case, just tell the compiler that you're doing it intentionally. The OP's example is trivially solved by using the Either type within the if statement to denote that the expression intentionally returns two different types. The compiler will then check for you that you actually deal with both cases. Better yet, one of the type constructors for Either is traditionally reserved as an error case, so it again correctly flags the code as always branching in the same direction barring an exceptional case.

Your Python example is a canonical use case for the Maybe/Option type. You either provide "Just" a value, or you provide "Nothing".

The point is, a type system of sufficient power can express your program just fine so long as you declare to the type checker that you're not just making a silly mistake. And let's be honest, we all make tons of those mistakes. Who wants them hiding for some unknown duration to crop up at the most unfortunate time. I'd rather know asap that my program has a problem.


a type system of sufficient power can express your program just fine so long as you declare to the type checker that you're not just making a silly mistake

Sure, if I know in advance what types I want to mix. But suppose I don't? Or suppose I start out with one type mix, and then discover I need to change it? With static typing, I have to do a custom type declaration for each mix, and keep it up to date.


Your intentions changed, so of course you need to communicate that. In real programs (like the Haskell compiler), the programmers simply make the change they'd like to make at one point. The compiler then spits out a list of issues the change caused and provides a possible fix for each one. They continue the dialogue with the compiler by making the suggested fix, or by making one of their own devising. The end result is a program that's free of entire classes of stupid bugs.

Also, if you're just changing type mixtures, you probably just need to make the type more polymorphic. You rarely need to be very specific, and good type inference systems will automatically infer the most general type possible.


Your intentions changed, so of course you need to communicate that.

But in a dynamically typed language, I don't have to communicate anything; it all happens automatically.

The end result is a program that's free of entire classes of stupid bugs.

A good test suite achieves the same objective with a dynamically typed language. There are tradeoffs both ways; I'm not saying dynamic typing is always better. But I don't think static typing is always better either.

good type inference systems will automatically infer the most general type possible.

With dynamic typing, no inference is necessary at all.


> But in a dynamically typed language, I don't have to communicate anything; it all happens automatically.

Nothing happens automatically, the compiler/interpreter just doesn't check. Eventually you hit the bit of nonsense code your change caused and then the program dies.

> A good test suite achieves the same objective with a dynamically typed language.

With considerably more effort and with no guarantees that you've come even close to correct. A test suite on top of a powerful static type system at least guarantees that you've killed classes of dumb bugs, and you've thought through and tested more complicated issues.

> With dynamic typing, no inference is necessary at all.

So...? You also don't get any guarantees.


Nothing happens automatically

If I have a function in Python that used to only return strings, and I re-code it so it returns None in some cases, I don't have to re-declare anything. The code just runs and returns None automatically, without me having to change anything about the function except the actual working code that determines what gets returned. With static typing, I need to re-declare the function's return type; it won't automatically return None if the type system thinks it can only return strings.

With considerably more effort

I don't understand this; you're comparing a test suite on top of dynamic typing, to a "test suite on top of a powerful static type system", so you're expending the effort on the test suite either way.

and with no guarantees that you've come even close to correct.

Static typing doesn't guarantee program correctness either; it just guarantees type correctness, so to speak. You can have a program that gets all its type declarations right but still has faulty logic. So you're still going to need a test suite either way, as I said above.

Eventually you hit the bit of nonsense code your change caused and then the program dies.

Why do you assume the code change produced nonsense code? If it doesn't--if I got the change right--then the program runs fine.

I assume what you really meant is, if I made a mistake, a dynamically typed language won't catch it until runtime, whereas a statically typed language will catch it at compile time. That's true; that's one of the tradeoffs I mentioned between static and dynamic typing. With static typing, I catch a certain class of errors earlier, but I pay for that by having to explicitly declare things. With dynamic typing, I don't catch those errors until later, but I can program faster (at least for certain kinds of applications) because I don't get bogged down in having to keep all the type declarations straight. As I said in another comment upthread, I'm not saying this tradeoff always works out in favor of dynamic typing; but it doesn't always work out in favor of static typing either. I think it depends on the type of application you're writing.


> To take a simple example from Python, there are plenty of cases where you want a function to return either a string, say, or None, depending on whether it is able to find a string meeting some set of criteria, and you can't use the empty string as your "not found" value because empty strings are semantically significant for your application.

Static typing is not a problem here. Static type with an inadequate type system is a problem here,


And that's the whole point of his post. Don't let strict adherence from static typing overshadow interesting new techniques from being developed when we don't use static typing. We can then go back and update our type models to allow that new usage. The first comment on the article (bottom of the list of comments) clarifies his purpose.


Statically typed mainstream languages would not allow you to do that, but languages with algebraic type systems, like Haskell, Scala and Rust, do. This is also how they get around not having null pointers. You just define a type that can be either a string, or the symbol "None". Or any other symbol that might be more fitting for the situation.


You just define a type that can be either a string, or the symbol "None"

And as long as I know in advance that that's the type mix I want, and as long as it never changes, that works. But one doesn't always know in advance, and the type spec you need often changes.

Also, the example I gave was easy because the mix was between built-in types. What about if I want to mix custom-defined type A from package P1, custom-defined type B from package P2, and custom-defined type C from package P3? And suppose I have a number of different such mixes needed for different places in my program?

If all of the types I'm mixing are duck-type compatible, Python lets me just mix them, no further action required. With static typing, I need to declare a new mixed type for each mix, and keep each declaration up to date. That looks like a programming and maintenance nightmare to me for a non-trivial program. But of course it depends on what kind of programming you are doing; there are probably plenty of applications where this kind of issue wouldn't arise.


> With static typing, I need to declare a new mixed type for each mix, and keep each declaration up to date. That looks like a programming and maintenance nightmare to me for a non-trivial program.

Sounds like you're thinking of Java/C# level type systems here. Type inference is a powerful feature that allows you to just not specify types (except where the program is ambiguous) in most cases.


Type inference is a powerful feature that allows you to just not specify types (except where the program is ambiguous) in most cases.

But type inference only works if there is some non-trivial base type shared by all the types I want to mix. Type inference can't figure out that None and a string are both valid return values for some function; I still have to declare a custom type that mixes them (like a Maybe in Haskell). Similarly for types declared in completely different packages that don't share any base types; they might be perfectly duck-type compatible, but how does the type inference system know that?


If String and None are both valid return types for a function, then you tag that so the inference system knows it's a valid solution -- the inferred type becomes Maybe String. The compiler will even help you get there.

> Similarly for types declared in completely different packages that don't share any base types;

If you know they share functionality, you can declare that they share functionality (any language with type classes). Once you've done that, the type inferencer will generalize the type of the function to the entire class. As a bonus, you automatically factor out shared code and make the program as a whole more modular.

The answer to "how does the type inference system know this?" is usually that you've made those bits explicit. This sounds tedious right up until the first time you write a complicated program, get it to compile, and it just runs correctly the very first time. It gets better when you learn to work with the compiler and start to make judicious use of code holes (undefined/_ in haskell; they allow you to provide an expression of any type so you can test unfinished programs. The compiler can even tell you what type the unwritten code needs to have, and in most cases that actually tells you what the code needs to be).

Lastly, in a language like Haskell, type declarations are almost always optional. They're used as a kind of documentation and an enforced contract for any other programmer that happens to read the code. Most types are inferred -- if there's some ambiguity, the compiler will let you know, and you can fix the code. For example, if your function tried to return String or Nothing, the compiler would complain that String doesn't match Maybe a, try Just <string value> instead. So you replace <string value> with Just <string value>, and now the function compiles. If the requirements later change, then you make the necessary changes the same way you would in a dynamic language, and then follow the chain of type errors to fix issues that have cropped up. So if suddenly you need String or Integer instead, you'd just replace Nothing with the <Integer value> you now need to use. The compiler will complain that Maybe String doesn't match Integer, and so you fix your code to Right <string value> and Left <integer value>. Now the compiler agrees that everything makes sense again, and you continue on.


If String and None are both valid return types for a function, then you tag that so the inference system knows it's a valid solution

In other words, I need to do extra work to tell the type system about the function's return type.

If you know they share functionality, you can declare that they share functionality (any language with type classes)

Same comment here; the language is making me do extra work that I don't have to do in a dynamically typed language. So there's a tradeoff.

This sounds tedious right up until the first time you write a complicated program, get it to compile, and it just runs correctly the very first time.

See, here's the thing: I often write non-trivial programs in Python that run correctly the very first time. So I've had this same experience in a dynamically typed language.

Perhaps a language like Haskell would raise the bar, so to speak, for how complex a program can get and still allow this to happen; but even then, it would only matter if you were writing the particular kind of program that can benefit from the difference.

Lastly, in a language like Haskell, type declarations are almost always optional.

It's the "almost always" that's key here; basically, what the rest of your discussion here shows is that the type declarations end up not being optional in precisely the cases where, in a dynamically typed language, you would just be making changes and running code, not spending extra work on adding type declarations to resolve ambiguities for the compiler. So there's a tradeoff.

As I've said in several comments in this thread, I'm not saying dynamic typing is always better; I'm saying it depends on the kind of application you're writing. There are tradeoffs involved, and they don't always end up favoring static typing.


But most statically typed languages let you represent that kind of code. C++ and Java have null. Haskell has Maybe. It doesn't prevent you from writing the code, but rather requires you to be explicit about what values a variable can take, which is the purpose of having a type system at all.


It doesn't prevent you from writing the code, but rather requires you to be explicit about what values a variable can take, which is the purpose of having a type system at all.

It's the purpose of having a static type system, yes. But the drawback is that you have to be explicit; you don't have a choice. There are situations where that is indeed a drawback; see my other posts in this thread.


There are only four options for removing a block of code you no longer want to use:

1. Add a comment around the whole block. This may not always work, as comments within the block may need to be removed.

2. Add string delimiters before and after the block, making the block an in-place string literal. This can be broken by strings within the block.

3. Use a goto statement to skip the block. This is a pretty good method but you have to worry about the label getting moved, and fellow developers laughing at you for using goto.

4. Wrap the block with a big if (false) { }. This is the most bullet proof method, but it violates your goodness criteria.


Delete it and refer to your version control system's history if you need to bring it back?

Edited addition to reduce the snarkiness a bit: if I see a block of dead code like that left in a file, regardless of which way it's been killed, I'm going to consider that bad unless there's a very clear comment explaining why this isn't needed anymore but was worth leaving in the code instead of fully deleting it. But if that comment's there, and the reason makes sense, it's still breaking the "rule" but becomes an example of knowing when to break the rule.


Honestly, the code's probably changed enough since it was commented out to not be worth considering anymore (short of maybe leaving the comment, reworded, to explain what used to happen and why).

Large amounts (I regularly see 500+ lines in a single file) of commented out code is a huge smell and it would take an outstanding justification to get me to not delete it.


5) delete it from your code. You do have source control, so you will be able to recover the code later. If you want to leave evidence, add a comment:

  -- Removed Foo handling in revision ... because of ...

That should be rare, though.

For temporary disabling, I tend to do:

   const disableFooForDebugging = true
   if( disableFooForDebugging)
That makes it easier to recognize and correct accidental checkins.


Sorry, I was attempting to be humourous. Obviously #5 is the right way to go. I can't remember ever pulling something useful out of a block of commented out code.


Dumb. Mangles Godel.

The premise is that for real world inputs some code may never be used, and so it's an advantage for late-checked types that you can have wrong code that will never cause any observed bug. I'd rather "waste" my time deleting nonsense code after it fails an early type-check. Less code = better reading/maintenance/extension.

Better to focus on the real advantages of single-type languages (usually, terser programs, although type inference can make some haskell/ML programs apparently type-less).


I agree, we need to have far more human compilers. Perhaps, when I compiler runs across a type violation it just breaks out laughing at you. "HA HA HA! You idiot! You defined this variable as a string, and then you tried to do arithmetic on it, WHAT WERE YOU THINKING?"

Maybe the compiler can just complain about the user. "Seriously, this guy first gets me to make this integer, and that's cool, but then he starts to use it like it was a float, so you know, I gotta go and rearrange ALL the bits. Then you know what he does, he goes and takes another integer and tries to math them together. Screw that guy. But you want to know the kicker? After all of this, he casts it back to an int and then tries to append a string. Seriously. Make up your mind. I need a beer. What do you think I am, Javascript?"

Maybe the compiler just gets angry. "You know what? First you claim it's this structure, then you treat it like that structure. Whatever, I say it's crap. I'm deleting it. Write it again, and make sense this time."

I prefer my computers less human.


"I don't know, you asked me to divide that integer by this string. What am I supposed to do?"


a typedef name was a complete surprise to me at this point in your program


>The logic goes that if you reject "bad" programs then only "good" programs remain, and who would want to run "bad" programs anyways?

I thought the logic was: if you reject all bad programs and some good programs, only good programs remain, is it not?


I think the logic actually goes "if you reject upfront all programs with a certain type of badness as well as some programs that falsely appear to have that type of badness, you will reduce the costs associated with bad programs by catching many of them sooner, at the cost of having to rewrite some already-good programs in less-simple ways to avoid the appearance of badness."


Let's put that on a bumper sticker :)


This article appears to me to failing in being entirely cogent; his example of the if that returns a number or a string seems nonsensical, and therefore "bad" by some definitions, which also points out that bad is subjective.

That being said, perhaps it would have been a better article if the author presented some arguments of "bad" programs that pass the type system, as type systems are no silver bullet and do not guarantee rejection of all bad programs.


> A static type system is a mechanism whereby an algorithm determines if a program exhibits a property, P, and if the property is not found to hold, then the program is rejected.

Doesn't this definition encompass dynamically typed languages as well? e.g. P is syntactic validity?


Syntactic validity is not a property of a program, but a property of a piece of text that makes it a program. In any case, the point is that type systems purport to identify some amount of "goodness" in a program. I agree that "The compiler will produce something" is a property, although I doubt it is a property any static type proponent would defend as particularly important.


I don't think I've ever seen someone use such gleefully religious rhetoric to advocate dynamic types before.


When was this a non-religious issue? But I believe the problem is following: Either the type system is incomplete (there are programs out there that you might want to run but won't be able to), or it's unsound (some programs will get a pass the type checker but fail at run time), or the type system will be so complex as to being undecidable. Since people generally aren't happy with number one, you either have to rely on the programmer (number two) or force everything to pass through a potentially loopy process every time you compile something (number three)

Ot, one might go the Qi/Shen route...


Me too. This and all those Go fanboys make me really embarrassed of my profession.


I'm at a loss when it comes to considering something with dead code or a potentially unexpected return type to be good code, especially in something statically typed...


Funny coincidence, just today I wondered if there's such a thing as a statically typed Lisp, and stumbled upon Racket with types [1]

I think another good example of a optionally static typed language would be Julia, where you can add types, and if you add them, the compiler will try to leverage that information to optimize your code [2].

I've used dynamic and static languages mixed for most of my life (Javascript, Python, PHP back in the early 2000's VS. C and Objective-C). Recently I started working with Scala, which has much stricter typing than, say, Objective-C, and I have to admit that after a bit of early helplessness in the beginning, it feels strangely liberating that, once the code compiles, the program is solid. Successful compilation brings with it a sort of satisfaction.

[1] http://docs.racket-lang.org/ts-guide/ [2] http://docs.julialang.org/en/release-0.1-0/manual/types/


> Other more quaint languages like Java may have type systems that check properties such as "The program does not add numbers and strings"

Poor example. In Java, number + string means "convert the number to a decimal string then concatenate", and it can be invoked unintentionally.


Does it? I thought string + number does that, but number + string does not compile.


That's not a counter-example. I'm not saying that Java doesn't allow syntax that might look like "number + string", just that it does not actually add them, like C does where: 3 + "4" means something very surprising.


> In the final analysis, we may appeal to the Gödel Incompleteness Theorems and conclude that in every logic there are statements which are true, but not provable, or provable, but not true.

Is this wrong?

"true but not provable" is easy.

What is an example of a statement that is "provable but not true"?


"Provable but not true" statements only arise in logical systems which are inconsistent, so it's fairly normal to not be able to come up with one.

So you need to assume two contradictory things are true, somewhere deep down in your logical system. As a trivial example, assume that both the following are true:

1. I am a robot. 2. I am not a robot.

And our false statement which we are going to prove is:

Pizza is made of fear.

We know that the following proposition is true:

I am a robot OR pizza is made of fear

Because we know that I am not a robot, the first term in the above must be false, and so the second term must be true. Therefore, pizza is made of fear.

That's really more an example of the principle of explosion, but the above is a tiny logical system which contains statements that are provable, but not true.

In practice, we tend to use logical systems which can't prove all true statements, but don't let you prove false ones. Much more useful, really.


It's true. The theorems are a statement about the logics. Most of the time when you design a logic, you get to "choose" which case you'll get. Most people choose "true but not provable" because that's a little useful. It would be very bad to choose "provable but not true", because then you could never trust any conclusions of the proof system. - the author


Poorly worded, and a bit disingenuous to bring Gödel's theorem in this context. Taken to it's logical conclusion, by his argument we shouldn't bother analyzing the properties of programs at all.


Are there any languages that have a type mismatch warning rather than a type mismatch error? Something like how gcc warns about incompatible pointer types when compiling C programs, but lets you use them if you insist, but for all types?


I suggest reading about progressive types:

http://blog.brownplt.org/2012/09/01/progressive-types.html


Wow, thank you, that is exactly what I had in mind



A few implementations of Common-Lisp (like SBCL) do this already (not surprisingly), and I find it very useful to compile misbehaving programs (and having not being segfaulted upon). Once you verify function behavior with a bunch of test cases, you declare types everywhere in the code and set the compiler optimization to full.


> (+ 1 > (if (negative? (fahrenheit->kelvin (abs some-number))) > "2" > 2))

This is inconsistent code to being with, though. You can't guarantee it "never" reaches the first branch, as much as you can't guarantee there aren't bugs in the conversion function, or on the underlying interpreter, or that machine details wouldn't make a big enough number wrap around and return negative. All in all, an inconsistent code branch hanging around is a bug even if it compiles.

This is more an evidence that "if" is evil rather than an argument about typing, really.


I like the ending: "Please enable JavaScript to view the comments powered by Disqus." ;)

I know that the strengths of racket lie elsewhere, but I wish the performance would be better (at least comparable with sbcl) ;)

As for dynamic vs static typing - I don't care much about human rights, so I prefer static typing :D

Dynamic typing (and laziness) is good when you don't have to care about memory footprint nor performance of code you write e.g. when you're prototyping something.


This is science. It must be true.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: