Hacker News new | past | comments | ask | show | jobs | submit login

The best kind of documentation is that which is checked by the compiler and guaranteed to be correct - the code itself. Code that is well written, in good style using sensible variable names, can be as descriptive as good comments. Using a good static type system allows you to encode properties into your code that are guaranteed to be valid.

I certainly think some comments have their uses - but these are generally at the level of how systems and modules work, and the concepts used therein, as discussed elsewhere in these comments. I agree with the article that only 5% (or less) of functions need individual comments attached.




I've been maintaining some code written with this philosophy, and I find it lacking.

Code is very good at answering "How" but often the reader needs to know "Why" or "Why not".

In fact, the maintainer of your code will rarely be reading it to figure out how it is working - almost always the next person looking at your code will want to know why it is not working, or will be attempting to change the behavior.

Comments can guide as to pitfalls that you've avoided in your implementation and can answer the all-too-frequent question, "What were they thinking?!"


Another problem: It seems like developers in the "clean code needs no comments" crowd are also the ones least likely to write clean code in the first place.


Clean code needs no internal comments, but that's very different from whether software contracts (which are inherently outside the code operation and therefore need comments) need comments.

I strongly dislike extraneous comments. But even some clear code needs some external communication alongside it.


Yes. At this point, I really only write "why" or "why not" comments.


I used to believe in "self-documenting code," but it takes far longer to read through code than it does to read a comment on the code, and you're far more likely to come away with an intuitive and correct understanding, and with far less effort, by reading a comment than code. When reading actual code, your understanding of how the piece of code might be incorrect, which could domino to bigger errors down the line. Also, it might be easy to overlook some critical line in the function which significantly affects its behavior.

Perhaps most importantly, most of the time you don't need to know how something works, only what it does. Unless you have a reason to doubt that the function does, in fact, do what it says it does, it's much nicer to have a few lines of comments -- written in human language -- to describe the inputs and outputs of some function, or the reason we're invoking some function here, than to have to actually go and read through the code of a function. In fact, reading through the actual code can sometimes impede understanding, because of the reasons I stated above. Now with some functions, this purpose can be entirely expressed in the function signature, but with more complicated functions, that's unlikely.


Exactly. This "code is the documentation" mentality is simply laziness or an excuse for poor engineering.

How about "the bridge is the documentation" for civil engineers? Or "the house is the documentation" who needs building plans?

Documentation also has to cover larger scale interactions, that is how objects interact with each other and how they fit into the design.

All that said, in a large software project you need to pick your battles. Maintaining the same level of documentation across the board and throughout the life of the software is very difficult. Make sure though that your core is well documented and you keep that documentation up to date. Libraries and APIs used externally also need to be well documented.


The equivalent of source code in civil engineering (and engineering of physical things in general) is the drawings/plans, not the actual bridge. In practice it turns out that drawings for physical objects are often even worse at comments than software, in part because leaving comments on CAD models is so much less convenient than in code.


Not exactly. The drawings I've worked with had lots of annotations on them (e.g. dimensions and manufacturing instructions). I agree it's not a perfect analogy but perhaps uncommented/undocumented source code lies somewhere between the physical thing and the drawings (or CAD file). My point though is that in theory every bit of information can be observed from the physical object it is a poor way of storing that information.


This is only valid in narrow situations, for example when the person reading the documentation is reading them in the code, and when the code is part of a monolithic (or other single-tech) application.

The code you look at to find the solution might be one 20 or 30 line chunk of Ruby that performs a service for a chunk of 10 year old VB or 20 year old Perl or 30 year old C, or some chain of several languages. A support guy, or apps-level documenter, or maintenance programmer adding a feature, or architect integrating with another system, or business integration consultant helping to decide where the business needs to invest, or some other decision maker really, really doesn't have time to read through 40,000 lines of code in several languages to find out how a feature works.

For example: one of my first jobs was with an established big brand with many years of legacy data and organic "enterprise" systems, integrating data produced by an AS400 green-screen application into VB (on a Windows box) by copying (via FTP on a SCO box) a fixed-width text file produced by a shell script on the AS400, and parsing it so we could put it into Oracle for processing by a C++ application with API hooks into a Nortel Meridian coms system. When an outbound call goes to the wrong number, where's the bug?

Even if I'm reading _my own_ code 6 or 36 months later, I'm much happier if I've logged the checkins correctly so that it narrows it down to which dozen or so of many thousands of commits touched a feature. Whenever I've had to track down someone else's bug, or tried to justify the technical justification for some business decision the system makes, or tried to write high level progress documentation (think changelog for senior managers), the commit messages make the difference between it taking two weeks and taking two years (i.e. never happening).

It's easy to think, in the post-codial glow when you're fresh from the zone, that there's no way this code isn't absolutely obvious. I've been that guy. I've also been the guy that cursed that guy for making it hard to find the needle in the haystack. I've even been both guys separated by 18 months. Commit messages can make the difference between getting it done in 20 minutes, and looking at "code that is well written, in good style using sensible variable names" for a solution for two days.


What happens when someone makes the assumption that the comments are up to date , correct and unambiguous? They get fired.


Who is assuming that? The comments let you write your code quickly, and then of course you test to make sure it works. Comments do not obviate the need for testing, they just mean you don't have to (if up to date) read tons of source code.


Who are you writing the comments for? If yourself as pseudocode, fine, but get rid of them when you are finished. If you are writing them for another developer down the road, that developer will either ignore them because he really doesn't know for sure the comments can be trusted, or will blindly trust the comments and now and then will get burned.


> The best kind of documentation is that which is checked by the compiler and guaranteed to be correct - the code itself.

Which lends your documentation to being hard to read for a certain percentage of developers, varying depending on your project.

This strategy, while easy for developers experienced in the target language and familiar with the code, can have real negative consequences if the people involved in the project are of a different fluency level with the language or even CS in general, or new to the project.

Q: What I just coded is obviously quicksort, so why should I label it?

A1: Because not everyone is used to seeing quicksort implemented manually in C, assembly, python, etc.

A2: Because without knowing at a high level what you are trying to accomplish, it's much harder to ascertain whether that bug you just found is really a bug or an interesting feature which will come into play a page later.

A3: Because knowing immediately that this chunk of code does NOT pertain to some specialized code to select items after sorting saves time and cognitive load.


You can't go around writing comments for the lowest common denominator. Insanity ensues.


I don't think I'm advocating that in any way, just at a minimum sanely peppering the code with markers. to keep with the quicksort example, a simple single line comment at the top would suffice:

/* standard quicksort on foo before we make our choice below */

That hardly seems like insanity to me.


Maybe not insanity, but I would find it a little noisy reading that code, like having to code at a desk near the receptionist.

I wouldn't mind if the comment said "using qucksort because n is expected to be large". But only if the choice of quicksort over some other algorithm was deemed significant.


My point is really to just define the block as containing quicksort when the function it's in does other things as well. Refactoring it into it's own function with a useful name would be just as (or probably more) useful. It's really that lack of either which I see as insufficient


Code is not self documenting. I worked with a bunch of Rubyists that thought this (I love ruby, btw) and I wanted to strangle every single one of them.

The parent comment nails this. These guys who thought "my code is clear therefore self-documenting" were some of the worse system designers and myopic thinkers in the company. What's worse, is this attitude usually extends to "my code is simple and therefore doesn't need to be tested."


Ruby is dynamically typed, so doesn't have the power of a type system to encode invariants. It's very hard writing self-documenting code in a dynamically typed languaged.


That kind of code doesn't explain all the different ways of accomplishing the piece's goal you tried and why they were lacking.

Some random developer comes at a later time, thinks this code is more complicated than required, refactors it and only then sees why you didn't do it that way. This happened to me many times (both in the "original dev" role and the "random future dev" one).

Comments can easily clarify why this particular piece of work is implemented in this manner and not the others you tried, saving a lot of time to the future devs.


I don't think you disagree with the article here. Most code should be straightforward and no comment of the sort you describe is necessary.


No. The best kind of documentation is the kind that explains the non-obvious decisions taken in the code. Or the kind that outlines the interactions between function x and states 1-3. Or the kind left behind by some poor bastard archaeologist who comes in after the fact to fix the appallingly opaque ball of hair I pooped out under terrible deadline pressure.

Documentation has multiple purposes, and multiple audiences, and a good static type checker can't do anything for most all of them.


I think you are neglecting the "API" case. If I want to call some code that you wrote that I think might solve me problem, I have to have access to your source, read all the source code, implement the machine state in my head, and figure out what you are actually doing? No thanks.

Note I don't necessarily mean an externally facing API, which I would assume you'd agree needs good documentation. Even if we are on the same team, I don't want to have to read your code just to figure out which of the frob_XXX functions to call. Maybe that is what you meant by your second paragraph?


A simple example comes from the article itself. Parameter 'strength', the strength of the frognication. What is the range of that? Is it 0 to max int? min int to max int? 1-100?

(A real-life example would be a function that outputs an image in JPEG or PNG and takes a quality parameter. I notice that within the same library often one type of image wants 1-100 where another type wants a different scale, like 0.0 - 1.0. The parameters have the same name.)

This is not something that can be checked by most type systems. Therefore it needs to be in the documentation.


> A simple example comes from the article itself. Parameter 'strength', the strength of the frognication. What is the range of that? Is it 0 to max int? min int to max int? 1-100?

[...]

> This is not something that can be checked by most type systems. Therefore it needs to be in the documentation.

Or, alternatively, we need better type systems. (Of course, it can still be in "documentation", just documentation that can be automatically generated from code that is also given real effect by the compiler, and thus documentation that can't get out of sync with the implementation.)


This depends entirely on your language and environment. Working on embedded systems, where memory is at a premium, the way an algorithm or data structure is coded up may be entirely non-intuitive, merely to get under the memory limits or to eke out a bit more performance. Good documentation + good code is far better than just good code.


Even good code rarely describes by function and variable names alone what intent and business purpose is being provided. Code is the 'how', documentation of some other form usually provides the 'why' (comments, specs, tests, external documentation, whatever).


A well documented code is a combination of both comments and good variable/methods names. Comments should explain any assumptions made, exceptions, or sample input format etc. The variable and method details,including logic explanation , should be take care by better naming and clean code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: