I usually find that the wider the scope of a variable or function, the longer and more descriptive the name I want to use. For names that are only used locally, there's more context available, and the name can be shorter to keep the code more concise and easily scannable.
For example, I might call a function in a geometry library gradient_of_line, but within a line module in that library, I'd probably just call it gradient.
In mathematical code where there is a common notation that happens to be very concise, I'm quite happy writing code that uses it, like
y = m * x + c,
as long as the terminology/notation is standardised or well understood. I don't think writing
By a similar argument, I have no problem with giving abstract loop control variables like counters a short name like i or n, since they are only used within a small area of the code. However, if the variable represents some more meaningful concept, then I would probably name it accordingly.
I first read this principle in some Java coding guidelines published by Sun, and I think it's great advice. Anything public probably deserves a longer name, keeping local variables short promotes readability.
Of course, naming is one of the "Two Hard Things".
It's absolutely an art, not a science. But it shouldn't go to either extreme -- names that are too short (var xi_a) are too confusing for clarity, but names that are too long provide too much visual noise to be clear.
In my opinion, "make_person_an_outside_subscriber_if_all_accesses_revoked" just takes too long to read. Much better would be something like this:
// If all accesses have been revoked, then make the person an outside subscriber
def updateOutsideSubscriber
...
end
The function name is short and gets straight to the point of what it does. Someone coming across the code for the first time can instantly figure out roughly what the function is, and then decide if it's worth their time to read further, or if they should just skip over it.
Then, a comment explains in further detail what it is for, and why it is used. You only need to read it if it seems important to your task at hand.
Good variable naming strategy goes hand-in-hand with good commenting strategy.
The end goal should always be maximum clarity, but remember that too much information, or too much information presented upfront, can reduce clarity.
37signals says "clarity over brevity". I'd say "clarity over extremes".
And then 3 months later someone comes and reads the code above using your module it sure would be nice if they didn't have to go dig through the source of your module and read the comments to figure out that it really means "make person an outside subscriber if all accesses revoked"
It's a good point. (And BTW I edited my comment to be "updateOutsideSubscriber" when I realized that was what it was really doing.)
It seems to me that "make_person_an_outside_subscriber_if_all_accesses_revoked" just sounds like an internal module function. If something outside the module is calling it, it just smells like the whole thing is organized wrong.
But sure, if the function is being used in a "global" way like that, then the full, long version is probably better...
It doesn't bother me that make_person_an_outside_subscriber_if_all_accesses_revoked is long, it bothers me that it seems too specific to be a method. Part of the condition is in the method name (if_all..), that's something that really seems odd to me.
What if new requirements dictate that if the same thing should happen given a different condition (if all access is revoked or if the user is set to be disabled), would the method name then become make_person_an_outside_subscribe_if_all_access_revoked_or_user_is_set_to_disabled ?
The function as described means that the only time this method is to ever be used is if all access is revoked... if that's the case why even make it into a function? Or to put it another way:
//make_person_an_outside_subscriber_if_all_accesses_revoked
function revokeAccess(user) {
user.clearSubscriptions();
user.addSubscription(new OutsideSubscription());
//other revoky stuff goes here...
}
//shift_records_upward_starting_at
function shiftRecords(startPosition, amount) {
//up or down. to go down supply a negative amount
}
I'm not sure updateOutsideSubscriber is a good substitute for the original. It reads like you're updating an existing outside subscriber, ignoring the creation/conversion aspect.
The original function appears to have two responsibilities: it performs a test on the accesses, and then it also converts the person in some cases. The long name follows from that complexity. Unless there was a good reason not to, I would prefer to separate those responsibilities into their own functions with one job each, and then the naming problem takes care of itself:
if all_accesses_revoked(john):
convert_to_outside_subscriber(john)
I disagree. In my opinion, "make_person_an_outside_subscriber_if_all_accesses_revoked" is an ugly name for an ugly function. The label matches the content, and that is the primary goal of naming functions. Of course, it would be better if things were refactored into
var allAccessesRevoked = items.All( areRevoked);
if( allAccessesRevoked)
person.makeOutsiedSubscriber;
but sometimes, the programmer hasn't discovered a good refactoring.
The problem with this is that although your solution might be better while reading the file where `updateOutsideSubscriber` is defined, once you see that method call somewhere else you'll have to go to that other file to read the comment. With the longer method name you don't need to read the method definition to see what's going on.
I once wrote a program that stored complex rules in a database (what was required for different types of legal documents, in thousands of different jurisdictions across the country). The table names I used made the relations obvious, but they were long.
A few years after I left that job, I briefly ended up back at that company on a contract assignment - for a different project, under a new development manager. At one point, I had to pull some data from the database, and had no problem finding the information I needed, or how to connect it together.
The dev manager joked about the long table names (not knowing I was the source of them). But when I asked him if any of his current developers had trouble finding the information they needed, or understanding the relations, he said, "No, they just get tired of typing them."
That's just one anecdote, but I'm going to stick with long, descriptive names.
Agreed. Short names seem like their optimizing for typing speed, rather than ease of readability. If it's true that code will be read many times more than it is written, then descriptive variable names are the answer.
DescriptiveVariableNameUser thinks DecriptiveVariablName is great. DescriptiveVariableName constantly reminds DescriptiveVariableNameUser the DescriptiveVariableNameMeaning of DescriptiveVariableName, so DescriptiveVariableNameUser never forgets descriptiveVariableNameMeaning.
The only way DescriptiveVariableName could me made for meaningful is if DescriptiveVariableNameUser declare s DescriltiveVariableName's type (DescriptiveVariable) every time DesriptiveVariableNameUser accesses (DescriptiveVariable) DescrptiveVariableName.
Of course, knowing when to use an extremely descriptive name is half of the battle. Having a good aesthetic sense as a programmer is crucial. As a general rule, if I can see the entire lifetime of a variable in my field of view, I'll go for short/terse variable names (loop control, temp values, etc). Or if the method name is descriptive, and the code in the method is fairly short, I won't need to repeat myself in the variable names. Good sense is key.
Also, the names used in the article show a very poor aesthetic sense. Names like that are screaming out as needing a refactor. Having a condition in your method name is a code smell. Your methods (read: api) should be abstracted as discreet actions, and the code that calls it should perform control flow.
Very good point: long variable names make it harder to spot when they've been mistyped. Even worse, some long variables become other long variables when mistyped in just the right characters.
On the other hand, in long variable names, the excess of characters provides redundancy, which makes it easier to error-correct. The parent post is riddled with errors that would halt a compiler, but many humans would not even notice them.
This is a strawman. When have you ever heard someone say "Let's choose short variable names! Screw readability!" ?
I don't buy that short variable names are necessarily less clear. I think the ideal is to find variable names that are terse and clear. And I'm strongly in the camp that believes that brevity aids clarity.
It's not so much "screw readability" as "long names are ugly and annoying to type, and if you can't figure out what 'i' and 'x' and 'dx' and 'j' mean in this context, then you lack intelligence and that's your problem, because it's obvious, duh. Real programmers are efficient, not verbose."
I've worked with quite a number of programmers who seem to have a very hard time putting themselves in the shoes of someone who will have to read/maintain their code later. Good variable names are all about communication, and there are certainly programmers out there who don't have communication as one of their strong suits.
> I've worked with quite a number of programmers who seem to have a very hard time putting themselves in the shoes of someone who will have to read/maintain their code later.
We all do that. How many times have you looked at something, went "What asshole wrote this?" Only to find out from git that it was you?
It sounds like what you're saying is that the strawmanning can go both directions. People who want longer variable names can be regarded as lacking intelligence and being inefficient. People who want shorter variable names can be regarded as not caring about readability or communication. The truth, as usual, is somewhere in the middle.
there are really smart people in the camp of long variable names, and there are really smart people in the camp of terse variable names. both sets are optimizing towards whatever subjective definition of "good code" they have, where, again, their definition is informed and intelligent even if it differs.
if someone really smart says they use terse variable names, you don't tell them they're wrong, you try to understand them. There exist styles of writing high quality code that don't need naming style like `someone_else_just_finished_writing?`. there are styles that use it. if you're only accustomed to the long way, you'd be well served to go join a successful team that uses the short way, instead of blindly saying that they're wrong. some of the clearest code i've ever seen was Haskell, for which i think most people find you don't need long names.
me, when i find myself needing long names, i try to refactor until i don't need them any more. its not always possible given time constraints, backwards compat, trying not to change fragile code etc, but i mostly chalk that up to a personal or team failing.
When you use it right, brevity is clarity. I'm all for descriptive names, but the elaborate wording in those method names is like writing "utilize" when you could have said "use".
Utilize, upward, starting at, finished writing
vs.
Use, up, from, wrote
So:
def shift_records_upward_starting_at(position)
could be:
def shift_records_up_from(position)
and:
def someone_else_just_finished_writing?(document)
could be:
def other_user_just_wrote?(document)
I don't think the shorter versions are any less clear.
I really like your revisions. While I agree with the spirit of the 37s post, I've found that megalong method names like the ones they listed are hard to reason about simply because they take up more human memory, thus making it harder to hold other objects in memory at the same time in order to compare.
I agree with David's sentiment and I have been doing this for years... except that I have been consistently told by other developers that I am doing the wrong thing. One of the issues in our industry is that there are no right answers, everyone seems to be correct and the biggest, loudest person in the room wins the debate. Obviously, moderation is key for anything, but it'd be nice to have a good, clear set of rules to live by (that won't change in 10 years).
For me, clear and concise method names have always help me understand code that I'm reading, as well as understand a stack trace.
Heh. Yeah I wanted to give a long enough timespan, but you are absolutely correct, even 6 months would be nice.
<rant class='mini'>
When I talk with friends in other industries (e.g.: civil engineering), they have a very specific way of doing things. These fundamental activities do not change very often because they're based on decades of experience (and, I assume, because bugs in their process could kill people). That doesn't mean things don't change: concrete mixtures and such are always improving and sometimes it sounds like the IT industry with all the new tech coming out, but that's just materials technology, not how specs and processes are written.
</rant>
The preference probably depends a lot on your background. If you're spending a lot of time stitching together APIs, you probably prefer longer names for clarity.
If you spend a lot of time writing math, the opposite is true: shorter variable and function names are better for clarity, to the point that arbitrary operator overloading is essential -- as anyone who's suffered through matrix.transpose().multiply(matrix2.inverse()).multiply(vector.multiply(scalar)) . . . would tell you!
This is flame war territory so I'll try and avoid fanning any flames. One advantage to sticking to short lines (80 chars) is that you can place several source files side by side and read them.
You can also print them. My co-workers sometimes print code. Please don't laugh, some people do not have good vision (and one day, you will be that person too).
I always feel articles such of these should be prefixed with "within my humble experience and within my application area".
"Comments are generally only needed when you failed to be clear enough in naming. Treat them as a code smell."
...but only in simpler applications, such as self-describing CRUD type web applications?
I think I agree with the quote within some web apps, however remember some code may take weeks/months to appreciate the complexities of. For example, a TCP stack is a complicated thing, born and refined through much research for several decades. Notes of various design choices and optimisations need to be documented, and long comments are sometimes the most convenient way to do so.
Agreed. I recently had to write a lot of digital signal processing including some things like unscented Kalman filters and other things you need a bit of stats background to digest. You don't just dump a load of linear algebra down and hope the poor maintainer can keep up! It's beneficial to explain what is going on. Just as a mathetmatics professor will talk while writing on the black board in a lecture, rather than just writing it in silence and arrogantly declaring it is complete and self documenting and walking out the door.
I've read far too many comments above functions that simply restate exactly what the code is doing or explain what the stupid abbreviated variables mean.
Comments are a tool that should be used sparingly. They definitely have their place and can be used helpfully. I've found that they're easy to screw up though and I try to avoid them.
I always find short variable/function/class names are often developers optimising for themselves right at that point in time, but when you get two years down the line and the developer has disappeared into the sunset you're stuck with it.
I think since I started programming professionally I've stuck to expressive names (not necessarily long names), mostly for code self-documentation. It's nice, most editors/IDEs tend to have name completion, even if it's only document bound, so you only need to type it in full once.
With the risk of RSI, I don't need to type any more than I already do. In a perfect world tab completion would exist everywhere and you could make arbitrarily long variable names. But do names like "make_person_an_outside_subscriber_if_all_accesses_revoked" really make your code more readable? There's only so much info you can put in a variable name, and I'd rather be able to see more code at one time on the screen that read a massively long name every few lines.
Especially when using underscores, it's too easy to miss a period or a subtraction.
And when you have several lines using such long names, it becomes a wall of text and have you to read everything out loud to understand what's happening.
I subscribe to the Church of Less Than 40 Character Lines, for no other reason than it's easier to read. Doesn't matter if it's code, poetry or prose.
My eyes never need to drift to the right side of the page/screen. Everything is justified at the left. Keep indentation to a minimum. Alas, this is not a popular idea when people write code.
Books, newpapers and magazines often adopt multi-column formats. Why? Does anyone know?
Columns also allow room for comments. Columned class notes are a great example. You can even leave the whole right side of the page clear for adding comments later.
Unconvential perhaps. But very useful.
Anyway, names are arbitrary. They are inherently ambiguous. The truth is that computers work via numbers, not names. As such, naming will always be a subjective affair, to some extent.
Here's an ignorant technical question (I'm not a developer): Can variable- and method names still be discerned from publicly-distributed executable code by reverse-compiling it? It used to be that way years ago, but I have no idea if it's still true. If so, verbose names would make it that much easier for a competitor to reverse-engineer your distributed executable code. From that perspective, comments might be preferable to verbosity, because comments would get stripped out in compilation.
(Of course, this assumes that you distribute your executable instead of keeping it inaccessible to others, for example by running it on a Web server.)
Not generally, but sometimes, depending on the language. That said: it is very easy to reverse engineer most compiled code regardless of the language; it is only a constant factor harder than understanding the source code in its original language.
Generally, virtual machine or interpreted languages have such symbol information available, native compilers tend to not include such things in the binary.
It is not that big of a deal though. On platforms like Java or .NET, you can run an obfuscator to strip out names and leave confusing looking strings.
But it only increases reverse engineering complication very slightly. If you're trying to protect a small secret (like a key validation or special algorithm), you're out of luck, even if you strip symbol names. If you're trying to prevent wholesale ripoff of your app, legal action is more effective.
Making your code "worse" is a terrible tradeoff for approximately zero gain.
In some programming languages/platforms (e.g. Java), the product that you distribute can indeed be reverse engineered in a way that most original names are kept. But there are tools to obfuscate the code before turning it into something executable.
In some others (e.g. C), the executable that's distributed pretty much talks the language of the underlying machine, so the original code structure and naming are not preserved.
If you have to resort to that long method names it means you 're doing something very specific (which means the method body will be very small); it's more flexible and clearer to just paste the method body (case in point the examples given)
By inlining that logic at all call-sites, you're abandoning encapsulating the logic for "updating user records to mark them as outside subscribers if all accesses are revoked" in a Single Point of Truth.
This creates at least two problems:
1) You need to remember to update every inlined use of that logic at bug fixing time. Experience says you'll inevitably miss at least one, the first time around.
2) You've created a Multiple Points of "Truth" maintenance burden. When you hand the code off to somebody else for maintenance, and there's some bug around "updating user records to mark them as outside subscribers if all accesses are revoked", they have to figure out whether or not the fact its being done different ways in different locations is intentional or accidental, and if accidental, which way is in fact correct. This is always an enormous pain in the ass.
I don't think that's true: if it's a small simple method, and you replace it with inlined code, you have a real mess when you later discover a subtle bug and need to replace its guts.
Doing one small and very specific thing is practically the definition of what a function should be. I don't completely agree with names given in the original post but I know that I cannot stand working with developers who cannot structure code into meaningful functions.
This is one of the reasons I like functional programming. Each function has one explicit task, so variable names can be short while also being very descriptive.
For example, I might call a function in a geometry library gradient_of_line, but within a line module in that library, I'd probably just call it gradient.
In mathematical code where there is a common notation that happens to be very concise, I'm quite happy writing code that uses it, like
as long as the terminology/notation is standardised or well understood. I don't think writing is an improvement in this sort of case.By a similar argument, I have no problem with giving abstract loop control variables like counters a short name like i or n, since they are only used within a small area of the code. However, if the variable represents some more meaningful concept, then I would probably name it accordingly.