Code duplication sometimes is a good thing but it doesn't prove that clean code is a bad thing.
I think you overcommitted a bit on the refactor, if you had replaced those "10 repetitive lines of math" with a function it would have been cleaner.
Let's be crystal clear about it, your teammate did a terrible job not creating a function for those 10 repetitive lines. I would be rejecting this PR but never rewriting it.
Rewriting a PR is a quick way to insult someone because you're not open to debate and many times you don't have the full picture of why the code was written this way. PRs have to be reviewed and rejected when there are no coding standards, the last thing I want is people committing code and saying "Clean code is not important", this mindset only leads to a codebase that no one wants to work within a matter of months.
Communicating better and having some tact is the lesson you should be taking from your experience, clean code has nothing to do with it.
> Rewriting a PR is a quick way to insult someone ...
Although I agree with this there is also another, more subtle thing going on. Rewriting the code that works and someone else is maintaining is a waste of the rewriter's time and is unprofessional. It also denies the original person an opportunity to learn because they don't see how lower quality code wastes time in practice (the time wasted in the refactor doesn't count because Abramov did it by initiative).
The clean code approach is better, but the issue here isn't code. It is that he was wasting his time while not providing useful feedback to the unnamed coder. He was making terrible choices. He isn't maximising the business or team outcomes. The best outcome is the original coder raising their standards and him going and working on something that needs work. He should have angled to that. Ie, the correct option was to do a code review.
> Although I agree with this there is also another, more subtle thing going on. Rewriting the code that works and someone else is maintaining is a waste of the rewriter's time and is unprofessional.
Rewriting existing functioning code is not a waste of time in general. We don't know the full picture, so one cannot make such a generalized statement. I can write perfectly functioning code in the most terrible way that makes maintaining the code a herculean task. Very easily so even. Making things worse is almost always easy. Rewriting that code might make maintenance much easier, even if the code was previously working. It might make onboarding easier, since others can better understand what is going on. It is not in general unprofessional. To judge that, you need a way more complete picture. It can even be unprofessional to leave horrible code as is, instead of fixing it, leaving future people to plaster over it with more leaky abstraction.
But, without talking to the other person is the problem.
This made it sound like it was checked in, and this guy jumped in and re-wrote it the very same week without talking to the other person. That is probably the biggest problem here.
So, there was a project happening, one person did a section, someone else jumped in and re-did it (double work), without talking about it (risk another future triple re-write).
Not specifically for this blog post, but this is more common than it should be, and it's a symptom of organizational failure most of the time.
Either the code was merged without the people that would be directly impacted by it reviewing it, they had two teams touching the same code-path almost simultaneously and not talking to each other or some other lack of communication.
Sometimes the person rewriting the code is simply cutting through the bullshit because they know the original committer simply sucks at their job and you can't do anything about it (yes, I've seen that before).
Ultimately it could just be the guy being a jerk, but more often than not, it's not as simple as it seems.
I disagree slightly; I think there’s an important thing here which was a little bit glossed over in the article.
The original code established a pattern and left space for where and how to add code to handle edge cases, when they arose, whereas the clean code solution was implemented as though it was the complete and final solution.
A first draft is often incomplete, and in an oversimplified first draft, it’s easy to pull out patterns and remove “duplication” which is a result of the simplification, not an inherent feature of the problem being solved.
Absolutely. The lesson is "goodbye to over-applying a rule without considering context-specific trade-offs", which is a lesson that takes time to learn.
> The lesson is "goodbye to over-applying a rule without considering context-specific trade-offs",
I think the problem is the rule itself. A rule of thumb is fundamentally broken if it does more harm than good. The heuristics to drive a decision need to lead the developer to a good outcome if they apply it without overthinking things or have meetings, because that's what rules of thumb are for.
Time and again developers tie codebases into contrived knots by mindlessly repeating the mantra "duplicate code is bad code" because they fail to realize that code that looks the same is not the same code at all, and sometimes should not be the same code at all for a number of reasons. Developers need to think hard about whether a refactorization helps simplify a project, and shoving code sections into multiple unrelated code paths is surely a way to do the exact opposite. Moreso when a mastermind developer opts to add conditionals to force similar code blocks to fit other code paths.
> Let's be crystal clear about it, your teammate did a terrible job not creating a function for those 10 repetitive lines.
I've found over a long time (20+ years) that this is usually a sentiment held by people who are focusing on the wrong things in code bases (and quite often aren't actually solving real problems but spend their time solving non-problems with the additional side effect of creating more for the future). The first implementation of this should most definitely not abstract away anything like those 10 lines (which is minuscule). It's trivial to take something that does exactly (and only) the thing and modify it later and it's pointless to abstract away something as small as 10 lines for a gain you haven't yet proven or tested.
"Clean Code" is absolutely not important and most rules/"principles" of the same character as those that make up Clean Code are absolutely not important either, but are things that people with nothing better to do hold on to in order to validate the hornets nests they accumulate in code bases over time. It leads to over-abstracted, hard-to-change code that runs badly and is much harder to understand, generally speaking.
The only thing you should use as a guiding principle, if anything, is to express your data transformations in a clear and easily changed way, since that's the only real thing a program actually does. If something doesn't have to do with improving the expression of data transformation it's very likely it's bullshit that's been stacked on top of what you're doing for no real reason.
Most of the SOLID principles have nothing to do with or in fact make it harder to see what data transformations take place and how, which is why they are largely useless for understanding what is actually going on in a program.
If you implement one single function, then it's easy to refactor later on.
If over 10 years, dozens of people add feature after feature without thinking carefully about the structure of the code, you get huge messes that are very hard to clean up. Then suddenly an old, unsupported dependency needs to be replaced (e.g. because of a vulnerability) and it takes weeks to remove it because there is zero separation of concerns anywhere in your codebase.
You say that overabstracted code is hard to change. I agree, but so is underabstracted code - or simply code with bad abstractions. I don't care particularly about SOLID, since I'm not too fond of OOP anymore, but encapsulation (which is a concept independent of paradigms) is definitely crucial when you want to make sure a code base is maintainable in the future.
> You say that overabstracted code is hard to change. I agree, but so is underabstracted code - or simply code with bad abstractions.
I am not against abstraction, but the attitude that the person's colleague should've *definitely* abstracted away this minuscule amount of lines immediately even in their first implementation is not just abstraction, it's premature abstraction and the kind of dogmatic thinking that leads to the problems I outlined. The blog post scenario is not unique; the correct solution is almost always to leave the code less abstracted and only abstract it when there are actual proven use cases.
Exactly : "premature abstraction and the kind of dogmatic thinking".
This thread is filled with nit-picking this code, when the biggest problem was actually team communications and overall discussions about future direction. Many young programmers will spin their wheels on these types of things, but then sales/marketing might be like 'dude, this has a 1-2 year life span at most'. Or, this is being replaced by product-x next year.
If I had a nickel for every piece of "will be turned off in 6 months, tops" software that my workplace still maintains a decade later, I'd probably double my salary.
> If I had a nickel for every piece of "will be turned off in 6 months, tops" software that my workplace still maintains a decade later, I'd probably double my salary.
Definitely this. I've seen it, too, and I'd wager the vast majority of IT professionals have seen it, too.
At one particular company I spent a decade at, I was brought in specifically to help in the decomissioning of a particular set of legacy databases. One particular database was still on life support because a particular trading desk wouldn't move off of it, and since they made a shit-ton of money for the firm, they got a free pass on it, while IT constantly being dinged every year for the legacy database still being alive. Competing mandates between a profit center and a cost center, the cost center always loses. Which is fine, but then don't blame the cost center for not being able to make their goals because the profit center is dragging their heals. (Which was _not_ done at this firm).
Why didn't the IT folks step in and provide the labor for the conversion? This helps them accomplish their goals. Tading desks love free labor. This shows a novel way in which cost centers can attribute their costs to lines of business. Finally, this demonstrates leadership and initiative by the IT department and heads--especially if they have to fight an uphill battle about IP, and they succeed in convincing the desk.
There's no definites in software development but if a formula is repeated 10 times you probably have a good name for it and at that point it probably should be in a function.
> There's no definites in software development but if a formula is repeated 10 times you probably have a good name for it and at that point it probably should be in a function.
I don't think your take makes sense. The example in the blogpost states multiple methods have 10 lines of math, but even the author mentions they were similar, not the same. The use of the weasel word "similar" already tells you that it wasn't the math that was shaved off. In fact, the supposedly brilliant refactoring that the blogger did was change the whole interface without any good reason, and with bad object-oriented inheritance chains and mixins to boot. What a mess.
Still, the blogger tries to claim this is clean?
There's a good reason why the blogger was pulled into a meeting and gently forced to revert that mess back to its old state, and why the blogger decided to depict a hero journey in a blog post where any sort of counterpoint is left out.
We don't really know what the actual implementation looks like, so I find it hard to make a definite judgement. I think the refactoring introduced by the author probably wasn't good, because it looks like a very brittle abstraction. But I find it hard to believe that among those 10 lines of maths each, there aren't any functions one can extract that form some natural abstraction (e.g. "translate vector" or whatever), so that every handle function ends up being maybe 1 or 2 lines instead of 10.
> But I find it hard to believe that among those 10 lines of maths each, there aren't any functions one can extract that form some natural abstraction (e.g. "translate vector" or whatever), so that every handle function ends up being maybe 1 or 2 lines instead of 10.
That's besides the point, isn't it? Just because you can in theory extract functions that doesn't mean you should, or that your codebase will be in a better shape if you do. There's the issue of coupling, and there's the issue of introducing dependencies, and there's the issue of whether the apparent duplication holds the same semantics and behavior across the project. I'm talking about things like does it make sense to change the behavior on all 10 code paths if you need to change it in only two or three? Having a surface-level similarity between code blocks doesn't make them the same behavior, does it?
If you're translating vectors 10 times, or calculating cosines, or whatever, then yes, you should abstract this into separate functions. It's not like "translateVector" or "calculateCosine" are brittle abstractions or anything.
The code should be flexible. The first iteration wasn't, simple, this is the code juniors in high school write. It always leads to a mess later on. Now, if you are a beginner like you seem to be, then it's a good code. Sure.
I just want to point out that this following line could come across as condescending, even if not intended that way.
> Now, if you are a beginner like you seem to be, then it's a good code.
You don't know anything about the person you are replying to except what they posted, and assuming someone with a different view (which I happen to agree with from my experiences) automatically has less experience (instead of acknowledging that others may have good reasons for their views which one hasn't considered or run into) doesn't reflect well.
The first iteration is plenty flexible where it needs to be. I've been programming since 2001 and honestly, the reason I'm holding this view and you're not is very likely the exact opposite situation of what you think: I think it's hard to find experienced and knowledgeable programmers who think the best time to abstract something is the first iteration of something.
So much this. It’s hard not to lose the forest for the trees — we are craftspeople after all — but at the end of the day the overall structure of a program is so much important than whether there’s a bit of duplication here or there. Better to let the structure emerge and then reduce duplication instead of trying to guess the right structure up front. And yeah I’m still not convinced SOLID is real (but I’m also not convinced classes are useful much of the time, for that matter).
Letting the structure emerge requires people thinking in depth about the underlying principles of what the code does or should do.
As for classes: They are merely a construct in many languages, that people have come up with for organizing code and in my opinion a very debatable one. Some newer languages don't even deal in classes at all (Rust for example) and with good reason. If one says we need classes for having objects—No we don't. And objects are a concept to manage state over the lifetime of what the object represents, so that might be a worthy concept, but a class? I mostly find classes being used as a kind of modules, not actually doing anything but grouping functionality, that could simply be expressed by writing ... functions ... in a module, a construct for grouping that functionality.
I think what many people actually want is modularity, which in contrast to classes is a concept, that truly seems to be well accepted and almost every language tries to offer it in some way or another. It is just that many people don't realize, that this is what they are chasing after, when they write class after class in some mainstream language, that possibly does not even provide a good module system.
> Some newer languages don't even deal in classes at all (Rust for example)
When you first write a struct, and then you below write an implement block for that struct containing functions that can only be applied to that struct, it really looks like a class to me, it just has a syntax that differs from what we're used to in other languages. Why wouldn't you call that a class?
The idea is, that it decouples state from behavior, while a class tries to group that together. Other people can implement traits later for your struct. (I think Clojure has something similar.) They do not need to subclass your struct and in fact cannot even. This design encourages and kind of forces composition over inheritance.
I would not name it a class, because I don't want people to fall back into thinking: "Ah I know! Inheritance!".
Well, having classes doesn't mean that you will necessarily use inheritance. There are programmers (ab)using it a lot, but for me, like for many others, classes are primarily a way to organize code: they provide a convinient way of representing something and specify functions that only make sense in the context of that something, while ensuring that you don't accidentally re-use those function for something else or lose track of which functions are meant for which component. They also provide a way to make collaboration easier, as you can agree on classes' interfaces and then each one goes on to implement it's own classes without having to worry about implementation details of other classes. It is true that you usually also have inheritance with classes, but I'm unsure if having it is a requirement to call something a class. IIRC from a theory perspective, classes are just an abstraction of some concepts, and the fact that a class' instance is called an object reflects this.
I think the person you're replying to tried to address that point that classes are primarily a way to organise code when other possibly equally good or better options exist like modules. An F# module might (for example) look like below.
module Hello =
let say() = "hi" // returns the string "hi"
There are mechanisms to encapsulate implementation details (private functions), to have multiple modules for different "domains" and specifying public contracts.
A class seems to imply more than that: each class specifies how to create an object with a constructor (where an object is something with the class's methods except modifying some state only owned by and accessible by the object itself).
But in Rust you've got constructors as well. The only thing that really seems to be missing is inheritance. But I understand that one might say that classes without the possibility of using inheritance don't look fully right.
> a constructor is a special type of function called to create an object.
Now, I don't think Wikipedia is always the authority on everything, but this is how I view them as well. A constructor is kinda like a callback that the language runtime invokes when an something is created. For example, in C++:
class Foo {
public:
Foo() {
// your code goes here
}
};
int main() {
Foo a; // constructor invoked here
}
(please excuse formatting, there is no unified C++ style so I just picked one)
or in Ruby:
class Foo
def initialize
# your code goes here
end
end
Foo.new # constructor invoked here
These "constructors" in Rust don't work like that at all. They are not special member functions that are invoked when something is created; they are the syntax to create something.
I think this is where things get muddy. On one hand, I think that there is still a function call going on under the hood, even if the syntax hides that. On the other, if you need some preprocessing before assigning the values to your struct's fields, you can still define a special member function that does whatever you need to figure out what the correct values are. In my view, what Rust is actually missing is the default constructor, because it will not initialize variables to some default value for you.
If you define a function that does some preprocessing before creating an instance in Rust, it is not a special member function that the language understands. It’s just a regular old function that happens to return an instance.
I do agree that it can be useful when discussing code to refer to these style functions with a more specific term, like a constructor, but that’s a colloquialism, and when discussing language features, being precise is important.
Classes are about data abstraction and encapsulation, which have nothing to do with implementation inheritance. They're about providing an interface that preserves any required invariants and does not depend directly on how the data is represented. A "structure" that either preserves useful invariants or is intended to admit of multiple possible representations that nonetheless expose the same outward behavior is effectively a class, whether you call it one or not.
I think the discussion of what to call that is a bit pointless. For some people through their Java/C++/whatever-tinted glasses, these things will be classes. To others they might be called something else. You personally call them "classes". I personally do not. Rust the language does not. Lots of people behind the development of the language thought that structs and traits are better names.
I appreciate the programming language design behind them and hope, that Rust will not devolve into an ecosystem, where everyone thinks that they must put everything into classes or similar, needlessly maintaining state therein, requiring users to mutate that state through accessors and whatnot, when simply a couple of functions (and I mean strictly functions, not procedures) would have done the job.
I never stated, that I personally think classes necessarily mean inheritance. But guess who thinks so. Lots and lots of developers working with mainstream languages, where frequently inheritance is made use of, in combination with what those languages call a "class". That is why I am saying, that I don't want those people to fall back into their previous thinking and would not want to call them classes. It gives many many people the wrong ideas.
What other people call it is their own choice. I am merely stating my own preference here and probably the preference of the language designers, whom I mostly deem to be quite a lot more competent in drawing the distinction than myself.
> requiring users to mutate that state through accessors
There are plenty of cases where this makes sense, such as when working with sub-word data (bitfields), which is common in the embedded domain and often found as part of efficient code more generally. In fact, it may be more rare to have actual structs where one definitely wants to provide actual access (e.g. via pointers/refs) to the underlying data, and thus cannot just rely on getters/setters.
> Letting the structure emerge requires people thinking in depth about the underlying principles of what the code does or should do.
Right, yes, but those principles are often still very much in flux in the early days of a feature. Once a feature is more mature, it’s easier to confidently say what the code should do, and so that becomes a good time to refactor. Early on in the development lifecycle I think it’s rarely a good idea to worry about code duplication, underabstraction, etc.
And yes I agree with you that classes are an organizational concept with parallels in functional languages. Modularity is very important, but as you say there’s no reason that modularity implies classes. Sometimes I find classes to be ergonomic, and when they are using them makes sense, but plenty of other times a struct will do, as long as there’s some type of module system to keep different things different.
> Some newer languages don't even deal in classes at all (Rust for example) and with good reason. If one says we need classes for having objects—No we don't.
That's specious reasoning at best.
A class basically means a way to specify types that track specific states and behavior. Afterwards this materializes in other traits like interfaces and information-hiding.
Don't tell me Rust does not support those.
Also, C++ precedes Rust and it supports free functions and objects without member functions from day one. Rust is hardly relevant or unique in this regard. What Rust supports or not is not progress, nor is the definition of progress whatever Rust supports.
> And objects are a concept to manage state over the lifetime of what the object represents, so that might be a worthy concept, but a class?
You're showing some ignorance here. Classes and objects are entirely different concepts. An object is an instance of a class. A class is a blueprint of an object. This is programming 101.
I'm not on the train of applying "Clean" Code always and everywhere. However:
> I've found over a long time (20+ years) that this is usually a sentiment held by people who are focusing on the wrong things in code bases (and quite often aren't actually solving real problems but spend their time solving non-problems with the additional side effect of creating more for the future). The first implementation of this should most definitely not abstract away anything like those 10 lines (which is minuscule). It's trivial to take something that does exactly (and only) the thing and modify it later and it's pointless to abstract away something as small as 10 lines for a gain you haven't yet proven or tested.
I have to disagree here a little.
How are you going to "prove" lower maintenance cost? How are you going to prove, that less bugs happened? How are you going to prove that ahead of making the change? How would you prove it even after making the change? You cannot go back in time and live the same time again with a different approach. So what you are demanding as a kind of proof is never going to be possible and as such you are never going to have the refactoring. This argument could be had about any kind of refactoring, not only the one described in the blog post.
If one plasters the same code in many places, I have to doubt their basic understanding of computer programming. Have fun bugfixing N places instead of 1 place, once a bug is discovered. You hopefully don't be forgetting any places.
This is not to say, that no duplication ever should exist. Lets not be extremists.
> "Clean Code" is absolutely not important and most rules/"principles" of the same character as those that make up Clean Code are absolutely not important either, but are things that people with nothing better to do hold on to in order to validate the hornets nests they accumulate in code bases over time. It leads to over-abstracted, hard-to-change code that runs badly and is much harder to understand, generally speaking.
That is a very broad over-generalization. Maybe some or even many people do as you say, but not everyone. There are indeed good ideas in Clean Code. We just need to not be dogmatic about them.
> How are you going to "prove" lower maintenance cost? How are you going to prove, that less bugs happened? How are you going to prove that ahead of making the change? How would you prove it even after making the change? You cannot go back in time and live the same time again with a different approach. So what you are demanding as a kind of proof is never going to be possible and as such you are never going to have the refactoring. This argument could be had about any kind of refactoring, not only the one described in the blog post.
The existence of a situation where the code has to be changed in several places will prove the connection between all of these pieces of code. This is best proven in actual practice without guesswork, which any premature abstraction is.
> If one plasters the same code in many places, I have to doubt their basic understanding of computer programming. Have fun bugfixing N places instead of 1 place, once a bug is discovered. You hopefully don't be forgetting any places.
The first time this happens the programmer will undoubtedly be much more well informed about the connections between the pieces of code and will have an actual case of them being related, which is why it's much better to delay any abstraction until that point. Future-proofing and playing oracle about the future needs of code is generally a dead end but the programmers least capable (i.e. junior programmers and/or Clean Code enthusiasts) of playing that game are also the most likely to do so.
> The first time this happens the programmer will undoubtedly be much more well informed about the connections between the pieces of code and will have an actual case of them being related, which is why it's much better to delay any abstraction until that point.
In a perfect world maybe. And even then it would be better not to accumulate issues until they start hurting. But in the real world there are often people, who never had these issues on their calendar and will not be willing to allow for sudden fix it time.
> Future-proofing and playing oracle about the future needs of code is generally a dead end but the programmers least capable (i.e. junior programmers and/or Clean Code enthusiasts) of playing that game are also the most likely to do so.
I don't need to be an oracle to tell, that 10 times the same code will incur the need to change 10 places instead of 1. I don't need to be an oracle to write code, that from the beginning can expect some change in business needs.
I am fed up with people thinking, that developers are like little babies, who never learn anything from their past experiences. All in good measure of course. If I tell someone, that I see a problem for the maintainability of the code of some project, because it cannot easily be extended in ways, that are likely to happen due to business needs, I expect them to take me serious and not belittle my experience. Engineering is my discipline, and often not theirs.
I have build systems, that anticipated changing requirements and they did change just like I thought they would. How did I do it? By making parts of the code work in "additive" ways. I cannot state exact details, but there was some project about checking values of variables and give some result of whether the values were correct. The values were of various types, so they required various kinds of checks. For example string edit distance. In such a situation it is already foreseeable, that at some point you might want to add checks that check the values in different ways. For example number equivalence and then later on equivalence with some epsilon, some allowed deviation from the correct value. So I implemented it in a way that I can register checks. That way one can add a check later and register it, without changing existing code. These things can be anticipated and a good design will allow for change later easily. No need to be an oracle. Just need some experience.
I'm not sure you and the OP are even in significant disagreement. When exactly did the OP say developers are like babies who don't learn from experience? From my perspective, the OP seems to be critiquing the culture of indoctrinating inexperienced and/or developers into the cult of clean code. It is the clean code camp that is more likely to dismiss experienced engineers as unenlightened dinosaurs.
The advantage, if the abstraction is wrong, of using the abstraction is that you can use tool automation too look up all the uses of the abstraction and replace them with a new one at need without needing to memorize the codebase.
Hodgepodge copy-template code that has minor context-sensitive structural differences is relatively difficult to find all instances of in a codebase without tracing execution paths unless you were the original author and you have the uses memorized. In contrast most IDEs have a find all references and even grep can find all the common names of something in a codebase.
The first time this happens the programmer will spend hours looking for the first few places that need to be changed. Then he will spend days debugging, and hopefully will find the last few places.
Then, even though he is now informed about the connections, he will NOT refactor this code because he (a) has no time to refactor as he already spent a week on what seemed like a one-line fix, and (b) is now terrified to touch this code any more than absolutely necessary.
> It leads to over-abstracted, hard-to-change code that runs badly and is much harder to understand, generally speaking.
Abstractions, even bad abstractions, are far easier to deal with than unabstracted code. Abstractions are cheap and disposable. You're not married to them. They only seem hard to change if you're stuck on the mindset that to implement a change to a piece of code using an abstraction, that you have to change that abstraction, which is incorrect. If an abstraction no longer fits, you should more often than not stop using it in that part of the code rather than altering the abstraction to fit the new use case. They are also easier to understand by virtue of communicating their intent, unlike unabstracted code. It takes a lot of effort, on the other hand, to read through and understand a chunk of unabstracted code, and even more to change it or write a slightly differing variant of it for a new use case, and even more to extract any kind of abstraction from a bloated unabstracted mess. With this being the case, it always makes sense to eagerly abstract if you can, even if your abstractions are bad, because the result will still be more maintainable than unabstracted code.
I think the original blog post is badly worded. We can see by the proposed refactor that in fact those 10 lines were not identical - they were only similar. The formulas for resizing the top-left corner of a rectangle are different from the formulas for modifying the bottom-right of an oval. They look quite similar, but one will have a + instead of a minus here and another one there etc.
The complexity is built into geometry itself in this case, and "abstracting" it away is only moving the mess around, it fundamentally can't (perhaps there is some clever mathematical abstraction that could using some special number group, but unless you have that built-in, it'll probably be much harder to implement the special arithmetic using built-in ints than you gain).
Assuming this story is real, there are several smells
* Colleague writes quite a bit of code, and merges it without anyone reviewing it or providing feedback.
* OP rewrites the code from a colleague without communicating, and again merges it without a review.
* The manager calls a meeting with OP and ask them to revert their changes.
The initial code might have been messy or not, and the refactor might have been a bad or good idea. Nevertheless I think OP is taking away the wrong lesson.
Wow. Just review the review and if it's good, merge it, resubmit to a different reviewer, whatever your process is. The reviewer/re-writer is helping get work done and being offended is counter-productive.
What the GP explained is commonly true when it comes to interacting with other humans, with only a few exceptions. You can complain about human nature being counter productive all you want, but refusing to adapt to this reality and foregoing fundamental soft skills is, ironically, even more counter productive.
> What the GP explained is commonly true when it comes to interacting with other humans,
Not really. It's company culture. It does not have to be that way at all, and is honestly a symptom of poor leadership and a misunderstanding of roles on a team.
> foregoing fundamental soft skills is, ironically, even more counter productive.
I disagree with having a culture that makes people afraid to submit a pull request because someone might help out and fix a problem with it, or make it better. Being angry comes from misunderstanding and fear that a developer's standing will be diminished because of the rewrite. A developer being angered by a correct or better rewrite is a symptom of bad leadership and a toxic culture.
"Clean code" needs a re-brand. It seems like constantly, over my 25+ year career, I'm seeing never-ending push back against design patterns, abstractions, de-duplication etc.
There are always the same reasons:
- Abstractions make code more complex
- We don't have time to write clean code
- It's all just your opinions and preferences
The purpose of clean code is to make code SIMPLER, and easier to maintain as requirements change. The value of software is precisely its ability to change over time. Otherwise we could stick to fixed circuits which are FAR easier and cheaper to implement and maintain.
Therefore, if you did not achieve these goals with your refactor, and your boss can make a persuasive argument that your refactor made it HARDER to maintain as requirements change into the future ... then your refactored code wasn't "clean", now was it?
As for not discussing the refactor with the original developer, that's outside the scope of clean code. Has nothing to do with whether or not clean code is good or bad, or whether your changes were "cleaner" or not. That's a process and etiquette discussion. Just because you did a dick move and went cowboy doesn't say anything about whether or not clean code is valuable.
I think the pushback is against certain recipes that seem too absolutist. "Clean Code" is the title of a famous book that defines clean (among other things) as:
"No Duplication
Duplication is the primary enemy of a well-designed system. It represents additional work, additional risk, and additional unnecessary complexity."
So according to this definition, removing duplication is synomymous with simplification, which is simply incorrect.
Removing duplication is the introduction of a dependency. If this dependency is a good model of the problem then this deduplication is a good abstraction and may also be a simplification. Otherwise it's just compression in the guise of abstraction.
[Edit] Actually, this quote is a reference to Kent Beck’s Simple Design that appears in Clean Code.
Actually, when writing highly optimized code, cut-and-paste duplication is not unusual. We may also eschew methods, for static FP-like functions (with big argument lists), in order to do things like keep an executable and its working space, inside a lower-level cache.
But also, we could optimize code that doesn't need to be optimized. For example, we may spend a bunch of time refactoring a Swift GUI controller struct, that saves 0.025 seconds of UI delay, avoiding doing the same for the C++ engine code, that might save 30 seconds.
I find "hard and fast rules" to be problematic, but they are kind of necessary, when most of the staff is fairly inexperienced. Being able to scale the solution is something that really needs experience. Not just "experience," but the right kind of experience. If we have ten years' experience with hammers, then all our problems are nails, and we will hit them, just right.
I tend to write code to be maintained by myself. It is frequently far-from-simple code, and I sometimes have to spend time, head-scratching, to figure out what I was thinking, when I wrote the code, but good formatting and documentation[0] help, there. Most folks would probably call my code "over-engineered," which I have come to realize is a euphemism for "code I don't understand." I find that I usually am grateful for the design decisions that I made, early on, as they often afford comprehensive fixes, pivots, and extension, down the road.
TBH, I don't know if I will ever reach the transcendence stage, but my two cents are (after 20 years of professional programming) that breaking the rules is sometimes desired but not when the person has not even reached the first stage. The example from OP, as explained in the post, seems a clear case of never reaching stage one, following the simple rule of no duplication.
The conclusion in the article is kind of good and bad: it seems the OP has reached stage two but for all the wrong reasons and there is no telling whether they now actually posses the knowledge or that they just discarded one rule to follow another, both of which can be wrong or right based on the situation.
That's not just a view or opinion, it's how learning works. Thanks for the link!
When starting to learn a domain, you lack deep understanding of it, so you cannot make sound decisions without rules.
But rules are always generalizations, they never fully encompass the complexity they hide.
Through experience, you should become able to see the underlying reason behind the rules you've been given. Once full understanding behind a rule is ingrained, the rule can be discarded and you can make your own decisions based on its underlying principles.
Then you become an expert through deep understanding of several aspects of a domain. You are able to craft solutions based off intricate relationships between said aspects.
Let's take `goto` as an example. The rule you will commonly see is "don't use it". But if you're experienced enough you know that's unnecessarily constraining, e.g. there's nothing wrong with a `goto` in C for the purpose of simplifying error handling within a function. It only becomes an issue when used liberally to jump across large swathes of code. But your advice to juniors should still just be "don't use it", otherwise your advice will have to be "use `goto` only when appropriate", which is meaningless.
Also, that "unnecessary optimization" thing, can be a form of "bikeshedding."
Maybe the app is too slow, and the junior engineer knows Swift, so they spend a bunch of time, doing fairly worthless UI optimization, when they should have just gone to their manager, and said "The profiler shore do spend a mite of time in the engine. Maybe you should ask C++ Bob, if he can figure out how to speed it up."
I love this part: "It represents additional work, additional risk, and additional unnecessary complexity", because it could be "refactored" into "additional work, risk, and complexity". I assume it hasn't been, because (in the author's opinion) it communicates the intended meaning better - which might be the case with code, too. "Well-designed" is subjective.
Good design has objective and subjective elements. Or... it might be more accurate to say that it is entirely objective, but some/many elements are context-sensitive.
For example, a style of writing that is difficult to follow but rewarding to parse for the dedicated and skilled reader may be considered good. It is good at being an enjoyable reading puzzle. But from an accessibility standpoint, it's not a clear presentation of information, so it's not good.
Mostly we call things that are increasingly accessible well designed. But we're using a specific criterion of accessibility. It's a great criterion and it's one we should generally prioritize. But it's not the only facet of design.
In code, we generally could categorize high quality design as accessibility. Most engineers probably think of themselves as not really needing accessibility features (although how many are undiagnosed neurodivergent?), but writing code that is easy to read and parse and follow is an accessibility feature and an aspect of good design.
I'm not really sure I know what you mean by "compression in the guise of abstraction". Re-usable code is a great way to isolate a discrete piece of logic / functionality to test and use in a repeatable manner. A sharable module is the simplest and often smallest form of abstraction.
Reusing a function C in functions A and B makes A and B dependent on C. If the definition of C changes, the definitions of A and B also change.
So this is more than reusing some lines of code. It's a statement that you want A and B to change automatically whenever C changes.
If this dependency is introduced purely out of a desire to reuse the lines of code that make up C, then I'm calling it compression. In my view, this is a bad and misleading form of abstraction if you can call it that at all.
> Reusing a function C in functions A and B makes A and B dependent on C. If the definition of C changes, the definitions of A and B also change.
To pile onto this example, in some cases the mastermind behind these blind deduplication changes doesn't notice that the somewhat similar code blocks reside in entirely different modules. Refactoring these code blocks into a shared function ends up introducing a build time dependency where previously there was none, and as a result at best your project takes longer to build because independent modules are now directly dependent or at worsr you just introduced cyclic dependencies.
Unrelated modules often superficially look like each other at a point in time, I think that’s what the parent is referring to. An inexperienced developer will see this and think “I need to remove this duplication”. But then as the modules diverge from each other over time you end up with a giant complex interface that would be better off as 2 separate modules.
So deduplicating unnecessarily compresses the code without adding anything to code quality.
> Removing duplication is the introduction of a dependency. If this dependency is a good model of the problem then this deduplication is a good abstraction and may also be a simplification. Otherwise it's just compression in the guise of abstraction.
I think you're referring to coupling. Deduplicating code ends up coupling together code paths that are entirely unrelated, which ends up increasing the complexity of an implementation and increase the cognitive load required to interpret it.
This problem is further compounded when duplicate code is extracted to abstract and concrete classes instantiated by some factory, because some mastermind had to add a conditional to deduplicate code and they read somewhere that conditionals are for chumps and strategy patterns are cleaner.
Everyone parrots the "Don't Repeat Yourself" (DRY) rule of thumb and mindlessly claim duplicate code is bad, but those who endure the problems introduced by the DRY principle ended up coining the Write Everything Twice (WET) rule of thumb to mitigate those problems for good reasons. I lost count of all the shit-tier technical debt I had to endure because some mastermind saw two code blocks resembling the same shape and decided to extract a factory with a state patter turning two code blocks into 5 classes. Brilliant work don't repeating yourself. It just required 3 times the code and 5 times the unit tests. Brilliant tradeoff.
Yeah, this is the crux of it. What exactly is duplicated code? Humans are pattern matching machines, we see rabbits in the clouds. Squint at any 4 lines of code and something might look duplicated.
On the other hand, code bases that do have true duplication (100s of lines that are duplicated, large blocks of code that are exactly duplicated 16 different times), multiple places & ways to interact with database at differing layers - that's all not fun either.
It is a balance & trade-off, it goes bad at either extreme. Further, there is a level of experience and knowledge that needs to be had to know what exactly is a "duplicate block of code" (say something that pulls the same data out of database and does the same transform on it, is 20 lines long and is in 2, 3 or more places) vs things that just look similar (rabbits in the clouds, they are all rabbits).
> Deduplicating code ends up coupling together code paths that are entirely unrelated, which ends up increasing the complexity
Code paths that may be unrelated. If they are related, then deduplicating is most definitely a good idea. If they're trying to do the same thing, it makes sense that they call the same function to do it. If they do completely different things that currently happen to involve some of the same lines of code, but they could become different in the future, then deduplication makes no sense.
"Clean code" can mean two things: (1) someone's subjective feeling that a certain piece of code is nice, and (2) following the (majority of) the guidelines in the book Clean Code and related.
What people have started feeling, myself included, is that (2) and (1) are fundamentally different in many places. That is, that following some of the guidelines of Clean Code the book produces code that is very much not clean.
So sure, everyone agrees that clean code is better, almost by definition. There are indeed valid (usually business/time) reasons to go for code you know is messy, but those are all external - in an ideal world, everyone wants to write clean code. But, increasingly, people don't feel that Clean Code is a good way to achieve that. In fact, many of the patterns in Clean Code have ended up being considered sources of messy code by many people - over-abstraction and de-duplication perhaps chief amongst them.
One thing I learned in industry is that the best cleanly written code does not survive contact with product. Sometimes the technical debt is encoded in the product requirements themselves, and no amount of refactoring or best practices can fix it, only mitigate the pain.
Sure. We see that all the time. Error handling and security are often afterthoughts, for example. We engineers tend to get "commissioned" to build quick proof of concepts and once the business sees their vision come to reality they don't want it to be thrown out and redone with architectural consideration, they want to rush it to market as the "MVP" as quick as humanly possible. This is a reality of our industry that we need to manage and find ways to cope with.
When it comes to certain architectural decisions, there are things that are very hard to change later on, and sometimes you don't know how the product will need to scale, or what challenges the engineering team will be faced with a year, two years, 5 years down the line. This is where we need to make our best guesses and we can't always get it right.
When it comes to the rest, making the code as simple and easy to change is the best we've got.
I was recently reminded by someone I look up to and admire that not everyone has the "gift" of being able to think "top down." This came up because I am often frustrated by the claim that "we don't have time to write this cleanly."
It's a frustrating claim to me, because I don't find that writing short, single responsibility code that is well labelled takes any time at all. I also spent time learning how to work TDD so I can write unit tests as part of the development and design process rather than having to go back and write tests after the fact.
But a lot of developers build "bottom up", where they write really long functions or methods, or dump everything into a single file and then think about separation of concerns, and layers and isolated "components" later on and thus need to spend a lot of lengthy time refactoring after the fact.
That initial "bottom up" process is the "getting it to work" part of the process, which is fun for most devs. But if they then need to refactor after it is tedious and laborious and not fun at all. And I think this is probably where the vast majority of the push-back and friction amongst devs really comes from if we look at it from a human point of view.
If you enjoy the process of DESIGNING software, and thinking top down, you won't find refactoring to be a chore or time consuming. You will understand how a well chosen abstraction can SIMPLIFY rather than complicate. But if you like to solve problems primarily, and to build things you haven't built before, and by "building" it means getting something that "works" up and running quickly ... then you don't want to spend your time reading books on design patterns, or refactoring your PR because the Principal told you it's not up to standards.
Top down and bottom up have little to do with using long functions and not separating code in multiple files. In top down you start with a high-level overview of the system and gradually break it down into smaller and more detailed components as it provides a clear and structured view of the system. In bottom up you start with the individual components or modules and gradually combine them to create larger subsystems or the complete system. Each approach is suitable for a type of problem. I personally prefer to start off a POC as bottom up then once the general shape of the system emerges a rewrite top bottom is a good idea. I've seen lots of systems that started as top bottom and the architecture was atrocious because it's final shape wasn't know at the beginning, it was great on paper though.
There's no silver bullet, there are multiple ways to solve problems and they all have their pros and cons. The takeaway is part is that applying methodology dogmatically is not a good idea.
> I don't find that writing short, single responsibility code that is well labelled takes any time at all.
Writing it takes no time; designing it does.
> That initial "bottom up" process is the "getting it to work" part of the process, which is fun for most devs.
It's not about having fun: it's about getting the functionality to the first users, and then, finally, you can get some slack and have fun and refactor your code for future changes. But functionality is what sells: clean code is invisible to the users and thus to the business.
Clean code is for us developers, and only for us. We have the right to be egoist and work for our own sake from time to time, and a working functionality is what will let the business close an eye over it for some time.
> It's not about having fun: it's about getting the functionality to the first users, and then, finally, you can get some slack and have fun and refactor your code for future changes.
You're speaking to business motivation, which I intentionally did not touch upon. Everything you said is valid.
What I'm speaking about is developer push-back against "clean code." And when a developer is advocating for the business, then it is business push-back.
Business push-back is valid because a) their business needs are valid and b) it is very difficult to correlate code quality with business value. There is a correlation, because developer velocity and ability to grow the feature-set and fix defects quickly and efficiently benefits the business as it reduces cost over time and benefits users. But that benefit is long-term and not immediately visible by end users. The business doesn't always want to pay for this up front which is their prerogative, and then we circle back to what you said about developers doing it in order to make our own lives easier because we know the day is going to come when the business gets frustrated that velocity was extremely high at the beginning of the project life-cycle but has ground to a halt.
In any event, developer push back is real, and is not tied to business values but to personal developer values. I see code comments (which IMO are smells and point to smells) that say things like "// no point in extracting this anonymous function as it is only a few lines long" ... meanwhile doing so would have taken no extra time at all and it is buried in the middle of a 300 line Promise chain or rjxs stream etc.
To see how clean code affects users you need to look beyond its direct, first order effects. Clean code is less likely to be buggy, is more malleable and performs better. That's how clean code benefits users, they get to enjoy faster software with fewer bugs and more features.
> you need to look beyond its direct, first order effects.
Sure, I agree! But too many developers, in order to look beyond the first order effects, have a tragic tendency to forget about looking at the first order effects. And that is a big problem!
The purpose of clean code and the results of practicing it, it turns out, are vastly different. The reason for that is because people are generally incapable of predicting any future state of their code even if they have the requirements in front of them.
Clean code necessitates that dreadful question of “what if”, which is a horrific Pandora’s box of never getting your shit out the door.
It is nearly always better to code for what you know, and to have a person experienced in the domain to otherwise guide the architecture.
I never made the argument of "no true developer." I pointed to specific reasons cited when push-back is given, and then explained why I disagree with those points.
> Clean code necessitates that dreadful question of “what if”,
Nope. Not even a little. "Clean code" can be as simple as keeping your moving parts isolated and loosely coupled so that they are easier to debug, to reason about and to reuse in new contexts if needed.
What you're talking about, the "what if" game ... is what Martin Fowler termed "Speculative Generality" in his book "Refactoring" when cataloguing code smells and sources of complexity. In other words, the exact opposite of what I'm talking about.
If you don't mind, I'd like to propose an alternative list:
- Most of our abstractions are bad, even those we think are good. The good ones are largely already in our libraries (although even that isn't always the case unfortunately).
- Bad abstractions are a time sink.
- Writing code is also about communicating with other humans (on the team), so what they think about communication matters a lot more than what I think
Yes, but contrary to the general perception, I fend them to be far less of a time sink than duplicated code. It takes almost no time at all to bypass or rework a bad abstraction, but code duplication necessitates either a literal multiplication of your efforts wherever you need to add features to duplicated code, or to refactor the duplication into a non-duplicated form, which also takes a lot longer than dealing with a bad abstraction. Abstractions are cheap and disposable, even if they're bad. Duplication is expensive and hard to get rid of.
Abstractions are not cheap and disposable. If 5 places use one abstraction, it takes at least (1) 5x the effort to dispose of them compared to changing one of those places.
(1) At least 5, because sometimes its worse. Example: lets say you are using a bad ORM which in addition doesn't expose its connection pool in any reusable way (most don't). You want to introduce a good ORM, but you can't because duplicating the size of the connection pool is too expensive. Additionally, because the ORM triggers events when objects are created and modified, and database state relies on that, you cannot use the other ORM for object creation and modification unless you replace them all at once. So you would have to refactor your code to introduce your own data access layer that has implementations for both ORMs, then switch over all at once.
In contrast, introducing deduplication later is somewhat easier. You can build an abstraction and test it with 1 or 2 of the existing parts of the code, then migrate the rest incrementally instead of all-at-once. It will also prevent some bad (expensive) aspects of the design due to the requirement that it must be incrementally adoptable (must be able to have a reusable connection pool, demand a more reliable change events system etc) and as a corollary it will also be incrementally removable.
The bad situations from the costs of duplication are often far worse. If those code paths had slightly different needs, they would diverge. It often feels like 90% of my career has been to take on massive undertakings to learn am entire system to ultimately pick these out after the fact & replace it with something reasonable.
I don't have data to support any absolute claims of what is better vs not. However, every project, team, company, industry that I have worked on has been filling my time with the above scenario. Not sure what universal truth could be pulled from that (if any)
> Otherwise we could stick to fixed circuits which are FAR easier and cheaper to implement and maintain.
While this is not the key point of your post, this in particular is also not the case. With even a little complexity, it quickly becomes much easier and cheaper to implement functionality in Software on a generic CPU than in fixed circuits.
For example, we often integrate simple CPU in products that otherwise are fixed circuits, just to perform single functions as they would be too resource intensive and hard to verify. The CPU then becomes a slave to the surrounding circuit.
People never think about complexity as a tool, but it's as much of a tool as the lowly assumption. If anything, they're two sides of the same coin.
You can make things simple by making assumptions, you can enable cool things by making things complex.
the thing about assumptions is that they're dangerous. easy example: assuming the timezone in a db column vastly simplifies things but if you're ever wrong things can go terribly wrong.
Which is why some of the most dangerous developers on your team are the ones who either don't _respect_ assumptions or don't even realize they're building in assumptions.
The flip side is true for complexity. You can absolutely harness complexity to great effect, but if you have developers who don't realize they're building in complexity or don't respect the cost of that complexity, you have dangerous developers.
It's a good idea to remember that $NEXT_PERSON could quite possibly you a year from now, when you've probably forgotten what you were on about previously. There are selfish reasons to clean code as well!
I think the broader issue is that there is little valuable empirical evidence that the stuff people say is clean code actually makes things simpler. Likewise I find it unlikely that there is one way to make things simpler and its the way associated with the book. It's all vibes.
I like to keep in mind a companion rule along with DRY (don't repeat yourself): Don't make the reader repeat themself, i.e. re-read your code multiple times after you've cleverly repackaged it.
Philosophy of Software Design is a much better book discussing these ideas than clean code. Part of the problem is uncle Bob's book is now tied up with the idea of clean code, but his book sucks.
Your colleague wrote lots of code using copy and paste. You refactored it post commit. He complained to your boss, who had a go at you, and now you're going to leave mess in the codebase next time.
There's a lesson to be taken from this experience but it isn't that copy&paste is better than writing functions.
Your refactor should have gone up as a change to be reviewed, tagging the author. However the escalation to boss who tells you to revert is a really bad sign, as opposed to your colleague talking to you directly.
I guess everybody has a different way of doing things. Personally I'd ask my colleague on Slack (privately), "Hey, you repeat 10 lines of math in each of these functions, wouldn't it make sense to make it a separate function?".
The two probable answers are: "No, because... (reason)" or, less likely, "You're right, I'll do it tomorrow". No fuss, no problems for anyone, and someone can learn a thing or two.
If it is copy-paste out of laziness, then certainly "writing functions" (abstractions) would be the better solutions for repetition (DRY).
But the author adds crucial context:
> My code traded the ability to change requirements for reduced duplication, and it was not a good trade. For example, we later needed many special cases and behaviors for different handles on different shapes.
It shows the difference between *accidental DRY* and *inherent DRY*. The context makes it clear that what we saw initially was *inherent*. Quite often, code is the same because the underlying libs A and B are consistent with eachother. Or business needs state that something has to be 15 items large, and another thing also has to be 15 items large. And so on. But when we look closely, the libs' consistency is really just chance: our wrapper around lib A and a wrapper around lib B may contain the same code, but they are wrapping something entirely different and abstracting that makes the result worse.
Same with the "15 items". When business states "every list must always show 15 items", then sure, we should abstract the 15 somehow. But if it says "the top 15 on the homepage shows 15 songs" and "the comments under a song, by default show 15 items" then abstracting that 15 is a worse option.
Without knowing too much about the context, it probably would have been possible to keep the structure where every shape is structurally independent from other shapes, but where calculations are abstracted away in pure functions that are just being called by the shapes. That keeps the maths in one place (and you can always add more calculation functions later if needed), but it doesn't entangle different shapes with each other needlessly.
For me the biggest red flag in all of this is changing a coworker submission without any kind of communication.
Second red flag is that the refactor is not a good refactor, it lacks modularity and is still coupled, yes, you reduced some duplication but that's mostly it.
“Hey I rewrote what you wrote because it sucked. Got problems with that?”
I suppose the best solution to this problem would be a “late review” asking for an explanation, offering a solution, and hoping it doesn’t just get ignored, which is the easiest thing to do for the author.
> Hey I rewrote what you wrote because it sucked. Got problems with that?
Reading it like that is just insecurity speaking. If anyone wants to rewrite my code to be more readable, I'd love for them to tag me in the review and show me how they did it, so I can learn. Assuming good intentions and being open to discussion goes a long way :)
I'll always tag the author for review if I'm rewriting something recent. It's more "Hey, I was too late to mention this in the review, but I think this improvement was worth the time. Let me know what you think in case I missed something important about the original."
That’s generally what I like too, but often in teams my suggestions are either approved with zero comments, ignored or followed by wasteful reviews by people who don’t know the language they use every day, and don’t bother to google their answers.
If someone had just summerly re-written my code I would personally go talk to them and found out 'whats up'. As it shows a communications breakdown. Second the person who did it should have approached the other dev and said 'hey I see you have blah why? would not this way be better?'.
This happens many times in codebases. People begin to 'own' sections of it and they become very protective of what is going on in their area. They take it very personally if something changes and they were not in on it.
It's a context thing. The point of adding the author as reviewer is to keep them in the loop, show an alternative solution and discuss which way to go. Also to learn why the did it the copy/paste route in the first place - it might be carefully reasoned, it might be a rush job, they might not know any alternatives.
Also to put the conversation on record somewhere so that if someone comes later with the same complaint they can see why the code was written like that.
I wouldn't necessarily tag the author for a review. I might ping them to notify that I was doing some refactoring in the area and leave it at that. Of course it all depends on the team size, diversity/range of work, typical interactions between members, etc.
I don't think a minor deduplication justifies a separate discussion even before a single line is changed. That would make a huge deal out of it and waste everyone's time.
I wouldn't probably make a separate PR for that - just refactor the code next time I need to touch it - and tag the author for review of course.
There was a significant bit of the article at the end where the author discussed future changes to the codebase that requires special casing handles for different shapes, how it would have been difficult with the refactor but simple with the repetitive code.
I think that’s the critical bit is that you have to realize that the code will change in the future. Requirements will change, related code will change. If you make abstractions that restrict that ability to change, you’re hurting future you’s ability to adapt the codebase. More so because we often hold on, emotionally, to these abstractions we’ve invested so much time and effort into that we waste more time trying to elegantly fit the changes into our elegant solution.
I'm going to go out on a limb and guess that their "reason 1" was the sole source of the issue. This smells like an instance of the "nobody likes a know-it-all" problem that often arises with junior devs who haven't yet learned that their job isn't just between them and the computer, but them, the computer, and other people: the cleaner version likely would have replaced the dirty version had the author actually had a dialog with the author of the dirty code and mutually decided that there was a better way.
The article likely would be better titled as "I learned a lesson about working on a team", not "goodbye, clean code".
> instance of the "nobody likes a know-it-all" problem
Excellent insight. I think this exactly captures the dev environment. The more complicated code was viewed as a threat, the rival worth taking down a peg or two via management chain.
For juniors reading this who have the same experience. "Nobody likes a know-it-all" is context dependent. It's absolutely true of lots of professional settings. Simplicity over flexibility, avoid confusing language constructs, preference for lots of time at the keyboard and languages that encourage this.
The lifelong learning, software craftsman, reflect on the process philosophy advocated in various places is completely at odds with that setup. Trying to get better is unpopular with those who are not improving.
Both are valid life choices. Software has been a comfortable place to put in your seven hours at the office and go home. It's also a world to explore bounded only by your imagination and intellect. Beware spending decades working with those who fear the unknown if you'd like to be one who discovers novelty.
Or more concisely, if your incentives are misaligned with those of your colleagues, consider getting out of there.
I think the title is fine. It's a classic for good reasons and has been an eye-opener to a lot of devs falling into this exact trap, which you can do even if you're working alone. You overengineer a clean abstraction just to get bitten by it in the future.
> This smells like an instance of the "nobody likes a know-it-all" problem that often arises with junior devs who haven't yet learned that their job isn't just between them and the computer, but them, the computer, and other people
Meh, it is usually seniors on power trip who are guilty of this.
The problem here is that changes are committed directly to the main branch without a review. There needs to be some sort of peer review (I don't care if via pairing or with a classic PR) for every code change exactly to prevent an escalation like experienced by the author.
The boss was a bad boss, instead of reprimanding you, they should have realised that they were missing a good process. In a team with a code review culture, an overeager newcomer might attempt something like this (and tbh, whether it was a good attempt or not is impossible to judge without knowing the code and its context) and a peer would have said "yeah, thanks but no" or "good idea, but let's try it like this instead", etc.
Independently from that, I do think that one shouldn't just refactor code without good reason because even if it's bad code, it might be code that rarely needs modification. The best time for refactoring is when you need to change behaviour and you realise that the old code structure is making things too hard.
Unfortunately, I think the wrong lesson was learned here.
Could you elaborate on how code review after merge is fine or better?
Some major downsides to review after merge:
- cost of context switching (author has moved on to something else new, which now remains paused to "go back", so there's no agility benefit to just merging if it works)
- increases risk of unnecessary conflicts (how do you address someone merging something, you have feedback, then someone else merges after on top? A PR helps resolve code that's done vs could be improved because it forces a communication moment between the authors)
- tooling (a PR or diff is well supported. How are you discussing feedback when everyone can just merge on top without review? I am assuming there's no point to a PR if everyone can just merge)
- decreases shared learning and understanding (I might think the code follows our standards but there still may be feedback from my team that could help improve. Why put that in the main branch before such feedback? It seems like it would be hard to keep track.)
I can't imagine my team performing well under those circumstances and I think we have a very healthy code review / quality culture. If I'm not giving or receiving feedback - that sounds more like code slinging than thinking and humility, even for the most experienced architects I've worked with welcome feedback, so it's not a matter of trust.
I haven't used this workflow but I imagine your concerns could be addressed.
> addressing merge conflicts
I actually think, from the pov of the change author, this workflow is better at this. Other code changes have to resolve conflicts with yours, not the other way around. The followup changes from review feedback can begin with conflicts addressed for you.
> tooling
I don't know how other platforms handle this, but on GitHub at least there is nothing stopping you from reviewing a merged PR. You can prevent pushing straight to the trunk while still allowing authors to merge their own PRs at will.
I do think your other points are clear drawbacks but on its face the practice doesn't seem without merit. Seems like the "show" point on the ship/show/ask spectrum.
- Breaking the framing that a review should only address code that has changed in a PR, and encouraging a more holistic view of the codebase. You don't have to review every merge every time, just review code at a granularity that makes sense when it makes sense to do it.
- Removing a delaying step between code being functionally complete and being delivered, where value invested is sitting on the shelf waiting for a reviewer.
- Forcing a high test pipeline confidence. If you're relying on code reviews to catch functional problems rather than stylistic or structural ones, yeah, you won't have the maturity to do this. Saying "we'll review after merge" is saying "we have enough confidence in our automated quality gates not to rely on a human before then."
To address your worries:
- Cost of context switching is removed, not made worse. If you insist on a review before merge, the author has to wait for a review, and unless you have a culture of reviewers context switching to do reviews immediately when they are requested, they will probably pick up something else. Now when the review comes back, what do they do? Do they leave the review waiting, increasing the likelihood of a merge conflict as it gathers dust? Or do they context switch to deal with it and realize the value invested? Contrast this to review-after-merge: the value is delivered, and there's no merge conflict potential; the reviewer can get to it when convenient; and the result of the review can just be another set of tickets on the board to be picked up in the normal sequence. No context switch required.
- Disconnect the review from the PR and the problem of conflicts goes away. Instead review a module at a time (where "module" can be whatever scope makes sense: file, import, header, whatever), so that you're always looking at the whole context rather than the few lines that happen to have changed. That also minimises the human tendency to find as many issues in a 50 line diff as in a 2000 line diff.
- The tooling issue is to some extent an availability bias. Just because tooling for a specific process is well-known does not make that process good. It can nail on harmful practices and make them hard to change or even to see how harmful they are. And yes, if you're doing something close to trunk-based development then either PRs get very small or they go away entirely - branch management is orthogonal to when reviews happen, but you can't usefully move in that direction without addressing the delays inherent in review-before-merge.
- Learning and understanding would only decrease if reviews stopped entirely, rather than moving when they happen. The reviewer always has `git blame` to direct feedback to the right place, and by expanding the scope of a review from just-the-diff to all the code around it, the reviewer has more of a learning opportunity, not less.
It's possible you do have a healthy review culture, but the question I would have is this: how long do PRs sit on the shelf waiting for a review, or for review feedback to be addressed? Do you track that number? And are you relying on humans to catch problems that should be embedded in automated quality gates? Moving the review out of the commit delivery cycle removes a potential process bottleneck, and it's a very easy one to be completely blind to.
No amount of automated tooling can account for subtle security issues, wrong understanding of the spec (something that happens easily especially to people new to the project, but also every time you work with an area of the code you're not familiar with), gotchas previously encountered, etc. There are things that humans (especially ones with a lot of experience) are good at catching that machines aren't.
If you do "review after merge" and you deploy after every merge, I think that's highly irresponsible. If you don't, then your first point still applies - there is a delay between the function being "complete" and being delivered.
That's just blame diffusion though. No review is guaranteed to catch issues like that, all it does is say "well we followed The Process, guess it sucks that one got through" when something bad inevitably makes it to production because the reviewer didn't have the specific knowledge to catch the specific problem. That's more likely to happen when the scope of a review is limited to a diff - either implicitly because that's all the tooling shows you, or explicitly because of "why should I fix code I didn't write on this ticket" reactions to review feedback.
That's not to say blame diffusion is without value! You might need it to avoid toxicity around the team. But that's a different problem.
It's not "blame diffusion", it's risk management. Two sets of eyes are more likely to catch issues than one, and if I specifically know that some part of the code is tricky and person X is familiar with that part of the code, I might even ask that specific person to review.
Honestly, it sounds like you have a very cavalier attitude towards breaking production. That might work in some settings, but definitely not all of them.
That may work in some cases, definitely a no go in a workflow where deployment is fully automated via GitOps, i.e. every push to the default branch triggers a deployment to prod.
So what do you do if there is significant changes to be done and have to revert the commit? All your collaborators will have wrong commits in their history and will have to reset their branches
Why is everyone stating what he should have done instead? This is a perfectly legitimate lesson.
In finance, we are constantly dealing with products which are somewhat alike but not quite. Equity options vs Fx options ? Sure, they're options, they have a strike, we price them using some version of Black Scholes, etc. But oh boy they are so different.
The temptation for new joiners to factorize a lot of behaviors is huge, and many times we are in the shoes of the OP's boss, telling the younger devs to remain calm and keep things separate.
It's not about saying goodbye to clean code, it's about actually maintaining a clean code by not overabstracting.
> Why is everyone stating what he should have done instead? ... In finance, we are constantly dealing with products which are somewhat alike but not quite.
Yes, but we are talking about geometry. What the (usual) handles do are affine transformations (translation, rotation, scaling and mirroring), if the scaling is uniform, these are similarity transformations. And are guaranteed to work for any shape.
Which leads to the real problem: you can't (well, shouldn't) abstract something, if you don't have enough knowledge about that - especially what new possibilities may show up in the future. For example if somebody thinks that the only possible ovals are circles and ellipses, they will be quite surprised when the first "real" oval shows up.
It's funny that you phrase this as such, because this exact kind of simple geometry is my go-to example for why this stuff is not as simple. Let's look at a square. Everyone knows a square is a rectangle, right? So of course a square should extend from a rectangle.
But math doesn't deal with mutability all that much. It's interested in the visible properties and constraints. A square is a rectangle because it meets all the constraints and has all the properties of a rectangle. But that is no longer true if the shapes are mutable. A rectangle might appropriately have `setWidth` and `setHeight` operations. A square cannot implement these operations and still obey Liskov substitution, without the ability to downgrade its type to a rectangle. In OOP you might correct this by making the square immutable and the `setWidth` and `setHeight` operations would return a rectangle instead of a square.
To bring this specifically to this use case: Yes, it's a defined mathematical operation to scale a shape such as a square on the x-axis. But, if you do, that square is now a rectangle. This may be important to the implementation, it may not. But it's certainly a relevant technical concern.
No, a square is a rhombus and a kite, same as a rectangle is a kite, so both should be kites. And both are also trapezoids, so they should be trapezoids too. And paralellograms. And of course all of these are quadrilaterals. A square is also a
The idea of geometric objects in a hierarchy leads to either incorrectness or problems in the implementation (most of the time both ;). Just don't do that.
What goes back into finance, because rewriting a Black Scholes predictor every time one instrument needs it would be insane. And yet, it's what this code seems to do.
> It's not about saying goodbye to clean code, it's about actually maintaining a clean code by not overabstracting.
But the title says so! :p
I get the whole point of this post, but it can easily be read as an excuse to actually messy (as opposed to "non-clean") code. The author doesn't actively distinguish them, so others do.
Agreed, it's quite telling that everyone in here feels like they know better than his team mates and him (at the time of writing the article), with no other context than some pseudocode.
DRY has value but abstractions come at a cost. It's always a tradeoff.
Part of the reason I enjoy languages like Haskell is because of the abstraction maximalism enshrined at the language level, rather than the coding level.
We all agree on what an Applicative is supposed to do. The challenge shifts right in the idea-to-code pipeline to finding the right Applicative for the problem. Most OO design patterns have natural, well defined equivalents in such an environment, but the difference is this way helps minimize the amount of project specific abstractions you have to deal with.
The downside as a beginner is of course a greater upfront cost to learning the language, and quite frequently opening up another Haskeller's work and realizing there are like 12 compiler extensions you've never heard of that you now need to grok. But even there I would argue that's a net win: you can carry those compiler extensions with you to all projects in the future. (The downside as an expert is realizing you have to get a PhD to get a sense for how much RAM your cats will gobble up!)
We know what Applicative and Monad (forgetting the burrito) do because the have mathematical LAWS. If a piece of code uses Applicative or Monad you know it’s behaviour due to the Applicative/Monad mathematical laws.
No OO design pattern has a law, they are a loosely defined concept/pattern which means they are open to personal interpretation/customisation but no concrete law to say exactly what it is and its expected behaviour/output.
> If a piece of code uses Applicative or Monad you know it’s behaviour due to the Applicative/Monad mathematical laws.
Well, we actually only know what it is supposed to do. The laws are not enforced, so it is perfectly possible to implement a type class that violates the laws of the type class.
Several test frameworks implement Law Testing which will test your instances for lawfulness as long as you can generate instances of your data structures. E.g. https://hackage.haskell.org/package/hedgehog-classes . So you only need to be able to generate instances and you get comprehensive tests for free, so that's nice too.
It is also the case that implementations of these very abstract classes typically only also have very few implementations that make any kind of sense type-wise and that the 'obvious' one is correct.
(Just to preempt: Yes, tests aren't proof, but such is life without a full proof system in the language. In practice, I've found property-based testing sufficient.)
Oh, I know and I'm not arguing against anything you've said. I just wanted to express - for people who don't know Haskell - that while the laws exist, they are not (as of now ;) enforced or checked by the type checker.
Your observation reminds me of Philip Wadler's Strange Loop talk [1], where he states that some languages are discovered, whereas most languages are merely constructed.
I think he would have agreed to Haskell (or at least, languages based around lambda calculus) is more in the first category, whereas Java, C++, Smalltalk, Javascrip (ie. OO languages) are in the second. Computational fundamentals versus engineering problems.
But Monads are like what you compose. The equivalent in OO would be the semicolon. You still need business logic that follows messy life / human rules
no matter what paradigm you use.
I think these are two wildly opposed world. What the industry calls OO is just a way to try to avoid large business failure. Other paradigms were more about computation, the logical basis is way way deeper. These people like to own all computational layers very precisely from parsing, to compiling, to algorithmic analysis. Enterprise/Application coding is not about that, it's how to bridge the customer / IT gap with something not too horrendous and still adaptable enough to handle market chaos.
As a fellow haskeller, I agree on that the language has very good abstraction capabilities.
On the specific example in the blog post though I can only say that there are so many times I've been down the clean code path and it's more often not another person, but the future you, into whose foot you are shooting. Abstraction is not beneficial when you need to study it later to understand it again.
> Abstraction is not beneficial when you need to study it later to understand it again.
This is where a good study routine comes in. I study Haskell like I study all things of any long term importance, using an Anki deck. It takes me longer to start using a new abstraction, but once I have it it's pretty much there for life, due to my daily commitment of half an hour or so to my reviews. It makes far less sense, of course, to do this for a one-off abstraction I had to apply to a specific project, so this naturally tilted me over the years to work more and more in loss like Haskell.
(N.B., I do not state this as a prescription. I think this is well above and beyond what most people expect of themselves professionally, and that is okay. Everything in life comes with tradeoffs.)
This is really interesting to me. What exactly do you write on the anki cards? I cannot really image learning a haskell abstraction by mere memorization. How can you learn it without actually applying it to a real problem. Often times the hard part is to know when to use which abstraction. How do you learn this using flashcards?
What you detail is pretty much how you would recognize it. You write out a brief summary of the high level problem, and then ask yourself "which abstraction would you reach for first, absent any other information?"
The latter part is often implied, but it's best to make it explicit. You're trying to build up expert intuition, which means allowing for the fractal nature of software development to mean you might be in the 20% of times where this won't work for you.
The OP would have done better to use more OOP principles, unless eschewed by the team in general--then I'd probably work elsewhere. The place where OOP has the most natural fit is implementation of GUI widgets and their editors. I've written a few myself.
The 'rule' that unifies OOP's other rules is tell-don't-ask. So for the example at hand we have the handles which are shared in a known way and I presume shape-specific control points (e.g. circle may have center, radius). Then a change of a box handle could call a updateControlPoints(changedHandle). There would also be updateBoxHandles(changedControlPoint). There may be other clean splits, but that's how I'd do it.
> We all agree on what an Applicative is supposed to do.
Maybe. But not on which parsing library to use (parsec? megaparsec? attoparsec? regex-applicative?), on whether to use Data.Text.Strict or Data.Text.Lazy, on how point-free the code should be, on whether to use "." or ">>>" or any other combinators, on which language extensions to use...
I like Haskell, but it suffers from similar problems as any other language: people can't really agree on the best way to do things
Rust has GAT's which have the same expressivity as HKT's. But many Haskell patterns turn out to be unidiomatic in Rust due to extra overhead. (For instance Rust has multiple function-like traits that access their environment differently. There's no equivalent to this in Haskell of course; GC and default boxing choices for Haskell types obviate the issue, but this is not zero cost.)
I know, I know. It's very silly of me not to pick up Rust when I've already got all of its forbearers in my bones. Someday, once I've finished studying everything else I need.
oh, oh my. The golang opposite where you don't have language extensions and should be able to grok everything about the language in your head all at once... yeah I'm going with "simpler language".
You could have just moved the repetitive math into separate functions and called them from the resize functions.
The resize functions themselves only looked repetitive because they were forming an interface, which is why refactoring them away was wrong. But the math they performed is a perfectly valid target for cleaning the code.
Especially since it was a refactoring, it's generally better to handle those in stages if you're working with a team.
There are lot of ways/patterns for solving the same problem... but tearing apart an existing interface being used by others isn't a good start.
The best part about abstraction is that you can modify the underpinnings to be more flexible without hurting the rest. A good step would be to create more flexible math functions underneath, like you're saying, and leave the rest alone.
Afterwards, there can be deeper conversations with the team about how they want the interfaces to communicate to provide an obvious facade but an abstract implementation.
I feel like what you’re describing is bad abstraction. This is the real issue, people designing things off two or three examples, not making sure that what they have is general enough to accommodate future usage.
In particular, a good approach to this is to abstract the common parts to all those into just some very simple functions. Abstract bottom to top, not top to bottom. There is no need to create a single point where everything happens and deals with the 2^n combinations.
You still reduce repetition, you can still change stuff in a single point, but you retain the ability to make custom changes if needed.
Yes, the problem in the real world is that no one KNOWS they're writing a bad abstraction at the time they're writing it, otherwise they wouldn't write it.
Therefore it's insufficient to say things like "avoid bad abstractions" or "make sure you accommodate future usage". No one can predict the future, generally speaking.
Abstraction should be used with extreme caution, even when you feel that it's probably right. Be an extreme skeptic.
I completely agree. The issue with the "clean" version of the code is that it's completely coupled together. If you touch one thing in the math code, it affects everything. At some point, it becomes easier to keep some duplicated code, point out the similarity using some code structure cues or comments, and over time, highlight any key differences using more comments.
A "clean code" approach to this kind of code doesn't handle evolving requirements well. It always ends up as a mess of helper functions with multiple code paths within depending on where the function is called from. This can sometimes be abstracted: some people might have noticed already the previous sentence is describing OOP polymorphism. But how would anyone know the exact requirements ahead of time? If we're in such an early phase, any attempt to "clean code" will only result in a bad abstraction, or in overengineering. The right approach is to do nothing, note the possible issue, and wait until more is known about the problem domain.
We could call it The Premature Abstraction Anti-pattern :)
Been a victim of this myself a few times myself so I normally do exactly what you've suggested and wait until a clear, and still performant abstraction has emerged from the code and slowly refactor-in the abstraction. But it's always better to do this on a nearly completed codebase, so at the end, where you should naturally be looking to remove lines, not add.
This is why I've grown to like discovering patterns based on changes - new features and bugs being fixed.
I wouldn't worry too much about somewhat duplicated, but easily understood code, like the original example.
I would however worry if I have to fix the same issue in 3 places, or add code for a new feature in 3 places. That's when I start wondering if I can push the code I need to change or add into a common place somehow.
This is completely right. People read about design patterns or DRY, usually apply them wrong since they're still learning... and then act like it's the source materials fault?
Yup this is the worst part of Clean Code. I've seen real-life bugs caused by this kind of refactoring.
The best part is that the same commit added unit tests so our coverage grew over the magic 80% bar, but of course unit tests don't usually test for race conditions.
The problem is that the source material (the Clean Code book in this case) has almost no caveats. And it rarely acknowledge other styles or points of views. For example, when the book discusses possible objections to having "lots of small functions", it simply doubles-down, suggesting also having tons of small classes, and rationalizes it. Experienced developers know that reality is a bit more nuanced than that.
This is actually a big reason for the popularity of the book, showing a "one true way".
And this carried on to followers of the style. Some linters even enforce the style without much regard for the practicality of it.
Design patterns, for example, really should only be studied once you have quite a bit of experience with complex code bases under your belt. You need to have done battle with some nasty code problems in bigger code bases before you really understand the problems design patterns are trying to solve. Inexperienced developers can't be trusted to apply them because they don't have that judgement. It just looks like a shiny thing.
True for so much of compsci stuff. You won't understand regular languages, automata, grammars etc until you have done a fair bit of pattern grokking yourself.
I don't see how that's a fair characterization of a book that has this quote in the first chapter:
> Many of the recommendations in this book are controversial. You will probably not agree with all of them. You might violently disagree with some of them. That’s fine. We can’t claim final authority. On the other hand, the recommendations in this book are things that we have thought long and hard about. We have learned them through decades of experience and repeated trial and error. So whether you agree or disagree, it would be a shame if you did not see, and respect, our point of view.
It's been a while since I've read Clean Code, but I seem to recall it stated many times that blindly applying the rules of Clean Code without good justification would lead to bad code. The author even provides examples of this. People in this thread are criticising Clean Code principles as if they are meant to be a rigidly enforced dogma. They aren't, and the author never intended them to be so.
I think that quote is a good example of why it is a fair characterization. It uses the authors’ seniority to argue from authority, even explicitly requesting respect.
To a beginner, it reads like ”these are subjective matters so experience is king, and we have more experience than you do”.
"Respecting my experience" does not translate to me as "do everything exactly accordingly to these strict rules". To me it says "consider my opinions before doing something different". Consider. Not follow blindly. I can see how one may interpret it as the first if they read the quote in isolation, but certainly not in the context of the
book. Which, as mentioned before, goes out of its way to state these rules are more like guidelines, and gives examples of where strict adherence causes worse code.
Is it? The book requires being quite familiar with programming already. Perhaps this is me projecting my own experience of when I read it, but I feel like the book is targeted at someone in the late-phase of being a Junior Dev, and/or the early-phase of being an Intermediate Dev.
I personally derived a lot of value from reading the book, and I feel like my skills noticably improved from before to after. So perhaps my own biases are showing, but I believe discounting the book's contents wholesale is a mistake. There is a lot of value to be garnered from reading it. The book doesn't have to be the infallible word of the Software Gods for it to be useful.
That disclaimer filters out anybody that isn't at least on the transition to be a senior dev (with real seniority, not just in inflated title). It takes quite a lot of experience to agree or disagree with a rule, and respect a point of view without automatically applying it.
In fact, since the rules on the book have way deeper impact than they look like, being able to read that book and not getting damaged by it is a good test for seniority.
But, funny thing, if you are mature enough to fit the disclaimer, you've necessarily already seen all that the book talks about and don't need reading it.
Not bad advice when you remember that advice is only ever for people who know absolutely nothing and need encouragement to start. "Only solve for X" gives some constraint so that one can start to focus on what really matters and not be bombarded with so many choices as to derail the entire effort.
With practice, one soon comes to understand why you might also want to solve for Y. Those with experience are going to ignore what everyone else says anyway, so it doesn't matter if it is not true for them.
Indeed, we in this industry are bad at practicing, and that's a problem. Imagine being Taylor Swift and having your guitarist for the night's performance having picked up a guitar for the first time yesterday. That would never fly, yet that's often how we treat software development – and that's how we get to these kinds of places.
There is no need to accommodate for future usage. Most of the issues resulting from abstractions actually come from people treating abstractions as sacred, trying to adapt abstractions used by some piece of code to fit new use cases, when they should be treating abstractions as disposable instead, swapping them out and creating new ones instead of altering existing ones. If the abstraction that you come up with based on the first few examples is not suitable for new use cases, that's completely fine.
As far as generality is concerned, it's not the number of examples that matters. Once you learn to distinguish whether your duplicated code actually duplicates some piece of knowledge or just happens to resemble another piece of code by happenstance, you stop having to rely on an unreliable magic number as your limit on how many instances of duplication to allow before abstracting something. And that skill probably correlates strongly with the understanding that abstractions are not just macros that make it more convenient to repeat some chunk of code, they actually represent knowledge, and thus their primary use is not to eliminate duplication, but to encode an understanding of a problem domain into the code.
Rob Pike once said "A little copying is better than a little dependency" as part of the reasons why Go is the way it is, and the more I use programming, the more I understand what he meant. Repeating yourself is not a problem in itself. What matters is grasping the unit you're currently reading and possibly modifying. DRY often means adding another abstraction, but if the abstraction is not orthogonal enough you now have to consider 2 units instead of one, and you've made it worse.
I don't agree with that being clean code in the first place. It's just adhering to the DRY maxim, which is one perspective on clean code, and imho the worst one. DRY causes you to mix use cases and glueing stuff together just because it's the same code.
DRY caused so much problems in my professional life. I'm now in the "DDD, ports & adapters" fraction and I encourage people to duplicate logic as long as the implemented use cases are isolated. That produces code that's easy to read, maintain, test, and is intertwined with the business logic. How often couldn't you explain business logic to stakeholders/customers because the code you write is worlds apart from how the app operates?
That's what the author also means with "My code traded the ability to change requirements for reduced duplication, and it was not a good trade." I guess.
+1 for this, code should be layout to express business logic, abstractions that difficult the readability of the business logic are not worth it, it is a distractions. Yeah takes a while for juniors to understand when/how to use DRY.
Repetition is not a terrible evil, it has its place
- actually write a story relevant to the lesson learned, and not make up some fable that takes 1/3 of the article before even starting to make a point
- not have any sort of popups asking me to subscribe or login
- not have any sort of annoying javascript animation that gets in the way of reading the article
I think the point of DRY is to avoid side effects. For eg. If you repeat same code 10 places and in one instance you decide to change the logic, now you have 9 places with old logic and 1 place with different logic.
Removing duplicate code for readability and making code clean is second order affect in my opinion.
Times have changed since pragmatic programmer [1] was written. Lot of the code we write today does not have the longevity nor it needs the resilience that was required in past decades. It is purely because things move very fast, engineers iterate very fast these days. Lot of principles still hold true but we have to pick and choose what's right for a given problem or situation.
That pragmatic book is very keen on code generators. Distill the information to some variant on ASCII tables and then derive all the redundant forms from it as a compilation step.
I think that's the right approach. I'm totally up for generating C++ from JSON if that's the available data representation.
So far, essentially everyone I've worked with has hated code generation in all forms. In language mechanisms may or may not be acceptable (C++ templates more likely to be OK than C macros). Outside language mechanisms - python writing source code during the build and similar - very hard to get past review.
I'm curious whether that aligns with other people's experience.
We have a lot of customer integrations, and often the source system is one we've integrated with before. So we usually copy the whole integration and make some tweaks. Invariably our customers want some data massaging and custom logic, and by duplicating we can't screw up another customer's integration when we make changes.
Of course we do have some code in shared libraries, like parsing dates and numbers, cleaning mixed utf-8/latin1 input and such.
Of course, that does mean we might have to change many integrations if something fundamental changes. But overall it has worked very well for us.
But surely this should be already inherently known? I have never met anyone that stuck dogmatically to "rules" from things like Code Complete, Clean Code etc etc no matter how much they espouse them. Very simple examples:
- 1 assertion per test
- Exceptions instead of return codes
- CQRS
etc
All of these are good guidelines in some contexts. I cannot fathom why anyone would think they apply to every situation...
Guy managed to bloat a simple mobile app to 180kloc. Every change we wanted to make after inheriting the codebase required changing abstractions in >10 places and all their corresponding unit and integration tests. I'm talking changes like adding an additional value to a dataset. Everything was abstracted as far as humanly possible.
He was a very proud clean code aficionado and would not let anyone from his old team dissuade him. He would also regularly scold his somewhat more junior co-worker when he did not follow his 7 layer abstractions for services, providers, surfaces, use cases and bindings.
But alas, I don't want to say you're wrong but there are people that would do well with this simple piece of advice.
>Clean code is not a goal. It’s an attempt to make some sense out of the immense complexity of systems we’re dealing with. It’s a defense mechanism when you’re not yet sure how a change would affect the codebase but you need guidance in a sea of unknowns.
Vomit. Guy picks bad abstraction, goes rogue and refactors a coworker's change, gets told off by the boss, and then writes a blog post with lofty garbage like the above as though he's an authority on clean code. "Brilliant" jerk vibes. Pass.
I'm sorry? I also disagree with the authors conclusion but the article is definitely giving me less "jerk vibes" than this reply.
The lesson learned in the article should be "talk to your co-worker on why they solved a problem a different way". It should be about communication, which this reply does not shine in either.
Then you don't lose the ability to quickly override things when there are special cases or requirements change, but you can keep the overall code repetition down a bit.
DRY might be the bane of my existence as an engineer. The vast majority of developers I meet that are self proclaimed as "senior" are extremely dogmatic. They've had roles where it's been drilled into them {{engineeringPractice}} is god tier engineering and it should be applied indiscriminately to everything you write.
Being a great engineer is about being able to be pragmatic, learn the rules, then learn where you can break the rules.
DRY is like the entry level dogmatic practice everyone adopts, and it leads to some god awful abstractions and it's really hard to explain why having un-DRY code is ok, it's really unintuitive to the "initiated".
The problem with these kinds of articles is that they disprove an idea but leave you without any concrete guidance on what you should do instead. Yes, abstractions could become obsolete because we didn't foresee the future (rightfully so), but what would you do instead? At any point in time, you have a limited amount of information and you should come up with the least stupid code that you can. Overdoing abstractions almost always leads to this terrible result, but you can't just call it "goodbye clean code", this is more of a marketing post than an actual useful informative one.
The long answer is a bit complicated, but the short answer is you should duplicate code.
It is far less work to dedupe WET and abstract later once you truly understand the correct abstraction than it is to try to untangle the wrong abstraction later.
Or as Sandi Metz would say, "Duplication is far cheaper than the wrong abstraction."
Does every post need to propose a new One True Way?
Letting go of the Old Way is just as valuable without having a preferred replacement.
It's similar to the bell curve meme about any niche/skill. A beginner only focuses on the fundamentals, a "mid-level" focuses on all the best tools & methodologies (to the point of overcomplicating things), and the master only focuses on the fundamentals.
This experience has zero to do with clean code. The title is very much a misnomer and incredibly misleading.
This is an elaboration on the experience of abhorrent engineering culture at this company. From people who can't professionally speak with one another to a manager who cannot seem to effectively solve the core issues underlying the problems at the company. Instead of a moment of reflection on HOW to improve the root cause of the problem, it feels like they're kicking that can down the road.
Two things I'd like to point out:
1. It's nice to see some form of humbling commentary from the author; albeit, I don't think it targets the right solution to the problem.
2. I still think there's a lesson here for any org to digest and that is communication is such a vital and integral part of "a raising the tide lifts all boats", so to speak.
I don't care about clean code. I care about readable code, easy to change code, efficient code. Of course, I applaud the virtues of OOP, SOLID, design patterns if I sense an interviewer is an Uncle Bob disciple. But I try to steer clear of Uncle Bob disciples, as I don't like living in the kingdom of nouns. [0]
It went from one extreme to the other. Is not cleanest code or its opposite. You have a gradient of options in how you'd code.
Here is another way to present the issue discussed by the author:
Every level of indirection you add will have a cost for the next developer that needs to extend or maintain it.
If, for whatever reason (making code "cleaner" for example) you didn't do it gracefully/welcoming/invinting-to-change/self-docummented enough, then you increased costs, which is the same as being aggressive to business margins.
So what's a more interesting conclusion?
Get over childish reactions of saying "goodbye clean code" but embracing improved discernment on when and how to introduce it.
> My code traded the ability to change requirements for reduced duplication, and it was not a good trade.
This is the the one that really clinches it for me.
I think it's important to be aware that "clean code" standards came out of the early-mid 2000s contract development community. "Maintainability" means something different in that context, for two killer reasons. First, you're very likely to walk away from the project after a fairly short time period. Maybe a couple months, maybe a couple years. In 5 years, though? It's almost certainly someone else's problem. Second, you probably bill by the hour, or something like it. So decisions that make the code resistant to future change aren't really a bad thing, because, while they do make that hypothetical future maintenance work more expensive, you also get to charge more money to cover those costs.
I'm not trying to cast shade on contract development shops here. Business is business. But I do think that those of us who are developing software for ourselves might be operating under a somewhat different incentive structure.
The changing part is not time consuming, is having to fix the same bug over and over because you didn’t realize this same function was copy-pasted in 7 different files. It really gets worse when you realize this a year after the first bugfix, still dealing with a stupid bug you fixed a year prior, if it wasn’t for copy-pasted code.
I’m also against DRYing up unnecessarily, but if the code is a 100% match in intention and/ord behavior, then it’s going to be deleted.
just to challenge that, why would you not realise the function is copy pasted in this scenario?
I mean, you'd see that the function is inlined in the place that you discover the bug, so you already know that changing that function's code isn't going to fix similar bugs anywhere else right?
How do you know where the function has been copy pasted? How do you know whether it has been copy pasted in the first place?
The bug may be latent, hard to find, or express itself very differently in different places. It is in fact, a different bug if it originates from code in a different place, even if the code is largely identical.
The example in the linked blog is easy to detect, but often duplicate code exists across files, often unrelated. Just today I deleted several React components that literally only varied by name and were intentionally copy-pasted. I only realized because I the one I edited was imported by the same file as others.
I definitely remember having to debug an old colleague's code who did things like turn and scale 2D shapes. The code was written like the original version in the article, except even more dirty. 4 entirely different code paths for turning things depending on the angle. Several dozen thousand lines of code where almost all of the logic was repeated 4 or two times with minor (and difficult to find) differences. Additionally, different functions which needed to do the same intermediate calculations didn't use intermediate functions, the same code was just copied.
Listening to some people you'd think this code was super easy to understand and modify. It wasn't. Because it was missing abstractions.
Turns out, code has bugs. Not only finding bugs was excruciatingly painful due to every function having 10 mutating variables on average. Actually fixing bugs was an exercise in futility because it was impossible to tell whether the same buggy logic wasn't copy pasted somewhere else in the code. And when you found one, you had to understand all of the specifics of that other place to make sure you didn't break anything.
Abstractions make things easier to understand by allowing the reader to care only about small parts of the logic at a time. The best abstraction is when you read the name of something and decide not to look into it because the information you need is already clear from the name and usage. What repetitive, imperative golang-style "simple" code lacks is the ability for the reader to not read it. My understanding of code is at a high level of abstraction, preferably as high as is relevant to my task. I'd much rather read code that says sum filter predicate than decipher a for loop and then having to figure out that it does indeed do a sum over a filter. Going down this low in abstraction is only useful for intro level programming students.
Then again, golang is clearly aimed at being not a low level programming language, but a language for programmers at a low level of programming.
The code from the article is at some point going to need to be abstracted probably. Unless it is 'done'. As they add much more to it and they will be in the exact situation you are talking about. That is a tough call sometimes when to just bash it out and move on or DRY it up. I personally would toss a comment in there for saying exactly as much. That is going to be a future mess (but not yet).
That whole article though I see as a total communication breakdown. Between the two devs and the manager. That the other dev just invoked 'management chain button' is not a good sign for a good working env. Also the new version did not ssem much better than the prev version. The trust of the developers is broken and the manager did not help and the dev who wrote the blog post now will be shy of doing things.
> Rewriting your teammate’s code without a discussion is a huge blow to your ability to effectively collaborate on a codebase together.
It's a mistake to believe that the code belongs to any one teammate. Code in a shared project is the project's code, not any individual's code.
Folks stuck on the idea that any given part of a project is "their" code would do well to look up the term "egoless programming" and start applying it to their work.
The point is how should such code grows? If it meant to grow on its own, separately on different requirements, then its not duplication. It's just a code that looks the same, for now.
Everything comes with a price. The Clean Code book basically argue on one side, on what you can gain with such practice. It's not gospel, its something that argue on benefits on certain practice. As a developer, you probably are more privy on what is a better trade off based on your project. Do what make sense.
You may default on writing "messy" code, or writing "clean" code, before you find the middle ground that make sense, and thats okay. Its a journey, not a destination. It takes experience to make a good trade off in your daily coding.
Not only the resulting abstraction is harder to adapt, but the abstraction itself is badly chosen.
The first and foremost thing you want to refactor is the repetitive set of arguments. As it stands, all those arguments will come from interacting with handles, so you can reasonably expect that they are deeply coupled with handle changes (for example, it may allow for fixed 1:1 aspect ratio in the future). So they should go to its own object or even class, so that some duplicates can go into its methods while allowing for individual changes. And then you can decide whether you want to refactor more or stop there. Some refactoring changes are clearly beneficial in virtually all counts, others less so, and mixing them is not a good strategy.
Yes and no. You need the right balance and the ability to make the right choices. As in for everything.
If you are implementing something trivial, duplication is often the right choice.
I've worked on complex frontend applications (including graphic editors, social networks) and never got to a point where duplication hindered my understanding of the codebase.
The problem I've seen (compounded by React zealots, incidentally!) is abstraction bloat. Think flux and redux - and boy, that complicates the codebase massively.
Once we started using React + Redux, our time to production dropped massively, but we blamed it on the new tech and we were "cargo culting" hard. This has improved nowadays as we don't use Redux anymore but we're still producing very large and complex application for the set of features we have.
On the other side, I worked on complex backend business logic where people didn't even know what clean code is (think massive payment system) and it was a complete nightmare to try to understand what was going on. When everything you read is incredibly detailed, you can't keep everything in your head, you need to group it conceptually. You can do it in code, or in your head.
It's the same reason Go was criticised for not having Generics: without abstraction you'll turn a 100 lines app into 500, making it way harder to follow and fit into your memory (and I'm not talking about RAM).
The other issue in the clean code debate is that what's understandable changes from person to person, depending on coding and reasoning capability.
I've written this pattern 10s of times now. There's a lot of ways to get it wrong according to my normal[0] but going and doing something weird in the "boring" interaction handler code usually burns me. I'd argue for "cleaner" code here.
Every code base has it's tradeoffs, though. But I don't think the conclusion should be "goodbye, clean code". Clean code is a goal for maintainable code; code bases you expect to last >1 year. If your goal is to iterate fast then you shouldn't be looking for maintainable code. And that's perfectly fine! Just don't be surprised when your <1 month code lives >1 year and you're sad.
[0]: My "normal" is to abstract as much of the math as possible to Math-only objects (working with "Box", "Rect", etc type math classes) which makes rigorous unit testing easy. Writing all the math _once_ in an update handler and call from start/move/end/whatever depending on the context and how much info you have. It's the easy to draw all the graphics based on your math objects that are now up to date, however that needs to be done.
I even think this conflates 'clean' with 'duplication'.
The first code is clearly 'clean' conceptually, although there's plenty of duplication, it's very clear what's going on, it's 'flat'.
The re-worked code with an abstraction is more complex, even though it's 'tighter' in some way, it's suddenly multi-leveled, and so less clean conceptually.
I think we should stop stretching the word "clean" to encompass everything we approve and disapprove of in code. Your discussion of the underlying issues is perfect - the author's coding dilemma is a classic tradeoff between the complexity of repetition versus the complexity of abstraction. "Clean", if it means anything at all, is not the dimension on which these options differ.
Clean code explicitly addresses situations like this: the code "seems" identical for the time being, but it's not in the long term, because it applies to different objects, that will evolve independently. This is in the book, so I am not sure why they're saying "bye Clean Code", it should be "I should read Clean Code again".
I think I went into the same trap as well quite often. And then I exactly ended up with the problems as they are described, i.e. adding further special cases becomes really complicated and convoluted in my abstraction.
But I really find this difficult. How to decide what abstraction is ok and which is not? Since I did this mistake a few times, I now sometimes tend to try to think a lot about this, which slows me down a lot, and in the end, I still not really sure whether I made the right decisions.
I think it's so specific to a given codebase that it's hard to give advice that generalizes well.
I have the same feelings as you here, I often feel a little lost trying to decide how much duplication is appropriate in a given case.
I try to not overthink it in the difficult cases. There are many other places in code where it's more obvious, because you can see the history of how people worked with the code, what was the frequency of changes and edge cases added to it etc.
Often it's also just about the readability of the code, I tend to go by the "how confusing it will be for the next person if I defuplicate this code in a smart way" metric.
Related to this: the worst part of all this “Clean code” mindset, it is the idea that having comments in the code is somewhat harmful, because “if there is a comment, the code is not simple enough”.
Currently in my second workplace that belongs to that “school of thought”.
If you have a block of code and feel that the block warrants an explanation, that block of code should receive a name instead. A block of code with a name. Behold: a function. Abstraction! Scary!
I do think things like types and functions should be documented, and on rare occasions there are tricks which can only reasonably be explained using code comments, but those are extremely rare.
Yes you’re right, but sometimes in interpreted languages it’s difficult to understand what the function does. Python resolves this problem via type-hinting, Ruby relies on comments (although there are some approaches like sorbet).
As a freelancer and consultant I’ve seen this drive or urge for simplifying coding go to the extreme to become pointless or even very damaging at a larger scale.
An idea that sometimes popup in some ambitious people are usually centered around making programming easier or more foolproof by developing a platform that supposedly would make that easier and safer to build software systems within an enterprise. What these people seldom realize is all the effort that needed to build a good platform from scratch versus finding and contributing to a suitable OSS project.
This is what has happened at a client I’m working for now. An architect thought it would be a great idea to develop such a platform with multiple components like a runtime, a visual designer to create flow diagrams that are finally transformed that into code, integrations with databases and lots of other features based on code generation from configuration files.
After some years of development this platform have only managed to create hurdles for the only project that have tried to adopt it. The documentation is outdated and lacking, the extensibility is lacking, the performance is not great, there is one consultant who is still developing it and charges a fortune and other teams avoid it like the plague. While this has been going on other great open source alternatives have been developed or matured.
DevOps/Agile always has a serious set of sustainability trade-offs, but in the beginning tends to show relatively quick progress. A great model for service companies billing by the hour for a limited scope product-iteration lifecycle, but a poor choice for building reusable infrastructure assets. =)
If you are good... than you will automatically leave a trail of commented stubs, design pattern templates, and working examples.... that set a project trajectory implicitly. Thus, avoiding having to explain choices to people that can't or won't understand (good frameworks can help in extreme cases of stupid).
People often entrench themselves in bad ideas/habits. Training peers can be challenging, as sometimes one must subtly repeat the same evidenced concept everyday in a different way for a month till a core simple concept registers (Neurodivergent managers can communicate in this manner even over trivial choices.) Being a good communicator is important when a problem domain shifts into the theoretical (i.e. 0 search results for the Jr.), and learning not to sound disrespectful/irritating takes practice.
At some point, a team needs to cut their losses, and confine the weak/obstinate members to an area that matches their skill/will level. This is often the first step in getting laggards to quit on their own, or making due with HR biases.
If someone is interested in further contemplation around this, I highly recommend the book "A philosophy of software design" by John Ousterhout. In my opinion, it gives good reasoning why clean code is sometimes praised and sometimes criticised like it is here. I highly agree with Ousterhout that software complexity is mostly caused by obscurity and (cognitive) dependencies within abstractions. With that in mind its easier to see why Dan failed to improve the code here.
Had a coworker and I collaborate on code during the day to have me come back the next day to it all having been rewritten. I pointed this out to my manager who did nothing. I asked to be moved to another project as I figured
1. This person was very passionate about the project but wasn’t good with feedback all the time. I stayed available for sometimes pointless code reviews. The project succeeded anyway. And the code is actually still pretty good today to be honest. Sometimes a difference of opinion is just a mismatch of objectives
2. I felt if management didn’t want to help with this situation I had to protect myself and my own self esteem, and sense of contribution to the team.
It’s weird how much “feelings” can and honestly should come into play in a technical role.
Was this the best outcome? I don’t think so. I think each of us could have stuck it out on the project working together longer. But it was not feeling like we had management support to do so, and everyone took “the easy way out”. The good news is this didn’t create a ton of technical debt and I still have a high opinion of the other developer to this day.
The problem here is not understanding which are the ends and which the means (I assume from other things in the essay that the author has since figured this out).
I see this problem a lot in people who memorize but don't understand the GoF book (Patterns) so you see huge stacks of singleton envelope factories.
The point of not repeating yourself is so you can fix a bug in a single place, and understand common behavior. As it happens it can also reduce code size, not that that's a big deal for most people these days.
In this case of seemingly duplicated code the different implementations were not really common, at that early stage only coincidentally so, and it was known that they would eventually diverge. So even if DRY is desirable (generally it is) this case was one of premature optimization.
Patterns and rules of thumb like DRY should not become fetishes; they are conventions to be picked up when they make life easier and communicate something useful.
The epitome of "no duplication's" failure is CMake. It subsitutes two duplicated build systems with a crudely unified version of both. What before were duplicated lists (tedious to maintain) become an algorithm (difficult to maintain). Add to it the plethora of other build systems and you really have something that has failed its unifying mission.
From the article: "When we don’t feel confident in our code, it is tempting to attach our sense of self-worth and professional pride to something that can be measured. A set of strict lint rules, a naming schema, a file structure, a lack of duplication."
I feel a little triggered. Maybe I'm not feeling confident? No. I've dealt with lots of other people's code and I'm really not bothered by repetition, even not by overuse of abstractions. But. Strict lint rules, a naming scheme and a file structure (in other words, basic coding standards) are, in my opinion, the minimal foundation of any professional software development project. Looking at it this way, clean code is code that is consequently adhering to a given coding standard.
Abstracting code is always a dangerous gamble because they rely on invariants. In this case “the shape all resize the same way”.
When the invariants breaks, the abstraction collapses and the code can become much more convoluted than it was originally. I’ve seen it many times.
In my career I’ve seen many many bad code abstractions and very few good ones. As measured by how long before they break.
I’ve asked the engineers that came up with those if there was a trick to it, and the answer has always been “dude it’s the 10th time in my career I’m writing that stuff”.
Good abstractions come from domain experience. If you’re writing something new, don’t abstract it. If you feel smart about it, that’s a bad sign. You don’t feel smart when you’re writing something for the 10th time.
When I first heard Dan Abramov talk about this I also recoiled a bit and was a bit offended. But the takeaway isn’t that abstraction is bad… it’s premature abstraction is bad. In fact just because it can be simplified does not mean the abstraction or simplification fits. Just because you can draw a circle around a common set of things does not mean that set is semantically significant. For example all humans have arms and legs, and you can make an abstraction encompassing those features, but it’s an incomplete abstraction if you miss the head or other features characteristic of an actual human. Before a domain is concrete a premature abstraction actually adds to technical debt.
this resonates strongly with me because i just had to nuke an abstraction invented by my colleague, which solved a small DRY issue but introduced a big change difficulty issue.
All we had to do is do some repetitive work for values of a dictionary (stringify and lowercase). We ended up having an abstraction of a dictionary with smart value conversion behaviours, which brought pain every time the business wanted some added custom behaviour (e.g. don't lowercase this property, make human-readable that property, etc). Younger me would keep piling up complexity onto this abstraction. Modern me just duplicated some `str(..).lower()` calls, removed the whole thing and went home happy.
In this particular problem, I see that there's a certain threshold where replacing duplication with abstraction really pays off. But everyone has different views here and there and no one really has a good answer. So we have two choices:
1. Proactively refactor the code. This makes it a future prediction problem, which is never easy.
2. Keep it as is, refactor when it really becomes a pain. You gotta do much more work than where you made a right prediction.
Many hardcore engineers (including me) prefer the option 1 since it's more of fun, but in my experience, we're usually not great at making a good bet.
This article always reminds me of this excerpt from re-frame’s docs [0]:
> Now, you think and design abstractly for a living, and that repetition will feel uncomfortable. It will call to you like a Siren: "refaaaaactoooor meeeee". "Maaaake it DRYYYY". So here's my tip: tie yourself to the mast and sail on. That repetition is good. It is serving a purpose. Just sail on.
The term 'clean' is ambiguous. I aim for 'simple' code. That means as few lines of code as necessary, as few abstractions as necessary, as few files as necessary. I have a rule of thumb that every abstraction I 'invent' should be relatively easy to explain to a non-technical person. So far in my 12 year career, I don't think I've ever had to invent an abstraction that was so complex that I couldn't explain it to a moderately intelligent business person and I've worked on some very complex systems.
The developer cycle: Small feature-> Coder -> Architect -> Maintenance.
Every line is a potential bug. Less lines less bugs.
Not all bugs are created equal.
If one studies how information is carried on living organisms whose maxim is to keep information surviving through time we will see duplication. Lots of it.
Aging is the action of duplication of incomplete/incorrect functionality. Is normal. Decreases the level of effort going forward.
All projects age. Some have a productive life and gracefully die living a beautiful garden behind, some can't die and turn into tumors.
How is this positive?
Respect the limitations imposed by a framework or structure instead of thinking how that could be improved?
Conciseness is a value if it helps the communication and understanding, but it’s not an absolute value.
And that is true for code, as the post explains quite well.
Hard lesson learned the hard way ;) Abstracting and clean coding around seemingly abstractable and similar fresh code (or even existing one!) without knowing the details that lead to that (agreed, often also not the best craftswork, but anyway) or the near future, is often the culprit of the later big tangled mess to be.
Also really, not just refactoring someones code without talking, but just pushing? Not even a PR? Imagine a workplace where everybody does this the whole time, lol. The boss asked for a revert for many good reasons ;)
I think, most of deduplication issues come from (1) premature optimization and (2) doing "technical" instead of "semantic" deduplication.
(1) When you pull things out too early, you might miss the big picture and constrain yourself in the long run.
(2) Sometimes things look similar, but aren't. By deduplicating without taking semantics into account, you can end up with a nice looking function at definition side, but at call side it looks too abstract.
Both of these issues give you something like "createHandle".
Clean code is bullshit. Don't get me wrong: I like to have beautiful and clean code very much. But code has to be maintainable, performant, reliable, communicative, adaptive and all kind of other things before it has to be clean.
If your clean code hurts those, you made the code worse.
Sometimes you will just have to accept that things take the number of lines they do and having them duplicate in different places is a feature as you (or someone else) might have the reasonable desire to change the individual parts.
Whenever your sense of aesthetics overrides practical concerns you should pause and take a step back.
> Sometimes you will just have to accept that things take the number of lines they do and having them duplicate in different places is a feature as you (or someone else) might have the reasonable desire to change the individual parts.
The trick is always to figure out whether you are in that case or whether you are in the case where someone will want to change the shared logic everywhere at once. In my experience, the latter is vastly more common, and this is what DRY optimizes for.
That is true, but this is not case in the example.
DRY is often a good choice, but you can also use it to make logic shared that very likely never should have been shared. Untying that knot later on can become very hard compared to code where someone just had repeated themselves.
So no DRY is better than wrong DRY, but well applied DRY is of course better still (depending an what we are optimizing for).
I think the main thing you’ve got to ask yourself when abstracting duplicated code is: are these things very similar because they are intrinsically linked or is it slightly coincidental. I.e if we ever have to change B will we always want to change A in the same way?
If the answer is that you’re not sure, it’s probably better to duplicate, and see what the case is when it needs to change again - and then decide whether to abstract or not.
The main thing you have to ask yourself, will these duplicates ever need to be special-cased, and how hard will it be? Will someone have to write monolithic handling for that case from scratch, or are there ways of hooking or overriding individual behavioral choices?
Some people are incapable of understanding context and trade-offs. Every rule must be either “always” or “never”.
Always use inheritance! Oh, it caused fragile code in this instance. Lesson learned - never ever use inhertance! And so forth.
In this case, the author is confusing de-duplication with creating a particular abstraction, and when the abstraction turns out to be a bad fit for the problem, they decide code duplication is actually fine. This is the wrong lesson.
I've been doing this a long time and one of the things I've learned is the more abstract the code, the more likely it is a changing business need will completely break the abstraction in unforeseeable ways.
Good code is easy to change.
Frankly, after years of falling heavily down the OO hole, I've come around to see procedural with as pure of functions as reasonable as probably the easiest to maintain long term.
I found that the best use case for classes is dependency injection. Have a constructor, inject your dependencies, then use them in one or more functions. However, you can do the same with functions:
function new_foo(dependency) {
return function(bar) {
// Do something here
}
}
> Each shape (such as a rectangle or an oval) had a different set of handles,
Yet, that's not what the code says. The objects have multiple methods with similar signatures representing handling.
The right thing here is to make the handles objects. The objects know what their parent is and send it messages "I have moved". There is one method in the shape which is reponsible for handle-initiated movement. The object looks at which handle has moved (it's an argument), and other state like Shift constraint, and adjusts its representation accordingly.
The behavior of how to respond to which handles should still be the responsibility of the shape. Commonality there could be factored out into some common functions such that if you need to special-case something, the in that shape you stop using the common functions entirely, or use them as a fallback for whatever behavior remains that is common.
Multiple dispatch is made for this; the movement behavior is a combination which shape and which handle. With a precise specialization of a multiply-dispatched method, you can code behavior for an exact shape and handle, overriding any less specific definition.
Read literally, the word "clean" in a coding context is most naturally interpreted as "observing a consistent style at a basic syntactic level" (e.g. camel case vs snake case), or "having a minimum of irrelevant clutter" (e.g. variables and functions that are defined but never used). It's about the sort of things that a linter looks for.
Unfortunately, "clean" has been stretched and concept-creeped to the point of meaninglessness in software engineering, especially by the book of the same name. Now, "clean" has become an invitation to enforce ideology on complex questions we don't understand. Once something is declared not "clean", action must be taken to mitigate the primal fear of contamination, even though we don't really know what "clean" and "dirty" mean in a coding context.
We should stop using this vague crutch word for everything we approve and disapprove of in code, and find more accurate words to express our feelings. If we think code is repetitive, we can call it repetitive, and have a conversation about whether repetition is good or bad in this context.
> Firstly, I didn’t talk to the person who wrote it.
This might be an issue in itself sometimes. People identifying with the code they wrote. When you are part of the team, the code belongs to everyone. Everyone should be able to change it if it makes sense and all quality gates are passing.
> When you are part of the team, the code belongs to everyone. Everyone should be able to change it if it makes sense and all quality gates are passing.
Yes, but actually no. Context is important here. If that code had been written six months ago or even a couple of weeks ago? Have at it. But, the OP states "My colleague has just checked in the code that they’ve been writing all week." then "It was already late at night (I got carried away)."
If spent all week on something, working out the all the details, checked it in at 5pm, and then came in the next morning to discover that someone stayed up rewriting it rather than waiting till the next morning to talk to me about it first, I'd be absolutely livid.
sure everyone should be able to change the code, but somebody is immediately refactoring new code to align with their own opinions and making it worse in the process is not ok. i would immediately revert the commit if somebody did that to me (and then tell them why of course)
well obviously both, but it just isn't a very good point. by making changes to the shared code in that specific way, you are undermining and insulting the developer who is supposed to be part of your team. you can't just pretend that the team aren't humans and that social conventions and etiquette don't apply.
Context is very important. If this happens overnight and out of nowhere/excess pro-activity, it's definitely an issue.
But the reality is usually more nuanced. Even saying that because of this particular situation clean code is bad, it's an exageration.
I don't know where the author learned about DRY, but I'd be shocked if it wasn't accompanied by a warning that DRY is only a general guideline, and in practice it's beneficial to allow some duplication to enhance code readability.
Working in enterprise consulting, usually with plenty of external partners, made me become used to "it works" and "fullfils expected behaviour" as already good enough in most cases, even if the code is not clean.
Young engineers can be childlike sometimes in that they hear a rule and feel like the rule MUST BE ENFORCED at all times, regardless of nuance. Coding best practices are tools. The wisdom is in deploying those tools properly.
The real problem here was the overall software architecture and development workflow that apparently wasn't geared towards building 2.0-alpha versions of a component while still keeping the 1.0 version untouched.
That refactor was good. The way the colleague was handled wasn't. Repetitive menial code like the one originally submitted should not be accepted, but you shouldn't unilaterally rewrite it either.
Article is dumb. He says don't write readable code for two reasons: 1) refactoring someone else's code might hurt their feelings. 2) you might need more functionality later.
First version was cleaner, and new code is, obviously, not clean at all, because it requires much more energy to learn then first one. Interesting that author misguide DRY code with Clean code
> My boss invited me for a one-on-one chat where they politely asked me to revert my change. I was aghast. The old code was a mess, and mine was clean!
Sounds like a weird thing, if not toxic, of the boss to do. Is the boss a developer? Then why are they the boss and not an equal? Are they in a managing position? Then why do they get involved with the developer's work? Don't they trust their own employee?
I could imagine it maybe, if they were a business of 10 people. Or perhaps the title "boss" is wrong and they meant senior and they themselves being junior dev or so. Otherwise it is really weird and the boss should get out of their face and let them do their job.
Even if they can it doesn't mean they should. People managers should manage people and let engineers engineer. Even if they are former engineers themselves
The problem with the example has been discussed elsewhere: the design is wrong. Of course refactoring but keeping the design more or less the same won't work.
If the author had written thorough unit tests first, they could have then engaged in fearless refactoring, they would have been orders of magnitude less likely to have broken any functionality, and what's more, nobody else would have cared about or minded the change in code.
Write whatever dirty abomination you want. Then unit test the crap out of it. Then refactor it into the pristine shining epitome of every pattern fanboy. As long as it works, before and after, nobody will berate you for it.
> As long as it works, before and after, nobody will berate you for it.
This is not the reality in most places. People care about aesthetics, and about practical issues like how maintainable the code is, even if the code is thoroughly tested.
You're wrong, your initial corrections were correct. Your boss will now have to spend 4x as much money to modify every class that ever relied on Ovals and Rectangles if/when he realizes there's another way he might want to modify them.
You didn't catch the end where they explicitly say
> Secondly, nothing is free. My code traded the ability to change requirements for reduced duplication, and it was not a good trade. For example, we later needed many special cases and behaviors for different handles on different shapes. My abstraction would have to become several times more convoluted to afford that, whereas with the original “messy” version such changes stayed easy as cake.
Clean code is easy to understand and to change. Ideally, any given file or method only deals with one concept.
Messy code usually results from someone merging concepts to avoid duplication. After that happens, differences in the concepts cause the code to become much harder to reason about.
I've seen this particular problem happen a million times. The worst code I ever stepped into was completely DRY. When, where, and why something happened wasn't something the original authors could answer. That's because everytime they found code duplication, they'd abstract. And everytime that didn't work, that ended up adding little "if (this instanceof Foo) special case;" to get over the most recent bug.
I've written code for profit for about 25 years, although I started learning to code in Perl and BASIC when I was 8 years old, so, 35 years total. I have a high school degree. I dropped out of art school. I'm self-taught. I've worked on some larger teams but generally in a role where I was responsible for a final module or piece of code from start to finish. Most of my career has been working alone, planning and building full stack custom software for midsized businesses and startups that need particular functionality in their workflows that they can't find off the shelf. Much of it is designing the business logic. Most of my codebase now is TS, NodeJS and PHP, (and SQL) although up until 2016 or so I did a huge amount of work, including games, in Actionscript 3.
I feel that Clean Code and Uncle Bob is like modern tiktok influencer hustlers. It's feel good whishy-washy crap that has some amount of truth but in the end it ends up being very non-productive
Not to mention UB was all butthurt the other day when people found out a conference he was invited to had fake speakers in its lineup. But hey, I guess it takes one to defend one.
mmm.. this is an interesting take. I have been a zealot of sort in the past (intern/junior time), but now coming to same conclusions about these cargo cults.
> UB was all butthurt the other day when people found out a conference he was invited to had fake speakers in its lineup. But hey, I guess it takes one to defend one.
Firstly, what does this have to do with clean code?
Secondly, “butthurt” is not a word I feel correctly expresses his comments. From what I can tell, he advocated for a skeptical and cautious approach, and wanted due process to take its course instead of mob justice to run with the first impression, as it always does.
Thanks for the link, you scroll down and see the rebuttal by Gergely
> Confirmed speakers at DevTernity 2021, 2022 and 2023 and JDKon 2024 do not exist. There were & are no such engineers at Coinbase or Meta.
> I shared this fact and said no conference should list fake speakers.
There is no "crime" there per se, having a fake speaker in your lineup is a doozy, but it's different from, let's say, saying Bill Gates (with his picture) will be at your conference with no attempt of doing so.
Now I do think the reasons listed by the conference organizer are feeble at best, but (moderately) plausible. I'd never put a placeholder saying so and so are "Engineer at Coinbase" and giving them fake names/pictures
A reasonable rebuttal. But how are his original comments therefore “butthurt”? This was what I was arguing against, remember?
You seem to be making the same misinterpretation of my comment as people in that thread; simply because some comment is not 100% supportive of one side, you think it must therefore be defending the other side. He expressed concern that mob justice was being perpetrated rashly, and people are interpreting that as support for the other side. Similarly, I was arguing against your categorization of his argument as being “butthurt”, and you immediately think that I am supporting the other side of the argument, and therefore present me with rebuttals to that argument, instead of what I actually wrote.
Everybody has different taste and opinions what "clean code" is supposed to look like (best example of this is "for-loops" vs .map()" - IMHO nested loops are usually more readable than a function chain which does the same, but other people have the opposite opinion).
IMHO the core problem of the situation described in the article is that the guy simply rewrote a piece of code without before talking to the author/maintainer/owner of that piece code.
> I was expecting this to be a rant against Uncle Bob for his weird comments around the fake women speakers at the tech conference.
It would have been very unfortunate if it were. If an article, entitled "Goodbye, Clean Code", had been written in response to Uncle Bob's twitter life, what it would have demonstrated would be that the author thinks that people with bad opinions have nothing of value to teach about their area of expertise. Too common a position these days.
I think you overcommitted a bit on the refactor, if you had replaced those "10 repetitive lines of math" with a function it would have been cleaner.
Let's be crystal clear about it, your teammate did a terrible job not creating a function for those 10 repetitive lines. I would be rejecting this PR but never rewriting it.
Rewriting a PR is a quick way to insult someone because you're not open to debate and many times you don't have the full picture of why the code was written this way. PRs have to be reviewed and rejected when there are no coding standards, the last thing I want is people committing code and saying "Clean code is not important", this mindset only leads to a codebase that no one wants to work within a matter of months.
Communicating better and having some tact is the lesson you should be taking from your experience, clean code has nothing to do with it.