I've had two experiences with unit testing recently that have made me a believer.
One of them was that I was working on a team where a programmer quit and I had to get a very complex codebase ready for production. The last programmer was terrible, the kind of guy who had trouble making primary keys that were unique, where any code that could possibly have a race condition did, and so forth. The code had unit tests, however, and that made it salvagable, and eventually I got the system to a place where it worked correctly and the customers loved it.
In nine months of effort on this, I ran into one refactoring where it felt the unit tests were a burden, and that involved about a day of work rewriting the tests. Unit tests are more likely to be a problem, however, when they add to the time of the build process. For instance, I wrote something in JUnit that hammered part of the system for race conditions, and this was key to fixing races in that part of the system. It fired off a thousand threads and took two minutes to run, and adding two minutes to your build is a BIG PROBLEM, particularly if anybody who wants to add two minutes to your build can do so and if anybody who wants to remove two minutes from the build is called "a complainer" and "not a team player". Overall the CPU time it takes to run is more likely to be a problem than the developer time it takes to maintain them.
As for Mockito I have found it is a great help for writing Map/Reduce jobs. As I don't own a big cluster and as I sometimes like to code on the run with my laptop, an integration test typically takes ten minutes with Amazon Elastic Map/Reduce. It takes some time to code up tests, but I get it all back with dividends because often I get jobs running with two or three integration test cycles instead of ten or twenty. When I find problems in the integration tests, usually I can reproduce them in the unit tests and solve them there.
Now, it did take considerable investment to get to the point where unit testing worked so well for me. I used to have problems where "the tests worked" but the real application didn't because Hadoop reuses Writable objects so if you just pass a List of objects to the reducer, you might get different results in a test than you do in real life. Creating an Iterable object that behaves more like Hadoop does solved that problem.
Generally if you are feeling that "unit testing sucks" or "mockito sucks" it's often that case that you're not doing it the right way.
My sense is that one should make the unit tests to be as resilient as possible to refactoring and changes. This means that for so long as the public behavior of the class does not change, one should not need to do much if any updates to the tests.
Thus any test that is written in such a way that it would be present an issue in refactoring code should be avoided if at all possible. A simple example is directly constructing the class under test in the test method:
@Test
public void tryToAvoidDoingThis() {
MyClass = new MyClass(param1, param2);
// do stuff to my class
}
If this is done for each test method, when the constructor parameters change, e.g. a new one is added, then each of the constructors calls in the test method(s) will have to be updated.
Instead, have a level of indirection and have a single method that can create a sample MyClass. Now when the parameters change, only one construction site has to be updated.
In general, unit tests should not be testing specific / internal implementation details of the class. Rather, the tests should verify the documented public behavior of the class.
There's an inconsistency here: unit tests should depend only on the public behavior of a class; the constructor is public; constructor calls should nevertheless be avoided where possible.
Factoring out a common constructor in tests is an example of making the tests resilient against changes in the underlying code. If the constructor changes, the tests need to be fixed in one place, not in 50.
Other examples may be around a `setup` method. If the method is private, don't test it. Then you can refactor freely. If it's public, test the pre/post conditions around the method in as few places as possible (hopefully one). Even if other tests rely on the object having been "setup", just trust that it works. If the specification of `setup` changes, you just have the `setup` tests to update, not the entire object.
Like all process you have to do what works for you. First, to steal from the recent airbnb article, the bar for testing has to be so low you trip over it. The testing framework should make it easy to get down to writing tests.
Second, start writing tests to verify bug reports and then fix the bug. In large systems I find this critical to honing in on the exact problem. The mental exercise of crafting the test to trigger the bug helps me really understand the problem.
Finally, start new features by first writing a test to simply drive your new codes golden path. When working in a large system I find writing a test to run my new code a much faster development turn around time than rebooting the entire system. This is compounded when there are many systems or moving parts which is common today.
> Well explain further. I hate these smart arse sounding comments - "you are doing it wrong" without any indication why, or how to do it better.
With unit tests, there are certain things that must be tested, such as very high-value code contracts, and the like. There are many things that people test (like "correct output" for the input) which may not be so valuable, particularly if several possible values may be correct.
So test contracts, not internals, and not representation. And please for the heaven's sake, don't test the behavior of your dependencies.
Unfortunately I also believe those comments are generally true, and I also believe when the posters answer "why", they will give you an answer that is also doing it the wrong way.
I have no idea how to do unit testing. I only believe there is a right way.
By reading the unit tests and code comments, and comparing them against bits of the actual codebase, you can gain a better understanding of what the previous programmer was thinking and what he was trying to accomplish.
That's the most obvious benefit of having tests for me. Documentation can be outdated and even if it isn't it very rarely contains examples of use. Passing tests are for me exactly that. An up to date example of how the thing should be used and what can I expect from it.
For better or worse, this system had a number of data-centric objects; these objects passed many correct tests, but they failed to be deserializable from XML (because of the way collections were handled) as well as having other deficiencies.
The tests meant I could fix those deficiencies quickly and have faith that I fixed them correctly. Of course, I added new tests to test that the system did the things it had to do.
So... Unit Tess are good because, if you take over a code base written by someone who is incompetent, and if that person wrote terrible unit tests which should have been failing (but are passing, due to said incompetence), you have slightly more information to work with when fixing the original code? Seems like a weak argument to me.
Sure, but it's hardly a reason to write unit tests.
I write unit tests for every one of my core algorithms. I have also seen a massive amount of dumb tests written by TDD people who strive for 100% code coverage. Ridiculous. It takes as much time to maintain the (often useless) tests as it does the code. Where they make sense they're indispensable.
Agreed. It's not a reason to dismiss tests though.
I think the fundamental question is why 100% code coverage is important. The fact is that it isn't. the problem with TDD is that a lot of people who do it totally get the idea of why you are testing entirely wrong. The goal should never be to "ensure your code functions properly." The coal should be to "ensure the code contracts are adhered to." If you test with that in mind, you will write lots of unit tests and almost never have to rewrite or delete them due to fixing bugs.
Again this comes back to what I said in another comment that tests should never be written to the code. Once you get that, then code coverage ends up being meaningless and not something you want to worry about.
This is probably not best practice, but I often disable tests that take a really long time. An alternative would be to have a 'full suite' and a 'fast suite'. The fast subset could be used locally for most development, but then you run the full superset when you are ready to release. A 5 minute release is no big deal, but if it takes 5 minutes to run a standard dev test, then people are not going to test as much.
I also disable tests by default that require a working installation to run. This allows me to have a test suite that can be run prior to installation and a larger support test suite that can pinpoint problems on production systems.
It is ok to have one fast suit of tests that are run all the time and one slow but more detailed that runs only at some checkpoints (overnight, weekly, before release).
Big projects over some size normally do it that way.
If you've got a slow test suite, that mean's you've got a bad test suite. Taking away tests from it to make it faster takes away from its purpose, which is to aid you in refactoring.
Or it means you are testing something that's computationally expensive. Not everything is just web model input validation--some people are doing real work. :P
Not nessasarily. I've worked on math heavy programs where single calculations could take seconds to run. For the frequency they come up in actual use, this was not a problem, but our tests needed to run these calculations more than any single execution of the program would likely need to.
More specifically, consider an SSE2-based function 'float32 floor(float32)'. There's only about 4 billion inputs, so why not test them all? That only takes a minute or so.
How is testing 100 inputs a unit test and testing 4 billion inputs, through exactly the same API, an integration test?
As the author points out, many people wrote libraries which are supposed to handle the entire range, but ended up making errors under various conditions, and even given wrong answers for over 20% of the possible input range.
Is 90 seconds to test a function "slow"? What about 4.5 minutes to test three functions?
If you say it's slow then either it's a bad test suite, and/or it includes integration tests. I believe that is the logic, yes?
There is no lower unit to test, so therefore this must be a unit test.
The linked-to page shows that testing all possibilities identifies flaws that normal manual test construction did not find. Therefore, it must be a better test suite than using manually selected test cases, with several examples of poorly tested implementations.
(Note: writing an exact test against the equivalent libc results is easier to write than selecting test cases manually, and it's easier for someone else to verify that the code is testing all possibilities than to verify that a selected set of corner cases is complete.)
Therefore, logic says that it is not a bad test suite.
Since it contains unit tests and it is not a bad test suite, therefore it must not be slow.
Therefore, 4.5 minutes to unit test these three functions is not "slow".
Therefore, acceptable unit tests may take several minutes to run.
That is what the logic says. Do you agree? If not, where is the flaw in my logic?
How can you have a good test suite without integration tests? That's not a full test suite. That's a cop-out.
A good test suite has two qualities - how comprehensive it is and how fast it takes to run. If either is lacking, then it is no longer a good test suite.
It's quite easy to have slow tests that aren't integration tests. For instance, there's some tests in Sympy that are only a few lines of code that run very slow because the calculation is difficult. Sometimes (but not always), it's trying to calculate a very difficult integral (which is a test of integration, but not an integration test).
Or it just means you have tests which could be better optimized for speed but in fact are optimized for something else.
We had a series of tests (more towards integration tests I guess) at one point in LedgerSMB that did things like check database permissions for some semblance of sanity. These took about 10 min to run on a decent system. The reason was we stuck with functionality we could guarantee wouldn't change (information schema) which did not perform well in this case. Eventually we got tired of this and rewrote the tests against the system tables cutting it down to something quite manageable.
We had this test mixed in with db logic unit tests because it provided more information we could use to track other failures of tests (i.e. "the database is sanely set up" is a prerequisite for db unit tests).
Heavy computation algorithms. My main focus is on geospatial analysis, and to test certain things, you are going to end up with some 1000ms+ tests. Get 10 or 20 of those, and you have a problem.
> Generally if you are feeling that "unit testing sucks" or "mockito sucks" it's often that case that you're not doing it the right way.
Either that, or the person just hasn't been sufficiently burned by someone changing something you're not aware of and having to track down a run time error for days that could have been caught and fixed by a unit test in minutes.
I read the article, and much of what he speaks of is tautological unit tests - testing something where someone could never have done anything but that. I've seen people unit test the behavior of filling up a collection then test to make sure the collection has all of those elements, for instance. And there, he has a point.
But there's a dangerous line there. And while the article makes a good point that too much of that type of testing can be detrimental, I've generally found it better to err on the safer side of more tests.
On your own project, or even on a small team, you can probably get away without them much of the time. But, on larger projects, where many different developers sometimes go back and make changes to code they didn't write, it's very easy for a new guy/gal to make changes with unanticipated consequences. When that occurs without sufficient test coverage, the project will wind up spending 10-20x more man hours to fix the issue.
Different between unit tests and integration tests. Individual unit tests should be on the order of a millisecond or less so you can rip through them very quickly. If you use Surefire and Maven, name tests with the suffix ITCase (can override) and then you can either run the unit tests or the integration tests with mvn test or mvn integration-test.
http://randomascii.wordpress.com/2014/01/27/theres-only-four... tested all 4+ billion inputs to ceil(), floor(), and round(), and pointed out that many libraries actually had high error rates, because of incorrect support for rounding, small values, NaN, and -0.0.
Each test is extremely fast. Testing all 12+ billion cases takes 4.5 minutes. Are these unit tests or integration tests?
If they are integration tests, roughly where is the boundary between the two?
Is it meaningful to distinguish between "unit" and "integration" testing based only on the amount of time they takes? That is, if the unit tests take 0.1 second too long then do they suddenly become integration tests?
I think the problem is that definitions that should reflect the granularity of what is being tested have become too interconnected with assumptions about frequency of testing / time to run and it has created completely mangled definitions.
The problem is so bad that some developers I've crossed have the firm opinion that mstest should only be used for "unit" tests. It is frustrating.
If you have slow unit tests, don't run them on every compile. If you have fast integration tests, run them as much as you'd like. I've personally never defined unit tests as "things that run fast", despite that being a valuable property it does not seem essential to the definition, but perhaps I have the wrong understanding of what a unit test is.
12B test permutations is not your typical scenario, though 4min is pretty damn quick for all that. I'm asserting that for a given project module, it is beneficial to be able to run the test suite in a short time, say 10-15s. If you've got to wait minutes, then it is more integration.
The longer your unit tests take, the less likely people will be to use and run them often, which is the whole point. Let the nightly build on the CI machine exercise the long running tests when everyone is asleep.
True. Only a few tests are fast enough to run 12B tests within a few minutes.
Really, I think the problem is that unit test frameworks are currently incapable of doing the right testing.
Unit tests take quadratic time. That is, each new test requires running all previous tests, to get the green. And at some point, a project will have enough tests that it can't finish in 10-15s.
One option is to mark "fast" and "slow" tests. Another is to recategorize them as "unit" vs. "functional" tests.
These are poorly-defined labels. In this case, the 4.5 minutes of testing is "slow", yes, but it only needs to be run when a specific, small part of the code changes. The problem is, there's no way to determine that automatically. The test runner can't look at the previous test execution path and see that nothing has changed, and there's no way to mark that a test should only be run if code in functions X, Y, or Z of module ABC has changed.
Humans are able to figure this out. Well, sometimes. And with lots of mistakes. Get the unit test framework to talk with a coverage analysis tool, plus some static analysis and perhaps a few annotations, and this discussion of how to distinguish one set of tests from another disappears.
Blue-sky dreams. I know. :)
In real life we toss those functions into their own library, note that the code is static, and do the full test suite only occasionally; mostly when the compiler changes.
In other words, bypass CI the same way one does any other third party library. (How often do you run the gcc test suite?)
Various test runners do just this. Maven on TeamCity ranks the tests by their volatility (recently failed first), then by run duration. The point is to run the most likely to fail and historically most brittle tests first and the slow stuff last so you can fail fast.
That still means to run all the tests each time, with re-prioritization to enrich the likelihood of faster feedback.
But if none of the code paths used for a test have changed, and the compiler hasn't changed, and there's nothing which depends on random input or timing effects, then why run those tests at all?
The reason is we don't have a good way to do that dependency analysis, which is why we run all of the tests all of the time. Or we manually partition them into "slow" and "fast" tests.
Code instrumented for coverage tells you which tests executed which portions of code. As I remember, Google's C++ build/test system was using this by late 2009 to efficiently run all tests on all checkins to HEAD.
> Individual unit tests should be on the order of a millisecond or less so you can rip through them very quickly.
So now we have to write tests for all our code, and tests that run fast. We have to refactor our code around the tests. Something seems a bit back to front here. Or is the assumption that if we have 100% test coverage that runs fast then it means that we have written the best code possible?
I think I am siding with the author of the article on this one.
I find when I have to re-factor code around the tests it means the code wasn't very good in the first place.
The author complains about re-factoring code into smaller testable functions. I completely disagree. Code structured as small easily understood functions which do one thing and have obvious inputs and outputs is good code which is much easier to extend and modify.
Yeah, that's one of the article's weaker spots. But as a rule, I'd tend to interpret imprecise statements like that charitably. He's not saying that small, clear functions are bad, but that splitting functions for the purposes of testing is counterproductive. He's not saying anything about splitting for clarity and focus.
Note that the original article doesn't disavow all unit tests; nor does it disavow all testing.
Sounds to me like the kind of tests you're describing aren't necessarily unit tests (system-level race conditions aren't typically discoverable with a simple unit test); where the tests were truly unit-level, the proposed alternative (assertions) may have been even more informative. Finally - a few real unit tests for known correct behavior are advised where in essence the algorithm can be described independently from its outcome.
One of them was that I was working on a team where a programmer quit and I had to get a very complex codebase ready for production. The last programmer was terrible, the kind of guy who had trouble making primary keys that were unique, where any code that could possibly have a race condition did, and so forth. The code had unit tests, however, and that made it salvagable, and eventually I got the system to a place where it worked correctly and the customers loved it.
In nine months of effort on this, I ran into one refactoring where it felt the unit tests were a burden, and that involved about a day of work rewriting the tests. Unit tests are more likely to be a problem, however, when they add to the time of the build process. For instance, I wrote something in JUnit that hammered part of the system for race conditions, and this was key to fixing races in that part of the system. It fired off a thousand threads and took two minutes to run, and adding two minutes to your build is a BIG PROBLEM, particularly if anybody who wants to add two minutes to your build can do so and if anybody who wants to remove two minutes from the build is called "a complainer" and "not a team player". Overall the CPU time it takes to run is more likely to be a problem than the developer time it takes to maintain them.
As for Mockito I have found it is a great help for writing Map/Reduce jobs. As I don't own a big cluster and as I sometimes like to code on the run with my laptop, an integration test typically takes ten minutes with Amazon Elastic Map/Reduce. It takes some time to code up tests, but I get it all back with dividends because often I get jobs running with two or three integration test cycles instead of ten or twenty. When I find problems in the integration tests, usually I can reproduce them in the unit tests and solve them there.
Now, it did take considerable investment to get to the point where unit testing worked so well for me. I used to have problems where "the tests worked" but the real application didn't because Hadoop reuses Writable objects so if you just pass a List of objects to the reducer, you might get different results in a test than you do in real life. Creating an Iterable object that behaves more like Hadoop does solved that problem.
Generally if you are feeling that "unit testing sucks" or "mockito sucks" it's often that case that you're not doing it the right way.