How SQLite is tested

chris_wot · on Oct 5, 2012

One of my proudest moments was finding a bug in SQLite where a corrupted index caused a select statement to segfault Firefox.

I jumped through a lot of hoops to get to the point where I got a backtrace that showed me the SQL statement of a corrupted places.sqlite. I then loaded SQLite on the data file, ran the statement and reproduced the segfault. One of their lead devs then got in contact with me, grabbed the data file and fixed the issue.

I suspect that not only did my diagnosis lead to a fix for a LOT of Firefox crashes, but it stopped a lot of frustrating crashes on things like iPhones, etc :-)

I may not have done the fix, but I took the time to reproduce the problem. It felt damn good :-)

P.S. in case anyone is interested, the bug is https://bugzilla.mozilla.org/show_bug.cgi?id=581946 on Mozilla, and at SQLite it's at http://www.sqlite.org/src/ci/83395a3d24

awj · on Oct 5, 2012

Thank you for taking the time to help out with a project that nearly everyone uses, even if they don't know it.

dicroce · on Oct 5, 2012

Good Job. That is a good one. I'd be proud of that too.

joevandyk · on Oct 5, 2012

Curious: why was there no test checked in at the same time as the bug fix?

sbochins · on Oct 5, 2012

congrats on catching that bug. I use sqlite for a ton of stuff. I just used it at work a few days ago. It has so many use cases where you need a lite query language and not an rdbms.

timClicks · on Oct 7, 2012

I agree with your sentiment, but SQLite is a RDBMS. It's just very lightweight.

tedunangst · on Oct 5, 2012

Unfortunately, 100% testing is only effective if you can detect 100% of the errors generated. The only bug I found in sqlite was an off by one in the btree code that was mostly harmless, unless your memory allocator was particularly fussy.

I only found the bug, never quite understood it, and after seeing how disturbing the fix was decided some things were best left unlearned. http://www2.sqlite.org/cgi/src/fdiff?v1=fa113d624d38bcb36700...

That said, sqlite is one of the most reliable and better designed libraries I've used. Software is hard.

dbaupp · on Oct 5, 2012

100% testing is effective, just not 100% effective. And 100% testing is strictly better than 99% testing is strictly better than 50% testing (when considering only its bug detecting capabilities).

jemfinch · on Oct 5, 2012

I'm glad you appended "when considering only its bug detecting capabilities". Something like SQLite, whose functionality is largely static, definitely benefits from the additional test coverage, but the majority of code out there--code that's constantly evolving, code to which functionality is frequently added and removed--is too often prematurely cast in stone by overzealous testers, making maintenance and evolution significantly more difficult than it ought to be. In a lot of cases, too much test coverage actually reduces the value of the code.

josephlord · on Oct 5, 2012

There is a strong case that more tests allow you to make changes to the code with some confidence that there won't be unintended side effects. Now maybe SQLite's level isn't appropriate in most cases but that doesn't mean high levels of code coverage are bad. It can also give confidence to upgrade underlying frameworks or libraries

josephlord · on Oct 5, 2012

Is there any more widely deployed software in the world than SQLite? Multiple copies (browsers, language runtimes, embedded in other software) on many computers and built into most smartphones and its probably in quite a few TVs and other devices too.

Very successful invisible (to non-developers) software.

tsahyt · on Oct 5, 2012

Not just successful, it's really good too! The API is rather easy to use, it's lightweight and (for small to medium sized databases) it performs really well. Not really suprising it is so widely used.

mhd · on Oct 5, 2012

> Is there any more widely deployed software in the world than SQLite?

zlib or libjepg, maybe.

mrich · on Oct 5, 2012

libc (both GNU and BSD)

lrem · on Oct 5, 2012

A funny thing here... If you count the number of active copies, this may not be true. Usually you don't see ten apps in one system coming with their own version of libc. This may pretty well be true for sqlite.

wickedchicken · on Oct 5, 2012

The BSD networking stack.

josephlord · on Oct 6, 2012

I suspect it is less deployed than SQLite. SQLite gets into things even without network capabilities (e.g. Digital TVs even before they were internet connected for the channel data and the EPG information).

I don't know to what extent the code of the BSD stack remains in Windows but you can bet even if a raw build of Windows doesn't have SQLite embedded somewhere within it (which it might well do) that there are probably multiple software installations with it embedded somewhere.

Would you actually describe the BSD networking code as being deployed in Linux even? I know it is derived from the BSD stack but is it really still the same software.

Zlib and libjpeg as someone else suggests are good suggestions and at least in the races with SQLite.

jemfinch · on Oct 5, 2012

What I'm more interested in is how SQLite reached such substantial test coverage. What techniques or tools were used to generate the tests? Did someone (drh?) actually sit down, read through the code, and construct a test case for every branch, or did they use tools to facilitate test case construction? How can I apply the same effort to my own software (once it's ready) to improve its reliability?

antonios · on Oct 5, 2012

Probably a combination of the above. They are also writing tests for each successfully crushed bug:

Whenever a bug is reported against SQLite, that bug is not considered fixed until new test cases have been added to the TCL test suite which would exhibit the bug in an unpatched version of SQLite. Over the years, this has resulted in thousands and thousands of new tests being added to the TCL test suite. These regression tests ensure that bugs that have been fixed in the past are not reintroduced into future versions of SQLite.

npsimons · on Oct 5, 2012

This is precisely how regression tests are supposed to be built; to wit:

1. Find/identify/isolate the bug;

2. Create a test that fails if the bug is not fixed;

3. Run the test to make sure it detects the bug;

4. Fix the bug;

5. Run the test to make sure it passes now that the bug is fixed;

6. Add the test to the test suite and checkin/push the bugfix.

I don't always get to do this on my projects, but it's a good habit to get into. Having a good/easy to use test framework already setup can help a lot with this (if tests are hard to write/take too much time, they won't be written).

furyofantares · on Oct 5, 2012

I do it differently.

1. Find/identify/isolate the bug;

2. Fix the bug;

3. Create a test that fails if the bug is not fixed;

4. Run the test to make sure it passes now that the bug is fixed;

5. Revert the fix and run the test to make sure it detects the bug;

6. Add the test to the test suite and checkin/push the bugfix.

I prefer this order because if I make a mistake in step 1 I usually realize it in step 2, where I feel like you might not realize it until step 4. I'd be curious to know if there are advantages you know of to the order in which you do it.

tspike · on Oct 5, 2012

I work in the same order as npsimons. It's a similar philosophy to the red-green-refactor strategy (http://bit.ly/OazqR8).

The advantage, to me, is that you focus on the behavior of the application rather than the code. I have a tendency to get off-track when I'm coding and I'll start on refactors that, in hindsight, were a terrible idea.

By forcing myself to be sure that the feature/bug is something I really want, I stay on-track because that damned test keeps failing and I just want to make it go green! By writing the test beforehand, I can be sure that it's what I really want and not just what's easy to code.

That said, both ways work and I sometimes switch to an approach like yours.

npsimons · on Oct 5, 2012

The advantage, to me, is that you focus on the behavior of the application rather than the code. I have a tendency to get off-track when I'm coding and I'll start on refactors that, in hindsight, were a terrible idea.

Very much the same for me; I'm much like Lenny from "Memento" at times ("now what was I doing?"); add to this that reproducing the bug is essentially what you are doing by writing a regression test for it. Also it falls in line with TDD as applied to maintenance (keep coding until all the tests pass). One last thing: it's kind of a wash with VC these days (and reverting, as the GP said), but if you fix the bug first, are you certain your test is catching it? I like to have a piece of code that I can say "yes, when I do this, my code fails; now to fix it."

shaggyfrog · on Oct 5, 2012

I would say that step 3 and 5 are basically identical. I'd call 5 the redundant one and scratch it.

joevandyk · on Oct 5, 2012

No, step 3 doesn't involve running the test.

ExpiredLink · on Oct 6, 2012

> What I'm more interested in is how SQLite reached such substantial test coverage.

Because SQLite was designed for testability. No "techniques or tools" will help when the application isn't created with testability in mind from the outset.

lrem · on Oct 5, 2012

You should definitely see drh present it in person. He's giving a talk about it in universities around the world, seen it a couple years back, in Poland. It's incredible what effort goes into reliability of what seems such a small thing.

Btw, the very same day he persuaded me to move to Fossil.

Edit: just to be clear, you should see it for all the good ideas he's explaining. Not for some marketing of a piece o of software.

RDeckard · on Oct 5, 2012

link?

lrem · on Oct 5, 2012

Hmmm, despite what uni authorities said, there is no recording of that particular talk available. So I have no specific advice, apart from bland "search YouTube" or "ask by email".

cfn · on Oct 5, 2012

Here is an old one (2006) at Google: http://www.youtube.com/watch?v=jN_YdMdjVpU

dblock · on Oct 5, 2012

Can anyone please comment on the humans involved in running some of this stuff? What are their roles, occupations? Infrastructure? Who owns that?

zandorg · on Oct 5, 2012

Well, I told the author of SQLite that a double inner join took forever to complete. He just told me I was wrong, but this inner join worked fine on PostGres, MySql and so on.

It annoys me, because neither PostGres not MySql are a binary file you can just run with a query, unlike SQLite which is very convenient for embedding in a desktop app.

fdr · on Oct 5, 2012

SQLite does not have very many execution methods or a very intelligent optimizer, and every database has different access methods available to it to some extent. For example, Postgres lacks skip scan, which Oracle has. MySQL lacks bitmap index scans to combine indexes, which Postgres and Oracle have. Postgres (until the recently released 9.2) lacked index-only scan, which MySQL and Oracle (and DB2, and Informix, and SQL server...that was a feature long in coming) have. MySQL of many versions lacked hash joins (which blows my mind), it may have it now (depends how you count the forks).

So, all in all...somewhat expected. I have a number of queries that simply cannot run in their most naive rendering on Postgres due to lack of skip scan.

plq · on Oct 6, 2012

In case anyone wants to read more about skip scans, here's the link: http://docs.oracle.com/cd/B10501_01/server.920/a96533/optimo...

stonemetal · on Oct 5, 2012

Mysql has an embedded version. It may not boil down to a single file on the HD, but I don't see that as a big drawback.

http://dev.mysql.com/doc/refman/5.1/en/libmysqld.html

kderbe · on Oct 5, 2012

My takeaway from this is high-quality software needs several robust test suites, each written with a different testing methodology.

A presentation from the Opus audio codec developers ( http://www.ietf.org/proceedings/82/slides/codec-4.pdf ) opened my eyes to just how many overlapping approaches you can use. In addition to listing a dizzying array of software tests (similar to the SQLite article), the Opus presentation summarizes the strengths and weaknesses of each approach, which is great for understanding when each approach is appropriate to use.

YZF · on Oct 6, 2012

This Google Talk starting about 32:35 the author talks a little about SQLite testing: http://www.youtube.com/watch?v=f428dSRkTs4

dgregd · on Oct 6, 2012

Does anyone know the real reason SQLite was removed form HTML5 specs?

I known official Mozilla arguments. Which are quite week IMHO.

Is this was MS job? They where afraid that browser apps will make desktop apps obsolete?

mariuz · on Oct 6, 2012

Yes was a Microsoft Hand , they killend it and went to more complex indexed db that no one loves

http://h30499.www3.hp.com/t5/Following-the-Wh1t3-Rabbit-Down...

anonymousDan · on Oct 6, 2012

Anyone have any thoughts on how they might be doing 'automatic' memory/resource leak detection (Section 6.0)? Seems to me it would be hard to do without generating lots of false positive failures.