Friday 28 August 2009

Testing TextTest

Geoff has just put up a couple of new pages on the texttest website, with some coverage statistics for his self tests. He uses coverage.py to produce this report which shows all the python modules in texttest, and marks covered statements in green. I think it's pretty impressive - he's claiming over 98% statement coverage for the over 17 000 lines of python code in texttest.

I had a poke around looking for some numbers to compare this to, and found on this page someone claiming Fitnesse has 94% statement coverage from its unit tests, and the Java Spring framework has 75% coverage. It's probably unwise to compare figures for different programming languages directly, but it gives you an idea.

Geoff also publishes the results of his nightly run of self tests here. It looks a bit complicated, but Geoff explained it to me. :-) He's got nearly 2000 tests testing texttest on unix, and about 900 testing it on windows. As you can see, the tests don't always pass, some are annoying ones that fail sporadically, some are due to actual bugs, which then get fixed. So even though he rarely has a totally green build, the project looks healthy overall, with new tests and fixes being added all the time.

Out of those 3000 odd tests that get run every night, Geoff has a core of about 1000 that he will run before every significant check-in. Since they run in parallel on a grid, they usually take about 2 minutes to execute. (When he has to run them at home in series on our fairly low spec linux laptop they take about half an hour.)

Note that we aren't talking about unit tests here, these are high level acceptance tests, running the whole texttest system. About half of them use PyUseCase to simulate user actions in the texttest GUI, the rest interact with the command line interface. Many of the tests use automatically generated test doubles to simulate interaction with 3rd party systems like version control, grid engines, diff programs etc.

Pretty impressive, don't you think? Well I'm impressed. But then I am married to him so I'm not entirely unbiased :-)

3 comments:

Jeremiah said...

You may want to look at perl's TAP (Test Anything Protocol) which is becoming an IETF protocol. Perl has a long history of being test driven and has a strong testing culture, look at CPANTS for an example. There is a TAP library written in python if you want to try that as well and a wiki for tap as well - google knows where it is.

Andrew Dalke said...

Not sure if I pointed out the sqlite testing summary page to you before: http://www.sqlite.org/testing.html .

Unknown said...

Hi Andrew,

100% branch coverage is nice, but 45 million lines of test code in C doesn't sound like it comes for free exactly :)
Not volunteering to maintain that...

Just counted my lines of test code and there are 639 of them (all mock-code rather than test code really).

Would be nice to test branch coverage but coverage.py doesn't do that currently.