Wednesday, 27 January 2010

Domain Specific Languages for Selenium Tests

I gave this talk at JFokus this week in Stockholm. I thought people might be interested in a summary of my main points.

I've been using Selenium for web application testing for over a year now, on a couple of different projects. This talk is based on my experiences.

What is Selenium?

Selenium is actually an umbrella term for a suite of tools. The two tools in the suite I have been using are Selenium Remote Control (RC) ,which allows you to test your webapp as it runs in a broswer, from Java code., and Selenium IDE, which is a Firefox plugin that allows you to record tests.

Selenium's killer feature is that it allows you to test your app while it is running in a browser, with a real browser javascript engine. This is particularly important for web applications with a lot of javascript and ajax. Selenium is also actively maintained, with version 2.0 having recently arrived at alpha release status.

I showed a demo of how to write a test using Selenium IDE to record browser interaction, then pasting that into Java code. The test doesn't pass the first time you execute it, you have to tweak the code to replace xpaths with ids, and add a "wait" for the page to load at one point in the test. I find this is typical - Selenium IDE helps you to get started with your test, but there is still a lot of manual work to get it to pass.

The code you end up with is rather widget-oriented and low level, this is what I end up with in my example:

public void createNewPublisher() {"menuItemPNewPublisher");
selenium.type("name", "Disney");
selenium.type("address", "24 Stars Ave");
selenium.type("zipCode", "91210");
selenium.type("city", "Hollywood");
selenium.type("country", "USA");
selenium.type("contactPersonName", "Emily");
selenium.type("contactPersonEmail", "");
selenium.type("contactPersonPassword", "123456");
selenium.type("contactPersonRetypePassword", "123456");"save");

verifyEquals(selenium.getTable("Users.1.1"), "Emily");

This is all very well, you can basically read what it is doing, (open the "New publisher" page, fill in a form, then check the user table displays correctly), but I think we can do better.

Agile Testing Theory

Brian Marick came up with this idea of "Testing Quadrants", which helps us to reason about what kinds of tests we need in an agile project. Selenium tests generally fall into the second quadrant - business facing tests that support the team. This implies they should be written in the language of the business, to help developers to understand what they are building. This is in contrast to the low-level, widget oriented script that selenium IDE produces.

It is better to discover a bug as soon after insertion as possible, and the faster the tests run, the more often you will run them. The usual rule of thumb says your tests must take less than a minute for you to run them as part of a TDD cycle, and less than 10 minutes for you to run them in the CI server. For a suite of selenium tests, even 10 minutes can be ambitious.

Selenium tests are fundamentally slow. They are slow because they run inside the browser using its javascript motor.


WebDriver (new in Selenium 2.0) uses a different model of interacting with your application under test. It uses the browser's api instead of its javascript motor. This means tests execute faster and more reliably, at the cost of supporting fewer browsers. In addition, WebDriver has a well thought out, clean api, and built in support for the PageObject pattern. Having said that, WebDriver is a very recent addition to selenium, and currently has poor or no integration with Selenium IDE and Selenium Grid.

the Page Object pattern

The PageObject pattern is a way of organizing your test code. It tells you to model your test code after the page structure of your application. The test I listed earlier could be rewritten using this pattern, and might look something like this:

public void createNewPublisher() {
PublisherData newPublisher = new PublisherData("Disney")
new AddressData("24 Stars Ave",
"91210", "Hollywood", "USA"))
new UserData("Emily", "",

NewPublisherPage page = new NewPublisherPage(selenium);
ViewPublisherPage viewPage = (ViewPublisherPage);

Here we are using a data struct to hold the form data, which tells you quite obviously that a PublisherData has a name, address and contact person. Then we have the Page Object for the NewPublisherPage, which uses selenium to navigate to the relevant page in the constructor, then has methods which allow us to interact with that page. When we as it to "save", we navigate to a new page, in this case the ViewPublisherPage, which we can ask to validate the new publisher is displayed correctly.

This is an improvement over the original test in several ways. I think it is much more readable, we have a certain amount of type safety from the data struct, and if the page flow, or form data changes, there are obvious central places in the code to update all the tests. We can easily vary the form data in different tests in order to do data driven testing.

Scenario classes

However, in the application I've been working on, we havn't used this pattern in quite this way. The application uses a lot of frames and ajax, and a page based model didn't really fit. So we've been using the idea of "Scenario" classes instead of Page Objects. They fulfill the same purpose, but instead of modelling pages in your app, these classes group together domain actions, or services that your application provides. The Scenario classes hide all the details of how to use selenium to interact with the GUI and achieve an outcome a user is interested in. In this case, creating a publisher.

public void createNewPublisher() {
PublisherData newPublisher = new PublisherData("Disney")
.withAddress(new AddressData("24 Stars Ave",
"91210", "Hollywood", "USA"))
new UserData("Emily", "",

PublisherScenarios scenarios =
new PublisherScenarios(selenium);

Domain Specific Languages

So with the PageObject or Scenario patterns, you could say that my tests are written in a kind of Domain Specific Language, in that they are written in terms close to the domain of the application they are testing. The trouble is, this is still Java code, and business people generally can't read Java. A more interesting DSL would enable communication between developers and business people.

This is where I turned to Fitnesse to help me. I found it pretty straightforward to put some Fitnesse tables over the top of this kind of Java code. Since the Java code is already organized in terms of the domain, it becomes natural to offer the same functionality in tables with meaningful names.

I chose Fitnesse because I wanted to try it out, but I think it would be equally easy to put Cucumber or Robot Framework or something like that on top, too. The whole point of these tools is that they allow you to easily write executable tests in a language anyone with knowledge of the domain can read. This is a huge benefit to developers when talking to business people.

Closing advice

Having said all this about how to make your Selenium tests more readable, robust and faster, I still find my Selenium test suite frustratingly slow and fragile (I havn't used WebDriver enough yet to know if that is the case there too). Compared to PyUseCase, it is in the dark ages. (Unfortunately PyUseCase has no support for web applications, so there's probably no point in mentioning it, actually. But it is a great tool!).

The difference between PageObjects and Scenario classes is not huge, but I like the latter, because they allow me to easily exchange Selenium for a WebService. My tests are written in terms of the services the application offers, and if I indeed implement those services as WebServices, I can call them directly from the tests. Using a WebService instead of selenium immediately makes my tests much faster and more robust.

The problem with testing against a WebService is the reason we chose Selenium in the first place - ajax and browsers are not covered.

So for web 2.0 application testing, I recommend both API tests and Selenium tests. Use Selenium to test the crucial parts of your GUI, but write most of your tests against some kind of WebService or API. Write your tests in terms a business user would understand, even in the Java code, and then it is easy to create a Domain Specific Language on top using another tool.

Sunday, 10 January 2010

Testing PyTDDmon

One of the things I like about GothPy is that lots of the people in the group enjoy writing code in their spare time, and like to share with us what they're up to.

A little while back, Olof came up with this little tool, PyTDDmon, which is to help you when you're doing TDD. Instead of your running your tests by hand, PyTDDmon sits in the background and runs them for you. When they are green, it has a little green blob, when they are red, it tells you how many are failing, and clicking on it gives you the stacktrace. Olof put together a screencast to show how it works.

Following Geoff's presentation of PyUseCase at GothPy in December, he and Olof started to discuss whether it could be used to test PyTDDmon. Since it's using the GUI library Tkinter instead of pyGTK, it didn't look all that straightforward.

Of course, Olof developed the app using TDD in the first place, (definite own dog food eating going on!), and his unit tests have about 48% statement coverage. The parts with least coverage are GUI-related, so there seemed to be scope to improve matters by adding tests via the GUI.

Geoff sat down a few days ago with a branch of the PyTDDmon code, started trying to generalize PyUseCase to work with Tkinter, and to write some tests for PyTDDmon. (Geoff also likes writing code in his spare time :-)

As I write, he's just checking in his changes and a new suite of tests for PyTDDmon that use the new, improved PyUseCase, with (basic) Tkinter support! He says the hardest part was understanding Tkinter, and after a spot of refactoring, only an extra couple of hundred lines of code were needed to support it.

Subsequently adding tests for PyTDDmon was very easy (Geoff wrote most of them while I made biscuits with the children this afternoon :-) All he had to do to Olof's code was assign names to some of the widgets, and sort out a problem with closing a window. (There was a listener for a mouse click that would destroy the window, before other listeners got a chance to do anything, such as record the click in a use case log...)

So now the statement coverage of the PyTDDmon tests is at 98%. I look forward to future, equally productive GothPy meetings and discussions!

Friday, 8 January 2010

joining eLabs

I've enjoyed my time working for Iptor, but I found I just couldn't refuse Carl-Johan when he offered me a job at eLabs. I first med C-J a few years ago at Got.rb, the local Ruby User Group, and I know he has a great entrepreneurial spirit and deep technical expertise. eLabs is his company, (a startup he co-owns with Edithouse), and he's attracted a really competent, friendly bunch of coders to join him. This a chance for me to get to learn Ruby and Rails properly, and work in a truly agile team. Hopefully I'll be able to contribute some insights from my years working with agile advocacy, automated testing, and development in Python and Java. I'm looking forward to starting at eLabs in April.

Monday, 4 January 2010

JUseCase, dreams of resurrection

JUseCase is the Java version of PyUseCase, for testing GUIs written in Swing. It was originally written as a master's thesis by Claes Verdoes in 2005, under Geoff's supervision. Since then it hasn't been used much, and has lain idle and unmaintained for a while. In the meantime, Geoff has made some major changes to PyUseCase, culminating in the recent release of version 3.0, which I think is hugely better than any competing tools. But then I am biassed :-)

I would love to see JUseCase resurrected and to support more of the features of PyUseCase 3.0. We originally got a lot of criticism of JUseCase that it required too many changes in application code; many developers are leery of changing their code to put in hooks "just for the tests". PyUseCase 3.0 removes the need for much of that, and I think JUseCase could do the same.

In JUseCase, you have to connect widgets to domain actions by adding bits of code like this:

JTextField originField = new JTextField(“ANY");
ScriptedTextField.connect("choose origin", originField, ScriptedTextField.EDIT);

i.e. the text field "originField" should record a domain action "choose origin" when it is edited. This enables JUseCase to record and replay user actions in a domain language, at a higher level of abstraction than just widget names.

In PyUseCase 3.0 these kinds of code changes are replaced by the UI map file. Widgets are identified by name, something like this:

edited = choose origin

So all you have to do in the code is make sure all your widgets have unique names. Good names are useful for accessability, (blind people etc), and for internationalization, so you can argue you're not just doing it for the tests.

The second thing that PyUseCase 3.0 does to reduce code changes is to automatically generate the UI log, rather than you having to insert log statements by hand. The UI log is the part of the test definition that TextTest uses to assert the application is behaving correctly. It should be a low-fidelity rendering of what the UI looks like, in plain text.

For example, for the simple application in my last post, the ui log might look like this:

---------- Window 'Book Animals for Procedures' ----------
Focus widget is 'available procedures'
Showing available procedures with columns: available procedures
-> abdominocentesis
-> haircut
-> re-shoe
-> milking ,
Showing available animals with columns: available animals , animal type
-> Good Morning Sunshine | mare
-> Goat 3 | goat
-> Goat 4 | goat
-> Guicho | gelding
-> Misty | mare
Button 'book'

This log is created just by inspecting the GUI elements in turn, and writing out a sketch of their contents.

In order to add to this log as the user interacts with the application, PyUseCase attaches itself as a listener to each widget, so it finds out when they change. Then after the user has done some actions, when the GUI event queue is empty, it takes the opportunity to write a few log statements about what has changed since the last time it logged something. In this way it produces a neat summary of what the user does and what the UI looks like after each action. I should think JUseCase could do something similar.

Then all that is left to hand code is to put in the application events - basically writing a log statement that you're doing something in another thread of execution, that the tests should wait for before proceeding. There shouldn't be too many of those calls, so hopefully they can slip under the radar.

Anyone looking for a masters thesis topic fancy giving it a go?