Friday, December 22, 2006

Implementing a JavaScript Language Interpreter for Selenium Testing

In another article, I explained why HTML Selenese has no support for "if" statements and conditional "for" loops.  There, I argued that implementing flow control "properly" in Selenium Core would require writing an entire language parser/interpreter in JavaScript.  Here I'll catalog some of our failed attempts to do precisely that.

Use the Browser's Native JS Parser

The first question everybody asks us is: "why don't you let people write Selenium tests in JavaScript directly, rather than writing them in HTML Selenese tables?"  JavaScript already includes (as part of its language) an "eval" function that can execute arbitrary JavaScript encoded as a string; JavaScript will parse the string, interpret the code, and even return the final result.

You can actually write tests in JavaScript today using Selenium Remote Control and Rhino, the JavaScript interpreter for Java.  The disadvantage of this is that your JS test runs in its own separate process (a Java JVM), and it requires you to set up a Selenium Server.  It doesn't run the JavaScript directly in the browser.

Using Selenium RC, you can write a JS test like this:

    selenium.open("/mypage");
    selenium.type("login", "alice");
    selenium.type("password", "aaaaa");
    selenium.click("login");
    selenium.waitForPageToLoad(5000);
    selenium.click("link=Colors");
    selenium.waitForPageToLoad(5000);
    selenium.click("blue");
    selenium.waitForPageToLoad(5000);
    if (!selenium.isTextPresent("blue moon")) throw new Error("blue not present!");

However, this test requires Selenium RC and a running Selenium Server.  You can't run that test directly in the browser for a very important reason: JavaScript has no "sleep" function; the JavaScript interpreter in all browsers is, by design, single-threaded.  That means that there's no way to implement "waitForPageToLoad" as it's written here.  In another language you might implement a function like "waitForPageToLoad" by doing something like this:

    function waitForPageToLoad() {
        while (!isPageLoaded() && !timeIsUp()) {
            sleep(500);
        }
    }

How can we do this without a sleep function?  Let's try a standard "busy wait" function (i.e. constantly perform mathematical calculations until our time is up).

    function sleep(interval) {
        var i = 0;
        for (var start = now(); (start + interval) > now();) {
            i++;
        }
    }
    
    function runTest() {
        frameLoaded = false;
        var myFrame = document.getElementById('myFrame');
        var myWindow = myFrame.contentWindow;
        myWindow.location = "hello.html";
        var start = now();
        var finish = start + 5000;
        while (!frameLoaded && finish > now()) {
            sleep(500);
        }
        alert("frameLoaded="+frameLoaded);
    }

You can try this test here: Busy Wait.  I tested it in Firefox and IE.  What you'll find is that the window frame refuses to load as long as the busy wait loop is running.  As soon as the test is finished (failing), the frame loads.  (In FF the frame loads while the alert pop-up appears; in IE the frame doesn't load until after you click OK.)

Under the hood, here's what's really happening: when you set the window.location in JavaScript, you haven't actually done anything yet; you've just scheduled a task to be done.  Once your task is completely finished, the browser goes on to perform the next task on the queue (which, in this case, is loading up a new web page).

To put that another way, there's a reason why JavaScript has no "sleep" function: architecturally, it would be pointless.  Normally, you sleep for a few seconds while something else happens in the background.  But as long as your JavaScript is running in the foreground, nothing can happen in the background!  (Try clicking around in the menus, or even clicking the close button, while the busy wait test is running.  The entire browser is locked!)

For a while, I thought I had found a clever workaround to this problem: using a Java applet to do the sleeping.  You can see this in action here: Applet Wait.  The page loads!  That's great.  But then we try to wait for the additional salutations onLoad... and they never come.  (If you run "top" or the Windows Task Manager you can see that this mechanism doesn't pin the CPU; the CPU is idle.)

Actions you perform in JavaScript can have multi-threaded effects, even adding items to the queue of work to be done, but as long as any JS function is working, no other JavaScript work can happen in the background.

Waiting With setTimeout

So how does Selenium do it?  JavaScript exposes a simple mechanism that allows you to schedule events for the future: setTimeout.  It places an item on the queue of events to be done, after a short delay.  It's like saying to the JS interpreter "run this snippet of code (as soon as you can) 500 milliseconds from now."  That parenthetical "as soon as you can" is critical here, because of course, if the JS interpreter is busy at that time (e.g. if somebody is working or is in the middle of a busy wait,) the timeout event won't happen until that earlier job is done.

That means that you still can't write the JS test I highlighted at the beginning of this article:

    selenium.click("blue");
    selenium.waitForPageToLoad(5000);
    if (!selenium.isTextPresent("blue moon")) throw new Error("blue not present!");

Because you cannot simply sleep and then assert something.  Instead, you have to let the interpreter sleep on your behalf, and then call you back when it thinks you might be done.

The Scriptaculous Test.Unit.Runner exposes a common mechanism for writing asynchronous tests using setTimeout: a "wait" function, appearing as the very last line of your test, that includes the "rest" of your test (everything to be done after you're finished waiting).  It must be the very last line of your test because (like the setTimeout function), "wait" simply schedules an event to happen in the future; it doesn't synchronously wait for the job to get done.  A Test.Unit.Runner test looks a little like this:

    selenium.click("blue");
    waitForPageToLoad(5000, function() {
        if (!selenium.isTextPresent("blue moon")) throw new Error("blue not present!");
    }

That might not look so bad from here... but remember that you have to nest these "wait" statements every time you want to wait; a standard Selenium test needs to wait after almost every line!  So the simple looking JS test I quoted at the beginning of this article would look like this:

    selenium.open("/mypage");
    selenium.type("login", "alice");
    selenium.type("password", "aaaaa");
    selenium.click("login");
    waitForPageToLoad(5000, function() {
        selenium.click("link=Colors");
        waitForPageToLoad(5000, function() {
            selenium.click("blue");
            waitForPageToLoad(5000, function() {
                if (!selenium.isTextPresent("blue moon")) throw new Error("blue not present!");
            }
        }
    }

Every time you "wait", you have to create a new nested block.  This code starts to look pretty gnarly pretty quickly...  But that's really the least of it.  The real kicker is that you can't really use for loops together with "wait" statements!

Don't forget, the "wait" statement has to appear at the end of your function.  That means that you can't click/wait/assert on 3 things in a for loop, like this:

    for (var i = 1; i <= 3; i++) {
        open(page[i]);
        wait(50, {
            assertTrue(document.foo == i);
        }
    }

Naively, we might assume that this code would do the following (which is what we'd want):

open(page[1])
sleep(50)
assert(foo==1)
open(page[2])
sleep(50)
assert(foo==2)
open(page[3])
sleep(50)
assert(foo==3)

But, instead, that code will try to open pages 1, 2 and 3 without any delay between them,[*] then wait 50ms, then do all three of your assertions.  In other words, you'd get this:

open(page[1])
open(page[2])
open(page[3])
sleep(50)
assert(foo==1)
assert(foo==2)
assert(foo==3)

[*] In fact, it's even worse than that, because page 1 wouldn't really open until your JS function was finished running.  So really it would only ever open page3 and then do all three assertions on page 3.

I don't want anyone to get the mistaken impression that I think the Scriptaculous Test.Unit.Runner "wait" function is altogether bad.  It's a useful function for doing unit testing, where you'll test one or maybe two asynchronous events at a time.  (Good unit tests test only one thing at a time, after all.)  It's a clever solution to a difficult problem.

But in functional integration tests like ours, where you have to wait after almost every line, using the "wait" function doesn't really help that much; it doesn't give you the ability to write complicated integration tests in a fully-powered language with logic, try/catch, and recursion.

"wait" gives you some of the power you need/expect in a modern programming language, but not enough: although you can use if statements and for loops within each nested "wait" block, you can't use them across code blocks.  (You still can't put an "open" command in a while loop, or in a try/catch block.)

Using Javascript in HTML Selenese

In fact, HTML Selenese makes it pretty easy for you to write asynchronous tests in JavaScript, simply by scheduling a timeout between every line of your Selenese test.  Since Selenese also supports running arbitrary JS in a table cell, it's not too hard to write your test like this:

storeEvalselenium.open("/mypage");
storeEvalselenium.type("login", "alice");
storeEvalselenium.type("password", "aaaaa");
storeEvalselenium.click("login");
storeEvalselenium.waitForPageToLoad(5000);
storeEvalselenium.click("link=Colors");
storeEvalselenium.waitForPageToLoad(5000);
storeEvalselenium.click("blue");
storeEvalselenium.waitForPageToLoad(5000);
storeEvalif (!selenium.isTextPresent("blue moon")) throw new Error("blue not present!");

That's not quite as nice as the test you can write with Selenium RC and Rhino, though, for a couple of reasons.  First, it suffers from all of the same problems as the Scritaculous "wait" function: although you can use "if" statements and "for" loops within each storeEval block, you can't use them across functional blocks.  (Again, you still can't put an "open" command in a "while" loop, or in a "try/catch" block.)

But in fact it's even a little worse than Test.Unit.Runner, because it means that you can't write a test like this:

storeEvalvar foo = 'bar';
storeEvalselenium.click(foo);

That's because "foo" was defined as a local variable in the first block; it goes out of scope as soon as the block finishes, and is undefined by the time we start the second block.  (Test.Unit.Runner doesn't suffer from this problem, because it uses closures to encapsulate the variables from the first block inside the second block.)

Generating Turing-Complete Selenese

Some people, when they see the storeEval tables I show above, start to thinking: "perhaps I could generate that table, from code written in a high-level functional language."

And, of course, you certainly could.  But, as we know, you certainly can't convert JS like this into Selenese:

    for (var i = 0; i < 3; i++) {
        selenium.open("page" + i);
    }

... or could you?  If Selenium had an "gotoIf" command, you could write the code like this:

storei0
gotoIf!(i < 3)5
storeEvalselenium.open("page"+i)
storeEvalii+1
gotoIfTRUE1
echodone

Every "for" loop is just a convenient way of saying "gotoIf"; with the addition of a simple command to the language, we would make it possible to translate simple "if" and "for" statements into Selenese.  "gotoIf" makes Selenese "Turing-complete". (In fact, there's a Selenium Core extension that provides this; I discuss the flowControl extension in more detail in an earlier article.)

"if" and "for" statements are easy, but what about try/catch statements?  What about JavaScript functions?  What about strings run in "eval" blocks?  How would we handle scoped variables (and guarantee that they go out of scope correctly)?  

If you understand this problem completely, you quickly see that the "translator" of JavaScript into Turing-complete Selenese is really a full compiler; it would require a complete language parser and interpreter to know how to translate try blocks, functions, nested scopes, etc. into their "gotoIf" equivalents.  It wouldn't be inappropriate to call it "Selenese Assembly", since it has a lot in common with assembly language: it's powerful enough to handle anything, but so complicated to write that you probably wouldn't want to write a lot of it by hand.

As I argued in the previous article, writing a full compiler for "Selenese Assembly" would be a lot of work, for basically no benefit, because we already have JavaScript support in Selenium RC.

Writing a Full Language Parser in Javascript

Writing a compiler for "Selenese Assembly" would be a lot of work, but that didn't stop a few people from trying.

Somebody has, in fact, written a language interpreter in JS: Brendan Eich, the "father" of JavaScript, has written a meta-circular JavaScript interpreter in JavaScript called "Narcissus" (named after the Greek myth of the boy who fell in love with his own reflection).  Jason Huggins, the original author of Selenium, began working on trying to integrate Narcissus with Selenium, but never finished.

The reason why he never finished is because we don't merely need a JS interpreter written in JS.  To implement a "sleep" function, we would also need the meta-circular interpreter to be able to interrupt its flow of execution using setTimeout.  That means that the meta-circular interpreter would need to be written under all of the constraints that I described in the previous section: it can't use any local variables (except as temporary storage until they get written out to permanent global variables) and it has (at best) limited use of for loops, try/catch blocks, and other language features of JavaScript.

Another way of putting that is that the meta-circular interpreter would need to be written in "continuations-passing style", meaning that all of the information about the current state of the running JS program (what line you're on, where you are in the stack, all of the local variables, scopes, etc.) would need to be stored in a variable that would be "passed around" to all of the other functions; this would allow you to "setTimeout" in the future, simply passing in the continuation object to the next chunk of your code.

Take a look at the Narcissus code for yourself... it's pretty complicated.  Rewriting it in continuations-passing style would require rewriting the whole thing from scratch... with one hand tied behind your back!

Having said that, there is another language in the world that is considerably easier to parse and interpret, and which actually makes it very easy to write your code in that style: LISP.  And, indeed, someone has written a LISP interpreter in JS.  Rewriting that code to support calls to "setTimeout" probably wouldn't be too hard, and at that point, you'd be able to write in-browser Selenium tests using the full power of LISP.

The only disadvantage?  You'd have to write your tests in LISP! ;-)

Oh, and did I mention that Bill Atkins has written a Selenium LISP client?  Or that you can use the SISC Java-based Scheme interpreter with our existing Java Client Driver?

No comments: