Friday, June 25, 2010

Spider and Validate

One of the things we've been trying to achieve is automation is every sense. As my old mate Otu once said "automate until it hurts!" Well, hopefully this will make it a little less painful. We wanted to point to a single page, have the application spider the entire site finding pages which return anything apart from 200 OK, and validate all those who do return 200 OK.

By combining two PHP projects, PHPCrawl (http://sourceforge.net/projects/phpcrawl/) and the frontend-test-suite (http://github.com/NeilCrosby/frontend-test-suite) it's been possible to do just that. Throw a little Ant build script in there too and deploy into a continuous integration container like Hudson, and you start to get a feel for the state of websites in development, or even in production as part of a monitoring tool.

To make use of this, edit the build.xml file to change the SITE_URL to be the endpoint which you want to test and run 'ant validate'. This will do an Ant copy with filtering to create a file called test.php. Ant then runs this and then captures the output from this. It collects which pages may have returned a 500, a 404 and 200 OK pages. It then passes an array of 200 OK pages into the frontend-test-suite which uses PHPUnit to report on HTML validation.

There's still plenty of work to do, mainly introducing the ability to supply your own W3C validator endpoint and introducing CSS validation. It'll come though, so keep an eye out!

The result of this is over at Github - http://github.com/robb1e/Validator

As a side note, to get frontend-test-suite as a submodule on Github I followed instructions from FND, thanks!

No comments: