PlebGUI: WebGUI Meets Plack

WebGUI is an Apache mod_perl application. Not just any mod_perl application; reputedly the most deployed mod_perl application on the planet. You’d be forgiven for thinking that we love Apache. And we do. Mostly.

But you see mod_perl is an overly zealous lover. Every intimate phase of the Apache request cycle is offered up to your eager Perl embrace. Sure, you have to learn a few new tricks to get decent performance (such as two-tier mod_proxy/mod_perl) but hey it’s 2001 and mod_perl is SO much better than CGI. Without hesitation you commit to a life of PerlResponseHandlers and Apache2::Const::OKs.

Years pass. Life is good.

Your user base had become very adept at deploying Apache. In fact you make life easier for them by distributing the complete Perl/Apache/MySQL stack as a simple installer, pre-configured for optimum performance.

Every now and then someone appears on the mailing list asking questions about WebGUI 5, a throw-back to the days when it used to be possible to deploy WebGUI on cheap shared hosting cPanel servers in CGI mode. You want to help, but WebGUI has become so powerful that CGI mode isn’t feasible anymore. That’s the price that was paid for evolving into an Enterprise-grade system. Developers lament the fact that small-time users can’t take advantage of all the awesome things WebGUI can do, not to mention all the word-of-mouth promotion WebGUI is missing out on, but them’s the brakes. WebGUI continues to grow. “Carrier-grade” is the new black. And across town, the non-enterprise crowd is left to content themselves with WordPress, Joomla and Drupal.

And then a guy called Tatsuhiko Miyagawa comes along.

He says, gee, look at these wonderful server abstractions that Python (WSGI) and Ruby (Rack) have. The Perl world might have moved on from CGI to things like Catalyst::Engine::* and HTTP::Engine, but there’s still duplicated effort everywhere. We can do better.

So he sits down and writes PSGI, an absurdly simple, manifestly beautiful specification for an interface between Perl web apps and web servers (drawing heavily on WSGI and Rack for inspiration).

PSGI

The idea goes like this: Web applications, when all is said and done, are really just on about sending three pieces of information to web browsers: a HTTP status code, a list of HTTP headers, and some content (a file or some text, normally HTML).

This is the specification:

[                                           # an array ref, containing..
    200,                                    # a HTTP status code
    [ 'Content-Type' => 'text/html', .. ],  # an array of HTTP headers
    [ '<html>...</html>' ],                 # the text content (or a $filehandle)
]

And that’s really all there is to it.

Some people write web apps according to the interface. Other people write server backends according to the interface. Web app developers use whatever whiz-bang technology they like, and as long as they return an array that complies with the spec their web app can run on any server. And server backend developers can do lots of clever things to get that information back to web browsers really fast, and have the results of their work benefit all PSGI/Plack consumers. No more duplicated effort. No more server-specific love lock-in.

And folks took notice. Not just any folks either. Really, really smart people like Yuval Kogman, Stevan Little, Shawn Moore, Matt Trout, Jesse Vincent, Chia-liang Kao, Dave Rolsky, Simon Cozens, ..  (They’re some of the well-known names that jumped out at me from the PSGI.pod spec doc. The others are probably even smarter, stealth hackers. I know, it’s scary).

Plack

Before you could say Plack there was a reference implementation, server backend support for CGI, FastCGI, mod_perl (welcome back!), AnyEvent, Coro, Perlbal, Nginx, .. framework support for Catalyst, CGI-Application, Mason, Continuity, Maypole, Mojo .. and a whole suite of Middleware and Utilities.

Meaning that all of a sudden any web app written in one of those frameworks can now be deployed on any of those servers. Or on the pure-perl standalone server that runs from the command line. Or on one of the more experimental/exotic servers I haven’t listed. If you’re not excited yet, just wait until Google AppEngine appears in that list.

WebGUI

But where is WebGUI! The problem is that WebGUI is both a framework and a web app. A really big web app. With a mod_perl addiction. Frameworks are built with multiple servers in mind, so they generally already have an in-built server abstraction layer. Which makes adding PSGI support relatively simple. WebGUI, on the other hand, deliberately eschews an abstraction layer so that it can fully embrace mod_perl and eke out every last ounce of power and performance it can from Apache.

So, faced with extreme framework envy, I did what any reasonable person would do.

I built a PSGI/Plack layer for WebGUI.

PlebGUI

I’ve codenamed the project “PlebGUI“, which I think aptly describes the way it makes it possible for the little people to run WebGUI on low-cost shared hosting.

And it actually works. Take for instance plebgui.patspam.com, a demo PlebGUI site site running in FastCGI mode on HostMonster (the prototypical low-cost shared webhost).

app.psgi

The second wonderously simple idea in the PSGI spec is that a web app is just a plain old perl subroutine. Here’s one I prepared earlier:


sub { [ 200, [ 'Content-Type' => 'text/html' ], [ 'Hello World' ] ] }

I know, it’s almost insulting. I’m a web developer man! I do sophisticated things! But try putting that single-line sub into a test file called app.psgi. And then after you’ve installed Plack (it’s not on CPAN yet so you have to install it from miyagawa++’s git repo) try running this:


$ plackup
Accepting connections at http://0:8080/

Go on, visit that url in your web browser. Hello to you too!

Middleware

Ok that was a cute trick, let’s try something more exciting:


use Plack::Builder;
builder {
    add "Plack::Middleware::Static", path => qr/./, root => '/var/www';
    sub { [ 404, [ "Content-Type" => "text/plain" ], [ "Not Found" ] ] };
};

Congratulations, you just added your first Middleware. Assuming you have some static files located at /var/www, you’ll get the static files returned to your browser with the correct mimetype (thanks to Plack::Middleware::Static). Middleware just wraps your web app (a plain old Perl sub) with another plain old Perl sub. Middleware can do logging, pretty HTML stack traces, pre/post processing, or anything else you like. Simple, but immensely powerful.

Plackup

Plackup is a simple utility script that launches your web app with a specified server backend. By default it runs the pure-perl standalone development server. It expects your webapp to live in a file, similar to the ones we just created. Want to run your web app on another server backend? Try one of these:


plackup                                         # dev server with StackTrace and AccessLog enabled
plackup -s CGI                                  # remember how slow web apps used to be?
plackup -s AnyEvent                             # nonblocking
plackup -s Coro --port 9090                     # coroutines
plackup -s Standalone::Prefork --max-workers 20 # blazingly fast preforking ftw!

dev.localhost.localdomain.psgi

Here’s what the current per-site .psgi file looks like for PlebGUI:


use Plack::Builder;
use lib '/data/WebGUI/lib';
use WebGUI;

builder {

 # Populate $env from site.conf
 add 'Plack::Middleware::WebGUI',
   root => '/data/WebGUI',
   config => 'dev.localhost.localdomain.conf';

 # Handle /extras via Plack::Middleware::Static
 add 'Plack::Middleware::Static',
   path => qr{^/extras/},
   root => '/data/WebGUI/www';

 # Handle /uploads via Plack::Middleware::WGAccess (including .wgaccess)
 add 'Plack::Middleware::WGAccess',
   path     => qr{^/uploads/},
   root => '/data/domains/dev.localhost.localdomain/public';

 sub { WebGUI::handle_psgi(shift) };
}

What you can see there are 3 Middleware layers added in, one to set up the WebGUI site-specific environment, one to handle /extras static content, and one to handle /uploads static content (taking into account .wgaccess file permissions).

All of those plackup command variations above can be used to launch WebGUI outside of mod_perl. Prefer running inside of Apache? How about one of these:


<VirtualHost *:80>
 PerlOptions +Parent
 PerlSwitches -I/data/WebGUI/lib

 # CGI
 #AddHandler cgi-script cgi
 #ScriptAlias / /data/WebGUI/etc/dev.localhost.localdomain.cgi/
 #<Directory /data/WebGUI/etc>
 #   Options +ExecCGI
 #</Directory>

 # mod_perl
 #SetHandler perl-script
 #PerlHandler Plack::Server::Apache2
 #PerlSetVar psgi_app /data/WebGUI/etc/dev.localhost.localdomain.psgi

 # FastCGI
 FastCgiServer /data/WebGUI/etc/dev.localhost.localdomain.fcgi
 ScriptAlias / /data/WebGUI/etc/dev.localhost.localdomain.fcgi/

 # mod_psgi
 #<Location />
 #    SetHandler psgi
 #    PSGIApp /data/WebGUI/etc/dev.localhost.localdomain.psgi
 #</Location>

</VirtualHost>

Using those directives you can run WebGUI in CGI, mod_perl, FastCGI mode, or even the in-development mod_psgi Apache module.

Benchmarks

Ok so how fast are these different backends? Let’s use ApacheBench to do some simple, unscientific tests of how many requests per second we can squeeze out of WebGUI.

First we’ll start with WebGUI in its original, un-plebified form, running on the WRE (more is better):

$ ab -n 1000 -c 10 -k http://dev.localhost.localdomain:8081/ | grep ‘Requests per’
Requests per second:    122.77 [#/sec] (mean)

The result is of course completely dependent on your Apache configuration – in this case I have (StartServers, MinSpareServers, MaxSpareServers, MaxClients) = (5,5,10,20).

Ok, now have a look at these numbers:

$ ./ab.pl –app /data/WebGUI/etc/dev.localhost.localdomain.psgi
Testing implementations: AnyEvent, Standalone, Standalone::Prefork, ServerSimple, Coro
app: /data/WebGUI/etc/dev.localhost.localdomain.psgi
ab:  ab -n 1000 -c 10 -k
URL: http://127.0.0.1/

– server: AnyEvent
Accepting requests at http://0.0.0.0:10001/
Requests per second:    68.06 [#/sec] (mean)

– server: Standalone
Accepting connections at http://0:10001/
Requests per second:    64.92 [#/sec] (mean)

– server: Standalone::Prefork
Accepting connections at http://0:10001/
Requests per second:    214.54 [#/sec] (mean)

– server: ServerSimple
Plack::Server::ServerSimple: You can connect to your server at http://localhost:10001/
Requests per second:    66.43 [#/sec] (mean)

– server: Coro
2009/10/11-23:17:32 Plack::Server::Coro::Server (type Net::Server::Coro) starting! pid(1581)
Requests per second:    67.55 [#/sec] (mean)

Did you see that? Standalone::Prefork is almost twice as fast as WebGUI in the WRE! The pre-forking server’s max-workers setting defaults to 10, so the comparison might actually be fair too.

Holy Shit

I think that’s worth repeating. PlebGUI, which is currently less than one week old and running on a compatibility layer optimised for “let’s just get this thing working and worry about performance later” is already capable of out-performing the WRE in terms of raw speed by 200%. (If you believe the benchmarks).

But Plack/PSGI is not just about speed. It’s about flexibility. Think Koen de Jonge, WebGUI hosting extraordinaire at Procolix, deploying a high availability “follow the sun” WebGUI cluster on Nginx servers. Think Colin Kuskie, WebGUI test suite overlord, probing the dark corners of the WebGUI API using standard HTTP::Request and HTTP::Response pairs through mocked HTTP and live HTTP servers. Think web developers working on WebGUI client code, with a web server fully integrated into Padre. Think WebGUI entirely deployable from the CPAN.

Think of every new project added to the PSGI/Plack ecosystem as a potential new PlebGUI feature.

Where to Now

The approach I took in turning WebGUI into PlebGUI was to create a fake Apache2::Request object, since that’s the closest thing WebGUI has to a server abstraction layer. Plack contains two helper classes Plack::Request and Plack::Response that make this really easy. Currently though, that leaves us in the curious situation where WebGUI does all of its work thinking it’s talking to mod_perl, only to have its real output re-routed through the PSGI-compatibility layer, to be subsequently processed by a specific server backend. If you don’t get the joke yet, just think about what happens when the PSGI server backend happens to be mod_perl.

The benchmarks would clearly improve if we ripped out mod_perl altogether. Lots of code would simplify too. Certain parts of WebGUI could disappear altogether, such as URL Handlers which could be entirely replaced with Middleware.

The only feature that PlebGUI currently lacks is content streaming. I deliberately left that out since it will be a lot easier to achieve once mod_perl disappears. In Plack land the way to drip-feed browsers with “chunked” content is to return an IO::Handle-like object that responds to getline() and close(). The Plack folks are planning a fancy new module called IO::Writer that will Do The Right Thing under both blocking and non-blocking servers. Expect awesome things.

Padre::Plugin::WebGUI Tab Icons

Version 0.03 of Padre::Plugin::WebGUI (my WebGUI plugin for the Padre Perl IDE) is on its way to the CPAN, with one small but rather nice feature: tab icons. Just as the plugin uses WebGUI’s native icons to display the Asset Tree for your site, you now get the same attractive icon in the tab that opens up when you double-click to edit something. Apart from the aesthetics, it’s handy to know at a glance what sort of asset you’re editing (WebGUI uses the same trick for the edit icons on your site when you turn admin mode on). I was hoping to be able to add hover-text too (in case you don’t recognise the Asset type from its icon) but thus far I haven’t figured out if wxPerl and/or Padre supports it.

The following screenshot shows a nice example of several tabs open at once with different WebGUI Asset types displaying their own icons.

Tab_Icons

Padre::Plugin::WebGUI now with remote editing

Up until now the WebGUI Padre plugin I’ve been working on has been more or less a proof-of-concept / toy / test-bed for experimental ideas. Whilst that has been fun (and made for good conference slides), the reality is that I haven’t actually been using the plugin for my daily WebGUI development, so how could I expect other WebGUI developers to do so?

I’m hoping to change that. As of today I’ve stripped out all the experimental features and concentrated on the single feature that seems to have the widest possible appeal, namely, remote editing.

Version 0.02 of Padre::Plugin::WebGUI, which is on its way to the CPAN, allows you to connect over HTTP to your webgui site, view you asset tree and edit your assets from inside the Padre text editor. That hopefully makes it as useful for designers and content managers as it is for developers.

Previously the plugin got most of its power from the WGDev library. However this meant that you had to install WebGUI and WGDev as a prerequisite. Given that the WRE ships with non-threaded perl, this meant either installing these prereqs into your system perl or building another perl specifically to run them.  Not an overly arduous task, but enough of a hurdle to dissuade most WebGUI developers from playing with the plugin. That obstacle has now been removed. You do not need to have WebGUI or WGDev installed to run the plugin.

Using WGDev also meant that you could only connect locally to your site. That restriction is now gone.

Installation

Firstly, install Padre.

To allow your WebGUI site to talk to the plugin, add this content handler (WebGUI::Content::Padre) to your site and restart modperl  (the easiest way to do this is to drop the module into your /data/WebGUI/lib/WebGUI/Content dir and add the new content handler to your site config file. There is also an install script that will do the site config editing for you).

Next install the plugin from the CPAN


cpan Padre::Plugin::WebGUI

Usage

Inside Padre, go to the Plugin Manager and enable the WebGUI plugin. You should then see a WebGUI item appear in your Plugins menu.

From the WebGUI menu, tick “Show Asset Tree” and double-click on “Connect” in the new panel that appears.

connect

You will be prompted for the URL of the site you want to connect to. Currently you need to enter you the site in the form: http://username:password@site.com (later we will add a site manager to make this nicer). Remember that this can be any webgui site that you have access to, not just a local site.

connect2

You will then see the Asset Tree dynamically populated with information from your site.

asset_tree

Try double-clicking on an article.

getting_started

A new file tab should open with the name set to the name of your article (with [wg] prepended). The article content will be loaded into to tab, with Padre’s HTML syntax highlighting in effect. If you have Padre::Plugin::HTML installed, you can use the Tidy/Lint/Validate the HTML to your heart’s content. And every time you hit Save, your changes will be pushed up to the site as a new Version Tag.

look_ma

Try double-clicking on another sort of asset, for instance a Template. Once again, the template html will be loaded into a new tab with nice syntax highlighting.

template

And for fun, try opening up a CSS Snippet. The plugin will detect that your snippet is set to the ‘text/css’ mime type and use CSS code highlighting instead of HTML.

css

And similarly for JavaScript Snippets.

javascript

Design and Extensions

As far as Padre is concerned the file you are editing is just a regular file on your file system. For example when you edit the text, an asterisk (*) appears next to the name of the file indicating that you haven’t save it yet.  The integration between the webgui content handler and the plugin is minimal, meaning that you could conceivably use this plugin to do remote editing on non-webgui sites too.

Padre has a nice architectural design, which makes it really easy for us to associate different WebGUI::Asset types with Padre::Document types. Currently most assets delegate to the default Padre::Document::WebGUI::Asset module which does HTML syntax highlighting and expects only generic content from the content handler running on your site. Padre::Document::WebGUI::Snippet has a slightly higher IQ level in that it asks the server to tell it what mimetype the asset is set to, so that it can use the appropriate syntax highlighter.

We could do some really interesting things by building asset-specific Padre::Document modules. For instance you could build a document type that uses fancy wxWidgets controls to do things that aren’t possible on the web (desktop integration etc..) And it’s just as easy to build these for your own custom asset types as it is for core ones. Given that Padre is cross-platform, you’re basically looking at a way of extending WebGUI interactions out into desktop-land.

If you’re a WebGUI designer/developer/content manager and you want to start playing with Padre::Plugin::WebGUI as a novel way of editing content, come find me on #webgui or #padre.

Finding missing WebGUI Templates

Following on from my previous WGDev post, here’s another example of wgd ls that lets you find missing Templates on a WebGUI site.

One problem that content managers run into is that after deleting a template in WebGUI, any assets that still refer to the template will die a horrible death when users try to view them. What would be nice is an automated way of finding all such assets that refer to missing template, so that you can update them to refer to a different template.

For starters, WGDev can give you a list of all available templates on your site via:


wgd ls root -r --format %assetId% --includeOnlyClass WebGUI::Asset::Template | sort | uniq > available

You can also ask for a list of all in-use templateIds, via:


wgd ls root -r --format %templateId% --filter '%templateId% ~~ /.+/' | sort | uniq > in_use

All you need to do then is ask the comm command to tell you which templateIds are in_use but not available:


comm -23 in_use available

Putting it all together on one line (without the temporary files), that becomes:


comm -23 <(wgd ls root -rf %templateId% --filter '%templateId% ~~ /.+/' | sort | uniq) <(wgd ls root -rf %assetId% --includeOnlyClass WebGUI::Asset::Template | sort | uniq)

The output of this command is a nice list of missing templateIds.

Of course, what you probably then want to do is ask WGDev to give you a list of asset urls that are trying to use these missing Templates, for example, if one item in your list is eHMsLCMqoXYpYrsZ7CWUtg, you would do:


wgd ls root -rf %url% --filter '%templateId% ~~ eHMsLCMqoXYpYrsZ7CWUtg'

Which would print something like


/my/broken/asset/url

Combining this with the previous command to do it all on one line is left as an exercise for the reader ;)

Now, templateId isn’t the only Asset field that templates can be referenced under, for example, if you want to also find missing Style and Printable Style templates, you’d want to do something like:


for tmpl in templateId styleTemplateId printableStyleTemplateId
do
comm -23 <(wgd ls root -rf %${tmpl}% --filter "%${tmpl}% ~~ /.+/" | sort | uniq) <(wgd ls root -rf %assetId% --includeOnlyClass WebGUI::Asset::Template | sort | uniq)
done

It’s not optimised for speed, but it sure is handy. And you could apply the same approach to search for invalid group permissions, etc.. Of course if you found yourself doing this often you’re probably better off writing a dedicated WGDev plugin to wrap up the functionality.

There’s a WebGUI branch on GitHub containing some extra Template Usage-related functions – the plan is to add the ability to find all the different ways that templates are referenced (in Asset definitions, Account module definitions, …) so perhaps we will end up adding a “find missing Templates” button to the Edit Template page.

[ perl ironman ]