PlebGUI: WebGUI Meets Plack

WebGUI is an Apache mod_perl application. Not just any mod_perl application; reputedly the most deployed mod_perl application on the planet. You’d be forgiven for thinking that we love Apache. And we do. Mostly.

But you see mod_perl is an overly zealous lover. Every intimate phase of the Apache request cycle is offered up to your eager Perl embrace. Sure, you have to learn a few new tricks to get decent performance (such as two-tier mod_proxy/mod_perl) but hey it’s 2001 and mod_perl is SO much better than CGI. Without hesitation you commit to a life of PerlResponseHandlers and Apache2::Const::OKs.

Years pass. Life is good.

Your user base had become very adept at deploying Apache. In fact you make life easier for them by distributing the complete Perl/Apache/MySQL stack as a simple installer, pre-configured for optimum performance.

Every now and then someone appears on the mailing list asking questions about WebGUI 5, a throw-back to the days when it used to be possible to deploy WebGUI on cheap shared hosting cPanel servers in CGI mode. You want to help, but WebGUI has become so powerful that CGI mode isn’t feasible anymore. That’s the price that was paid for evolving into an Enterprise-grade system. Developers lament the fact that small-time users can’t take advantage of all the awesome things WebGUI can do, not to mention all the word-of-mouth promotion WebGUI is missing out on, but them’s the brakes. WebGUI continues to grow. “Carrier-grade” is the new black. And across town, the non-enterprise crowd is left to content themselves with WordPress, Joomla and Drupal.

And then a guy called Tatsuhiko Miyagawa comes along.

He says, gee, look at these wonderful server abstractions that Python (WSGI) and Ruby (Rack) have. The Perl world might have moved on from CGI to things like Catalyst::Engine::* and HTTP::Engine, but there’s still duplicated effort everywhere. We can do better.

So he sits down and writes PSGI, an absurdly simple, manifestly beautiful specification for an interface between Perl web apps and web servers (drawing heavily on WSGI and Rack for inspiration).

PSGI

The idea goes like this: Web applications, when all is said and done, are really just on about sending three pieces of information to web browsers: a HTTP status code, a list of HTTP headers, and some content (a file or some text, normally HTML).

This is the specification:

[                                           # an array ref, containing..
    200,                                    # a HTTP status code
    [ 'Content-Type' => 'text/html', .. ],  # an array of HTTP headers
    [ '<html>...</html>' ],                 # the text content (or a $filehandle)
]

And that’s really all there is to it.

Some people write web apps according to the interface. Other people write server backends according to the interface. Web app developers use whatever whiz-bang technology they like, and as long as they return an array that complies with the spec their web app can run on any server. And server backend developers can do lots of clever things to get that information back to web browsers really fast, and have the results of their work benefit all PSGI/Plack consumers. No more duplicated effort. No more server-specific love lock-in.

And folks took notice. Not just any folks either. Really, really smart people like Yuval Kogman, Stevan Little, Shawn Moore, Matt Trout, Jesse Vincent, Chia-liang Kao, Dave Rolsky, Simon Cozens, ..  (They’re some of the well-known names that jumped out at me from the PSGI.pod spec doc. The others are probably even smarter, stealth hackers. I know, it’s scary).

Plack

Before you could say Plack there was a reference implementation, server backend support for CGI, FastCGI, mod_perl (welcome back!), AnyEvent, Coro, Perlbal, Nginx, .. framework support for Catalyst, CGI-Application, Mason, Continuity, Maypole, Mojo .. and a whole suite of Middleware and Utilities.

Meaning that all of a sudden any web app written in one of those frameworks can now be deployed on any of those servers. Or on the pure-perl standalone server that runs from the command line. Or on one of the more experimental/exotic servers I haven’t listed. If you’re not excited yet, just wait until Google AppEngine appears in that list.

WebGUI

But where is WebGUI! The problem is that WebGUI is both a framework and a web app. A really big web app. With a mod_perl addiction. Frameworks are built with multiple servers in mind, so they generally already have an in-built server abstraction layer. Which makes adding PSGI support relatively simple. WebGUI, on the other hand, deliberately eschews an abstraction layer so that it can fully embrace mod_perl and eke out every last ounce of power and performance it can from Apache.

So, faced with extreme framework envy, I did what any reasonable person would do.

I built a PSGI/Plack layer for WebGUI.

PlebGUI

I’ve codenamed the project “PlebGUI“, which I think aptly describes the way it makes it possible for the little people to run WebGUI on low-cost shared hosting.

And it actually works. Take for instance plebgui.patspam.com, a demo PlebGUI site site running in FastCGI mode on HostMonster (the prototypical low-cost shared webhost).

app.psgi

The second wonderously simple idea in the PSGI spec is that a web app is just a plain old perl subroutine. Here’s one I prepared earlier:


sub { [ 200, [ 'Content-Type' => 'text/html' ], [ 'Hello World' ] ] }

I know, it’s almost insulting. I’m a web developer man! I do sophisticated things! But try putting that single-line sub into a test file called app.psgi. And then after you’ve installed Plack (it’s not on CPAN yet so you have to install it from miyagawa++’s git repo) try running this:


$ plackup
Accepting connections at http://0:8080/

Go on, visit that url in your web browser. Hello to you too!

Middleware

Ok that was a cute trick, let’s try something more exciting:


use Plack::Builder;
builder {
    add "Plack::Middleware::Static", path => qr/./, root => '/var/www';
    sub { [ 404, [ "Content-Type" => "text/plain" ], [ "Not Found" ] ] };
};

Congratulations, you just added your first Middleware. Assuming you have some static files located at /var/www, you’ll get the static files returned to your browser with the correct mimetype (thanks to Plack::Middleware::Static). Middleware just wraps your web app (a plain old Perl sub) with another plain old Perl sub. Middleware can do logging, pretty HTML stack traces, pre/post processing, or anything else you like. Simple, but immensely powerful.

Plackup

Plackup is a simple utility script that launches your web app with a specified server backend. By default it runs the pure-perl standalone development server. It expects your webapp to live in a file, similar to the ones we just created. Want to run your web app on another server backend? Try one of these:


plackup                                         # dev server with StackTrace and AccessLog enabled
plackup -s CGI                                  # remember how slow web apps used to be?
plackup -s AnyEvent                             # nonblocking
plackup -s Coro --port 9090                     # coroutines
plackup -s Standalone::Prefork --max-workers 20 # blazingly fast preforking ftw!

dev.localhost.localdomain.psgi

Here’s what the current per-site .psgi file looks like for PlebGUI:


use Plack::Builder;
use lib '/data/WebGUI/lib';
use WebGUI;

builder {

 # Populate $env from site.conf
 add 'Plack::Middleware::WebGUI',
   root => '/data/WebGUI',
   config => 'dev.localhost.localdomain.conf';

 # Handle /extras via Plack::Middleware::Static
 add 'Plack::Middleware::Static',
   path => qr{^/extras/},
   root => '/data/WebGUI/www';

 # Handle /uploads via Plack::Middleware::WGAccess (including .wgaccess)
 add 'Plack::Middleware::WGAccess',
   path     => qr{^/uploads/},
   root => '/data/domains/dev.localhost.localdomain/public';

 sub { WebGUI::handle_psgi(shift) };
}

What you can see there are 3 Middleware layers added in, one to set up the WebGUI site-specific environment, one to handle /extras static content, and one to handle /uploads static content (taking into account .wgaccess file permissions).

All of those plackup command variations above can be used to launch WebGUI outside of mod_perl. Prefer running inside of Apache? How about one of these:


<VirtualHost *:80>
 PerlOptions +Parent
 PerlSwitches -I/data/WebGUI/lib

 # CGI
 #AddHandler cgi-script cgi
 #ScriptAlias / /data/WebGUI/etc/dev.localhost.localdomain.cgi/
 #<Directory /data/WebGUI/etc>
 #   Options +ExecCGI
 #</Directory>

 # mod_perl
 #SetHandler perl-script
 #PerlHandler Plack::Server::Apache2
 #PerlSetVar psgi_app /data/WebGUI/etc/dev.localhost.localdomain.psgi

 # FastCGI
 FastCgiServer /data/WebGUI/etc/dev.localhost.localdomain.fcgi
 ScriptAlias / /data/WebGUI/etc/dev.localhost.localdomain.fcgi/

 # mod_psgi
 #<Location />
 #    SetHandler psgi
 #    PSGIApp /data/WebGUI/etc/dev.localhost.localdomain.psgi
 #</Location>

</VirtualHost>

Using those directives you can run WebGUI in CGI, mod_perl, FastCGI mode, or even the in-development mod_psgi Apache module.

Benchmarks

Ok so how fast are these different backends? Let’s use ApacheBench to do some simple, unscientific tests of how many requests per second we can squeeze out of WebGUI.

First we’ll start with WebGUI in its original, un-plebified form, running on the WRE (more is better):

$ ab -n 1000 -c 10 -k http://dev.localhost.localdomain:8081/ | grep ‘Requests per’
Requests per second:    122.77 [#/sec] (mean)

The result is of course completely dependent on your Apache configuration – in this case I have (StartServers, MinSpareServers, MaxSpareServers, MaxClients) = (5,5,10,20).

Ok, now have a look at these numbers:

$ ./ab.pl –app /data/WebGUI/etc/dev.localhost.localdomain.psgi
Testing implementations: AnyEvent, Standalone, Standalone::Prefork, ServerSimple, Coro
app: /data/WebGUI/etc/dev.localhost.localdomain.psgi
ab:  ab -n 1000 -c 10 -k
URL: http://127.0.0.1/

– server: AnyEvent
Accepting requests at http://0.0.0.0:10001/
Requests per second:    68.06 [#/sec] (mean)

– server: Standalone
Accepting connections at http://0:10001/
Requests per second:    64.92 [#/sec] (mean)

– server: Standalone::Prefork
Accepting connections at http://0:10001/
Requests per second:    214.54 [#/sec] (mean)

– server: ServerSimple
Plack::Server::ServerSimple: You can connect to your server at http://localhost:10001/
Requests per second:    66.43 [#/sec] (mean)

– server: Coro
2009/10/11-23:17:32 Plack::Server::Coro::Server (type Net::Server::Coro) starting! pid(1581)
Requests per second:    67.55 [#/sec] (mean)

Did you see that? Standalone::Prefork is almost twice as fast as WebGUI in the WRE! The pre-forking server’s max-workers setting defaults to 10, so the comparison might actually be fair too.

Holy Shit

I think that’s worth repeating. PlebGUI, which is currently less than one week old and running on a compatibility layer optimised for “let’s just get this thing working and worry about performance later” is already capable of out-performing the WRE in terms of raw speed by 200%. (If you believe the benchmarks).

But Plack/PSGI is not just about speed. It’s about flexibility. Think Koen de Jonge, WebGUI hosting extraordinaire at Procolix, deploying a high availability “follow the sun” WebGUI cluster on Nginx servers. Think Colin Kuskie, WebGUI test suite overlord, probing the dark corners of the WebGUI API using standard HTTP::Request and HTTP::Response pairs through mocked HTTP and live HTTP servers. Think web developers working on WebGUI client code, with a web server fully integrated into Padre. Think WebGUI entirely deployable from the CPAN.

Think of every new project added to the PSGI/Plack ecosystem as a potential new PlebGUI feature.

Where to Now

The approach I took in turning WebGUI into PlebGUI was to create a fake Apache2::Request object, since that’s the closest thing WebGUI has to a server abstraction layer. Plack contains two helper classes Plack::Request and Plack::Response that make this really easy. Currently though, that leaves us in the curious situation where WebGUI does all of its work thinking it’s talking to mod_perl, only to have its real output re-routed through the PSGI-compatibility layer, to be subsequently processed by a specific server backend. If you don’t get the joke yet, just think about what happens when the PSGI server backend happens to be mod_perl.

The benchmarks would clearly improve if we ripped out mod_perl altogether. Lots of code would simplify too. Certain parts of WebGUI could disappear altogether, such as URL Handlers which could be entirely replaced with Middleware.

The only feature that PlebGUI currently lacks is content streaming. I deliberately left that out since it will be a lot easier to achieve once mod_perl disappears. In Plack land the way to drip-feed browsers with “chunked” content is to return an IO::Handle-like object that responds to getline() and close(). The Plack folks are planning a fancy new module called IO::Writer that will Do The Right Thing under both blocking and non-blocking servers. Expect awesome things.

13 thoughts on “PlebGUI: WebGUI Meets Plack”

  1. Hey Patrick,

    Awesome post! I heard about WSGI last year and lamented the fact that it was a Python-only implementation as it offered some very savvy features. I’m psyched about PSGI and look forward to digging in further!

    Thanks for sharing,
    William

  2. Patrick, this is so sexy it hurts. Here’s hoping we can get this into core (for WebGUI 8 maybe?)

  3. Hi Patrick,

    I can’t say I understand all of it, but the implications are amazing! I never thought I would type these three letters, but… OMG!

    Rogier

  4. Minor updates to the technical details in this post as of Oct. 17 2009:

    - “add” in .psgi buillder is renamed to “enable”. ‘add’ will continue working for the moment.
    - IO::Writer idea is abandoned and we explicitly require callback style response if you want to delay response or stream content: more on http://bulknews.typepad.com/blog/2009/10/psgiplack-streaming-is-now-complete.html
    - PSGI and Plack are uploaded to CPAN. So you can install them from the CPAN shell :)

Comments are closed.