[wp-hackers] GSoC Proposal: Integrate WP-cache / WP Super Cache into WordPress

Andy Skelton skeltoac at gmail.com
Sat Mar 1 16:42:20 GMT 2008


On Fri, Feb 29, 2008 at 12:17 AM, Ronald Heft <ron at cavemonkey50.com> wrote:
>  excellent opportunity to start fresh with a new caching system. If anyone
>  has any examples of a PHP-based caching system in they feel is better than
>  WP Super Cache, I would love to see it.

I've done a lot of work on the caching systems that WordPress.com
uses, severely hacking on Ryan's memcached client, and I wrote the
HTML cache that helps burst-traffic event blogs like live.gizmodo.com
stay up when most others fail.

I don't know of anything better than Donncha's Super Cache for what it
does. My HTML cache (I named it supercache before Donncha released his
Super Cache, but I never released mine so he won the name recognition
in the end) is based on memcached and that makes it a poor fit for the
mainstream right now. That's not because memcached is hard to set up
(it's not) or hard to code for (we have working clients) but shared
hosting setups typically don't allow it.

Some form of worthwhile persistent caching should work in the default
install with the minimum requirements. Obviously we aren't there yet.
If we keep the requirements as they are, we will have to put the cache
in the database. Caching in the database would only be worthwhile if
we cached very high-level objects, the highest level being fully
rendered pages. Obviously this would not be as good as an APC object
cache, but for the minimum install it could do wonders.

What I learned from my experiences with WordPress and caching is that
there are more variables than you can hope to account for. (You can
take the WordPress out of the dynamic but you can't take the dynamic
out of WordPress?) You can get a lot of mileage from caching at any
level (queries, values, objects, rendered HTML elements, entire pages)
but the higher up that stack you go, the harder it gets because there
are so many factors contributing to the rendered page.

Back to my supercache: it's painfully simple and extremely effective
at serving a page to new visitors. It allows pages to be freshly
rendered for cookie holders or whatever criteria you care to configure
for identifying visitors who require it. The whole thing is about 200
lines of PHP including comments. Using a similar test for cache
eligibility, a db-backed HTML cache could extend the operational
availability of sites that experience surges of new visitors to single
posts (a typical Digg scenario). This would have to be significantly
more complex than my supercache because we would have to code even the
basic mechanism of expiration and cleanup into the cache. However, if
we are only caching entire pages, this can be pretty easy.

One thing that kills an HTML cache is too-dynamic content. Random
output (e.g. random-ordered blogroll), time-dependent elements (e.g.
clocks), and HTTP header-dependent elements (e.g. seach term
highlighting [Referer] or Firefox buttons [User-Agent]) all become
frozen in the cache and should not be used on blogs that serve
pre-generated pages unless they are generated client-side with JS,
Flash, etc., or the cache is set up to vary accordingly.

Wrapping up, I think there should always be attention devoted to
improving WordPress resilience on the minimum setup and I would like
to see this as a GSoC project.

Cheers,
Andy


More information about the wp-hackers mailing list