[wp-hackers] Help with the API on WordPress.org?

Fri Jan 2 05:56:44 GMT 2009

"DD32" <wordpress at dd32.id.au> wrote:
> Because it has no need to be (RESTful)?

Oh, now there you've gone and done it. You've just unleashed my "RESTful
Zealot" persona and now I feel compelled to enter proselytizing mode. :-)

But before I start, let me say that my mention of a JSON return type was an
arbitrary example; more on that in a bit.

I spent about a year between 2006 & 2007 studying HTTP-based web services
and came away with the fervent belief that almost all HTTP-based APIs are
better off when they are RESTful, especially on domains as publicly relevant
as WordPress.org. Unfortunately most of the benefits, while real and
tangible with many use-cases, are not immediately apparent without a healthy
combination of motivated study and a moderately intuitive nature as well as
ideally a bit of mentoring such as can be found on the rest-discuss list.
Suffice it to say that many of the benefits are enabling serendipity,
something which is rarely the most obvious requirement unless someone goes
explicitly searching for it.

REST is basically the same as the broader HTTP-based web but with numerous
added constraints and thus, when done right, bring along many of the
benefits of the HTTP web "for free." And those constrains add structure and
compatibility, both which are benefits. As it has been said "Embracing
constraints can be very liberating."

By comparison, the RESTians would call the style of the WordPress.org Plugin
API "RPC over HTTP" were most of the benefits of the HTTP web are provided
automatically.  Let me give you once such real and tangible benefit of
having a RESTful API for Plugins on WordPress.org vs. using the existing RCP
over HTTP API: "HTTP caching."

Here one BIG reason why your API *does* need to be RESTful.  Using the
approach I suggested you get HTTP Caching "for free."  Said http caching
would 1.) reduce the load on the WordPress.org API servers and 2.) it would
speed the end user experience as they page through the multiple pages of the
same results. Yes, you could cache the values you retrieve from your API but
you'd have to write caching code and caching on RESTful APIs comes with
(almost) no extra code.

And that's the thing about RESTful APIs; you get so much of what you'd
otherwise have to write for free. Basically anything that understands the
HTTP web can also understand a RESTful web service and add additional
functionality, such bookmarking services, etc.

> Well.. Just checking, It can accept $_GET too, but since
> it was originally spec'd for POST, i thought it'd be
> simpler to just not mention it..

Actually allowing GET and POST to do the same this is, from a RESTful
perspective, a big no-no and for good reason (and I used to write web apps
that did it, so I'm calling my prior self out on this one.) With RESTful web
services the two verbs mean two different things. GET retrieves
representations and can be cached by intermediaries on the web including
browsers and POST explicitly should not be cached. A bit more on this below.

> And keep all the params tied together (since passing
> a serialised string via GET would break standards - the
> url would reach maximum length quickly)

Although in the HTTP spec there is not a maximum URL length googling
"maximum url length" gives a recommendation for no more than 2000 characters
because of arbitrary limits that browser and servers places on URLs (
http://www.boutell.com/newfaq/misc/urllength.html). But 2000 is a lot of
characters and I don't think any of your current API capabilities would
exceed that.

Of course allowing for retrieval of information about a list of plugins
given the list of plugin slugs could easily exceed 2000 characters (it is
ironic that that was my needed use-case.) The way that could be addressed
would be to allow them to be encoded into the URL for short lists and for
longer lists split into two HTTP transaction, i.e.

   #1 - POST to "http://restapi.wordpress.org/plugins/sets/new" with the
POST body being "plugin_list=plugin1&plugin2&plugin3&...&plugin"

The response to the post would be "201 Created" (or maybe "303 See Other",
I'd need to ask on the rest-discuss list) with a header of "Location:
http://restapi.wordpress.org/plugins/sets/{set-id}" where {set-id} would be
the unique ID given to the that set of plugins on the server. Note that the
first time this POST was submitted for a list of plugins it would create the
new set; each time after it would simply retrieve the set ID for the
existing set.

   #2 – GET http://restapi.wordpress.org/plugins/sets/{set-id}.{ext}

This would return the list of plugin information for the plugsin contained
in the set identified by {set-id}. The representation returned by the
response would be the mime-type that mapped to the {ext}, i.e.

   json – Javascript object notation
   xml – Basic XML layout (as formatted by your API)
   rss – An RSS feed
   atom – An Atom feed
   sphp – Serialized PHP (is there a "standard" extension for this?)
   html – The data in HTML format

So for short lists it's a single transaction but for longer lists it's a
two-part transaction. Since most calls from plugin_api() would fit into the
2000 character limit this isn't a big deal.

> As for the multiple output formats.. That'd
> complicate the output stages, Not something
> that was needed, So it was skipped.

When architected in the way I just mentioned it really does not complicate
the output stage; you just use an internal representation (PHP objects and
arrays) and then each response representation can be generated by a
pluggable serializer. Piece of cake. :-)

But I do understand it wasn't needed and "Good Today" is usually better than
"Great Tomorrow." :-)

> Well, The fact its not a human-use API really..
> It accepts PHP, it gives back PHP, It makes sense
> to test from within PHP,

My response is "Every API is a human-use API, or at least should be
considered one."

Why? Because programmers are humans and they have to figure out how to
program them. The more conceptually accessible and easily testable an API,
the more often people actually use it (or at least that's what statistical
logic would tell me… :-)  That's one reason I advocate always having an HTML
response representation so that developers can quickly and easily see the
result of their API GETs.

It might make sense for *you* to test the API you developed with PHP because
you already know all your own assumptions and don't have to discover how to
call your API; that's not true for others (like me!) But for someone who
doesn't know your assumptions writing code is a lot more effort than
performing a simple URL request in a browser. And where there is more effort
required it reduces how often people decide to do it. (btw, convincing
people of make their APIs browser-accessible is a crusade of mine and if you
google enough you'll find me advocating it elsewhere too. :)

One thing you find when you start building RESTful APIs is that almost all
of the interaction patterns you might need are covered by HTTP and thus you
don't need to "design" the API; you just follow the best practice patterns.
As an example, consider this URL:

   http://restapi.wordpress.org/plugins/sets/abd123.yaml

Here let's assume the web service has (yet?) to implement a YAML
specification format representation renderer but the web service still
"handles" it just fine, it returns a "501 Not Implemented" status code and
all intermediaries that "speak" HTTP understand what that means. Easy peasy.
:-)

> And since the primary use of it is for WP, calling
> plugins_api('action', (object)array('test'= > 'a'));
> makes a lot more sense than allowing for browsers/etc.

I'd (respectfully) argue that your implementation will (almost) certainly
doom it to be so (i.e. only ever used by plugins_api().)

However I can actually envision lots of different use-case for the WordPress
Plugin API if it were more accessible.  For example, if there were a RESTful
API with an RSS representation format then plugin authors could use an RSS
Widget to list their own plugsin on their own blog. Here's the
(hypothetical) URL that would list your (dd32's) plugins:

   http://restapi.wordpress.org/authors/plugins/dd32.rss

See how the serendipity of the RESTful approach starts to expose itself? :-)

BTW, the Twitter API does most of this and just look at how many Twitter
add-ons there are! (though notably it doesn't have HTML response rendering.)
 Clearly there is more to it than an accessible API, but that accessible API
have helped, I know that for certain in my local peer group.  As an example,
here's my own stream which you can view in the browser:

The nice thing about APIs on the server is they are just interfaces to a
back-end so it's possible and easy to implement multiple APIs to the same
back-end. Which brings me back to my offer to implement a truly RESTful API
for WordPress.org plugin repository: I've got lots of the code already
implemented from other projects; so who do I propose this to? :-)

-Mike Schinkel
http://mikeschinkel.com

P.S.  Now imagine that we had such a RESTful API?  We can easily extend it
to include plugin usage logging. We could set up a scheduled task to daily
perform a PUT to the following URL with the HTTP body being the list of
plugins currently installed.

   PUT http://restapi.wordpress.org/sites/{site-id}/plugins/

The {site-id} could be a GUID that is generated the first time the scheduled
task is called and, if this were included in core it could be an option that
administrators could turn off if they don't want to report this information
(we could even ask for permission on the install screen.)

But with this information collected we could know plugins are in use and, if
we also later added the ability for people to rank how important a plugin is
to their site we could find out which plugins people depend on most and thus
which ones most need to be upgraded.  This could also be used to rank plugin
search results like Google uses PageRank.

Just a thought.  A RESTful approach opens up huge possibilities without
having to spent lots of time doing API design.