[wp-hackers] Statistics Based on WordPress Trac

Jacob Santos wordpress at santosj.name
Wed Oct 20 17:19:35 UTC 2010


I developed a basic HTML scrapper for WordPress Trac and I'm wanting to go
through and do some graphs. Is there anything anyone would want to know
before I release my own observations?

What I'm working on:

1. Number of Attachments based on username [1].
2. Average size of Attachments for each username.
3. Number of people who have more than 100 attachments verses those that
only have less than 10.
4. Number of Patches based on username [1].
5. Time between last patch and the commit of patch (will only consider
initial patch and commit, patches after the first commit will not be
considered).
6. Number of patches for those with 25+ patches between different time
periods (quarters).

It took about 9 hours to scrap the 15k tickets, but now that I know which
tickets have attachments, it should be faster to do some of the more
involved tasks.

Jacob Santos

[1] Attachments are the raw files that were uploaded by the username, so it
includes pictures, compressed files, php files, etc as well as patches and
diff files. Patches will only include files that are .patch or .diff.


More information about the wp-hackers mailing list