[wp-hackers] Network activations and WP SlimStat

Otto otto at ottodestruct.com
Tue Nov 9 02:11:07 UTC 2010


On Mon, Nov 8, 2010 at 7:53 PM, Dino Termini <dino at duechiacchiere.it> wrote:
> Good question. Of course I could have introduced a blog_id and combine all
> the tables to make the thing scale no matter how big the network is. But
> what is the difference between 100 tables containing 1 million records each
> and 1 table containing 100 million records? The second one will be much much
> slower when performing SELECTS and other operations, and will be much bigger
> than the 100 tables combined, if I enable indexing on the most queried
> fields. I'm not a DB guru, so there's probably something I'm missing here.

You have a point here, however I think you're thinking of an edge
case. 1 million records of hits is too much to be storing in the first
place. That should be paired down on a regular basis to far fewer. Get
the data you want, then dump it out elsewhere. Realistically, if I was
storing raw data like that, then I wouldn't put it in a database to
begin with, I'd use a flat file and process out what I actually need
in batches, then store that.

> Hardcoded? I had thought about that, but in other words you're suggesting
> that it's better to load a 3.5 Mb array into memory EVERY TIME the plugin
> has to record a new hit, than querying the database to find the same
> information, am I right?

No, I'm saying that you don't need to record the country at all in the
first place. You need to know the country at display time, not at raw
data grabbing time. Raw data should be *raw*. Fast. Efficient for
writing. Save the processing for later.

> Sorry, but this is not the case with WP SlimStat. It allows you to
> drill-down into the metrics, filtering data, and ultimately giving you
> access to the 'raw data' that users can browse and inspect line by line.
> It's not just a bunch of graphs and charts, please try it by yourself :-)

Yes, but do you actually have any stats on how often this is actually
done? The fact that you've made it possible to do something that is
rather pointless doesn't mean that lots and lots of people actually do
that pointless thing.

Nobody wants the raw data except for eggheads like you and me, and
then we really want it to be able to see trends over time, charts,
most often used search queries, etc. Do you really spend your day
going to the basic data level to see individual hits? Why would you
ever do that?

It's totally *pointless* to store the last million hits to your site.
Abstract out the data you actually want from it, store that, then
trash the data.

-Otto


More information about the wp-hackers mailing list