[HyperDB] Any advantage to partitioning on a single database server?

Callum Macdonald lists.automattic.com at callum-macdonald.com
Fri Aug 27 17:30:20 UTC 2010


Hola Jim,

Yes, I believe it is beneficial to partition into multiple databases on
a single server. I had a little experience with MultiDB from Incsub and
that was the primary reason behind partitioning on a single server. I
believe that edublogs.org uses 4096 databases on a single server, in
production. They might have changed their configuration in the last year
or two, but that was how it went a while back.

I don't remember if it was because of the number of files per directory
or the number of simultaneously open files, but I know we partitioned
the data (for about 300k blogs at that point) across multiple databases
on a single database server.

Best of luck with your project.

Love and joy - Callum.

On Thu, 2010-08-26 at 13:26 -0700, Jim McQuillan wrote:
> The number of files in a directory is exactly the reason I was thinking that
> partitioning would be needed.  We're using MyISAM so that would be 3 files
> per table, and if there are 50,000 blogs with 9 tables per blog that would
> be 1,350,000 files in a single directory.
> 
> I'm no database or filesystem expert, but wouldn't that large number of
> files be an issue?  Unfortunately it isn't easy way to "test" different
> configurations before actually growing to that size, but we're looking to
> set things up ahead of time to be prepared.
> 
> Thanks!
> -Jim
> 
> 
> On Thu, Aug 26, 2010 at 12:53 PM, Andy Skelton <skeltoac at gmail.com> wrote:
> 
> > Jim McQuillan wrote:
> > > I've been doing more research into scaling a WordPress multi-site
> > > installation, and I've been under the assumption that partitioning the
> > > tables into multiple databases would be beneficial for a large
> > installation
> > > (hypothetically 50,000 blogs).
> >
> > At some point we decided to have multiple databases on a single
> > machine. Maybe it had something to do with the number of files in a
> > directory, or the number of hard drives in the machine; I never quite
> > understood it. Maybe Barry or Donncha can explain.
> >
> > > I know that using multiple database server would definitely bring a
> > > performance enhancement, but we're starting with one... so is it even
> > worth
> > > it to implement partitioning at all right now?
> >
> > There is some measurable overhead to partitioning. So if you had two
> > blogs and you put them in separate partitions I assume there would be
> > a tiny downside and no measurable upside.
> >
> > At some point (hopefully) you'll need to use partitioning due to
> > hardware limitations. When that time comes, you may or may not benefit
> > from any partitioning you did on the one first server. Your scheme
> > might have to be thrown out or it might actually make it easier to
> > migrate to a scaled-up system. The only guaranteed benefit is having
> > the experience of working with the more complex configuration.
> >
> > Without further information I'd say proceed without partitioning.
> > Don't try to do all your scaling in advance; it doesn't work. Run some
> > experiments once you have real-world data and traffic.
> >
> > Andy
> > _______________________________________________
> > HyperDB mailing list
> > HyperDB at lists.automattic.com
> > http://lists.automattic.com/mailman/listinfo/hyperdb
> >
> _______________________________________________
> HyperDB mailing list
> HyperDB at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/hyperdb




More information about the HyperDB mailing list