[wp-hackers] Development for 2.x : Improved Search
Denis de Bernardy
denis at semiologic.com
Sun Feb 5 13:06:13 GMT 2006
> > a) Simple : Add MySQL Full Text Indexing to Wordpress and
> > modify the search hooks to use it. Moving to FT indices on
> > MyISAM tables gives actually quite good serch out of the
> > gate. (...)
> >
> > Difficulty: not huge. Willing to do in full myself.
>
> This is mostly done already, and MyISAM full text indexing is
> about as bad as bad can get.
>
> http://www.semiologic.com/software/search-reloaded/
As additional information, a past version of the plugin did a slightly
better job than the above at the cost of a huge compute power. To spare
yourself some time:
1. Using a FT index on the text-only version of the formatted post excerpt
and content does not improve the results in any significant manner.
2. MySQL has a number of issues that are related to charsets.
These tend to worsen after MySQL 4.1 (at which point they introduced a
collection of new bugs, for good measure). The underlying mess is a
nightmare to sort out.
3. Trying to tweak the results by reworking the raw mysql score can produce
meaningful enhancements but involved a significant overhead.
Things I tried include the keyword order, their presence in the post title,
presence or absence of double quotes to create keyword groups, and later on
the use of a soundex.
I eventually dropped all of these ideas because working around MySQL's lack
of features by using php was simply ridiculous. If you give this a shot
yourself, store your indexes and search procedures in a real database, such
as pgsql.
4. Last but not least, several users sent me messages along the lines of the
following:
"Search reloaded returns results in a random order. Why doesn't it sort
results by date?"
Denis
More information about the wp-hackers
mailing list