[wp-hackers] WordPress Search

Simon Dunton - WP Sites simon at wpsites.co.uk
Thu Oct 31 12:33:16 UTC 2013


Elasticsearch is a more complicated route but if search is important to you and you want total control then it's a good option.

Elasticsearch has a number of analyzers that can be used to break up the query and index tokens http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html so you could for instance pretty easily replace textual representations of numbers in the query; so "5 Pillars" could be automatically converted to "(5 OR five) Pillars". 

It's probably an impossible task to create your own version of Google but in many cases for your own data especially if it's a bit more specialised I think a custom Elasticsearch solution is going to be more relevant than Google could ever be. Google doesn't know what is important on your website as well as you do, Google just takes a generic approach, but you can tailor it to your content. Just for generic text i.e. if your site is a blog about general stuff your not going to have too much to work with but if your site was the WordPress codex/forums then you're going to know a whole load more about your sites data and structure than Google and you can use that edge to create a better search experience for your users than you could ever get from Google Custom Search.

Sorry if I waffled a bit there, in a bit of a rush with work!


On 31 Oct 2013, at 11:18, Haluk Karamete wrote:

> Thank you for your feedback Simon.
> After you pointing out that a custom Google search is not as good as Google
> search, I compared these two;
> this one searches  "advanced taxonomy posts" on wordpress.org web site
> http://wordpress.org/search/advanced%20taxonomy%20posts
> and this one searches "advanced taxonomy posts" on google.com with a search
> operator attached (site:wordpress.org)
> https://www.google.com/search?num=100&q=site%3Awordpress.org+advanced+taxonomy+posts&oq=site%3Awordpress.org+advanced+taxonomy+posts&gs_l=serp.3...68176.72806.0.76162.
> Obviously results are pretty close but not identical. Both are good and
> share a lot in common.
> I have done some other searches but I feel like Google search results are a
> little better, more mature. But it makes me think why there is a
> difference?
> I guess there are some settings internally set somewhere and that modifies
> the way the results are served. ( PS. I'm *not* referring here how the
> search results are displayed format wise, I mean the actual order and the
> result set... ) They are definitely not identical.
> As to the elasticsearch you refer to, from an earlier look, it looks like a
> complicated route to me.
> Does the elasticsearch be able to handle "5 Pillars" and the "Tariq
> Ramadan" examples I gave you in my earlier post?
> Can it handle 5+5 type search? ( Not that I need this but... )
> Could elasticsearch leverage Google search with all of its intelligence, or
> is it a complete "DIY" situation here?
> I'm just curious.
> On Thu, Oct 31, 2013 at 3:38 AM, Simon Dunton - WP Sites <
> simon at wpsites.co.uk> wrote:
>> Hi,
>> WordPress.org must be using https://www.google.co.uk/cse/
>> In my opinion Google custom search engines are useless. Yes you can
>> specify which sites you want to index and tweak some settings but in my
>> experience the results aren't as good as a normal Google search (I used it
>> years ago so might have improved since then) and besides, do you really
>> want Google to decide which factors are most relevant when it comes to
>> searching on your website?
>> I think the best way is get your self an elasticsearch instance/cluster
>> have all your post content automatically feed into elasticsearch to be
>> indexed and you're totally in control.
>> Simon
>> On 31 Oct 2013, at 09:15, Haluk Karamete wrote:
>>> Hi Guys...
>>> I have a question  that has two parts...
>>> One philosophical and the other is practical... before I get into that,
>> let
>>> me set the context of this question.
>>> This question does not apply to small business or blogs.
>>> It applies to huge sites that have thousands of posts, perhaps over
>> 100,000.
>>> Search is a key feature to me, like to many other people.
>>> I know there are a ton of great plugins out there specializing on search.
>>> There are great minds & work behind those plugins & I respect the work
>>> highly.
>>> But when it comes to search, I don't think Google is beatable.
>>> I think no matter how dedicated a group might be, they won't be able to
>>> come up with something that does better than what Google can. I'm
>> including
>>> in this statement Yahoo & Bing, let alone the plugins that I've talked
>>> about.
>>> There are 2 kinds of searches to me.
>>> the kind that is super accurate ( accurate to the dot ) and this kind of
>>> search usually comes with no wisdom. They are handy for certain
>>> implementations such as searching a code base.. you can go really
>> accurate
>>> with all kinds of  and's &  or's & contains etc...  like an editor's
>> search
>>> and there is the other kind of search..  this one comes with wisdom.
>>> it won't match to certain results because it *somehow* factors in some
>>> wisdom, and it simply avoids some results that the first type of search
>>> mentality I've referred above. For example, a query on "Ramadan" won't
>>> match "Tariq Ramadan" here. But yet a query on "5 Pillars" matches "Five
>>> Pillars". Well, that's google.
>>> I'd like to hear your opinions on this. Cause I maybe seeing it wrong,
>>> there could be some solutions that come somewhat close to Google's way of
>>> doing it. But honestly, I'm almost 100% sure, that there is no better
>> way.
>>> Until you convince me otherwise, I would think that if you are in charge
>> of
>>> a site like TechCrunch, New York Times or NPR etc, the search must be
>> based
>>> on Google.
>>> Second part of my question is if you agree with this point of view of
>> mine,
>>> would you please give me a few leads as to which plugins or solutions
>> that
>>> you may recommend that would integrate Google search into a WordPress
>> site.
>>> And BTW, I just did a search on wordpress.org just to see how
>> Wordpress.org
>>> was handling the search ( cause honestly, I did not know how the codex
>>> handled the search aspect & I was going to compare wordpress.org's
>> search
>>> results to google with site:wordpress.org - but it turned out that
>>> WordPress.org too adapted Google when it comes to search. :)
>>> In that case, I could ask now if there is a recommended practice in
>> setting
>>> up the custom google seathe way Wordpress.org did.
>>> THank you
>>> _______________________________________________
>>> wp-hackers mailing list
>>> wp-hackers at lists.automattic.com
>>> http://lists.automattic.com/mailman/listinfo/wp-hackers
>> _______________________________________________
>> wp-hackers mailing list
>> wp-hackers at lists.automattic.com
>> http://lists.automattic.com/mailman/listinfo/wp-hackers
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers

More information about the wp-hackers mailing list