[wp-hackers] [WPSoC] The search system improvements
kodieg at gmail.com
Wed Jul 9 11:42:06 GMT 2008
I'm participant in Google Summer of Code and I'm working on developing
the new search system. I was asked to post here some info about my
project before mid-term evaluation. Below is some description of my
project so if you're not interested in reading it, you can skip this
paragraph. So, my idea was to create framework integrated with wordpress
which would manage searches. Firstly, it would enable adding new search
engines (for example packaged as normal wordpress plugins). Thanks to
this we could simply allow users to use Zend_Lucene search engine or one
based on Google Search (now AJAX) API. Secondly, I wanted to create new
search engine for wordpress which would allow searching in posts, pages
and comments. I've based my design a bit on Lucene, however, I made it
much easier and I used mysql to store index. I wanted to create plugins
for google and zend_lucene as well. Some information you can find at:
Testing blog is at: http://inzynieriawiedzy.org/kodie/gsoc/ (login:
admin, password wpgsoc). Feel free not to destroy it and please do no
harm to server. It contains more or less infos from some wikis or other
blog. However, you may find some info useful to understand how
everything should work.
So, what is implemented? Basically there is integrated with wordpress
search framework. You can reach it using $wpsearch variable. Using this
object you can register new search engine, unregister one, etc... It
also gives administrators simple management page (Management/Search
engines) to show what plugins are working and to give links to
configuration (if possible). Framework handles also changing template
for search results.
The new search engine is also working. It is searching in posts, pages,
comments and even attachments. You can exclude words from query, use
"terms like this", search in title (title: prefix), by author (author:
prefix) or in specified date range (date_start: and date_end:). See this
post for more details: http://inzynieriawiedzy.org/kodie/gsoc/?p=22
This engine is based on idea of inverted index (used for example in
lucene). It gives administrator page with options to clear and rebuild
index and delete also some documents from index.
- date issue in admin panel
- double entries for some attachments
- no type: (post,page,comment,attachment) option in query
I've also moved old searching code to plugin. So if someone needs to use
old search engine he will be able to do it.
You can download code here:
I couldn't find where google wanted me to upload it (I found only place
after project ends).
See this page for some notes:
I would be very grateful for comments, ideas, testing, looking through a
code or anything which might help me.
More information about the wp-hackers