OK I have a mySQL Database that looks something like this
ID – an int and the unique ID of the recorded
Title – The name of the item
Description – The items description
I want to search both title and description of key words, currently I’m using.
SELECT * From ‘item’ where title LIKE %key%
And this works and as there’s not much in the database, as however searching for “this key” doesn’t find “this that key” I want to improve the search engine of the site, and may be even add some kind of ranking system to it (but that’s a long time away).
So to the question, I’ve heard about something called “Full text search” it is (as far as I can tell) a staple of database design, but being a Newby to this subject I know nothing about it so…
1) Do you think it would be useful?
And an additional questron…
2) What can I read about database design / search engine design that will point me in the right direction.
If it’s of relevance the site is currently written in stright PHP (I.E. without a framework) (thro the thought of converting it to Ruby on Rails has crossed my mind)
Thanks all, I’ll go for Fulltext search.
And for any one finding this later, I found a good tutorial on fulltext search as well.
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
The problem with the ‘%keyword%’ type search is that there is no way to efficiently search on it in a regular table, even if you create an index on that column. Think about how you would look that string up in the phone book. There is actually no way to optimize it – you have to scan the entire phone book – and that is what MySQL does, a full table scan.
If you change that search to ‘keyword%’ and use an index, you can get very fast searching. It sounds like this is not what you want, though.
So with that in mind, I have used fulltext indexing/searching quite a bit, and here are a few pros and cons:
- Very fast
- Returns results sorted by relevance (by default, although you can use any sorting)
- Stop words can be used.
- Only works with MyISAM tables
- Words that are too short are ignored (default minimum is 4 letters)
- Requires different SQL in where clause, so you will need to modify existing queries.
- Does not match partial strings (for example, ‘word’ does not match ‘keyword’, only ‘word’)
Another option is to use a searching system such as Sphinx. It can be extremely fast and flexible. It is optimized for searching and integrates well with MySQL.
You might also consider Zend_Lucene. It’s slightly easier to integrate than Sphinx, because it is pure PHP.
I would guess that MySQL fulltext is sufficient for your needs, but it’s worth noting that the built in support doesn’t scale very well. For average size documents it starts to become unusable for table sizes as small as a few hundred thousand rows. If you think that this might become a problem further on you should probably look into Sphinx already. It’s becoming the defacto standard for MYSQL-users, even though I personally prefer to implement my own solution using java lucene. 🙂
Also, I’d like to mention that full text search is fundamentally different from the standard LIKE ‘%keyword%’-search. Unlike the LIKE-search full text indexing allows you to search for several keywords that doesn’t have to appear right next to each other. Standard search engines such as google are full text search engines, for example.