Jump to content

Sphinx Install


Recommended Posts

Warning: Some steps of this document are intended for advanced users, require root access to your web-server (which in turn requires that you have a web-server with root access, usually reserved for dedicated hosting or vps hosting), and should not be attempted if you are not sure of what you are doing.

Introduction
IP.Board 3.x provides full out-of-the-box support to utilize Sphinx for fulltext searching of content on your site. That said, it is still your responsibility to install and configure Sphinx, so this article will help you do just that so that you can use Sphinx for searching content within IP.Board 3.x.

Please be advised that applications must define a sphinx template in order for the content to be searchable. If you install a third party application that does not properly define a sphinx template file, it will not be searchable through Sphinx.

Installing Sphinx
The first thing you must do is install Sphinx itself. Sphinx is a third party search engine available at http://sphinxsearch.com . The documentation on their site explains how to install sphinx, but below you will find the general commands you will need to run.

Login to your webserver as root. If you do not have root access to your webserver, contact your host for assistance.

Change directory to a temp directory and download sphinx. Untar the package afterwards, and move into the untarred Sphinx directory.

cd /tmp
wget http
://sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz
tar xzvf sphinx-0.9.8.1.tar.gz
cd sphinx
-0.9.8.1
Next you need to configure, make and make install the package

./configure
make
make install

If you get an error at any of these steps, stop and correct it. For instance, if Sphinx cannot find your mysql binaries, you can tell it where they are by passing "--with-mysql (path)" to the ./configure command.

Once this is done, Sphinx is installed and ready to be used (though there is still more work to do).

You will need to copy the api/sphinxapi.php file provided in the Sphinx download to your forum root directory

cp api/sphinxapi.php /path/to/forums/here
Next, you should create the directories that Sphinx will store it's log files and index files in. The suggested directory is /var/sphinx, however you can create the directory anywhere you wish. Just remember where you put it.

mkdir -p /var/sphinx/log
Configuring IPB
Now, login to your IPB admin control panel. Visit System -> System Settings -> Search Set-Up. Change "Type of search" to "Sphinx" in the dropdown, and configure the Sphinx settings appropriately. In most cases, you do not need to change any of the sphinx settings, however if you created a directory other than /var/sphinx, or if you are installing Sphinx on your MySQL server in a multi-server setup, you will need to adjust these appropriately. Save the settings.

Visit System -> Manage Applications & Modules next, and click on Build Sphinx Config. You will be presented with a downloadable copy of sphinx.conf. Download this file, and then upload it to your server (the exact location is unimportant, but remember where you put it).

Creating the index and starting the search daemon
Back in shell, you need to index your searchable content. This is an expensive task, however even with very large databases (4+ million posts or more) this does not take a very long time.

Run the following command, replacing the path to the sphinx.conf file appropriately

/usr/local/bin/indexer --config /path/to/sphinx.conf --all
Once this is done, you need to start the search daemon.

/usr/local/bin/searchd --config /path/to/sphinx.conf
And once the search daemon is running, you should be able to use the search feature on your site, now using sphinx for it's backend searching. Go give your search function a quick test to make sure everything is working before proceeding further (note that "View new content", "Active posts", and "Find posts/topics by member" still make use of internal searching and do not use Sphinx in 3.0.0 - so you need to perform an actual search for a keyword to test this).

Final "tweaks"
There are two more steps you need to do.

First, you need to create two cron jobs to rebuild the indexes at intervals. One cron job will rebuild the "delta" index (only including new content) every 15 minutes. This task only grabs new content, so it is not overly resource-heavy. The second task will rebuild the entire index once a day (to ensure edited posts and so forth are re-indexed properly), and since it has to rebuild the entire index should be scheduled for a time period that your server is least busy (e.g. 4 AM).

crontab -e
*/15 * * * * /usr/local/bin/indexer --config /path/to/sphinx.conf core_search_delta members_search_delta forums_search_posts_delta --rotate
0 4 * * * /usr/local/bin/indexer --config /path/to/sphinx.conf --all --rotate
Again, remember to replace the path to the sphinx configuration file appropriately. Also, you will note in the first cron job added that we have to specify which indexes we want to rebuild (only the _delta indexes). There should be one index for each application installed that supports Sphinx (except for the "forums" application, which has 2). Thus, if you install Calendar, Blog, Gallery and Downloads you should change the cron job like so

*/15 * * * * /usr/local/bin/indexer --config /path/to/sphinx.conf core_search_delta members_search_delta forums_search_posts_delta forums_search_topics_delta calendar_search_delta downloads_search_delta blog_search_delta gallery_search_delta --rotate
If you omit one, it will simply mean that new content won't be added to the index until the full search index is rebuilt at 4 AM. If in doubt, just search the sphinx.conf file you have downloaded for "_delta" and note all of the indexes you find that have this suffix.

Finally, in case you restart your server, you want to make sure that Sphinx is started back up when the server starts. The method of doing this will vary from system to system, so contact your system administrator if you are unsure. We generally use on CentOS the following:

nano /etc/init.d/rc.local
and add to the file

rm -f /var/sphinx/*.spl
/usr/local/bin/searchd --config /path/to/sphinx.conf

This will remove any lingering lock files that may have been left and restart Sphinx. Adjust the paths as appropriate.

Conclusion
Sphinx is an excellent search engine and can reduce resource usage on your servers when setup and in use. You will need some commandline/Linux technical knowledge to do so, but once it's setup you shouldn't have to make many changes to it (only when you install new applications and want them to be searchable). We hope this article provides the information necessary to setup and use Sphinx with IP.Board 3.0. 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...