DeutschEnglish

Submenu

 - - - By CrazyStat - - -

15. July 2016

Request-Tracker4: Enabling fast fulltext search with Sphinxsearch on Debian Jessie

I just successfully made fulltext search in our request tracker super fast. This is a guide for Debian Jessie and request tracker set up to use MySQL (or MariaDB).

  • Don’t forget to backup your RT database:
    mysqldump -p -u root rtdb > rtdb.sql
  • Log out of request tracker. I had some strange issue because I did not 😉
  • Make sure you are using request-tracker4 version 4.2.12 from jessie-backports. If you don’t already, this is how you update:
    – add this to the end of your /etc/apt/sources.list:

    deb http://ftp.debian.org/debian jessie-backports main

    – run

    apt-get update

    – install request-tracker4 from jessie-backports:

    apt-get -t jessie-backports install request-tracker4

    – follow the update procedure, make sure your database is upgraded
    (We need the package from backports as jessie’s rt4-db-mysql package depends on mysql-client, whereas the backports version depends on virtual-mysql-client and thus can also be easily used with MariaDB.)

  • Second, make sure you are using MariaDB instead of MySQL. MariaDB is a drop-in replacement for MySQL. The big advantage of MariaDB here is that it comes with the SphinxSE storage engine that we will use for the fulltext search. SphinxSE is also available for MySQL, but you would need to compile it yourself whereas MariaDB includes SphinxSE by default. MariaDB has also a lot more advantages. Switching to MariaDB is very simple, no need to convert databases or configuration. I used aptitude and selected to install mariadb-client and maridb-server. Then I checked the dependency resolutions and choose the option to uninstall mysql-client, mysql-server and related packages. Then everything went smooth automatically, I just needed to enter the root password again, everything else was updated automatically.
  • Now check if your RT is still working well with MariaDB.
  • Now enable the SphinxSE storage engine in MariaDB:
    mysql -u root -p
    [enter password]
    INSTALL SONAME 'ha_sphinx';
    SHOW ENGINES;
    EXIT;
  • Now install the Sphinxsearch daemon:
    apt-get install sphinxsearch
  • Now set up indexed full text search in RT:
    rt-setup-fulltext-index --dba root --dba-password secret

    (Let me know if you find a way without passing the password as a parameter)
    This asks you what type of index you want to use. Enter “sphinx”. Then keep the defaults.

  • The command above gave you code for /etc/request-tracker4/RT_SiteConfig.pm – like this:
    Set( %FullTextSearch,
        Enable     => 1,
        Indexed    => 1,
        MaxMatches => '10000',
        Table      => 'AttachmentsIndex',
    );

    Add / adjust this in RT_SiteConfig.pm.

  • Then you need to configure the index in sphinx. The above tool also gave you a config file for this, but it does not work with the current sphinxsearch version of jessie. So here is what works. Create an /etc/sphinxsearch/sphinx.conf like this:
    source rt {
        type            = mysql
    
        sql_host        = localhost
        sql_db          = rtdb
        sql_user        = rtuser
        sql_pass        = RT_PASSWORD
    
        sql_query_pre   = SET NAMES utf8
        sql_query       = \
            SELECT a.id, t.Subject, a.content FROM Attachments a \
            JOIN Transactions txn ON a.TransactionId = txn.id AND txn.ObjectType = 'RT::Ticket' \
            JOIN Tickets t ON txn.ObjectId = t.id \
            WHERE a.ContentType LIKE 'text/%' AND t.Status != 'deleted'
    
    #    sql_query_info  = SELECT * FROM Attachments WHERE id=$id
    }
    
    index rt {
        source                  = rt
        path                    = /var/lib/sphinxsearch/data/rt
        docinfo                 = extern
    #    charset_type            = utf-8
    }
    
    indexer {
        mem_limit               = 32M
    }
    
    searchd {
        listen                  = 3312
        log                     = /var/log/sphinxsearch/searchd.log
        query_log               = /var/log/sphinxsearch/query.log
        read_timeout            = 5
        max_children            = 30
        pid_file                = /var/run/sphinxsearch/searchd.pid
    #    max_matches             = 10000
        seamless_rotate         = 1
        preopen_indexes         = 0
        unlink_old              = 1
        # For sphinx >= 1.10:
        binlog_path             = /var/lib/sphinxsearch/data
    }

    Make sure to adjust the mysql database, user and password.
    Note that there are some differences to the config provided by rt-setup-fulltext-index. Especially, you need to use listen instead of port. If you don’t adjust this, sphinxsearch will start on a different port and RT will not find any results in fulltext searches! If your RT is not behind a firewall, better also set listen to 127.0.0.1:3312 . Also, I used the Debian style paths from the Debian sample. The fields commented out are not necessary anymore in current sphinxsearch.
    UPDATE 26.08.2016: I changed the SQL Query to also index the Subject of the tickets and index all ‘text/%’ types and not only ‘text/plain’. This change is based on a comment by Claes Merin in the Request Tracker Mailing List. Thanks a lot. Without this change, the fulltext search will only search within ticket content, but not within ticket titles… Also, before it would not index text/html attachments.

  • Now run the indexer to build the initial index. This will take some time, for a 6 GB Attachments Table, it took not more than 30 minutes. Maybe run this in a screen in case your ssh connection breaks:
    indexer rt
  • Enable the Sphinx daemon. To do so, adjust START to yes in /etc/default/sphinxsearch
  • Finally, start the sphinxsearch daemon:
    /etc/init.d/sphinxsearch start
  • Now you can log into RT and do a fulltext search by entering in the search field:
    fulltext:search_word
  • In my case, this made the fulltext searches run within a second instead of 5 minutes 🙂
  • Make sure the indexer is run regularily to update the index. To do so, enter
    crontab -e

    And add a line like this:

    13 * * * * /usr/bin/indexer rt --rotate

I hope this helps somebody to setup this a little faster.

 

 

 

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.

3 Comments »

  1. Great article, worked like a charm!

    You can use MySQL too, but in Debian it does not have Sphinx built in, so you will have to compile it.

    Comment by Nepto — 26. August 2016 @ 09:20

  2. Works perfectly. Thanks!

    Comment by Sorin — 19. May 2017 @ 11:33

  3. Thank you for this article, it really helped me a lot. It also works with the latest available MariaDB and sphinxsearch from maintainers, it’s not necessary to stay with little bit outdated Debian packages.

    Some tips:
    – when you want to have all sphinxsearch files in different than default prefix, eg. in /opt/rt4/var/sphinx/, you need to also update PIDFILE variable in /etc/init.d/sphinxsearch script (for the sphinxsearch version from maintainer, I have not checked the one from Debian repo).

    – shouldn’t be the files produced by indexer owned by sphinxsearch instead of root? If so, you need to also enhance the crontab line with something like ‘runuser’.

    Comment by Petr Hanousek — 13. July 2017 @ 15:58

RSS feed for comments on this post. TrackBack URL

Leave a comment