DeutschEnglish

Submenu

 - - - By CrazyStat - - -

15. July 2016

Request-Tracker4: Enabling fast fulltext search with Sphinxsearch on Debian Jessie

I just successfully made fulltext search in our request tracker super fast. This is a guide for Debian Jessie and request tracker set up to use MySQL (or MariaDB).

  • Don’t forget to backup your RT database:
    mysqldump -p -u root rtdb > rtdb.sql
  • Log out of request tracker. I had some strange issue because I did not 😉
  • Make sure you are using request-tracker4 version 4.2.12 from jessie-backports. If you don’t already, this is how you update:
    – add this to the end of your /etc/apt/sources.list:

    deb http://ftp.debian.org/debian jessie-backports main

    – run

    apt-get update

    – install request-tracker4 from jessie-backports:

    apt-get -t jessie-backports install request-tracker4

    – follow the update procedure, make sure your database is upgraded
    (We need the package from backports as jessie’s rt4-db-mysql package depends on mysql-client, whereas the backports version depends on virtual-mysql-client and thus can also be easily used with MariaDB.)

  • Second, make sure you are using MariaDB instead of MySQL. MariaDB is a drop-in replacement for MySQL. The big advantage of MariaDB here is that it comes with the SphinxSE storage engine that we will use for the fulltext search. SphinxSE is also available for MySQL, but you would need to compile it yourself whereas MariaDB includes SphinxSE by default. MariaDB has also a lot more advantages. Switching to MariaDB is very simple, no need to convert databases or configuration. I used aptitude and selected to install mariadb-client and maridb-server. Then I checked the dependency resolutions and choose the option to uninstall mysql-client, mysql-server and related packages. Then everything went smooth automatically, I just needed to enter the root password again, everything else was updated automatically.
  • Now check if your RT is still working well with MariaDB.
  • Now enable the SphinxSE storage engine in MariaDB:
    mysql -u root -p
    [enter password]
    INSTALL SONAME 'ha_sphinx';
    SHOW ENGINES;
    EXIT;
  • Now install the Sphinxsearch daemon:
    apt-get install sphinxsearch
  • Now set up indexed full text search in RT:
    rt-setup-fulltext-index --dba root --dba-password secret

    (Let me know if you find a way without passing the password as a parameter)
    This asks you what type of index you want to use. Enter “sphinx”. Then keep the defaults.

  • The command above gave you code for /etc/request-tracker4/RT_SiteConfig.pm – like this:
    Set( %FullTextSearch,
        Enable     => 1,
        Indexed    => 1,
        MaxMatches => '10000',
        Table      => 'AttachmentsIndex',
    );

    Add / adjust this in RT_SiteConfig.pm.

  • Then you need to configure the index in sphinx. The above tool also gave you a config file for this, but it does not work with the current sphinxsearch version of jessie. So here is what works. Create an /etc/sphinxsearch/sphinx.conf like this:
    source rt {
        type            = mysql
    
        sql_host        = localhost
        sql_db          = rtdb
        sql_user        = rtuser
        sql_pass        = RT_PASSWORD
    
        sql_query_pre   = SET NAMES utf8
        sql_query       = \
            SELECT a.id, t.Subject, a.content FROM Attachments a \
            JOIN Transactions txn ON a.TransactionId = txn.id AND txn.ObjectType = 'RT::Ticket' \
            JOIN Tickets t ON txn.ObjectId = t.id \
            WHERE a.ContentType LIKE 'text/%' AND t.Status != 'deleted'
    
    #    sql_query_info  = SELECT * FROM Attachments WHERE id=$id
    }
    
    index rt {
        source                  = rt
        path                    = /var/lib/sphinxsearch/data/rt
        docinfo                 = extern
    #    charset_type            = utf-8
    }
    
    indexer {
        mem_limit               = 32M
    }
    
    searchd {
        listen                  = 3312
        log                     = /var/log/sphinxsearch/searchd.log
        query_log               = /var/log/sphinxsearch/query.log
        read_timeout            = 5
        max_children            = 30
        pid_file                = /var/run/sphinxsearch/searchd.pid
    #    max_matches             = 10000
        seamless_rotate         = 1
        preopen_indexes         = 0
        unlink_old              = 1
        # For sphinx >= 1.10:
        binlog_path             = /var/lib/sphinxsearch/data
    }

    Make sure to adjust the mysql database, user and password.
    Note that there are some differences to the config provided by rt-setup-fulltext-index. Especially, you need to use listen instead of port. If you don’t adjust this, sphinxsearch will start on a different port and RT will not find any results in fulltext searches! If your RT is not behind a firewall, better also set listen to 127.0.0.1:3312 . Also, I used the Debian style paths from the Debian sample. The fields commented out are not necessary anymore in current sphinxsearch.
    UPDATE 26.08.2016: I changed the SQL Query to also index the Subject of the tickets and index all ‘text/%’ types and not only ‘text/plain’. This change is based on a comment by Claes Merin in the Request Tracker Mailing List. Thanks a lot. Without this change, the fulltext search will only search within ticket content, but not within ticket titles… Also, before it would not index text/html attachments.

  • Now run the indexer to build the initial index. This will take some time, for a 6 GB Attachments Table, it took not more than 30 minutes. Maybe run this in a screen in case your ssh connection breaks:
    indexer rt
  • Enable the Sphinx daemon. To do so, adjust START to yes in /etc/default/sphinxsearch
  • Finally, start the sphinxsearch daemon:
    /etc/init.d/sphinxsearch start
  • Now you can log into RT and do a fulltext search by entering in the search field:
    fulltext:search_word
  • In my case, this made the fulltext searches run within a second instead of 5 minutes 🙂
  • Make sure the indexer is run regularily to update the index. To do so, enter
    crontab -e

    And add a line like this:

    13 * * * * /usr/bin/indexer rt --rotate

I hope this helps somebody to setup this a little faster.

 

 

 

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.

11. August 2013

RT Request Tracker: Migrate From SQlite to mySQL

Filed under: DBMS,Linux,Server Administration — Tags: , , , , , , — Christopher Kramer @ 23:07

I lately had to migrate an RT installation (version 4.0.4) from SQLite to MySQL. In case anyone else has to do this, here is a brief description of how it worked out.

  1. Setup a working MySQL server in case you have not already
  2. Create a MySQL user for RT (e.g. rt4)
  3. Configure RT to use MySQL using this username and a dbname (e.g. rt4) of a not yet existing db (See /etc/request-tracker4/RT_SiteConfig.pm and RT_SiteConfig.d/)
    I’d recommend to keep a copy of the SQlite-Config…
  4. Run
    rt-setup-database --action init

    to create a blank RT DB in MySQL. Check if it worked.

  5. Delete all rows from the MySQL DB, only keep the schema.
  6. Create a copy of your SQLite db
    cp rtdb mydbcopy
  7. Open the copy of the db in the sqlite shell
    sqlite3 mydbcopy
  8. For each table in the DB:
    1. Set the Output file to something like this (“Attachments” is the table name)

    .out data_Attachments

    2. Set the mode to insert (“Attachments” is again the table name)

    .mode insert Attachments

    3. Get all the data

    SELECT * FROM Attachments;

    4. Now you have a file with lots of INSERT statements for the Attachments table. Try to run it in mysql:

    mysql -u rt4 -p rt4 < data_Attachments

    (On a linux shell, not the sqlite shell of course. Here the first rt4 is the username and the second the dbname. data_Attachments is the dump file created before)
    5. In case mysql complains some NOT NULL constraints are violated:
    Go back to the sqlite shell and set these cells to the default-value (0):

    UPDATE Attachments SET someColumn=0 WHERE someColumn IS NULL;

    (Only do this on the copy, not the original db 😉 )
    Now recreate the dump and retry to insert this in mysql. Do it with all columns where NOT NULL constraints are violated.

This works table by table, there are no foreign keys that would get in the way. You can also do several tables in one output file, but you might run into problems when NOT NULL constraints are violated by one table. After all your tables are filled with your data, RT should work. Maybe restart Apache.

apache2ctl restart

This worked without problems so far for me. I first tried pumping the whole SQLite dump into MySQL (using this conversion script) but the schema that this ended up in was different, missed indexes and RT only liked it until I restarted Apache (which then refused to start). Better start with a schema created by rt-setup-database, not with one that originates from SQLite.

I hope this is of some help for somebody. Please let me know in case it helped you or if you have any comments or questions.

14. August 2012

MySQL: How to do a (simple/complex) CHECK CONSTRAINT?

Filed under: DBMS — Tags: , , , , , , , , , — Christopher Kramer @ 02:27

Another “badly-missing-feature”-post. I should create a category for that I guess…

Usually, you can add a constraint to a table in SQL with something like:

ALTER TABLE foo ADD CONSTRAINT my_constraint CHECK (bla<>0)

Some sites like w3school even tell you this is the syntax for it in MySQL. Well, I guess they have never tried this and not even read the manual. Well, you can execute it in MySQL without an error. But also without any effect whatsoever. I tried it in InnoDB (MySQL 5.5) and it did not check the constraint at all.

The MySQL manual says:

The CHECK clause is parsed but ignored by all storage engines

Hallelujah, MySQL is great. A bug request for that exists since 2004, so don’t expect it to be fixed soon…

So we go with a trigger instead. For constraints, we use a BEFORE trigger. So next thing I tried looked something like this:

CREATE TRIGGER check_my_constraint BEFORE UPDATE ON `foo`
     FOR EACH ROW 
     BEGIN 
     IF (NEW.bla=0)
     THEN 
     ROLLBACK;
     END IF; 
     END

Should work, but doesn’t. Problem is that MySQL doesn’t like ROLLBACK here. Hallelujah, MySQL is great. So what worked in the end is sending an error signal:

CREATE TRIGGER check_my_constraint BEFORE UPDATE ON `foo`
     FOR EACH ROW 
     BEGIN 
     DECLARE msg varchar(255);
     IF (NEW.bla=0)
     THEN 
        SET msg = concat('Constraint my_constraint violated: bla must not be zero ', cast(new.bla as char));
        SIGNAL sqlstate '45000' SET message_text = msg;
     END IF; 
     END

Thanks to Rui Da-Costa for the solution with the SIGNAL.

The cool thing about triggers is that you can do complex checks here. So I just used it for something like:

 IF (NEW.ref_id IS NOT NULL AND
     NEW.ref2_id<>(SELECT `ref2_id` FROM `ref` WHERE `ref`.id=NEW.`ref_id`))

So that makes sure that ref_id is either null or it refers to a row in table `ref` with this id that has the same ref2_id as the table we are checking. So a more complex type of foreign key relationship.

Hope this helped somebody to get things solved more quickly. Don’t forget to add an INSERT trigger as well.

Another remark: Make sure you do not use semicolon as delimiter when using SQL/PL syntax. phpMyAdmin has an extra field to change the delimiter (below the textarea where you enter your SQL) for example.