This documentation is for Dovecot v2.x, see wiki1 for v1.x documentation.
Differences between revisions 34 and 35
Revision 34 as of 2019-01-14 13:27:57
Size: 3954
Editor: 2001:2060:49:130:1ad3:3fa6:6ad9:9b74
Comment:
Revision 35 as of 2019-02-09 02:47:00
Size: 3974
Editor: wf126-083
Comment: State fts-xapian requirement (confirmed does not work with v2.2)
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
 * [[https://github.com/grosjo/fts-xapian|fts-xapian]] is [[https://xapian.org|Xapian]] based plugin maintained by <<MailTo(jom AT NOSPAM grosjo DOT net)>>  * [[https://github.com/grosjo/fts-xapian|fts-xapian]] is [[https://xapian.org|Xapian]] based plugin maintained by <<MailTo(jom AT NOSPAM grosjo DOT net)>>. (Requires v2.3+)

Full text search indexing

The following FTS indexers (in preferred order) are supported:

  • Solr communicates with Lucene's Solr server.

  • Lucene uses Lucene's C++ library. (Requires v2.1+)

  • fts-dovecot is Dovecot Pro's new search index, and is not available without commercial agreement. (Requires v2.2+)

  • Squat is Dovecot's own search index. (Obsolete in v2.1+)

  • fts-xapian is Xapian based plugin maintained by <jom AT NOSPAM grosjo DOT net>. (Requires v2.3+)

Indexing

By default the FTS indexes are updated only while searching, so neither the LDA nor an IMAP APPEND command updates the indexes immediately. This means that if user has received a lot of mail since the last indexing (== search operation), it may take a while to index all the mails before replying to the search command. Dovecot sends periodic "* OK Indexed n% of the mailbox" updates which can be caught by webmail implementations to implement a progress bar.

In v2.2.9+ the indexing can be done automatically with fts_autoindex=yes setting (see below).

The indexing can be done manually (e.g. cronjob) or by a LDA script by running:

  • v2.1: doveadm index -u user@domain -q INBOX

  • v2.0: printf "a select INBOX\nb search text xyzzy\nc logout\n" | /usr/local/libexec/dovecot/imap -u user@domain

Of course the INBOX needs to be replaced with whatever mailbox needs to be indexed.

Indexing Attachments (v2.1+)

Attachments can be indexed either via a script that translates the attachment to UTF-8 plaintext or Apache Tika server.

  • fts_decoder = <service>: Decode attachments to plaintext using this service and index the resulting plaintext. See the decode2text.sh script included in Dovecot for how to use this. (v2.1+)

  • fts_tika = http://tikahost:9998/tika/: This URL needs to be running Apache Tika server (e.g. started with java -jar tika-server/target/tika-server-1.5.jar) (v2.2.13+)

Rescan (v2.1+)

Since v2.1 Dovecot keeps track of indexed messages in the dovecot.index files. If this becomes out of sync with the actual FTS indexes (either too many or too few mails), you'll need to do a rescan:

doveadm fts rescan -u user@domain

Other Settings

All the FTS settings go inside plugin {}  section of 90-plugin.conf.

  • fts_autoindex=yes: Index new messages immediately after they've been saved/copied. (v2.2.9+)

  • fts_autoindex_exclude=pattern1, fts_autoindex_exclude2=pattern2, ...: Exclude given mailboxes, one pattern per setting. Supports "*" and "?" wildcards. If a name starts with '\', it's treated as a case-insensitive special-use flag. (v2.2.25+)

    • Example:

      plugin {
        fts_autoindex_exclude = \Junk
        fts_autoindex_exclude2 = \Trash
        fts_autoindex_exclude3 = DUMPSTER
      }
  • fts_autoindex_max_recent_msgs=n: Skip autoindexing the mailbox if it has more than n \Recent messages (implying that the mailbox is never actually being accessed). (v2.2.9+)

  • fts_enforced:

    • no (default): All body searches will index all missing mails in FTS. Header searches will use FTS if the mails are indexed, otherwise fallback to parsing the headers (usually from dovecot.index.cache). If FTS search fails, fallback to reading and parsing all mails.
    • yes: All header and body searches will index all missing mails in FTS. If FTS search fails, error is returned to client.
  • fts_index_timeout: When SEARCH notices that index isn't up to date, it tells indexer to index the mails and waits until it is finished. This setting adds a maximum timeout to this wait. If the timeout is reached, the SEARCH fails with: NO [INUSE] Timeout while waiting for indexing to finish (v2.1+)

Plugins/FTS (last edited 2019-02-09 02:47:00 by wf126-083)