This documentation is for Dovecot v2.x, see wiki1 for v1.x documentation.
Differences between revisions 6 and 7
Revision 6 as of 2016-12-28 17:19:13
Size: 10197
Editor: AkiTuomi
Comment:
Revision 7 as of 2021-09-24 15:04:33
Size: 90
Editor: TimoSirainen
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Main index =

The main index can be used to quickly look up messages' UIDs, flags, keywords and extension-specific data, such as cache file or mbox file offsets.

== Reading, writing and locking ==

Reading {{{dovecot.index}}} file requires locking, unfortunately. Shared read locking is done using the standard index locking method specified in {{{lock_method}}} setting ({{{lock_method}}} parameter for {{{mail_index_open()}}}).

Writing to index files requires transaction log to be exclusively locked first. This way the index locking only has to worry about existing read locks. The locking works by first trying to lock the index with the standard locking method, but if it couldn't acquire the lock in two seconds, it'll fallback to copying the index file to a temporary file, and when unlocking it'll {{{rename()}}} the temporary file over the {{{dovecot.index}}} file. Note that this is safe only because of the exclusive transaction log lock. This way the writers are never blocked by readers who are allowed to keep the shared lock as long as they want.

The copy-locking is used always when doing anything that could corrupt the index file if it crashed in the middle of an operation. For example if the header or record size changes, or if messages are expunged. New messages can be appended however, because the message count in the header is updated last. Expunging the last messages would probably be safe also (because only the header needs updating), but it's not done currently.

The index file should never be directly modified. Everything should go through the transaction log, and the only time the index needs to be write-locked is when transactions are written to it.

Currently the index file is updated whenever the backend mailbox is synchronized. This isn't necessary, because an old index file can be updated using the transaction log. In future there could be some smarter decisions about when writing to the index isn't worth the extra disk writes.

== Header ==

{{{
struct mail_index_header {
        uint8_t major_version;
        uint8_t minor_version;

        uint16_t base_header_size;
        uint32_t header_size;
        uint32_t record_size;

        uint8_t compat_flags;
        uint8_t unused[3];

        uint32_t indexid;
        uint32_t flags;

        uint32_t uid_validity;
        uint32_t next_uid;

        uint32_t messages_count;
        uint32_t unused_old_recent_messages_count;
        uint32_t seen_messages_count;
        uint32_t deleted_messages_count;

        uint32_t first_recent_uid;
        uint32_t first_unseen_uid_lowwater;
        uint32_t first_deleted_uid_lowwater;

        uint32_t log_file_seq;
        uint32_t log_file_tail_offset;
        uint32_t log_file_head_offset;

        uint64_t unused_old_sync_size;
        uint32_t unused_old_sync_stamp;

        uint32_t day_stamp;
        uint32_t day_first_uid[8];
}
}}}

Fields that won't change without recreating the index:

 major_version:: If this doesn't match {{{MAIL_INDEX_MAJOR_VERSION}}}, don't try to read the index. Dovecot recreates the index file then.
 minor_version:: If this doesn't match {{{MAIL_INDEX_MINOR_VERSION}}} there are some backwards compatible changes in the index file (typically header fields). Try to preserve the headers and the minor version when updating the index file.
 base_header_size:: Extension headers begin after the base headers. This is normally the same as {{{sizeof(struct mail_index_header)}}}.
 header_size:: Records begin after base and extension headers.
 record_size:: Size of each record and its extensions. Initially the same as {{{sizeof(struct mail_index_record)}}}.
 compat_flags:: Currently there is just one compatibility flag: {{{MAIL_INDEX_COMPAT_LITTLE_ENDIAN}}}. Dovecot doesn't try to bother to read different endianess files, they're simply recreated.
 indexid:: Unique index file ID. This is used to make sure that the main index, transaction log and cache file are all part of the same index.

Header flags:

 MAIL_INDEX_HDR_FLAG_CORRUPTED:: Set whenever the index file is found to be corrupted. If the reader notices this flag, it shouldn't try to continue using the index.
 MAIL_INDEX_HDR_FLAG_HAVE_DIRTY:: This index has records with {{{MAIL_INDEX_MAIL_FLAG_DIRTY}}} flag set.
 MAIL_INDEX_HDR_FLAG_FSCK:: Call {{{mail_index_fsck()}}} as soon as possible. This flag isn't actually set anywhere currently.

Message UIDs and counters:

 uid_validity:: IMAP UIDVALIDITY field. Initially can be 0, but after it's set we don't currently try to even handle the case of UIDVALIDITY changing. It's done by marking the index file corrupted and recreating it. That's a bit ugly, but typically the UIDVALIDITY never changes.
 next_uid:: UID given to the next appended message. Only increases.
 messages_count:: Number of records in the index file.
 recent_messages_count:: Number of records with {{{MAIL_RECENT}}} flag set.
 seen_messages_count:: Number of records with {{{MAIL_SEEN}}} flag set.
 deleted_messages_count:: Number of records with {{{MAIL_DELETED}}} flag set.
 first_recent_uid_lowwater:: There are no UIDs lower than this with {{{MAIL_RECENT}}} flag set.
 first_unseen_uid_lowwater:: There are no UIDs lower than this '''without''' {{{MAIL_SEEN}}} flag set.
 first_deleted_uid_lowwater:: There are no UIDs lower than this with {{{MAIL_DELETE}}} flag set.

The lowwater fields are used to optimize searching messages with/without a specific flag.

Fields related to syncing:

 log_file_seq:: Log file the log_*_offset fields point to.
 log_file_int_offset, log_file_ext_offset:: All the internal/external transactions before this offset in the log file are synced to the index. External transactions are synced more often than internal, so {{{log_file_int_offset}}} <= {{{log_file_ext_offset}}}.
 sync_size, sync_stamp:: Used by the mailbox backends to store their synchronization information. Some day these should be removed and replaced with extension headers.

Then there are day fields:

 day_stamp:: UNIX timestamp to the beginning of the day when new records were last added to the index file.
 day_first_uid[8]:: These fields are updated when {{{day_stamp}}} < today. The [0..6] are first moved to [1..7], then [0] is set to the first appended UID. So they contain the first UID of the day for last 8 days when messages were appended.

The {{{day_first_uid[]}}} fields are used by cache file compression to decide when to drop {{{MAIL_CACHE_DECISION_TEMP}}} data.

== Extension headers ==

After the base header comes a list of extensions and their headers. The first extension begins from {{{mail_index_header.base_header_size}}} offset. The second begins after the first one's {{{data[]}}} and so on. The extensions always begin 64bit aligned however, so you may need to skip a few bytes always. Read the extensions as long as the offset is smaller than {{{mail_index_header.header_size}}}.

{{{
struct mail_index_ext_header {
 uint32_t hdr_size; /* size of data[] */
 uint32_t reset_id;
 uint16_t record_offset;
 uint16_t record_size;
 uint16_t record_align;
 uint16_t name_size;
 /* unsigned char name[name_size] */
 /* unsigned char data[hdr_size] (starting 64bit aligned) */
};
}}}

{{{reset_id}}}, record offset, size and alignment is explained in [[Design/Indexes/TransactionLog]]'s {{{struct mail_transaction_ext_intro}}}.

== Records ==

There are {{{hdr.messages_count}}} records in the file. Each record contains at least two fields: Record UID and flags. The UID is always increasing for the records, so it's possible to find a record by its UID with binary search. The record size is specified by {{{mail_index_header.record_size}}}.

The flags are a combination of {{{enum mail_flags}}} and {{{enum mail_index_mail_flags}}}. There exists only one index flag currently: {{{MAIL_INDEX_MAIL_FLAG_DIRTY}}}. If a record has this flag set, it means that the mailbox syncing code should ignore the flag in the mailbox and use the flag in the index file instead. This is used for example with mbox and {{{mbox_lazy_writes=yes}}}. It also allows having modifiable flags for read-only mailboxes.

The rest data is stored in record extensions.

== Keywords ==

The keywords are stored in record extensions, but for better performance and lower disk space usage in transaction logs, they are quite tightly integrated to the index file code.

The list of keywords is stored in "keywords" extension header:

{{{
struct mail_index_keyword_header {
 uint32_t keywords_count;
 /* struct mail_index_keyword_header_rec[] */
 /* char name[][] */
};
struct mail_index_keyword_header_rec {
 uint32_t unused; /* for backwards compatibility */
 uint32_t name_offset; /* relative to beginning of name[] */
};
}}}

The unused field originally contained {{{count}}} field, but while writing this documentation I noticed it's not actually used anywhere. Apparently it was added there accidentally. It'll be removed in later versions.

So there exists {{{keywords_count}}} keywords, each listed in a NUL-terminated string beginning from {{{name_offset}}}.

Since crashing in the middle of updating the keywords list pretty much breaks the keywords, adding new keywords causes the index file to be always copied to a temporary file and be replaced.

The keywords in the records are stored in a "keywords" extension bitfield. So the nth bit in the bitfield points to the nth keyword listed in the header.

It's not currently possible to safely remove existing keywords.

== Extensions ==

The extensions only specify their wanted size and alignment, the index file syncing code is free to assign any offset inside the record to them. The extensions may be reordered at any time.

Dovecot's current extension ordering code works pretty well, but it's not perfect. If the extension size isn't the same as its alignment, it may create larger records than necessary. This will be fixed later.

The records size is always divisible by the maximum alignment requirement. This isn't strictly necessary either, so it could be fixed later as well.
Moved to https://doc.dovecot.org/developer_manual/design/indexes/index_file_format_main/

None: Design/Indexes/MainIndex (last edited 2021-09-24 15:04:33 by TimoSirainen)