This documentation is for Dovecot v2.x, see wiki1 for v1.x documentation.
Differences between revisions 4 and 5
Revision 4 as of 2013-02-08 16:13:38
Size: 8120
Editor: TimoSirainen
Comment:
Revision 5 as of 2013-02-08 16:24:12
Size: 8911
Editor: TimoSirainen
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Most of the design is about getting as many compiler warnings and errors as possible. Issues found by static analyzers should also be fixed in some way, e.g. adding extra asserts. On runtime errors should be caught by asserts or NULL pointer dereferences, which cleanly crash the program instead of it continuing and possibly corrupting data or causing other bad things. Most of the design is about getting as many compiler warnings and errors as possible. Dovecot already has some patched-Clang-specific features to get more warnings, and will likely have more in future. Issues found by static analyzers should also be fixed in some way, e.g. adding extra asserts. On runtime errors should be caught by asserts or NULL pointer dereferences, which cleanly crash the program instead of it continuing and possibly corrupting data or causing other bad things. Dovecot's bottleneck is either disk I/O or memory, so using extra CPU for extra checks costs practically nothing.
Line 18: Line 18:
  * Also unless you know that a parameter can be NULL, don't bother wastine code on checking if it is. Ideally those are noticed by the patched clang check, but even if not, it's not that bad to crash on NULL pointer dereference.
Line 62: Line 63:
Use dynamically growing strings/buffers wherever necessary instead of a static sized buffer, where on larger input the function fails or truncates the data. It's of course not good to allow users to infinitely grow memory usage, so there should be some limits added, but it shouldn't fail even if the limit is set to infinite.

Code Design

Most of the design is about getting as many compiler warnings and errors as possible. Dovecot already has some patched-Clang-specific features to get more warnings, and will likely have more in future. Issues found by static analyzers should also be fixed in some way, e.g. adding extra asserts. On runtime errors should be caught by asserts or NULL pointer dereferences, which cleanly crash the program instead of it continuing and possibly corrupting data or causing other bad things. Dovecot's bottleneck is either disk I/O or memory, so using extra CPU for extra checks costs practically nothing.

Types

  • Unsigned types are used whenever the value isn't allowed to be negative. This makes it easier to do "value too large" checks when you don't also have to check for negative values. Also in arithmetic it's better to have the value wrap (and hopefully checked later!) than cause undefined behavior with a signed integer overflow.
  • char * always points to a NUL-terminated string, unsigned char * doesn't.

  • size_t should generally be used when pointing to a large memory area, especially for mmap(). Since size_t can be slower to access than unsigned int (or at least use memory), it's fine to use unsigned int when it's "very unlikely" that the size ever goes beyond 4 GB (e.g. string_t).

  • uoff_t is used for file offsets/sizes. This is usually 64bit, even with 32bit machines.

  • uint32_t vs. unsigned int: Use uint32_t when the type really should be 32bit, but don't spend too much energy trying to avoid mixing it with unsigned int, since they are going to be the same types probably for the rest of Dovecot's life..

  • uint8_t vs. unsigned char: I doubt Dovecot will ever be compiled anywhere where these differ from one another, but for readability use uint8_t for binary data and unsigned char for text data.

Function parameters

  • Try to avoid using/allowing NULL pointers. For example for a public API instead of having foo(struct bar *bar) where bar can be NULL, you could have both foo(void) and foo_with_bar(struct bar *bar). Of course don't try too hard if it makes the API otherwise ugly.

    • Dovecot v2.2+ marks all such parameters with ATTR_NULL. These can be verified with a patched clang. Unfortunately the API isn't very nice, it would be better to mark each parameter separately with ATTR_NULL instead of having a numbered list. This will likely change in future.
    • Also unless you know that a parameter can be NULL, don't bother wastine code on checking if it is. Ideally those are noticed by the patched clang check, but even if not, it's not that bad to crash on NULL pointer dereference.
  • _r suffix in parameter is used for values that are returned (e.g. foo(const char **error_r)).

  • Use const for pointers pretty much whenever possible.
  • Some day in future I'd like compiler to give warnings if function parameters are modified by the function itself. There's probably a lot of code that does this currently, but try avoiding it in future.

Function return values

  • int type is commonly used for return values. Usually this is used to mean one of:

    • -1 = error, 0 = ok
    • -1 = error, 0 = unfinished (e.g. non-blocking call), 1 = finished
    • unfortunately there are also other uses. I've been wondering about using different types for these some day, such as "err" and "err3", but haven't figured out an easy enough to use design, especially one where compiler verifies that types aren't accidentally mixed.
  • bool shouldn't be used as return value for ok/error, use int 0/-1 instead. (Old code has some of these. Sometimes it's also not quite clear if something is an "error" or not, so this rule isn't perfect.)

  • Functions returning pointers shouldn't use NULL pointer to mean an error, only for things like "not found". For example instead of if ((ctx = init()) == NULL) use if (init(&ctx) < 0)

  • Usually when calling a function, either save/check the return value or explicitly say that you don't care about it by saying (void)func();

    • Some functions where the return value can nearly always be ignored it's annoying to add the (void) prefix. You can avoid these by adding ATTR_NOWARN_UNUSED_RESULT to such function's prototype. A patched clang can be used to find missing return value checks.

  • If a system call fails unexpectedly, always log an error about it, possibly even kill the process if it's vital for correct functionality (e.g. gettimeofday()). %m in Dovecot's printf()-style functions expands to strerror(errno).

Boolean expressions

Try to use boolean expressions the way they work in Java. C doesn't require this, I think it makes the code easier to understand and reduces bugs in some cases (e.g. if (!foo()) when thinking foo() returns bool/FALSE, but actually returns int/-1 on error). I'm planning on patching clang some day to give warnings when booleans aren't used properly.

  • bool x and unsigned int x:1 are the boolean types

  • !=, ==, <, >, etc. create a boolean

  • if, for, while, etc. require a boolean

So:

  • if (!ptr) -> if (ptr == NULL)

  • if (!num) -> if (num == 0)

  • if (flags & FLAG_FOO) -> if ((flags & FLAG_FOO) != 0)

Memory

Memory is always allocated through one of Dovecot's wrappers, e.g. i_malloc() or i_new(). All of Dovecot's memory allocations always succeed or kill the process. There's no point in writing a lot of code to check for memory allocation failures that happen just about never. The only reason some memory allocations fail in Dovecot currently is because a process VSZ limit is reached, which usually indicates either a memory leak or trying to access a mailbox that is too large. In either of these cases it's better to just completely restart the process than try to limp along without getting anything useful done anymore.

Memory allocations can be assumed to be zero-initialized. All of the memory allocation functions do it, except t_malloc() and t_buffer_get(), which you should almost never use directly anyway. The code currently also assumes that pointers in zero-initialized memory area are NULL, which isn't guaranteed by ANSI-C, but practically Dovecot isn't going to be run in systems where it's not true and you're not going to remember to NULL-initialize all of your pointers anyway without compiler/runtime failure.

When using a struct, always zero-initialize it with memset() instead of setting each field separately to 0. It's too easy to cause bugs by adding a new field to the struct and forgetting to initialize it.

Double-frees in C are bad. Better to crash with NULL pointer dereference than possibly allow an attacker to exploit heap corruption and run executable code. Most of the pointers are set to NULL when they are freed:

  • i_free_and_null() is a macro to free the memory and set the pointer to NULL.

  • i_free() also currently sets the pointer to NULL, but another thought was to set this pointing to some other invalid pointer. But arguably this should be defined to be exactly the same as i_free_and_null().

  • Most deinit() functions take a pointer-to-pointer parameter and set the original one to NULL. There's no need to explicitly set the same pointer to NULL afterwards.

Use dynamically growing strings/buffers wherever necessary instead of a static sized buffer, where on larger input the function fails or truncates the data. It's of course not good to allow users to infinitely grow memory usage, so there should be some limits added, but it shouldn't fail even if the limit is set to infinite.

Type safety

  • Try to avoid using void pointers.
  • Try to avoid casting types to other types, especially if the cast isn't necessary to avoid a compiler warning.

It's better if compiler can give a warning when something is accidentally used wrong.

Callback functions

Callback functions make the code more difficult to follow, especially when a callback calls another callback, or when using function pointers to jump to different callbacks depending on state. Of course with asynchronous C code it's pretty much impossible to avoid callbacks. Still, try to avoid them where possible to keep the code readable.

Often callback functions can be avoided by creating iterator functions instead. For example instead of parse(callback, context use ctx = parse_init(); while (parse_next(ctx)) { .. } parse_deinit(&ctx);

Dovecot has some helper macros to make callbacks' context parameters type-safe. In v2.2+ see CALLBACK_TYPECHECK() macro and for example io_add() for example usage.

None: Design/Code (last edited 2021-09-24 15:08:59 by TimoSirainen)