Feb 20, 2018

Migrate mbox to Maildir and disassemble your Email setup

Note: On 6th of August 2018 Thunderbird 60 brought along experimental bidirectional conversion between mbox and Maildir.

When I lost some work and started to use rsnapshot for simple incremental backups, the disadvantage of Thunderbirds mbox-format showed. All messages of a mailbox-folder are combined in one file and a small change to its content will cause a complete re-transfer, again claiming its file size at the backup destination. Maildir instead uses one file per message, ideal for this kind of backup. It has other disadvantages not relevant to small mailserver setups though. I read for backups going on tape in the past, the appendable mbox-format was less of a concern. Block-based disk snapshots or more capable mechanisms (rdiff,bup,borg) are not bothered by small changes on big files either. But retrieval with rsnapshot is very approachable, just navigate in the parallel directory tree. Since version 38, Thunderbird supports a Maildir-alike format for newly created accounts, a migration seemed worth the effort! So I read up on ways to convert a lot of mbox history1.

PPPPPP

I trialed the options listed in a superuser.com question. As the mbox files were written to by different clients and operating systems, the scripts probably met some untested behaviour. mb2md.pl (also in the debian repositories) suggested on blog.philippklaus.de didn't work for me. mb2md.sh uses procmails formail to pipe the individual messages to the addtomaildir.py script. It seemed to choke on ^M windows-style line endings, and with my input, it was better to pass formail the -Y option (traditional Berkeley format). Still, using mboxcheck to calculate reference target counts of messages, the output of the splitting scripts didn't add up. I didn't look into it any further and moved on to the dovecot wiki.

There's also an approach with ImporExportTools and the advice on blog.adambros.org to feed the individual .eml-files one-by-one to getmail_maildir, but I didn't test this either.

Carefree and fast: Dovecots dsync

Dovecots dsync was the most robust, easiest and quickest option and as the Wiki suggests, skip the other methods if this works for you. Edit mail_location as maildir:~/Maildir and separator to / in /etc/dovecot/conf.d/10-mail.conf, then run dsync -Dv mirror mbox:~/mail:INBOX=Inbox. With some renaming, moving and cleanup, you can go ahead and include the converted folder directly with Thunderbird.

Beyond Maildir

As I want to put the Mailbox on a remote soon, I started to serve ~/Maildir locally via Dovecot/IMAP. This needed some additional tooling for retrieval/sending and made the plumbing character of mail very visible. It's also somehow a case of xkcd #349 until you have recovered all functionality the local GUI client provided. I used mozfilter2sieve to convert the msgFilterRules.dat to sieve. I gave getmail and fetchmail a try for the retrieval and handing over to dovecot-lda, where sieve does its filtering and a learned bogofilter stamps a spam-score to it. getmail supports arbitrary password commands, so there's no need to store it in a plaintext config file. Not fetchmail though, but it supports multiple accounts and has a handy preconnect mode to create a ssh-tunnel first before fetching mails - if the remote only offers pop/imap on a locally bound interface. msmtp can do command line sending if the client does not handle it by itself. A user cron script checks for new mail and can do desktop-notifications. offlineimap will keep the Drafts synced via IMAP, even when I fetch via POP and do nothing else with IMAP. It accounts for folder naming at a remote different to the local one.

The modular setup allowed for some email client hopping. Making the rounds produced a survey of clients able to make use of a fast server-side search, for what it's worth.

Why?

"Modern Mail" is involved (see everything about this), at the same time archaic and hardly a secure message mechanism. New communication tools and habits on handheld devices are replacing the person-to-person Email, at least within my family. At the same time, the long technical history and many components bolted on top of each other recruit my curiosity and DIY-spirit in self-hosting. New ideas like delta.chat and Autocrypt still emerge. If you're not employed to make Email work in all critical circumstances, it can be interesting by itself. With filters doing the sorting, it's an independent notification and message medium. There's useful mailing lists and relevant newsletters. And for better and worse a todo-list that writes itself. I used mail for a big part of my life and I'm curious for how long it's going to be around.

Addendum

As the conversion is covered in the excellent dovecot wiki, these are some config files I use for the setup. Not all of them are neccessary.

├── .doveconf
│   └── local.conf
├── .fetchmailrc
├── .getmail
│   └── getmailrc
├── .msmtprc
├── .mutt
│   ├── mailcap
│   └── muttrc
├── .offlineimap.py
├── .offlineimaprc
├── .sieve
│   └── webmail.sieve
└── bin
    └── checkmail.sh

.doveconf/local.conf (output of doveconf -n, concatenated /etc/dovecot/dovecot.conf -> conf.d/*)

listen = 127.0.0.1
mail_location = maildir:~/Maildir
mail_plugins = " fts fts_solr"
namespace inbox {
  inbox = yes
  location = 
  mailbox Drafts {
    special_use = \Drafts
  }
  mailbox Junk {
    special_use = \Junk
  }
  mailbox Sent {
    special_use = \Sent
  }
  mailbox "Sent Messages" {
    special_use = \Sent
  }
  mailbox Trash {
    special_use = \Trash
  }
  prefix = 
  separator = /
}
passdb {
  driver = pam
}
plugin {
  debug = true
  fts = solr
  fts_solr = url=http://127.0.0.1:8081/solr/ break-imap-search
  sieve = file:~/.sieve/webmail.sieve
}
postmaster_address = localuser@maschine
protocols = " imap"
ssl = no
userdb {
  driver = passwd
}
protocol lda {
  mail_plugins = " fts fts_solr sieve"
}

.fetchmailrc

# set logfile "/home/user/log/fetchmail.log"
defaults
  no rewrite
  mda "/usr/lib/dovecot/deliver -e"

# webmail
poll pop.webmail.net with protocol pop3
  user '123456789' there with password 'hunter2' is 'localuser' here options ssl sslcertck sslproto 'TLS1.2+' keep

# vserver
skip localhost with protocol pop3 and port 11110:
  preconnect  'ssh -f -o "ControlMaster no" remotehost -L 11110:127.0.0.1:110 sleep 1'
  password hunter2

.getmail/getmailrc

[retriever]
type = SimplePOP3SSLRetriever
server = pop.webmail.net
port = 995
# uses python2.7 ssl module, doesn't seem to work though
# ssl_version = TLSv1.2
# ssl_version = PROTOCOL_TLSv1.2
# ssl_cipher = ECDHE
username = webmailuser
password_command = ("/usr/bin/pass","mailpass.pw")

# without dovecot:
# [destination]
# type = Maildir
# path = ~/Maildir/

[destination]
type = MDA_external
# -e: make dovecot return an error if necessary
path = /usr/lib/dovecot/deliver
arguments = ("-e",)

[options]
verbose = 1
# get only new messages
read_all = false
delete = false
delete_after = 10
# message_log_syslog = true

.msmtprc

defaults
 # logfile /home/user/log/msmtp.log

 # make tls and submission port default
 port 587
 tls on
 tls_trust_file /etc/ssl/certs/ca-certificates.crt

account webmail
  host mail.webmail.net
  from me@webmail.net
  auth on
  user me@webmail.net
  passwordeval "pass mailpass.pw | head -n1 | grep -E '^(.*)$'"
  # see "secret-tool" in msmtp manpage for alternative methods

account example
  host example.de
  from me@example.de
  auth on
  user me
  passwordeval "pass example.pw | head -n1"

# default can be set after account definition
account default : webmail

.mutt/muttrc

# inactive IMAP config
# set folder="imaps://me@webmail.net@imap.webmail.net"
# set spoolfile="imaps://me@webmail.net@imap.webmail.net/INBOX"

# alternative_order text/plain text/enriched text/html
auto_view text/html

# maildir Config
set mbox_type=Maildir
set folder="~/Maildir"
set mbox="~/Maildir"
set mask="!^\\.[^.]"

# save file after send
set record="+.Sent"
set postponed="+.Drafts"
set spoolfile="~/Maildir"

# speed up mailbox listings
set header_cache=~/.cache/mutt

# mailbox Setup
mailboxes + +'..todo' +'.Mailbox.reading' +'.Mailbox.lists' +'.Sent' +'.Unsent Messages' +'.Junk' +'.Drafts' +'.Templates' +'.Trash' +'.Archives'

# reply with email received-to
set reverse_name=yes

# multiaccount
alias webmail  fullname   <me@webmail.net>
alias example  selfhost   <me@example.de>
macro compose v "<edit-from>^Uidentity\_<tab>" "Select from"

# Outgoing mail
set sendmail="/usr/bin/msmtp"
set use_from=yes
set from="me@webmail.net"
set envelope_from=yes

# use different smtp sending account depending on from
send-hook "~f '@example.de$'" set sendmail="/usr/bin/msmtp -a example"

# helpers in navigation
macro index c "<change-folder>?<toggle-mailboxes>" "open a different folder"
macro pager c "<change-folder>?<toggle-mailboxes>" "open a different folder"
macro index C "<copy-message>?<toggle-mailboxes>" "copy a message to a mailbox"
macro index M "<save-message>?<toggle-mailboxes>" "move a message to a mailbox"
macro compose A "<attach-message>?<toggle-mailboxes>" "attach message(s) to this message"

# modern mutt gnupg config
# https://www.henrytodd.org/notes/2014/simpler-gnupg-mutt-config-with-gpgme/
set crypt_use_gpgme=yes

set crypt_autosign=no
set crypt_verify_sig=yes

set crypt_replysign=yes
set crypt_replyencrypt=yes
set crypt_replysignencrypted=yes

# when this variable is set, sent emails will be stored unencrypted and unsigned
set fcc_clear=yes

# if your mutt is current enough, you have these options to control this behaviour
# set pgp_self_encrypt
# set pgp_default_key

.mutt/mailcap

text/html; w3m -I %{charset} -T text/html; copiousoutput;

.offlineimaprc

[general]
accounts = imapsync
maxsyncaccounts = 1
pythonfile = ~/.offlineimap.py

[Account imapsync]
remoterepository = dovecot
localrepository = webmail

[Repository webmail]
type = IMAP
remotehost = imap.webmail.net
remoteuser = me@webmail.de
remotepasseval = subprocess.check_output(["pass", "show", "mailpass.pw"]).strip()
ssl = yes
sslcacertfile = /etc/ssl/certs/ca-certificates.crt
maxconnections = 1
folderfilter = lambda folder: folder in ('Entw&APw-rfe')
nametrans = lambda folder: re.sub('Entw&APw-rfe', 'Drafts', folder)

[Repository dovecot]
type = IMAP
remotehost = 127.0.0.1
remoteuser = localuser
remotepasseval = subprocess.check_output(["pass", "show", "localuser.pw"]).strip()
ssl = no
maxconnections = 1
folderfilter = lambda folder: folder in ('Drafts')
nametrans = lambda folder: re.sub('Drafts', 'Entw&APw-rfe', folder)
# readonly = true

.offlineimap.py

#!/usr/bin/env python2
import subprocess

Set the spamicity format in /etc/bogofilter.cf (on the receiving dovecot) to %0.2f for the local sieves spamtest plugin to filter by scores directly.

/etc/dovecot/conf.d/90-sieve.conf

sieve_extensions = +spamtest +spamtestplus
sieve_spamtest_status_type = score
sieve_spamtest_status_header = X-Bogosity: [A-Za-z]+, tests=bogofilter, spamicity=([0-9]{1}\.[0-9]+[^,]).*
sieve_spamtest_max_value = 1.00

.sieve/webmail.sieve

if header :contains "X-Bogosity" "Spam," {
  fileinto "Junk";
  stop;
}
elsif spamtest :percent :value "gt" :comparator "i;ascii-numeric" "1" {
  fileinto "Junk";
  stop;
}
elsif header :matches "From" "*mailings@webmailer.net*"
{
  fileinto "Trash";
}
elsif header :matches "From" "*invitations@linkedin.com*"
{
  fileinto "Junk";
}

Test your rules with sieve-test -t - -Tlevel=matching -x "+imapflags -enotify +spamtest +spamtestplus" -C ~/.sieve/webmail.sieve ~/Maildir/.Inbox/mailinquestion with the testmail containing a header line with spamicity value.

Customizing cutoff values for Spam classification can be set on the remote too and might be better, but at the cost of flexibility. I still look for better ways. To filter only by text statistics seems interesting, but blocklists seem a sane neccessity.

~/bin/checkmail.sh with desktop notification

#!/usr/bin/env bash
#
# get the display context
user=$(whoami)
env_reference_process=$( pgrep -u "$user" xfce4-session || pgrep -u "$user" ciannamon-session || pgrep -u "$user" gnome-session || pgrep -u "$user" gnome-shell || pgrep -u "$user" kdeinit | head -1 )

export DBUS_SESSION_BUS_ADDRESS=$(cat /proc/"$env_reference_process"/environ | grep --null-data ^DBUS_SESSION_BUS_ADDRESS= | sed 's/DBUS_SESSION_BUS_ADDRESS=//')
export DISPLAY=$(cat /proc/"$env_reference_process"/environ | grep --null-data ^DISPLAY= | sed 's/DISPLAY=//')

# the actual mail check, customize to your locale
checkmail=$(fetchmail --uidl -F)
if echo "$checkmail" | grep -Fq "wird gelesen"; then
    notify-send --icon=mail-message-new "new mail" "${checkmail}"
else
    :
fi

or create, enable and start fetchmail as systemd service in /etc/systemd/system/fetchmail.service

[Unit]
Description=Fetchmail
After=network.target

[Service]
User=<username>
ExecStart=/usr/bin/fetchmail -f /home/<username>/.fetchmailrc --idle --uidl --flush
RestartSec=3

[Install]
WantedBy=multi-user.target

  1. that began with Netscape Communicator 4.74 in 2001 in a SuSE 7.0