02.08.07

Posted in Quick Guides at 1:52 am by jasonb

My relationship with dspam, a bayesian classifier based spam solution, has been unwavering since I first installed it in 2005. I previously used SpamAssassin. While SpamAssassin offers a wide variety of tests involving potentially dozens of sources including rulesets, DNS blacklists, URI blacklists, checksum databases, and bayesian classifying, I found its false negative rate unacceptable and resource overhead excessive.

Recently, the time for a fresh installation of the latest stable release of dspam had come. My prior install was using version 3.0, which has become quite old, though still effective. My previous install was simply a bruteforce effort involving procmail and some other nastiness. Instead of continuing that tradition, I have configured dspam to work with Exim4 and PostgreSQL on Ubuntu 6.06.

The configuration that follows allows Exim4 to pass incoming, non-local messages without a dspam header to dspam. The message is checked, then reinjected into Exim for local delivery if the message is deemed innocent. If the message is believed to be spam, the mail will be quarantined on the system. All tokens and user preferences are stored within a PostgreSQL database. The Web interface allows you to report false positives and negatives. Should that sound appealing, read on for details.

To install dspam, you will need the following entry uncommented in your sources.list.

deb http://us.archive.ubuntu.com/ubuntu/ dapper universe

And the usual installation of packages.

# apt-get install dspam \
  libdspam7-drv-pgsql dspam-webfrontend

For dspam to run at start up, you must edit /etc/default/dspam accordingly.

START=yes

For dspam to speak to PostgreSQL, you will want to to modify /etc/dspam/dspam.d/pgsql.conf accordingly.

PgSQLServer     127.0.0.1
PgSQLUser       dspam
PgSQLPass
PgSQLDb         libdspam7drvpgsql

It’s necessary to create a database and populate it with the current schema.

# su - postgres -c 'createuser -R -D -S -e dspam'
# su - postgres -c 'createdb libdspam7drvpgsql -O dspam'
# su - postgres -c \
  'cat /usr/share/doc/libdspam7-drv-pgsql/pgsql_objects.sql | psql -U postgres libdspam7drvpgsql'

Finally, PostgreSQL must allow trust for the dspam user to the dspam database on localhost. While having dspam use a password is supposedly possible, I was never successful in having it connect to localhost via TCP with a password required. Add the following to the /etc/postgresql/8.1/main/pg_hba.conf file and restart PostgreSQL.

# Most specific first.
# Must come before more general host rules.
host libdspam7drvpgsql dspam 127.0.0.1/32 trust

Configuring dspam itself is necessary and no easy task. Given the wide ranging number of possible configurations dspam supports, there are many, many options. The configuration that follows supports using dspam over a UNIX socket in a daemon setup. Many options are described in the dspam.conf file in some detail while the purpose of others is entirely mystifiying.

StorageDriver /usr/lib/dspam/libpgsql_drv.so
TrustedDeliveryAgent "/usr/sbin/exim4 -oi -oMr despammed"
OnFail error
 
Trust root
Trust dspam
Trust mail
Trust mailnull
Trust smmsp
Trust daemon
Trust Debian-exim
 
TrainingMode teft
TestConditionalTraining on
Feature chained
Feature whitelist
Algorithm graham burton
PValue graham
 
Preference "spamAction=quarantine"
Preference "signatureLocation=headers"
Preference "showFactors=on"
 
AllowOverride trainingMode
AllowOverride spamAction spamSubject
AllowOverride statisticalSedation
AllowOverride enableBNR
AllowOverride enableWhitelist
AllowOverride signatureLocation
AllowOverride showFactors
AllowOverride optIn optOut
AllowOverride whitelistThreshold
 
HashRecMax              98317
HashAutoExtend          on
HashMaxExtents          0
HashExtentSize          49157
HashMaxSeek             100
HashConnectionCache     10
Notifications   off
 
PurgeSignatures 14          # Stale signatures
PurgeNeutral    90          # Tokens with neutralish probabilities
PurgeUnused     90          # Unused tokens
PurgeHapaxes    30          # Tokens with less than 5 hits (hapaxes)
PurgeHits1S     15          # Tokens with only 1 spam hit
PurgeHits1I     15          # Tokens with only 1 innocent hit
 
LocalMX 127.0.0.1
 
SystemLog on
UserLog   on
 
Opt in
 
ServerMode dspam
ServerPass.heh    "huh"
ServerDomainSocketPath  "/var/spool/dspam/dspam.sock"
ClientHost      /var/spool/dspam/dspam.sock
ClientIdent     "huh@heh"
 
ProcessorBias on
 
Include /etc/dspam/dspam.d/

In my configuration above, the usage of dspam is entirely optIn based. Using the database configuration, you absolutely must set any preferences, such as the optIn flag, using the dspam_admin tool. It will properly set the option in the database. The per user preference files on disk are completely ignored under the database drivers.

# dspam_admin list preferences jasonb
# dspam_admin add preference jasonb optIn on

Also, add the dspam user to the Debian-exim group.

# usermod -G Debian-exim -a dspam

The Exim4 portion of my configuration is based almost entirely upon one by Simon McVittie.

First, in /etc/exim4/conf.d/router/550_exim4-local-dspam

dspam_router:
  no_verify
  check_local_user
  condition = "${if and { \
    {!def:h_X-My-Dspam:} \
    {!eq {$received_protocol}{local}} \
    {!eq {$received_protocol}{despammed}} \
    { <= {$message_size}{3M}} \
    }\
    {1}{0}}"
  headers_add = "X-My-Dspam: scanned by $primary_hostname, $tod_full"
  driver = accept
  transport = dspam_transport
 
dspam_error_spam_router:
  driver = accept
  domains = example.com
  local_part_suffix = -spam
  transport = dspam_error_spam_transport
 
dspam_error_ham_router:
  driver = accept
  domains = example.com
  local_part_suffix = -fp
  transport = dspam_error_ham_transport

Next, in /etc/exim4/conf.d/transport/40_exim4-config_local_dspam the following will actually summon dspam. Notice only innocent mail is delivered, allowing for spam to be quarantined. If you deliver both innocent and spam, the former will not be quarantined even if quaranting is enabled in the dspam.conf configuration.

dspam_transport:
  driver = pipe
  command = "/usr/bin/dspam --client --deliver=innocent --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
  user = dspam
  group = dspam
  log_output = true
  return_fail_output = true
  return_path_add = false
  message_prefix =
  message_suffix =
 
dspam_error_spam_transport:
  driver = pipe
  command = "/usr/bin/dspam --client --source=error --class=spam --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
  user = dspam
  group = dspam
  log_output = true
  return_fail_output = true
  return_path_add = false
  message_prefix =
  message_suffix =
 
dspam_error_ham_transport:
  driver = pipe
  command = "/usr/bin/dspam --client --source=error --class=innocent --user ${lc:$local_part} -f '$sender_address' -oi -oMr despammed -- %u"
  user = dspam
  group = dspam
  log_output = true
  return_fail_output = true
  return_path_add = false
  message_prefix =
  message_suffix =

To access the dspam Web interface, you must ensure that the dspam CGI will execute as the dspam group so it can access files in /var/spool/dspam and write to them. For Apache 1.3 on Ubuntu 6.06, you have to mv the suexec wrapper around.

# mv /usr/lib/apache/suexec.disabled /usr/lib/apache/suexec
# invoke-rc.d apache restart

dspam will use the valid-user from AuthType Basic against whatever authentication framework you configure. For a small number of users I am simply using AuthUserFile.

AuthType Basic
AuthUserFile /etc/apache/authz
AllowOverride None
AuthName "DSPAM Control Center"
Require valid-user

If dspam does segfault, which sadly has happened, your mail will bounce.

Client exited with error -5
R=dspam_router T=dspam_transport: Child process of dspam_transport transport returned 251
  (could mean shell command ended by signal 123 (Unknown signal 123))
  from command: /usr/bin/dspam

Therefore, though beyond the scope of this discussion, should you configure monit to monitor your services, it can monitor dspam.

check process dspamd with pidfile /var/run/dspam.pid
  group dspam
  start program = "/etc/init.d/dspam start"
  stop program = "/etc/init.d/dspam stop"
  if failed unixsocket /var/spool/dspam/dspam.sock then restart
  if 3 restarts within 5 cycles then timeout

Leave a Comment

Show some love for the large Wordpress Hash-cash.