Login

Import mail into a Django model

Author:
pbx
Posted:
April 12, 2007
Language:
Python
Version:
.96
Score:
4 (after 4 ratings)

A mildly crufty script to slurp mail from an mbox into a Django model. I use a variant of this script to pull the contents of my scammy-spam mbox into the database displayed at http://purportal.com/spam/

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#!/usr/bin/env python
"""
This code presumes this model:

class Message(models.Model):
    subject = models.CharField(maxlength=250)
    date = models.DateField()
    body = models.TextField()
    raw = models.TextField()
"""

import os, mailbox, email, datetime, shutil, sys
from email.Utils import parsedate  # change to email.utils for Python 2.5
set os.environ["DJANGO_SETTINGS_MODULE"] = "YOURPROJECT.settings" 
# set sys.path as needed

from models import Message
from MySQLdb import OperationalError

MAILBOX = '/path/to/mbox'

mbox = file(MAILBOX, 'rb')
for message in mailbox.PortableUnixMailbox(mbox, email.message_from_file):
    try:
        date = datetime.datetime(*parsedate(message['date'])[:6])
    except TypeError:  # silently ignore badly-formed dates
        date = datetime.datetime.now()
    try:
        msg = Message(
            subject=message['subject'],
            date=date,
            body=message.get_payload(decode=False),
            raw=message.as_string(),
            )
        print "Adding: %s..." % msg.subject[:40]
        msg.save()
    except OperationalError:
        print "Trouble parsing message (%s...)" % msg.subject[:40]

print "Archive now contains %s messages" % Message.objects.count()
# Depending on your application, you might clear the mbox now: open(MAILBOX, "w").write("")

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 11 months ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 11 months, 1 week ago
  3. Serializer factory with Django Rest Framework by julio 1 year, 6 months ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 1 year, 6 months ago
  5. Help text hyperlinks by sa2812 1 year, 7 months ago

Comments

dyadya-zed (on April 12, 2007):

Great idea! But it seems to broke one of the main python power - one language, many platforms (of course IMHO :). This will be usable only for Unix-like OS'es. An option for me is to connect to POP3 and get all data from there.

#

pbx (on April 13, 2007):

Yes, it's definitely Unix-specific. I'll leave writing a generalized version as an exercise for the reader, since it would likely quadruple in size (and not be any more useful to me personally!).

#

pbx (on April 13, 2007):

Oh, and it also has a gratuitous hardcoded reference to MySQLdb, simply because the DB was frequently barfing on bad dates (the mail I'm processing with this is spam). One could certainly find a way to catch that without resorting to a DB-specific reference.

#

Please login first before commenting.