- May 8, 2010
- compress text field model
- 1 (after 1 ratings)
I found my self doing data migration for a client, and also found that their old system (CSV) had 4 text fields that were 2 to 4MB each and after about 2 minutes of hammering the mysql server as I was parsing this and trying to insert the data the server would drop all connections. In my testing I found that if I didnt send those 4 fields the mysql server was happy to let me migrate all my data all (240GB of it). So I started thinking, "I should just store these fields compress anyways, a little over head to render the data, but thats fine by me."
So thus was born a CompressedTextField. It bz2 compresses the contents then does a base64 encode to play nice with the server storage of text fields. Once I started using this my data migration, though a bit slower then without the field data, ran along all happy.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
class CompressedTextField(models.TextField): """ model Fields for storing text in a compressed format (bz2 by default) """ __metaclass__ = models.SubfieldBase def to_python(self, value): if not value: return value try: return value.decode('base64').decode('bz2').decode('utf-8') except Exception: return value def get_prep_value(self, value): if not value: return value try: value.decode('base64') return value except Exception: try: tmp = value.encode('utf-8').encode('bz2').encode('base64') except Exception: return value else: if len(tmp) > len(value): return value return tmp