- June 10, 2009
- image thumbnail s3 amazon sorl
- 3 (after 3 ratings)
Set MEDIA_URL (or whatever you use for uploaded content to point to S3 (ie. MEDIA_URL = "http://s3.amazonaws.com/MyBucket/"))
Put django-storage in project_root/libraries, or change the paths to make you happy.
This uses the functionality of django-storage, but not as DEFAULT_FILE_STORAGE.
The functionality works like so:
Getting stuff to S3
On file upload of a noted model, a copy of the uploaded file is saved to S3.
On any thumbnail generation, a copy is also saved to S3.
On a page load:
We check to see if the thumbnail exists locally. If so, we assume it's been sent to S3 and move on.
If it's missing, we check to see if S3 has a copy. If so, we download it and move on.
If the thumb is missing, we check to see if the source image exists. If so, we make a new thumb (which uploads itself to S3), and move on.
If the source is also missing, we see if it's on S3, and if so, get it, thumb it, and push the thumb back up, and move on.
If all of that fails, somebody deleted the image, or things have gone fubar'd.
Thumbs are checked locally, so everything after the initial creation is very fast.
You can clear out local files to save disk space on the server (one assumes you needed S3 for a reason), and trust that only the thumbs should ever be downloaded.
If you want to be really clever, you can delete the original source files, and zero-byte the thumbs. This means very little space cost, and everything still works.
If you're not actually low on disk space, Sorl Thumbnail keeps working just like it did, except your content is served by S3.
My python-fu is not as strong as those who wrote Sorl Thumbnail. I did tweak their code. Something may be wonky. YMMV.
The relative_source property is a hack, and if the first 7 characters of the filename are repeated somewhere, step 4 above will fail.
Upload is slow, and the first thumbnailing is slow, because we wait for the transfers to S3 to complete. This isn't django-storage, so things do genuinely take longer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
# sorl/thumbnail/base.py: def generate(self): """ Generates the thumbnail if it doesn't exist or if the file date of the source file is newer than that of the thumbnail. """ # Ensure dest(ination) attribute is set if not self.dest: raise ThumbnailException("No destination filename set.") new_generated = False if not isinstance(self.dest, basestring): # We'll assume dest is a file-like instance if it exists but isn't # a string. self._do_generate() new_generated = True elif not isfile(self.dest) or (self.source_exists and getmtime(self.source) > getmtime(self.dest)): import events.s3 as s3_events if s3_events.is_on_s3(self.relative_dest): # "thumb is on s3" s3_events.pull_from_s3(self.relative_dest) self._source_exists = True else: # "thumb not on s3" if not self.source_exists: # file's missing. if s3_events.is_on_s3(self.relative_source): s3_events.pull_from_s3(self.relative_source) self._source_exists = True else: # "source is not on S3!" self._source_exists = False if self.source_exists: # Ensure the directory exists directory = dirname(self.dest) if not isdir(directory): os.makedirs(directory) self._do_generate() new_generated = True if new_generated: s3_events.push_to_s3(self.relative_dest) def _get_relative_source(self): # Hack. try: start_str = self.relative_dest[:7] return self.source[self.source.find(start_str):] except: return self.source relative_source = property(_get_relative_source) # events/s3.py from django.conf import settings import libraries.backends.s3 as s3 def push_to_s3(file_path): s3_storage = s3.S3Storage() img_file = open("%s%s" % (settings.MEDIA_ROOT,file_path),'r') s3_img_file = s3_storage.open("%s" % (file_path), 'w') s3_img_file.write(img_file.read()) img_file.close() s3_img_file.close() def is_on_s3(file_path): s3_storage = s3.S3Storage() return s3_storage.exists(file_path) def pull_from_s3(file_path): s3_storage = s3.S3Storage() img_file = open("%s%s" % (settings.MEDIA_ROOT,file_path),'w') s3_img_file = s3_storage.open(file_path, 'r') img_file.write(s3_img_file.read()) s3_img_file.close() img_file.close() # models.py class Screenshot(SixLinksModel): shot = models.ImageField("Screenshot",upload_to="screenshots") def save(self): super(Screenshot, self).save() import events.s3 as s3_events s3_events.push_to_s3(self.shot) def __unicode__(self): return "%s" % (self.shot) # assumes django-storage is sitting in libraries, e.g. libraries/backends/s3.py is a file # settings.py AWS_ACCESS_KEY_ID = "YOUR-KEY" AWS_SECRET_ACCESS_KEY = "YOUR-SECRET-KEY" AWS_STORAGE_BUCKET_NAME = "YOUR-BUCKET" from S3 import CallingFormat AWS_CALLING_FORMAT = CallingFormat.PATH