Put this code somewhere in one of your INSTALLED_APPS  `__init__.py` file. This code will replace the django.template.loader.get_template with cached version. Standard get_template function from django reads and parses the template code every time it's called. This version calls (if DEBUG set to False) it only once per template. After that it gets a Template object from template_cache dictionary. On django http server with template code like that:
    {% extends "index.html" %}
    {% block content %}
    {% if form.has_errors %}
    <p>Your username and password didn't match. Please try again.</p>
    {% endif %}
    <form method="post" action=".">
    <table>
    <tr><td><label for="id_username">Username:</label></td><td>{{ form.username }}</td></tr>
    <tr><td><label for="id_password">Password:</label></td><td>{{ form.password }}</td></tr>
    </table>
    <input type="submit" value="login" />
    <input type="hidden" name="next" value="{{ next }}" />
    </form>
    {% endblock %}
ab -n 100 on mac os x 10.5 core 2 duo 2 ghz with 2 GB of RAM gives 
    forge-macbook:~ forge$ ab -n 100 http://127.0.0.1:8000/login/
    This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Copyright 2006 The Apache Software Foundation, http://www.apache.org/
    Benchmarking 127.0.0.1 (be patient).....done
    Server Software:        WSGIServer/0.1
    Server Hostname:        127.0.0.1
    Server Port:            8000
    Document Path:          /login/
    Document Length:        934 bytes
    Concurrency Level:      1
    Time taken for tests:   0.432934 seconds
    Complete requests:      100
    Failed requests:        0
    Write errors:           0
    Total transferred:      120200 bytes
    HTML transferred:       93400 bytes
    Requests per second:    230.98 [#/sec] (mean)
    Time per request:       4.329 [ms] (mean)
    Time per request:       4.329 [ms] (mean, across all concurrent requests)
    Transfer rate:          270.25 [Kbytes/sec] received
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        0    0   0.0      0       0
    Processing:     3    3   1.5      4      12
    Waiting:        3    3   1.2      3      12
    Total:          3    3   1.5      4      12
    Percentage of the requests served within a certain time (ms)
      50%      4
      66%      4
      75%      4
      80%      4
      90%      4
      95%      5
      98%     10
      99%     12
     100%     12 (longest request)
without template caching, and
    forge-macbook:~ forge$ ab -n 100 http://127.0.0.1:8000/login/
    This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Copyright 2006 The Apache Software Foundation, http://www.apache.org/
    Benchmarking 127.0.0.1 (be patient).....done
    Server Software:        WSGIServer/0.1
    Server Hostname:        127.0.0.1
    Server Port:            8000
    Document Path:          /login/
    Document Length:        934 bytes
    Concurrency Level:      1
    Time taken for tests:   0.369860 seconds
    Complete requests:      100
    Failed requests:        0
    Write errors:           0
    Total transferred:      120200 bytes
    HTML transferred:       93400 bytes
    Requests per second:    270.37 [#/sec] (mean)
    Time per request:       3.699 [ms] (mean)
    Time per request:       3.699 [ms] (mean, across all concurrent requests)
    Transfer rate:          316.34 [Kbytes/sec] received
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        0    0   0.0      0       0
    Processing:     3    3   0.9      3       9
    Waiting:        2    3   0.9      3       8
    Total:          3    3   0.9      3       9
    Percentage of the requests served within a certain time (ms)
      50%      3
      66%      3
      75%      3
      80%      3
      90%      3
      95%      5
      98%      8
      99%      9
     100%      9 (longest request)
with caching enabled.
In both cases DEBUG is set to False.
                
                    
                    
                    - template
 
                    
                    - cache
 
                    
                    - performance
 
                    
                    - optimization