Login

urlize HTML

Author:
maguspk
Posted:
June 18, 2010
Language:
Python
Version:
1.2
Score:
2 (after 2 ratings)

The default Django urlize filter does not work with html nicely so here I've used an HTML parser BeautifulSoup to quickly search through each text node and run the django urlize filter on it.

Optimizations could be made to include a regex in the soup.findAll() method's text argument to only search those text nodes which matched a url pattern. You could also modify the method to convert the text to urls, such as using your own custom url filter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
@register.filter("urlize_html")
def urlize_html(html):
    """
    Returns urls found in an (X)HTML text node element as urls via Django urlize filter.
    """
    try:
        from BeautifulSoup import BeautifulSoup
        from django.utils.html import urlize
    except ImportError:
        if settings.DEBUG:
            raise template.TemplateSyntaxError, "Error in urlize_html The Python BeautifulSoup libraries aren't installed."
        return html
    else:
        soup = BeautifulSoup(html)
           
        textNodes = soup.findAll(text=True)
        for textNode in textNodes:
            urlizedText = urlize(textNode)
            textNode.replaceWith(urlizedText)
            
        return str(soup)

More like this

  1. Template tag - list punctuation for a list of items by shapiromatron 3 months, 1 week ago
  2. JSONRequestMiddleware adds a .json() method to your HttpRequests by cdcarter 3 months, 2 weeks ago
  3. Serializer factory with Django Rest Framework by julio 10 months, 2 weeks ago
  4. Image compression before saving the new model / work with JPG, PNG by Schleidens 11 months ago
  5. Help text hyperlinks by sa2812 12 months ago

Comments

Samus_ (on June 19, 2010):

congratulations!

#

Please login first before commenting.