I like using unusual text characters to decorate my site, for example, my home page uses lots of mid-dots (· ·) and chevrons (» »), as well as other special characters. To keep the HTML source from being cluttered with those inscrutable numeric entities, I wrote this Django tag:
special_ch = {
'': '',
'>>': '»', # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
'<<': '«', # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
'(c)': '©', # COPYRIGHT SIGN
'S': '§', # SECTION SIGN
'*': '•', # BULLET
'.': '·', # MIDDLE DOT
'-': '–', # EN DASH
'--': '—', # EM DASH
':>': '▶', # BLACK RIGHT-POINTING TRIANGLE
'o': '◦', # WHITE BULLET
'[]': '▫', # WHITE SMALL SQUARE
'<>': '◇', # WHITE DIAMOND
}
@register.simple_tag
def ch(value):
return ' '.join([special_ch[s] for s in value.split(' ')])
Now I can use the ch tag with a mnemonic representation of the character in question. Spaces become non-breaking spaces to help control the layout around these characters:
<p>{% ch ">> " %}more text..</p>
<p>{% ch "(c) " %}2002{% ch "-" %}2009</p>
becomes
» more text..
© 2002–2009
The tag reference takes more space than the entities, but I can tell how they will display, without having to memorize the Unicode code points.
Comments
Why not to use unicode encoding for templates and place those characters directly, without HTML ampersand entities?
Because © isn't mnemonic enough?© » for »?
http://www.bryanlprice.com/specials.html is my list of ornamentals.
http://www.chami.com/tips/internet/050798I.html seems to be a comprehensive list of named entities.
I thought we had a conversation about this a few years ago, but that was Keith Devens, not you. :-p
@vvd: I'm not accustomed to entering non-ASCII characters directly into source files. That's probably the best way to go in the long run..
@Bryan: I've got a bit of Stockholm syndrome from working with XML files that makes me think I have to use numeric entities. Named entities are a good option for HTML files (and templates) though.
A big moment for me was when I realized that I could remap CapsLock, which I had not used for at least two decades, to the "Compose" key used to introduce multi-key Unicode character abbreviations under Linux. Now I just run:
every so often when I want to read back through the (many!) possible key combinations and see what characters I can type. Most of them were easy to guess without looking them up. An en-dash is "--." while an em-dash is "---" while the Copyright symbol © is "co" or "oc" (most of the codes work when typed either way). The â character is either "a^" or "^a", and so forth. I type them; they appear in this text box; and now I'll hit "Post" and they'll appear. It's easy. It's magic.Add a comment: