Vowels and consonants

Wednesday 7 December 2005

Last week's New Yorker magazine had a good story on Matthew Carter, a renowned type designer. They got all the details right, but one factoid they mentioned stood out like a sore thumb. They said, "Dickens preferred vowels, Thackeray used more consonants". How could that be? They were both writing in English in the same time period. How much personal preference for one letter over another could you express?

So I did an experiment. I downloaded plain text versions of David Copperfield and Vanity Fair from Project Gutenberg. A quick histogram of the letters in each reveals this distribution of frequency of use (for letters more than 1%):

DickensThackeray
 e 12.08  e 12.39 
 t 8.85  t 8.43 
 a 8.17  a 8.31 
 o 7.74  o 7.58 
 i 7.24  n 6.78 
 n 6.84  h 6.57 
 h 6.06  s 6.46 
 s 6.05  i 6.41 
 r 5.75  r 6.19 
 d 4.70  d 4.71 
 l 3.79  l 3.97 
 m 3.15  u 2.70 
 u 2.83  m 2.66 
 w 2.60  c 2.53 
 y 2.26  w 2.52 
 c 2.24  f 2.13 
 f 2.17  g 2.13 
 g 2.10  y 2.04 
 p 1.70  p 1.74 
 b 1.52  b 1.64 

The most significant difference I can see is the i: 7.24% for Dickens and 6.41% for Thackeray. And Thackeray's s, h, and r are more common than Dickens'. But Thackeray used more e's and fewer t's. It's all a wash as far as I can see. Maybe there's a slight truth to it, but enough to make a difference to a type designer? I don't see it.

BTW: every time I have to make a data table on this site, I struggle with it. Some day I'll learn the CSS to do it right.

Comments

[gravatar]
Dave Delay 10:12 AM on 7 Dec 2005

While you were counting, which did you enjoy more -- David Copperfield or Vanity Fair? ;-)

[gravatar]
andrew 11:18 AM on 7 Dec 2005

Its just the New Yorker. They lie about everything. Did Seymour Hirsch write the article?

[gravatar]
tani 12:26 PM on 7 Dec 2005

Well, I guess looking a bit closer Dickens is using roughly .6% more vowels. I'm not sure if over a 200 pages that .6% would be noticable.

[gravatar]
Richard Schwartz 12:30 PM on 7 Dec 2005

Writing in first person narration, which memory tells me Dickens did quite a lot of, is an obvious potential reason for the frequency of "I".

[gravatar]
Ned Batchelder 12:52 PM on 7 Dec 2005

Good point!

[gravatar]
ELY Mustapha 4:19 PM on 7 Dec 2005

Nice and efficient way to make differencies between authors styles...e.g try this method on "Mein Kampf" you'll (perhaps) discover a universal rule of human being style ;-)
sheers!

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.