SI prefixes

Sunday 9 September 2007This is close to 16 years old. Be careful.

Discussion of what size external disk drive to buy (I’m thinking of a 500Gb model to put an end to data squeeze for a good long time) lead to a question about the meaning of the SI prefix tera-. Wikipedia’s SI prefix page explains that the prefixes started from Greek roots for large things, then switched over to numbers as sources:

  • giga- (G, 109): From the Greek root for giant.
  • tera- (T, 1012): From the Greek root for monster.
  • peta- (P, 1015): Since tera- looks like tetra- (Greek for four with a letter missing), peta- is from penta-, Greek for five with a letter missing.
  • exa- (E, 1018): From the Greek prefix hexa- (six), with a letter missing.
  • zetta- (Z, 1021): From the Latin septum (seven), with the p dropped, and the first letter changed from s to avoid confusion with existing SI symbols.
  • yotta- (Y, 1024): From the Greek octo- (eight), with the c dropped and a y added to avoid having a symbol of O, indistinguishable from zero.

The official Bureau International de Poids et Mesures page about the prefixes is interesting because of its links to the actual resolutions that created the prefixes. Unlike ISO, BIPM manages to make their official output refreshingly brief: their resolutions are only a few paragraphs at most.

One aspect of this I don’t understand: why the BIPM approves new prefixes in such small increments. Peta- and exa- appeared in 1975, and zetta- and yotta- in 1991. Why dole them out so slowly? Of course, there are proposals, including one that goes all the way to 1063 (luma-).

The scientists take exception to the computer geeks using their decimal prefixes for binary amounts. 220 is not 106, and the inaccuracy gets quite large by the time you are discussing exabytes of data. There is a system of binary prefixes (gibibytes, anyone?), but they have not caught on yet, and sound silly, so I don’t think they ever will.


Is it that hard for us to just say "Hey, I want a five-hundred billion byte harddrive!"
As it happens, I did buy a "500 Gb" external hard drive, and Windows tells me its capacity is 465 Gb:

465 * 2^30 == 499,289,948,160
I want my 500YB hard drive now!
Reminds mo of this glorious attempt to bring meaning to common units So, what's the velocity of a sheep in a vacuum

- Paddy.
2^10 is not 10^6

True. And 2^20 isn't, either.
@Ed: good catch! Thanks for reading so closely... :-)
Well, I wonder why computer geeks hang on to the binary prefixes when talking about hard drive sizes. While powers of 2 can be an advantage for page alignment, CPU caches, and RAM sizes, no such advantage exists for modern hard drives. Given the choice between a drive with 2^40 bytes and a drive with 1.1 * 10^12 bytes, if the price is the same, I'll take the 1.1 TB.
Thanks for bringing this up. Whenever I try to discuss it with my friends, they say, "Yotta getta life".
The binary prefixes may sound silly, but I don't think they sound any sillier than the power-of-ten prefixes; they're only less familiar.

They're becoming more and more standard in operating systems and programs that deal with large amounts of data, for exactly the reason you state: the difference between power-of-two prefix and power-of-ten prefix gets large enough that the difference is confusing to the user. Consequently, I expect they'll catch on to the extent that the user encounters them in use, just as users originally learned the large power-of-ten prefixes.
I agree with Ben Finney, and there is a slow but steady uptake of those binary prefixes. An example is the Dutch magazine "PC Active" that has been running a small box explaining them on their "Q&A" pages for a couple of years now.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.