Distributed proofreaders

Thursday 13 May 2004

In one of the comments to my entry about Read Print from Tuesday, "Blues" suggested trying out PG Distributed Proofreaders, and I did. It's a fascinating web artefact. They've solved the problem of how to accomplish the labor-intensive job of proofing and correcting the OCR scans of books.

The site is a web application for handing out units of work, and getting back results. They have over 11,700 people signed up to proof pages, and they are proofing 6200 pages a day. You sign up on the site, then log in to proof pages. You are presented with a scan of a page and the text as produced by the OCR software. Your job is simply to compare the two, and make corrections. Mostly it seems to come down to re-joining hyphenated words (why can't OCR software do that itself?). All they ask it that you proof one page a day.

It's a cool way to provide a little bit of labor for a noble cause: the dissemination of public domain information electronically.

tagged: » 1 reaction

Comments

[gravatar]
Jessica 1:25 AM on 14 May 2004

Read Print also does the same thing... they have a bunch of volunteers who proof and format all the works. I have asked them earlier about this... they said they are currently working on a system whereby visitors would also be able to contribute.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.