
All of us have, at one time or another, encountered a set of those funky, distorted characters of a CAPTCHA code when registering or filling out a form on a website. The World's websites use these CAPTCHA tests to help discern between actual people and programmatic bots attempting to create fake accounts or otherwise attempting to bypass security. CAPTCHA, by the way, stands for Completely Automated Public Test [to tell] Computers and Humans Apart.
But did you know that on the majority of sites, when you type the characters you see, you are participating in the reCaptcha massive-scale online collaboration project to digitize the world's books?
- "While you're typing a CAPTCHA, during those 10 seconds, your brain is doing something amazing. Your brain is doing something that computers cannot yet do.", says Luis von Ahn (the inventor of CAPTCHA) in the video below from TEDxCMU at Carnegie Mellon University.
Digitizing the world's books is, of course, a noble cause. It will help us democratize knowledge by making it accessible to anyone, anywhere. But you've probably never stopped to think about the logistics of digitizing process. Scanning a book obviously involves taking a picture of every page. Then by using OCR (Optical Character Recognition), the scanning computer attempts to recognize all of the words on each page. The problem however, is that computers are not yet that good at that OCR translation of words in a picture into a word of actual text. Especially with older books, in which the pages have yellowed and the ink has faded, computers often fail to transfer 30% or more of the words. This flaw in OCR makes digitizing the World's books a problem computers can't solve by themselves.
Meanwhile, roughly 200 million reCAPTCHA codes are hand-typed into websites everyday by people around the world. That's 500,000 hours spent every single day entering reCAPTCHAs. reCAPTCHA harnesses this effort by feeding unrecognized OCR words out to reCAPTCHA users and this stroke of genius gets 100,000,000 words a day are being translated by CAPTCHA-encountering humans. That's the equivalent of translating 2.5 million books a year, one word at a time, by random people on the Internet. The reCaptcha team reports that 750 million people have participated in the reCAPTCHA digitizing project so far.
Luis von Ahn and reCaptcha are Changing the World. Stay tuned. Next, Luis and the reCAPTCHA team is going to tackle translating the World's information into multiple languages.

