Field of Science

The best spam filter ever

The famous Turing Test tests the intelligence of computers in the following way: if a computer can convince us it is a human, it is probably as intelligent as a human (that's not Turing's original version, but it's better known).

What is interesting is that although Turing focused on language and problem solving, one of the easiest ways of telling a human from a machine is our perceptual system -- especially our sense of vision, which in humans is the dominant of the five senses. So one of the most important forms of the Turing Test today is actually a vision test.

To get an account on just about any website, you must prove you are human by copying a few sloppily-written letters. Machines, despite decades of research, are very bad at visual recognition of objects, including alphanumeric letters. These bits of text that you have to rewrite are called CAPTCHAs.

Bring to the scene reCAPTCHA. The website says it all:

About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that's not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.


That is, each CAPTCHA is in fact a section of text in a book that is being digitized that their computer is unable to read. I have no idea how well the system has worked so far or the details of the implementation, but the idea is brilliant, and really captures part of what makes the Web so powerful: millions of people all donating just a few minutes of their time. This is of course what has given us the Fray, Wikipedia, and Web-based experiments. But unlike those cases, filling out CAPTCHAs is something people have to do anyway.

2 comments:

Anonymous said...

Do you know the website Games With a Purpose? This website is the brainchild of Luis von Ahn who developed the CAPTCHA and is based on the same principle.

josh said...

Funny you should ask. Check out today's post.