reCAPTCHA: Human-Based Character Recognition via Web Security Measures

reCAPTCHA: Human-Based Character Recognition via Web Security Measures

12 SEPTEMBER 2008 | Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum
reCAPTCHA is a security measure that uses human effort to help digitize old printed material. It works by asking users to decipher distorted text from scanned books that optical character recognition (OCR) software cannot recognize. This process allows users to contribute to the digitization of texts while also verifying that they are human. The system, called reCAPTCHA, has been deployed on over 40,000 websites and has transcribed over 440 million words. reCAPTCHA presents two words to users: one unknown and one known (control word). Users must type both correctly to pass. The system uses the correct answers to the control word to verify the user's answer to the unknown word. If the user correctly types the control word, the system assumes they are human and that their answer to the unknown word is correct. This method allows the system to improve the digitization process by using human accuracy to correct OCR errors. The system has been shown to achieve a word accuracy of over 99%, matching the accuracy of professional human transcribers. This is because humans are more accurate at transcribing text than OCR software, especially for older texts with faded ink and yellowed pages. The system also helps to reduce the workload of human transcribers by using the time spent solving CAPTCHAs to transcribe text. reCAPTCHA has been shown to be effective at preventing automated abuse of online services. It is used by many websites to prevent spam, ticket scalping, and other forms of abuse. The system has also been shown to be effective at digitizing old texts, with over 17,600 books manually transcribed. The system continues to grow in popularity, with over 4 million suspicious words transcribed per day. reCAPTCHA is more secure than conventional CAPTCHAs because the words used are those that OCR software cannot recognize. This makes it more difficult for automated programs to guess the correct answer. The system has also been shown to be effective at improving OCR accuracy by using human answers to correct OCR errors. reCAPTCHA has been shown to be effective at harnessing human effort to solve problems that computers cannot yet solve. This is part of a broader concept called "human computation," where human effort is used to solve complex problems. reCAPTCHA is an example of this concept, as it uses human effort to digitize old texts while also serving as a security measure.reCAPTCHA is a security measure that uses human effort to help digitize old printed material. It works by asking users to decipher distorted text from scanned books that optical character recognition (OCR) software cannot recognize. This process allows users to contribute to the digitization of texts while also verifying that they are human. The system, called reCAPTCHA, has been deployed on over 40,000 websites and has transcribed over 440 million words. reCAPTCHA presents two words to users: one unknown and one known (control word). Users must type both correctly to pass. The system uses the correct answers to the control word to verify the user's answer to the unknown word. If the user correctly types the control word, the system assumes they are human and that their answer to the unknown word is correct. This method allows the system to improve the digitization process by using human accuracy to correct OCR errors. The system has been shown to achieve a word accuracy of over 99%, matching the accuracy of professional human transcribers. This is because humans are more accurate at transcribing text than OCR software, especially for older texts with faded ink and yellowed pages. The system also helps to reduce the workload of human transcribers by using the time spent solving CAPTCHAs to transcribe text. reCAPTCHA has been shown to be effective at preventing automated abuse of online services. It is used by many websites to prevent spam, ticket scalping, and other forms of abuse. The system has also been shown to be effective at digitizing old texts, with over 17,600 books manually transcribed. The system continues to grow in popularity, with over 4 million suspicious words transcribed per day. reCAPTCHA is more secure than conventional CAPTCHAs because the words used are those that OCR software cannot recognize. This makes it more difficult for automated programs to guess the correct answer. The system has also been shown to be effective at improving OCR accuracy by using human answers to correct OCR errors. reCAPTCHA has been shown to be effective at harnessing human effort to solve problems that computers cannot yet solve. This is part of a broader concept called "human computation," where human effort is used to solve complex problems. reCAPTCHA is an example of this concept, as it uses human effort to digitize old texts while also serving as a security measure.
Reach us at info@study.space
[slides] reCAPTCHA%3A Human-Based Character Recognition via Web Security Measures | StudySpace