KittenAuth and Extensions

Schneier posts about KittenAuth, a reverse turing test in which you're asked to distinguish pictures of kittens from other kinds of animals. Schneier raises the obvious issue of how to brute-force these kinds of systems:
Of course you could increase the security by adding more images or requiring the person to choose more images. Another worry -- which I didn't see mentioned -- is that the computer could brute-force a static database. If there are only a small fixed number of actual kittens, the computer could be told -- by a person -- that they're kittens. Then, the computer would know that whenever it sees that image it's a kitten.

Of course, ability to brute force is basically a function of database size. My guess is that their database is pretty small, since I just got three copies of the same image in my challenge--though it does look like they're using some low-resolution fuzzing to make the problem harder. What you want is a large database of pictures that are pre-labelled for you and that people can't get a labelled database dump for. I don't know of one for cats/other animals (there's Flickr I Love My Cat Photo Pool, but it's pre-sorted). but male/female is als a quite difficult problem. The obvious choice for this is Am I Hot Or Not, which (1) has a large database (2) sorts pictures into male/female and (3) doesn't seem to let you get a copy of the database. They also sort things into age ranges of 18-25, 26-32, 33-40, and over 40, so you could probably get another bit from 18-25 vs. over 40.


Brute forcing based on trying to obtain a copy of the kitten-picture-database seems like the naïve solution. It'd be way easier to just pick images randomly, and just guess 84 times as many times. It's not like there's a shortage of bandwidth to submit 84x the number of queries. And it's not like there's a shortage of IP addresses from which to submit the requests to avoid looking like a bot. Assuming of course you have a bot network of a few hundred thousand nodes.

The best way to break it is to use a porn-serving captcha farm to create a database of what are kittens, and then do fully automated from there.

