I’m working on an analysis that compares the classifications of volunteers at Snapshot Serengeti with the classifications of experts for several thousand images from Season 4. This analysis will do two things. First, it will give us an idea of how good (or bad) our simple vote-counting method is for figuring out species in pictures. Second, it will allow us to see if more complicated systems for combining the volunteer data work any better. (Hopefully I’ll have something interesting to say about it next week.)
Right now I’m curating the expert classifications. I’ve allowed the experts to classify an image as “impossible,” which, I know, is totally unfair, since Snapshot Serengeti volunteers don’t get that option. But we all recognize that for some images, it really isn’t possible to figure out what the species is — either because it’s too close or too far or too off the side of the image or too blurry or …. The goal is that whatever our combining method is, it should be able to figure out “impossible” images by combining the non-“impossible” classifications of volunteers. We’ll see if we can do it.
Another challenge that I’m just running into is that our data set of several thousand images contains a duiker. A what? A common duiker, also known as a bush duiker:
You’ve probably noticed that “duiker” is not on the list of animals we provide. While the common duiker is widespread, it’s not commonly seen in the Serengeti, being small and active mainly at night. So we forgot to include it on the list. (Sorry about that.)
The result is that it’s technically impossible for volunteers to properly classify this image. Which means that it’s unlikely that we’ll be able to come up with the correct species identification when we combine volunteer classifications. (Interested in what the votes were for this image? 10 reedbuck, 6 dik dik, and 1 each of bushbuck, wildebeest(!), and impala.)
The duiker is not the only animal that’s popped up unexpectedly since we put together the animal list and launched the site. I never expected we’d catch a bat on film:
Our friends over at Bat Detective tell us that the glare on the face makes it impossible to truly identify, but they did confirm that it’s a large, insect-eating bat. Anyway, how to classify it? It’s not a bird. It’s not a rodent. And we didn’t allow for an “other” category.
I also didn’t think we’d see insects or spiders.
Moths fly by, ticks appear on mammal bodies, spiders spin webs in front of the camera and even ants have been seen walking on nearby branches. Again, how should they be classified?
And here’s one more uncommon antelope that we’ve seen:
It’s a steenbok, again not commonly seen in Serengeti. And so we forgot to put it on the list. (Sorry.)
Luckily, all these animals we missed from the list are rare enough in our data that when we analyze thousands of images, the small error in species identification won’t matter much. But it’s good to know that these rarely seen animals are there. When Season 5 comes out (soon!), if you run into anything you think isn’t on our list, please comment in Talk with a hash-tag, so we can make a note of these rarities. Thanks!