Some Results from Season 4
I was asked in the comments to last week’s blog post if I could provide some feedback about the results of Season 4. If you felt like you were seeing a lot of “nothing here” images, you’re right: of the 158,098 unique capture events we showed you, 70% were classified as having no animals in them. That left 47,320 with animals in them to classify, and the vast majority of these (94%) contained just one species. Here’s the breakdown of what was in all those images:
Maybe it won’t surprise you that Season 4 covered 2012’s wet season, when over a million wildebeest, zebra, and Thomson’s gazelle migrate through our study area. I find it interesting that hartebeest are also pretty numerous, but I wonder if it’s because of that one hartebeest that stood in front of the camera for hours on end.
This pie chart is based on the number of what we call “capture events,” which is the set of 1 or 3 pictures you see every time you make a classification. Once a camera has taken a set of pictures, we delay it from triggering again for about a minute. That way we don’t fill up the camera’s memory card with too many repeats of the same animals before we have a chance to replace them. But a minute isn’t a very long time for an animal that has decided to camp out in front of a camera, and so we frequently get sequences of many capture events that are all of the same animal. One of the things we’ll have to do in turning your classifications into valid research results is to figure out how to find these sequences in the data automatically.
Here’s a sequence of an elephant family hanging out around our camera for the night about a year ago. (Hat tip to dms246 who put together a collection of most of these images to answer the concerned question of some classifiers who saw just one image out of the whole sequence: is that elephant dead or just sleeping?)
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
If you’re interested in how I made the above pie chart, keep reading. But we’re going to get technical here, so if algorithms don’t interest you, feel free to stop.
The Nitty Gritty
The first thing I had to do with the data is to figure out what the “right” animal is based on all the volunteer classifications. Each capture event gets at least 10 pairs of eyeballs looking at it, so we always have 10 or more classifications to consider.
Now I have an admission to make. In doing this analysis, I used an old table of classifications from January 17, five days before we actually finished Season 4 and when we were about 90% through it. This means that not all capture events in my analysis table actually had 10 classifications. This minor discrepancy doesn’t matter for the big overall picture, like the pie chart above, but you’ll see it does matter for the details.
To figure out the correct species, I took the 47,320 capture events with animals in them and counted how many species were marked in each classification for each capture event. I used the median number of species classified for each capture event (rounded up) as the “real” number of species in that image. The result of this was that 44,471 capture events contained just one species, 2,811 contained two species, and 38 had three species.
Next, I decided to use a really simple strategy to assign the “correct” species to each capture event. For capture events with just one species, I determined the correct species as the one with at least 50% of the vote. If a species had exactly 50% of the vote, I declared it the winner if there wasn’t another species that also had 50% of the vote. So, for example, if a capture event had 10 votes for Thomson’s gazelle and another 10 votes for Grant’s gazelle, then neither would win. But if it had 10 votes for Thomson’s gazelle, 8 votes for Grant’s gazelle, and 2 votes for impala, I would declare Thomson’s gazelle the winner. This strategy left me with “correct” species identifications for 96% of the capture events! There remained just 1,604 capture events where no one species had at least half of the vote. I left these as “hard to figure out” in the pie chart above.
How certain am I that my simple algorithm to determine species identifications is correct? I obviously haven’t flipped through all the capture events by hand to check. But I did note that 25,279 (57%) of the single-species capture events were unanimous – in other words everyone who saw those images agreed on the species. And remember that’s at least ten people. Not bad. A total of 37,430 (84%) capture events had at least 75% agreement among classifiers. So I feel pretty good that this simple algorithm can do well on the majority of capture events.
What about those capture events with multiple species in them? I used the same simple logic to assign species to those. For two-species capture events, I looked for exactly two species that each had at least 25% of the vote. That would mean that together, they had at least 50% of the vote. If I ended up with 3 or 1 or 0 species with at least 25% of the vote, I considered the capture event “hard to figure out.” For three-species capture events, I looked for exactly three species that each had at least 16.66% of the vote. The result was that I could determine “correct” identifications on 96% (2,700/2,811) of two-species capture events and 82% (31/38) of three-species capture events.
Below are the tabulated results. And now here is where it’s important to remind you that I didn’t use the fully-finished Season 4 data. I probably should have removed capture events that hadn’t yet accumulated ten classifications at the start, but I didn’t think to do that at the time. When I started looking at the actual images of supposed rarely captured animals I discovered that many of them were errors. If only one person had classified a capture event, then it was considered to be at 100% known identification, even if the classification was completely wrong.
So take the numbers at the bottom of the table with a grain of salt; many of them would decrease or turn to zero if I were to run this algorithm again on the fully classified data. For example, I found only one real rhino and one real wildcat in the several images that were putatively “rhino” and “wildcat” due to too few classifications, and no civets, genets, or zorillas. (Sorry for those of you trying to collect them all!) However, the multi-species rare combinations seem to be pretty good. The two female-lion-and-other-bird images (here and here) checked out, as did the one secretary-bird-wildebeest-zebra image, and several others that I spot-checked.
Species | Number of Capture Events |
wildebeest |
17,443 |
zebra |
7,886 |
hartebeest |
2,628 |
buffalo |
2,275 |
gazelleThomsons |
2,213 |
wildebeest, zebra |
2,024 |
“hard to figure out” |
1,722 |
impala |
1,556 |
giraffe |
1,160 |
elephant |
1,066 |
warthog |
969 |
human |
883 |
otherBird |
732 |
gazelleGrants |
720 |
guineaFowl |
678 |
hyenaSpotted |
504 |
lionFemale |
302 |
hippopotamus |
284 |
reedbuck |
259 |
baboon |
192 |
eland |
175 |
otherBird, wildebeest |
163 |
otherBird, buffalo |
153 |
topi |
141 |
cheetah |
100 |
dikDik |
96 |
koriBustard |
82 |
lionMale |
54 |
jackal |
48 |
otherBird, zebra |
48 |
serval |
43 |
aardvark |
40 |
waterbuck |
38 |
gazelleThomsons, wildebeest |
38 |
secretaryBird |
37 |
vervetMonkey |
37 |
porcupine |
34 |
bushbuck |
30 |
ostrich |
30 |
hare |
29 |
leopard |
28 |
gazelleThomsons, hartebeest |
24 |
hartebeest, otherBird |
24 |
mongoose |
23 |
gazelleGrants, gazelleThomsons |
22 |
guineaFowl, wildebeest |
22 |
otherBird, wildebeest, zebra |
21 |
giraffe, otherBird |
20 |
gazelleThomsons, otherBird |
16 |
batEaredFox |
13 |
giraffe, zebra |
13 |
impala, zebra |
13 |
eland, zebra |
11 |
gazelleGrants, hartebeest |
11 |
lionFemale, lionMale |
11 |
guineaFowl, warthog |
10 |
impala, otherBird |
10 |
warthog, zebra |
10 |
hartebeest, zebra |
9 |
aardwolf |
8 |
buffalo, zebra |
8 |
gazelleThomsons, zebra |
8 |
otherBird, warthog |
7 |
caracal |
6 |
rodents |
6 |
guineaFowl, hartebeest |
6 |
hyenaSpotted, wildebeest |
6 |
wildcat |
5 |
hyenaStriped |
4 |
rhinoceros |
4 |
buffalo, warthog |
4 |
eland, otherBird |
4 |
giraffe, warthog |
4 |
guineaFowl, otherBird |
4 |
guineaFowl, zebra |
4 |
secretaryBird, wildebeest |
4 |
topi, warthog |
4 |
warthog, wildebeest |
4 |
reptiles |
3 |
baboon, impala |
3 |
elephant, otherBird |
3 |
gazelleGrants, warthog |
3 |
hartebeest, wildebeest |
3 |
human, wildebeest |
3 |
buffalo, giraffe |
2 |
buffalo, wildebeest |
2 |
elephant, zebra |
2 |
gazelleGrants, otherBird |
2 |
gazelleGrants, wildebeest |
2 |
gazelleGrants, zebra |
2 |
gazelleThomsons, giraffe |
2 |
gazelleThomsons, human |
2 |
guineaFowl, wildebeest, zebra |
2 |
human, otherBird |
2 |
human, wildcat |
2 |
impala, warthog |
2 |
koriBustard, zebra |
2 |
lionFemale, otherBird |
2 |
rodents, zebra |
2 |
civet |
1 |
genet |
1 |
zorilla |
1 |
baboon, gazelleGrants |
1 |
baboon, giraffe |
1 |
baboon, zebra |
1 |
buffalo, elephant |
1 |
buffalo, otherBird, zebra |
1 |
eland, wildebeest |
1 |
elephant, guineaFowl |
1 |
elephant, vervetMonkey |
1 |
elephant, wildebeest |
1 |
gazelleGrants, giraffe |
1 |
gazelleGrants, impala |
1 |
gazelleGrants, koriBustard |
1 |
gazelleGrants, otherBird, warthog |
1 |
gazelleThomsons, hartebeest, ostrich |
1 |
gazelleThomsons, ostrich |
1 |
gazelleThomsons, topi |
1 |
gazelleThomsons, warthog |
1 |
gazelleThomsons, wildebeest, zebra |
1 |
giraffe, impala |
1 |
giraffe, ostrich |
1 |
giraffe, topi |
1 |
giraffe, vervetMonkey |
1 |
giraffe, wildebeest |
1 |
hare, zebra |
1 |
hartebeest, impala |
1 |
hartebeest, warthog |
1 |
hartebeest, wildebeest, zebra |
1 |
human, impala |
1 |
human, warthog |
1 |
human, wildebeest, zebra |
1 |
human, zebra |
1 |
hyenaSpotted, zebra |
1 |
impala, reedbuck |
1 |
impala, wildebeest |
1 |
jackal, otherBird |
1 |
koriBustard, reedbuck |
1 |
koriBustard, wildebeest |
1 |
koriBustard, wildebeest, zebra |
1 |
mongoose, otherBird |
1 |
otherBird, topi |
1 |
otherBird, waterbuck |
1 |
secretaryBird, wildebeest, zebra |
1 |
secretaryBird, zebra |
1 |
topi, zebra |
1 |
19 responses to “Some Results from Season 4”
Trackbacks / Pingbacks
- February 6, 2013 -
- February 13, 2013 -
- February 20, 2013 -
- May 22, 2013 -
- May 29, 2013 -
- June 5, 2013 -
This is fascinating. : ) Is that one secretary-bird-wildebeest-zebra image really a secretary bird, or a stork?
I agree with Muffy, it might as well be a stork. The head is not really visible. The tail is though and on a secretary bird I’d expect a much longer tail then the pictures currently show.
“one secretary-bird-wildebeest-zebra image”: http://talk.snapshotserengeti.org/#/subjects/ASG000b7ux
http://www.google.nl/search?q=sagittarius+serpentarius&tbo=u&tbm=isch
http://www.google.nl/search?q=Ciconia+ciconia&tbo=u&tbm=isch
Oh, man. I am _not_ a bird expert. I will leave it up to you birders to decide whether it’s really a stork or not. I think it’s an understandable mistake, though, if it is, as we have secretary bird on our list and not stork. I wish we could do better justice to all the cool birds out in Serengeti…
Don’t worry, we won’t hold it against you, i’m not an expert either :).
We won’t be able to make a definitive decision with just these pictures. As a scientist you might have access to the high resolution pictures, or have easy access to the pictures just before or after these in the timeline. Because of the randomizer something hard for us might be easy for you :).
But these results are fantastic, With these results and the results of season 1-3 you cover a complete year and might evaluate the species list. Maybe you could have a look at the “otherBirds” section and upgrade the top 3 of the birds in there (if they’ve been identified correctly at least x amount of times) to have an own button in the list. Maybe a button “othercatlike” to group caracal,wildcat, cevet, genet and zorilla together.
For the category “hard to figure out” you might create buttons like “show previous/next in timeline”. I think some of those hartebeest pictures camping in front of the same tree might have end up in this category. Sometimes knowing the previous / next picture in the timeline will make something hard to figure out suddenly pretty easy to figure out. Like a picture completely covering a patch of fur. We could count the hairs, but thats about it, species impossible to tell. Knowing that just a minute earlier there was hartebeest camping there makes it easy to tell. Don’t enable these buttons by default, only enable these on pictures which after at least x amount of people looking at them still fall into the category “hard to figure out”.
Super!! great results. So glad I was able to participate.
Fascinating! This is a great project.
I am a little surprised by the impala results. My impression was that I saw more than that, certainly more than Giraffe + Elephant. I hope that doesn’t mean I’ve been misidentifying them!
I find all of this info so informative. Even if it is “technical”!
Oh this is fascinating. Thanks so much for the update and all of the technical info!
Oh and btw, I am totally convinced that one Hartebeest is the cause for such a large percentage. I have seen that guy far too many times to count. 🙂
I was also surprised that the percentage of Impala’s was so low. But on another note- does anyone agree with me – in wondering if our Hartebeest could have been in labor since a new born did show up in later shots?
I too had the thought of wonder if this Hartebeest is in labour, as you say a new born did show up later, its a shame we hadn’t some birthing shots, Their was mentiopn of one that had just calved and her behind shot showed evidence of having recently giving birth. Be interested to know if their were any with a baby on its way into the world.
Excellent post! And thanks for the “Nitty Gritty”, that was very interesting as well. I was a bit surprised that buffalo was quite a bit more frequent than impalas. I would have estimated that I saw them about equally often. And, as others have pointed out, I think the hartebeest number may be significantly skewed by that “one” guy camping out in front of the camera.
Anyway, it’s great to see the actual numbers. It helps explains why I still have 10 animals on the list that I haven’t sen yet.
I too was suprised about the Impala number being so low, maybe I and the others, who reported lots of Impala, just happened to get a larger number of impala shots and less of the others, in our mix.