Last week I posted an animated GIF of hourly carnivore sightings. To clarify, the map showed patterns of temporal activity across all days over the last 3 years — so the map at 9am shows sites where lions, leopards, cheetahs, and hyenas like to be in general at that time of day (not on any one specific day).
These maps here actually show where the carnivores are on consecutive days and months (the dates are printed across the top). [For whatever reason, the embedded .GIFs hate me; click on the map to open in a new tab and see the animation!]
Keep in mind that in the early days (June-Sept 2010) we didn’t have a whole lot of cameras on the ground, and that the cameras were taken down from Nov 2010-Feb 2011 (so that’s why those maps are empty).
The day-by-day map is pretty sparse, and in fact looks pretty random. The take-home message for this is that lions, hyenas, cheetahs, and leopards are all *around*, but the chances of them walking past a camera on any given day are kinda low. I’m still trying to find a pattern in the monthly distributions below.
So this is what I’ve been staring at in my turkey-induced post-Thanksgiving coma. Could be worse!
Truth be told, I *have* been working on data analysis from the start. It’s actually one of my favorite parts of research — piecing together the story from all the different puzzle pieces that have been collected over the years.
But right now I am knee-deep in taking a closer look at the camera trap data. Since we have *so* many cameras taking pictures every day I want to look at where the animals are not just overall, but from day to day, hour to hour. I’m not 100% sure what analytical approaches are out there, but my first step is to simply visualize the data. What does it look like?
So I’ve started making animations within the statistical programming software R. Here’s one of my first ones (stay tuned over the holidays for more). Each frame represents a different hour on the 24 hour clock: 0 is midnight, 12 is noon, 23 is 11pm, etc. Each dot is sized proportionally to the number of captures of that species at that site at that time of day. The dots are set to be a little transparent so you can see when sites are hotspots for multiple species. [*note: if the .gif isn't animating for you in the blog, try clicking on it so it opens in a new tab.]
Deep breath; I promise it will be okay.
By now, many of you have probably seen the one image that haunts your dreams: the backlit photo of the towering acacia that makes the wildebeest in front look tiny, with those two terrible words in big white print across the front — “We’re Done!” Now what are you going to do when you drink your morning coffee?? Need a break from staring at spreadsheets?? Are in desperate need of an African animal fix?? Trust me, I know the feeling.
Deep breath. (And skip to the end if you can’t wait another minute to find out when you can ID Snapshot Serengeti animals again.)
I have to admit that as a scientist using the Snapshot Serengeti data, I’m pretty stoked that Seasons 5 and 6 are done. I’ve been anxiously watching the progress bars inch along, hoping that they’d be done in time for me to incorporate them in my dissertation analyses that I’m finally starting to hash out. Silly me for worrying. You, our Snapshot Serengeti community, have consistently awed us with how quickly you have waded through our mountains of pictures. Remember when we first launched? We put up Seasons 1-3 and thought we’d have a month or so to wait. In three days we were scrambling to put up Season 4. This is not usually the problem that scientists with big datasets have!
Now that Seasons 5 and 6 are done, we’ll download all of the classifications for every single capture event and try to make sense of them using the algorithms that Margaret’s written about here and here. We’ll also need to do a lot of data “cleaning” — fixing errors in the database. Our biggest worry is handling incorrect timestamps — and for whatever reason, when a camera trap gets injured, the time stamps are the first things to malfunction (usually shuttling back to 1970 or into the futuristic 2029). It’s a big data cleaning problem for us. First, one of the things we care about is when animals are at different sites, so knowing the time is important. But also, many of the cameras are rendered non-functional for various reasons - meaning that sometimes a site isn’t taking pictures for days or even weeks. To properly analyze the data, we need to line up the number of animal captures with the record of activity, so we know that a record of 0 lions for the week really means 0 lions, and not just that the camera was face down in the mud.
So, we now have a lot of work in front of us. But what about you? First, Season 7 will be on its way soon, and we hope to have it online in early 2014. But that’s so far away! Yes, so in the meanwhile, the Zooniverse team will be “un-retiring” images like they’ve done in previous seasons. This means that we’ll be collecting more classifications on photos that have already been boxed away as “done.” Especially for the really tricky images, this can help us refine the algorithms that turn your classifications into a “correct answer.”
But there are also a whole bunch of awesome new Zooniverse projects out there that we’d encourage you to try in the meanwhile. For example, this fall, Zooniverse launched Plankton Portal, which takes you on a whole different kind of safari. Instead of identifying different gazelles by the white patches on their bums, you identify different species of plankton by their shapes. Although plankton are small, they have big impacts on the system — as the Plankton Portal scientists point out on their new site, “No plankton = No life in the ocean.”
Wherever you choose to spend your time, know that all of us on the science teams are incredibly grateful for your help. We couldn’t do this without you.
Last week, william garner asked me in the comments to my post ‘Better with experience’ how well the experts did on the about 4,000 images that I’ve been using as the expert-identified data set. How do we know that those expert-identifications are correct?
Here’s how I put together that expert data set. I asked a set of experts to classify images on snapshotserengeti.org — just like you do — but I asked them to keep track of how many they had done and any that they found particularly difficult. When I had reports back that we had 4,000 done, I told them that they could stop. Since the experts were reporting back at different times, we actually ended up doing more than 4,000. In fact, we’d done 4,149 sets of images (captures), and we had 4,428 total classifications of those 4,149 captures. This is because some experts got the same capture.
Once I had those expert classifications, I compared them with the majority algorithm. (I hadn’t yet figured out the plurality algorithm.) Then I marked (1) those captures where experts and the algorithm disagreed, and (2) those captures that experts had said were particularly tricky. For these marked captures, I went through to catch any obvious blunders. For example, in one expert-classified capture, the expert classified the otherBirds in the images, but forgot to classify the giraffe the birds were on! The rest of these marked images I sent to Ali to look at. I didn’t tell her what the expert had marked or what the algorithm said. I just asked her to give me a new classification. If Ali’s classification matched with either the algorithm or the expert, I set hers as the official classification. If it didn’t, then she, and Craig, and I examined the capture further together — there were very few of these.
And that is how I came up with the expert data set. I went back this week to tally how the experts did on their first attempt versus the final expert data set. Out of the 4,428 classifications, 30 were marked as ‘impossible’ by Ali, 1 was the duiker (which the experts couldn’t get right by using the website), and 101 mistakes were made. That makes for a 97.7% rate of success for the experts. (If you look at last week’s graph, you can see that some of you qualify as experts too!)
Okay, and what did the experts get wrong? About 30% of the mistakes were what I call wildebeest-zebra errors. That is, there are wildebeest and zebra, but someone just marks the wildebeest. Or there are only zebra, and someone marks both wildebeest and zebra. Many of the wildebeest and zebra herd pictures are plain difficult to figure out, especially if animals are in the distance. Another 10% of the mistakes were otherBird errors — either someone marked an otherBird when there wasn’t really one there, or (more commonly) forgot to note an otherBird. About 10% of the time, experts listed an extra animal that wasn’t there. And another 10% of the time, they missed an animal that was there. Some of these were obvious blunders, like missing a giraffe or eland; other times it was more subtle, like a bird or rodent hidden in the grass.
The other 40% of the time were mis-identifications of the species. I didn’t find any obvious patterns to where the mistakes were; here are the species that were mis-identified:
|wildebeest||6||buffalo, hartebeest, elephant, lionFemale|
|hartebeest||5||gazelleThomsons, impala, topi, lionFemale|
|gazelleGrants||4||impala, gazelleThomsons, hartebeest|
|reedbuck||3||dikDik, gazelleThomsons, impala|
Does experience help with identifying Snapshot Serengeti images? I’ve started an analysis to find out.
I’m using the set of about 4,000 expert-classified images for this analysis. I’ve selected all the classifications that were done by logged-in volunteers on the images that had just one species in them. (It’s easier to work with images with just one species.) And I’ve thrown out all the images that experts said were “impossible.” That leaves me with 68,535 classifications for 4,084 images done by 5,096 different logged-in volunteers.
I’ve counted the number of total classifications each volunteer has done and given them a score based on those classifications. And then I’ve averaged the scores for each group of volunteers who did the same number of classifications. And here are the results:
Here we have the number of classifications done on the bottom. Note that the scale is a log scale, which means that higher numbers get grouped closer together. We do this so we can more easily look at all the data on one graph. Also, we expect someone to improve more quickly with each additional classification at lower numbers of classifications.
On the left, we have the average score for each group of volunteers who did that many classifications. So, for example, the group of people who did just one classification in our set had an average score of 78.4% (black square on the graph). The group of people who did two classifications had an average score of 78.5%, and the group of people who did three classifications had an average score of 81.6%.
Overall, the five thousand volunteers got an average score of 88.6% correct (orange dotted line). Not bad, but it’s worth noting that it’s quite a bit lower than the 96.6% that we get if we pool individuals’ answers together with the plurality algorithm.
And we see that, indeed, volunteers who did more classifications tended to get a higher percentage of them correct (blue line). But there’s quite a lot of individual variation. You can see that despite doing 512 classifications in our set, one user had a score of only 81.4% (purple circle). This is a similar rate of success as you might expect for someone doing just 4 classifications! Similarly, it wasn’t the most prolific volunteer who scored the best; instead, the volunteer who did just 96 classifications got 95 correct, for a score of 99.0% (blue circle).
We have to be careful, though, because this set of images was drawn randomly from Season 4, and someone who has just one classification in our set could have already classified hundreds of images before this one. Counting the number of classifications done before the ones in this set will be my task for next time. Then I’ll be able to give a better sense of how the total number of classifications done on Snapshot Serengeti is related to how correct volunteers are. And that will give us a sense of whether people learn to identify animals better as they go along.
I’ve been working on a federal grant application the last couple of weeks. It’s left me feeling a bit like this:
The grant was originally due this upcoming Thursday, but with the government shutdown showing no signs of ending, who knows what will happen? The National Science Foundation’s website is unavailable during the furlough, meaning that nobody can submit applications. So we’ve all been granted an unexpected extension, but we’re not sure until when.
The grant I’m applying for is called the Doctoral Dissertation Improvement Grant. It’s an opportunity for Ph.D. students to acquire funding to add on a piece to their dissertation that they wouldn’t otherwise be able to do. I’m applying for funds to go down to South Africa and work with a couple of folks from the conservation organization Panthera to collate data from two sites with long-term carnivore research projects. Their research team currently has camera surveys laid out in two reserves in Kwazulu-Natal, South Africa: Phinda Private Game Reserve and Mkhuze Game reserve. Now, the cool thing about these reserves is that they are small, fenced, and pretty much identical to each other…except that lions have been deliberately excluded from Mkhuze.
Now, one of the biggest frustrations of working with large carnivores is that I can’t experimentally isolate the processes I’m studying. If I want to know how lions affect the ranging patterns and demography of hyenas, well, I should take out all the lions from a system and see what happens to the hyenas. For obvious reasons, this is never going to happen. But the set-up in Phinda and Mkhuze is the next best thing: by holding everything else constant – habitat, prey – I can actually assess the effect of lions on the ranging and dynamics of hyenas, cheetahs, and leopards by comparing the two reserves.
So, that’s what I’m working on non-stop until whenever it turns out to be due. Because this would be a really cool grant to get. I’m currently working on analyzing some of the camera trap data from Seasons 1-4 and hope to share some of the results with you next week. Until then, I’m going to continue to be a bit of a zombie.
From last week’s post, we know that we can identify images that are particularly difficult using information about classification evenness and the fraction of “nothing here” votes cast. However, the algorithm (and really, all of you volunteers) get the right answer even on hard images most of the time. So we don’t necessary want to just throw out those difficult images. But can we?
Let’s think about two classes of species: (1) the common herbivores and (2) carnivores. We want to understand the relationship between the migratory and non-migratory herbivores. And Ali is researching carnivore coexistence. So these are important classes to get right.
First the herbivores. Here’s a table showing the most common herbivores and our algorithm’s results based on the expert-classified data of about 4,000 images. “Total” is the total number of images that our algorithm classified as that species, and “correct” is the number of those that our experts agreed with.
We see that we do quite well on the common herbivores. Perhaps we’d wish for Thomsons gazelles to be a bit higher (Grants gazelles are most commonly mis-classified as Thomsons), but these results look pretty good.
If we wanted to be conservative about our estimates of species ranges, we could throw out some of the images with high Pielou scores. Let’s say we threw out the 10% most questionable wildebeest images. Here’s how we would score. (Note that I didn’t do the zebra, since they’d be at 100% again, no matter how many we dropped.) The columns are the same as the above table, except this time, I’ve listed the threshold Pielou score used to throw out 10% of the images of that species.
|species||Pielou cutoff||total||correct||% correct|
We do quite a bit better with our Thomsons gazelle and increase the accuracy of all the other species at least a little. But do we sacrifice anything throwing out data like that? If wildebeest make up a third of our images and we have a million images, then we’re throwing away 33,000 images(!), but we still have another 300,000 left to do our analyses. One thing we will look at in the future is how much dropping the most questionable images affects estimates of species ranges. I’m guessing that for wildebeest it won’t be much.
What if we did the same thing for Thomsons gazelle or impala? We would expect about 50,000 images of each of those per million images. Throwing out 5,000 images still leaves us with 45,000, which seems like it might be enough for many analyses.
Now let’s look at the carnivore classifications from the expert-validated data set:
Wow! You guys sure know your carnivores. The two wrong answers were the supposed bat-eared fox that was really a jackal and the supposed striped hyena that was really an aardwolf. These two wrong answers had high Pielou scores: 0.77 and 0.83 respectively.
Judging by this data set, about 2.5% of all images are carnivores, which gives us about 25,000 carnivore images for every million we collect. That’s a lot of great data on these relatively rare animals! But it’s not so much that we want to throw any of it away. Fortunately, we won’t have to. We can use the Pielou score to have an expert look at the most difficult images.
Let’s say Ali wants to be very confident of her data. She can choose the 20% most difficult carnivore images — which is only about 5,000 per million images, and she can go through them herself. Five thousand images is nothing to sneeze at, of course, but the work can be done in a single day of intense effort.
In summary, we might be able to throw out some of the more difficult images (based on Pielou score) for the common herbivores without losing much coverage in our data. Further analyses are needed, though, to see if doing so is worthwhile and whether we lose anything by throwing out so many correct answers. For carnivores, the difficult images can be narrowed down sufficiently that an expert can double-check them by hand.
Back in June, I wrote about algorithms I was working on to take the volunteer data and spit out the “correct” classification of for each image. First, I made a simple majority-rules algorithm and compared its results to several thousand classifications done by experts. Then, when the algorithm came up with no answer for some of the images (because there were no answers in the majority), I tried a plurality algorithm, which just looked to see which species got the most votes, even if it didn’t get more than half the votes. It worked well, so I’m using the plurality algorithm going forward.
One of the things I’ve been curious about is whether we can detect when particular images are “hard.” You know what I mean by hard: animals smack up in front of the camera lens, animals way back on the horizon, animals with just a tip of the ear or a tuft of tail peeking onto the image from one side, animals obfuscated by trees or the dark of night.
So how can we judge “hard”? One way is to look at the “evenness” of the volunteer votes. Luckily, in ecology, we deal with evenness a lot. We frequently want to know what species are present in a given area. But we also want to know more than that. We want to know if some species are very dominant in that area or if species are fairly evenly distributed. For example, in a famous agricultural ecology paper*, Cornell entomologist Richard Root found that insect herbivore (pest) species on collard greens were less even on collards grown in a big plot with only other collards around versus on those grown in a row surrounded by meadow plants. In other words, the insect species in the big plot were skewed toward many individuals of just a few species, whereas in the the meadow rows, there were a lot more species with fewer individuals of each species.
We can adopt a species evenness metric called “Pielou’s evenness index” (which, for you information theorists, is closely related to Shannon entropy.)
[An aside: I was surprised to learn that this index is named for a woman: Dr. Evelyn Chrystalla Pielou. Upon reflection, this is the first time in my 22 years of formal education (in math, computer science, and ecology) that I have come across a mathematical term named for a woman. Jacqueline Gill, who writes a great paleo-ecology blog, has a nice piece honoring Dr. Pielou and her accomplishments.]
Okay, back to the Pielou index: we can use it to judge how even the votes are. If all the votes are for the same species, we can have high confidence. But if we have 3 votes for elephant and 3 votes for rhino and 3 votes for wildebeest and 3 votes for hippo, then we have very low confidence. The way the Pielou index works out, a 0 means all the votes are for the same species (high skew, high confidence) and a 1 means there are at least two species and they all got the same number of votes (high evenness, low confidence). Numbers in between 0 and 1 are somewhere between highly skewed (e.g. 0.2) and really even (e.g. 0.9).
Another way we could measure the difficulty of an image is to look at how many people click “nothing here.” I don’t like it, but I suspect that some people use “nothing here” as an “I don’t know” button. Alternatively, if animals are really far away, “nothing here” is a reasonable choice. We might assume that the percentage of “nothing here” votes correlates with the difficulty of the image.
I calculated the Pielou evenness index (after excluding “nothing here” votes) and the fraction of “nothing here” votes for the single-species images that were classified by experts. And then I plotted them. Here I have the Pielou index on the x-axis and the fraction of “nothing here” votes on the y-axis. The small pink dots are the 3,775 images that the algorithm and the experts agreed on, the big blue dots are the 84 images that the plurality algorithm got wrong, and the open circles are the 29 images that the experts marked as “impossible.” (Click to enlarge.)
And sure enough, we see that the images the algorithm got wrong had relatively high Pielou scores. And the images that were “impossible” had either high Pielou scores or a high fraction of “nothing here” votes (or both). I checked out the four anomalies over on the left with a Pielou score of zero. All four were unanimously voted as wildebeest. For the three “impossibles,” both Ali and I agree that wildebeest is a reasonable answer. But Ali contends that the image the algorithm got wrong is almost certainly a buffalo. (It IS a hard image, though — right up near the camera, and at night.)
So we do seem to be able to get an idea of which images are hardest. But note that there are a lot more correct answers with high Pielou scores and high “nothing here” fractions than errors or “impossibles”. We don’t want to throw out good data, so we can’t just ignore the high-scorers. But we can attach a measure of certainty to each of our algorithm’s answers.
I did some analysis of what happened during our recent crowdfunding campaign that I want to share.
But first, I want to note a couple things that do not appear in the numbers. Several Snapshot Serengeti fans told us that they wanted to donate, but could not. This was because the Indiegogo site is set up to only accept major credit cards, and will not accept PayPal payments for campaigns (like ours) that send money to non-profits. As many folks outside the U.S. do not have credit cards, this was a barrier. We were unfortunately unable to figure out an alternative method of donation during the campaign, and as a result, a number of you couldn’t donate (or get the perks). And we’re sorry for that; lesson learned.
There were also several people who found our campaign after it had ended. They, too, couldn’t donate via the Indiegogo site. But we do have a webpage set up to accept any future gifts you might like to make. Donations go directly to the University of Minnesota Foundation; simply write “Snapshot Serengeti” in the text field after checking the “yes” box, and we’ll get the money.
And now for the numbers.
We raised $36,324 on Indiegogo, with 701 donors contributing. Of these 701, about a quarter (187) provided a mailing address for their perks. These 187 donors hail from 18 countries and 155 cities, most of them in the United States. We had donations from 27 U.S. states plus the District of Colombia. Here are some maps of where they are. Don’t forget that this is just a quarter of our generous donors! (And also, I just fell in love with MultiPlottr. It took me about a minute to make these maps.)
When we look over time, we see that some interesting things happened. This graph looks a bit busy, but hang in there while I explain it. (And many thanks to Rob Simpson, who helped put this graph together.) Click for a larger version.
Along the bottom, we have the days of the campaign, starting on July 14 and running to August 9. The dots along the lines are daily dots. The green line shows the total dollar amount raised that day, and you can see the dollar values on the left side of the graph. The red line shows the total number of (unique) visitors to the Snapshot Serengeti site on that day. And I got Indiegogo to send me the data on visitors to the Indiegogo webpage; that’s the yellow line. You can see the values for the red and yellow lines on the right side of the graph.
So what do we see? Here are some things I’ve noticed; maybe you can point out some others. Our initial donations came from our hardcore supporters in the first few days – those of you who read this blog or have liked us on Facebook. The National Geographic article that came out the same day was cool, but didn’t have any links to our campaign for several days.
On July 26, we sent out a newsletter to everyone who’s ever worked on Snapshot Serengeti, announcing Season 6 and our crowdfunding campaign. You can see an immediate uptick in donations (green line) that remains elevated for the next few days. Traffic to Snapshot Serengeti (red line) also increases, but not by a lot. On July 29, we got some coverage at KSTP, our favorite local Minnesota TV station, which may have contributed a little.
But the big event was the following day, when the Zooniverse sent out a newsletter to all of its users. You can see the impact. Lots of folks rushed over to check out Snapshot Serengeti (red line), some of them also checked out the Indiegogo page (yellow line) and 107 of them contributed $4,700 (green line).
All that activity, helped secure us a spot on Indiegogo’s front page on August 1. You can see that for the next few days, visits to our Indiegogo site (yellow line) – and funding (green line) – increased, while attention over at Snapshot Serengeti itself waned. This suggests that being on Indiegogo’s front page was useful and helped keep donations flowing.
On August 4, National Geographic gave us some more great coverage, this time with appropriate links to our campaign. And on August 5, a piece I wrote about the campaign was published on a blog that is frequented by scientists interested in crowdfunding. Both things appeared to give us a boost.
On August 6, we marked down the damaged camera traps, and had a bunch of takers. We also had coverage on BoingBoing the same day, but they initially linked to our About page, so I’m not sure how much impact that had on our fundraising.
It was August 8, our second-to-last day, that sealed the deal. We made Indiegogo’s top “Last Chance” projects. And the Zooniverse sent out another newsletter asking for help getting us the last part of the way to our goal. And the Zoonites (Zoonorians? Zooners?) responded. That day, 219 people gave almost $10,000!
Thanks again to everyone who contributed and also to those who tried but weren’t able.