Machine learning and citizen science…a winning combination!
* Sarah Huebner, who heads up the Snapshot Safari team has written the following blog to give all participants of Snapshot Safari projects the low down on machine learning advances that are being introduced today*
In the era of Big Data, when equipment allows us to collect data faster than we assess it, researchers are always looking for ways to enhance and accelerate the process between data collection and analysis. We here at Snapshot Safari are proud to have been the first camera trapping project partnering with citizen scientists on Zooniverse when we introduced Snapshot Serengeti, and to have expanded that model from one African park to dozens. Now we hope to improve the data pipeline once again by integrating machine learning to reduce the amount of volunteer effort required to classify data from our participating sites.
‘Machine learning’ refers to Artificial Intelligence algorithms that have been trained for a specific task or purpose. These algorithms are fed millions of images labeled with their correct names and are ‘trained’ to recognize those animals again in different settings. These models generate ‘predictions’ based on the training they’ve received and provide confidence levels to let us know how sure they are that is the correct label. Because Snapshot Serengeti has been running since 2010, it has generated millions of images over the years, which make a perfect training dataset for machine learning (ML) algorithms. We are employing ML models to drastically reduce the effort required to retire empty images (no animals present) and to retire images of common animals like wildebeest and zebra.
First, our ML models have become quite good at telling us whether animals are present or not. This helps us to more easily spot cameras where vegetation has grown in front of the lens, resulting in hundreds of pictures of grass blowing in the wind. Pretty, but not quite what we’re after, so we can eliminate those prior to upload. Secondly, we have modified the retirement rules on Snapshot projects (implemented starting today as new seasons are launched) so that only two volunteers need to confirm the computer’s prediction of ‘empty’. This means instead of 10 or even 20 people viewing those photos, only two people will see them and can push them out of the dataset quickly.
Those of you who have been working on this project for a while know that the wildlife you’re most likely to see are zebras and wildebeest, and you all are great at identifying those! Because those are easy identifications, they too will retire with fewer views than before. What this means practically is that you should see more images of rare and cryptic species like predators and fewer blank images. We have implemented a number of retirement rules behind the scenes to make this happen, based on varying confidence levels produced by the algorithm. For example, our simulations have proven that even at only 50% confidence, the computer is right 99.6% of the time when it tells us that an image is empty. Therefore, any ‘empty’ prediction with confidence of 50% or more will only need two human views to confirm that the computer is correct. Likewise, if the model tells us that it’s a human with a confidence level of 80% or higher, we will retire with just two confirmations.
We will continue to improve the algorithm’s capabilities by using our most valuable asset—all of you! We hope that you will be as interested as we are in advancing the use of ML to make the classifying process more fun and satisfying. The algorithm is pretty good at species, but now we need to improve its ability to count animals, so we will soon be introducing a special project, ‘Snapshot Focus’, which will feature images the algorithm has reviewed and marked each animal with a bounding box. We will ask you to tell us whether the ML model got it right. Stay tuned for that and other special projects!
We are launching three new sites today—Camdeboo National Park, Kgalagadi Transfrontier Park, and DeHoop Nature Reserve, all from South Africa. These three projects have the new retirement rules in place, as will Season 12 of Snapshot Serengeti, which will launch in June. As new seasons or new projects come online, they will be set up with these rules and perhaps more as we refine the data pipeline. Let us and the moderators know how it goes. We are so thankful for your efforts and support, which help us to return data to our collaborators at reserves in Africa quickly and with confidence that it is correct thanks to the combination of citizen science and machine learning. Happy classifying!
Research Manager, Snapshot Safari
May 28, 2019
For more information about the machine learning algorithms created using Snapshot Serengeti images, see:
Willi, Marco, Pitman, Ross Tyzack, Cardoso, Annabelle W., Locke, Christina, Swanson, Alexandra, Boyer, Amy, Veldthuis, Marten, and Fortson, Lucy. (2019) Identifying animal species in camera trap images using deep learning and citizen science. Methods in Ecology and Evolution 10(1):80-91.
Norouzzadeh, Mohammad Sadegh, Nguyen, Anh, Kosmala, Margaret, Swanson, Alexandra, Palmer, Meredith S., Packer, Craig, and Clune, Jeff. (2017) Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proceedings of the National Academy of Sciences 115(25):E5716-E5725.
To read about how algorithms make decisions in comparison to humans, see:
Miao, Z., Gaynor, K.M., Wang, J., Liu, Z., Muellerklein, O., Norouzzadeh, M.S., McInturff, A., Bowie, R.C., Nathon, R., Stella, X.Y. and Getz, W.M. (2018) A comparison of visual features used by humans and machines to classify wildlife. bioRxiv, p.450189.
Whilst we are waiting for the next season of Snapshot Serengeti images I have been reviewing some of the amazing images that season 11 turned up. I always have to remind myself that these cameras do not have some avid photographer sat behind them snapping away at the opportune moment but are activated merely by sensing a change in heat within their detection zone. It is truly amazing how often we get stunning images.
Here a beautiful male ostrich struts across the field of view showing off his amazingly pink legs. The bare parts of male ostrich are usually a pale grey to pink colour but during the breeding season hormones influence the pigmentation and a flush of red blazes through his neck and legs. Given the extent of these legs and neck contrasted with the bold black and white feathers it makes for an arresting sight. Compared with the drab browny grey of the female the male is a real show off.
What’s strange about this scenario is that in most bird species where the male changes feather of bare parts colour it is the female alone that rears the chicks. The colourful male would perhaps attract too much predator attention around the vulnerable nest or chicks. In ostrich though, the male takes his share of sitting on eggs and looking after chicks. In fact he and his primary female will take turns incubating a clutch of eggs that typically include both hers and other females eggs so he will generally have more invested in the chicks as a direct parent than the female who may only be ‘aunt’ to some of the chicks.
I guess the shear size, power and speed of the ostrich, who is perfectly adapted to the open plains of the Serengeti, means he can afford the fancy show of pink legs if it means winning the ladies.
Wondering About African Wild Dogs
The Serengeti is renowned for being one of the few relatively intact large ecosystems remaining in the world. Sure, it isn’t without its problems, nowhere, not even protected areas (PA’s), are exempt from the onslaught of effects from humans either directly on the ground or through climate change but in general the health of the Serengeti is robust.
So why is it in this well balanced, large ecosystem that we don’t ever see African wild dogs (Lycaon pictus) on the Snapshot Serengeti cameras? Of course, African wild dog numbers everywhere are low even within (PA’s) but they are part of the carnivore guild alongside lion, leopard, hyena, cheetah in most of the big PA’s like South Africa’s Kruger National Park and Botswana’s Okavango Delta. So why not the Serengeti National Park (SNP)?
Not so long ago wild dog were present in the SNP in small numbers. In 1970, when studies began, there were an estimated 95 individuals in 12 packs. They were studied sporadically until 1991 when all 12 packs had seemingly died or disappeared. Here in lays the mystery, what killed them?
At the time the rapid disappearance of the wild dogs coincided with a renewed period of research that saw individuals from several packs immobilised and fitted with radio collars. The research community at the time were baffled and a hypothesis was proposed by Roger Burrows that implemented researchers handling of the dogs as being causal to their decline, the theory being that the stress imparted to the animals made them susceptible to rabies which eventually killed them. It is a controversial hypothesis and has had the research community at each other’s throats for the last 25 years. Some argue that invasive handling of study animals is un-ethical and can lead to tragic outcomes (as hypothesised by Burrows for the African wild dog) others argue that collaring and taking blood samples from study animals is vital to understanding processes which effect conservation management.
A recent paper by Jackson et al, “No evidence of handling‐induced mortality in Serengeti’s African wild dog population” an open access paper published in Ecology and Evolution, revisits the question and aims to shed new light on the argument with research.
I am not trying to weigh in on the argument, my expertise is inadequate for that but I thought it was an interesting take on the question of wild dog in the Serengeti and it used data generated by Snapshot Serengeti, all be it in a small way, to help with its argument.
Surrounding the eastern side of the SNP are two PA’s, the Ngorongoro Conservation Area (NCA, a multi use area occupied by a large population of Maasai pastoralists) and Loliondo Game Controlled Area (LGCA, a multi use area with some settlements, hunting and tourist concessions.) One would argue that these areas are not as pristine as SNP itself with a good dose of human impact but the wild dogs have been studied here since 2005 and as of 2017 there was an estimated population of 120 individuals in 10 packs.
Collaring has shown that the wild dogs do venture back to the Serengeti plains from time to time (proving there is no physical barrier to dispersal) but that they do not settle there, choosing instead to stay in an environment where one would imagine it was harder to survive on the periphery of human habitation. Our own Snapshot Serengeti work comes in to play here to prove that even with an extensive network of cameras that have been in operation for several years no wild dog has been recorded in SNP.
So what do Jackson et al imagine could be the route cause if not the direct handling leading to stress related rabies outbreaks hypothesis.
The team have studied wild dogs for over a decade in areas adjacent to SNP that arguably have an equal or higher rabies risks (think of the domestic dogs associated with people) to the SNP. They have used the same invasive methods of study as the SNP researchers including using intervention to fit collars, take blood and in one incidence an attempt at relocation back to SNP. They believe that they have the perfect scenario in which to test the hypothesis.
They found, in contrast to the earlier study which saw the entire population disappear, that 12 month survival rate in handled wild dogs was 87.6% and in a group of 67 wild dogs captured and translocated back to SNP, held for almost a year in translocation enclosures, a long term stress situation, 95.5% survived over 12 months. Incidentally, most of the relocated dogs did not stay in the SNP, returning to the adjoining PA’s. One pack did remain but not in the former study area of grassland plains but rather in a rugged area just outside the original study area.
The team argues that the wild dogs in both NCA and LGCA have been subject to handling just as much as the original study yet have shown a population increase, secondly, there has been no repopulation of the original study area either naturally through dispersal nor through attempted reintroduction something that arguably should have happened if the only reason for the die off was human induced.
Instead they believe that the demise of the wild dog coincided with a rise in lion and hyena numbers on the plains of the Serengeti. Wild dog are killed by these predators but perhaps more importantly they also have their hard earned prey stolen from them by larger predators. Their theory is that the competition from increasing lion and hyena numbers as well as out breaks of rabies and canine distemper saw the death of some packs and the dispersal into the adjoining PA’s of the rest. Compared with the endless grassy plains of the Serengeti, the NCA and LGCA are much more varied terrain with a mixture of hilly, rocky areas as well as open woodland and open grassland. This kind of mosaic gives wild dogs a much better chance of avoiding larger more dominant predators and so their chance of survival is greater.
So could this be why we see no wild dog in our camera traps. Whatever the reason it highlights that even with what we imagine as well protected areas the space we have left for wildlife is minimal and to protect a wide range of biodiversity we should be doing more to protect a wide range of ecosystems and habitats.
If you want to read the Jackson paper you can find it here: https://onlinelibrary.wiley.com/doi/abs/10.1002/ece3.4798
There are links within the paper to the original Burrows work so you can get a feel for both sides of the story.