Plurality algorithm

On Wednesday, I wrote about how well the simple algorithm I came up with does against the experts. The algorithm looks for species that have more than 50% of the votes in a given capture (i.e. species that have a majority). Commenter Tor suggested that I try looking at which species have the most votes, regardless of whether they cross the 50% mark (i.e. a plurality). It’s a great idea, and easy to implement because any species that has more than 50% of the vote ALSO has the plurality. Which means all I have to do is look at the handful of captures that the majority algorithm had no answer for.

You can see why it might be a good idea in this example. Say that for a particular capture, you had these votes:

10	impala
4	gazelleThomsons
4	dikDik
3	bushbuck

You’d have 21 votes total, but the leading candidate, impala, would be just shy of the 11 needed to have a majority. It really does seem like impala is the likely candidate here, but my majority algorithm would come up with “no answer” for this capture.

So I tried out Tor’s plurality algorithm. The good news is that 57% of those “no answers” got the correct answer with the plurality algorithm. So that brings our correct percentage from 95.8% to 96.6%. Not bad! Here’s how that other 3.4% shakes out:

So now we have a few more errors. (About a quarter of the “no answers” were errors when the plurality algorithm was applied.) And we’ve got a new category called “Ties”. When you look for a plurality that isn’t over 50%, there can be ties. And there were. Five of them. And in every case the right answer was one of the two that tied.

And now, because it’s Friday, a few images I’ve stumbled upon so far in Season 5. What will you find?

Tags: algorithm, error, majority, plurality

About Margaret Kosmala

I am an ecologist exploring the complex dynamics of plant and animal systems. I am especially interested in understanding how species communities change over time and how humans impact them.

View all posts by Margaret Kosmala »

« Previous post

william garner says : June 9, 2013 at 2:30 pm
When do you know when the algorithm is ‘good enough’? Or, what is the criteria for success? Having a few percentage points difference in these judgments seems logical given the difficulty of correctly identifying some of these images.

Is there some way to go back to a previous image & correct errors? Its happened several times that I accidentally hit the finish key before identifying a separate species.

Reply
- Margaret Kosmala says : June 10, 2013 at 12:54 am
  Hmm… that’s a good question. We’ve got to think about scaling up. So if we have a 4% error rate and a million captures, then that means that we have 40,000 errors. So I’d like to see if we can get much smaller, under 1%. We also have to take into consideration what the errors are. If they’re because zebra and wildebeest are being excluded here and there, then that’s not a big deal. But if we have a high percentage of errors in, say, leopard images, that’s a problem because we don’t have that many leopard images in total.
  
  No, there’s no way to go back and correct errors, I’m afraid. But we may think about putting borderline images back in for additional classifications if we think that will help. We’ll see…
  
  Reply

Trackbacks / Pingbacks

Certainty score | Snapshot Serengeti - September 11, 2013
Better with experience | Snapshot Serengeti - October 9, 2013
Summary of the Experts | Snapshot Serengeti - October 16, 2013
Getting good data: part 1 (of many) | Snapshot Serengeti - January 29, 2015
Hot off the presses – get your “good data” right here! | Snapshot Serengeti - May 10, 2016

Plurality algorithm

About Margaret Kosmala

7 responses to “Plurality algorithm”

Trackbacks / Pingbacks

Leave a comment Cancel reply

Welcome

Recent Posts

Tweets

Categories

Archives

Facebook

Plurality algorithm

Share this:

Related

About Margaret Kosmala

7 responses to “Plurality algorithm”

Trackbacks / Pingbacks

Leave a comment Cancel reply

Welcome

Recent Posts

Tweets

Categories

Archives

Facebook