On Wednesday, I wrote about how well the simple algorithm I came up with does against the experts. The algorithm looks for species that have more than 50% of the votes in a given capture (i.e. species that have a majority). Commenter Tor suggested that I try looking at which species have the most votes, regardless of whether they cross the 50% mark (i.e. a plurality). It’s a great idea, and easy to implement because any species that has more than 50% of the vote ALSO has the plurality. Which means all I have to do is look at the handful of captures that the majority algorithm had no answer for.
You can see why it might be a good idea in this example. Say that for a particular capture, you had these votes:
You’d have 21 votes total, but the leading candidate, impala, would be just shy of the 11 needed to have a majority. It really does seem like impala is the likely candidate here, but my majority algorithm would come up with “no answer” for this capture.
So I tried out Tor’s plurality algorithm. The good news is that 57% of those “no answers” got the correct answer with the plurality algorithm. So that brings our correct percentage from 95.8% to 96.6%. Not bad! Here’s how that other 3.4% shakes out:
So now we have a few more errors. (About a quarter of the “no answers” were errors when the plurality algorithm was applied.) And we’ve got a new category called “Ties”. When you look for a plurality that isn’t over 50%, there can be ties. And there were. Five of them. And in every case the right answer was one of the two that tied.
And now, because it’s Friday, a few images I’ve stumbled upon so far in Season 5. What will you find?
7 responses to “Plurality algorithm”
Trackbacks / Pingbacks
- September 11, 2013 -
- October 9, 2013 -
- October 16, 2013 -
- January 29, 2015 -
- May 10, 2016 -
When do you know when the algorithm is ‘good enough’? Or, what is the criteria for success? Having a few percentage points difference in these judgments seems logical given the difficulty of correctly identifying some of these images.
Is there some way to go back to a previous image & correct errors? Its happened several times that I accidentally hit the finish key before identifying a separate species.
Hmm… that’s a good question. We’ve got to think about scaling up. So if we have a 4% error rate and a million captures, then that means that we have 40,000 errors. So I’d like to see if we can get much smaller, under 1%. We also have to take into consideration what the errors are. If they’re because zebra and wildebeest are being excluded here and there, then that’s not a big deal. But if we have a high percentage of errors in, say, leopard images, that’s a problem because we don’t have that many leopard images in total.
No, there’s no way to go back and correct errors, I’m afraid. But we may think about putting borderline images back in for additional classifications if we think that will help. We’ll see…