Unicorn Ranking System

Sorry, no new binary release. The last one can be found here.Sorry, no new source release. The last one can be found here.

The most stupid thing would be to reinvent the wheel on creating a ranking system for the votes collected by the DRM's nag screen. There are a lot of ranking system we could use, but we have to keep in mind that our ranking system behaves a little bit differently compared to most of them.

  • The web server determines what pictures should be ranked so we are able to influence the number of votes per image.
  • We do not collect numerical values, instead we collect relative information such as "Picture A is better than Picture B".
  • It is not similar to a poll, because not every picture had an equal chance of being selected (new images have less votes)
  • Besides voting it is also possible to report certain images.
  • Because we expect lots of images we need to create fair battles (so relatively bad images should still have a reasonable chance of being voted on). This is required because the top segment and bottom segment will also need a good relative ranking.

Ranking systems with similar properties are used in online games such as Startcraft 2 laddering. These systems are little bit more complex than what we need because they have to take into account that not everybody plays the same number of games or that some players have stopped playing. The idea of different leagues (Silver, Gold, Platinum, ...) makes sure that everybody gets a decent challenge. In the scenario that a player wins from somebody ranked higher, he'll get more points. Lets just use this laddering example as a basis for our ranking system and assume that voting is a 4 person free for all battle on which winning or losing is decided by the user making a vote. Unfortunately these games do not want to expose the exact algorithms they use so lets copy-past as much as we can and try to fill in the blanks. The rank of a certain picture is determined by the so called MMR (Match Making Rating) and how the MMR is influenced by a loss or a win is determined by it's opponents MMR and a certainty-value sigma that represents how certain our ranking system is on approximating the correct MMR. Remember these two names. We are going to talk a lot about MMR and sigma in the text below. Okay, let's put on our stupid-hat and start reinventing the wheel. Round wheels are overrated, let go for something more flashy. Spherical, yes, lets make it spherical. Each league has certain number of images in them and we need to try and keep them balanced. We will go for a system where we aspire to get the following distribution (that's right, copy-pasted from Starcraft 2):

  • Diamond (top 10%)
  • Platinum (10-25%)
  • Gold (25-45%)
  • Silver (45-70%)
  • Bronze (70-100%)

We will start by figuring out how to use the MMR and Sigma values to select 4 images that will battle each other. When we know how these values are going to be used we will try and find a nice way of actually calculating these values.

Server Selection and Client Voting

The server might create a promotion or demotion battle to restore balance, but the client will have no knowledge of this. To complicate things even more, the client downloads the images for the battle in a background thread, to be used for the next time the program starts. It is possible that a battle created by the server will take place days later or worse: not at all. When the server receives a result from the battle (= a vote on the most awesome unicorn), even the server will not know if it was part of some promotion/demotion battle. By the time the vote has been received the image-rankings might have changed in such way that demoting or promoting isn't even reasonable anymore! So the server just receives a vote from 4 unicorn images and depending on the current Leagues, Ranks and the League distribution it will decide to promote or demote one or multiple images. We will still create promotion/demotion battles to give the server a higher chance that the server is able to promote/demote an image when a vote has been received.

Server Selection - Forced Promotion/Demotion Battle

The higher a league-size discrepancy, the bigger the chance that a forced promotion/demotion battle will be created. The idea is to use a form of fuzzy logic to create a self balancing system. Lets figure out though an example what the 'best' method is. Lets assume we have 1500 images (wow, that would be so cool!) and the distribution is as follows:

LeagueDesired SizeActual SizeDiscrepancy
Diamond10% (150)141 (95% of desired)5%
Platinum15% (225)215 (95.55% of desired)4.45%
Gold20% (300)332 (110.67% of desired)10.67%
Silver25% (375)369 (98.4% of desired)1.6%
Bronze30% (450)443 (98.44% of desired)1.56%

We will allow a little bit of discrepancy to exist (it may resolve itself), but if the discrepancy gets too high (such as the Gold League), we need to create a couple of battles that will help the system in choosing to demote or promote images from one league to another. The league with the highest discrepancy is picked (Gold League) and if the discrepancy is above a certain threshold (lets say 7%) than we will create forced battles to resolve this issue. We could use complex math formulea for choosing to create a demotion battle or promotion battle, but this isn't really necessary. This is the fun part of self balancing fuzzy systems! We will not generate forced promotion or demotion battles for the Gold League 100% of the time, the more discrepancy the League has, the higher the chance that force promotion or demotion battles will be created. Lets make it a 1-on-1 dependency on the discrepancy percentage and see how this will work. Chance of a forced promotion or demotion battle to resolve discrepancy in the Gold League: 10.67% and the chance at creating a normal battle: 89.33%. Discrepancy is theoretically able to go over 100%, but for the forced battles we'll just cap this percentage at 80%. Hmm now that I think about it some more, there will be a lot of discrepancy at the beginning when there aren't many drawings. We should probably start with just one league and increase this number as the amount of images grows.

When a game requests 4 images for the next time the nag-'battle'-screen and the server decides you will receive a forced battle than the server will pick 4 candidates for promotion/demotion. Most of the time the server will have two options for promotion or demotion. But when the Diamond or Bronze league is too full, only one option is available. When two options are available one of them is picked with a 50% probability. There is no need to be all fancy with math to pick 'the best option', because this system is good enough to balance itself. To make sure this system works when we have very few images in the database, we also require that there are at least 16 images in the league we want to remove one of the image from. If this is not the case we will just default to a normal battle selection.

Selecting the four individuals reminds me of Tournament selection as used in Genetic Algorithms. In our situation we are working with a database (without stored procedures). We need to limit the number of queries to the database, while still have a non-deterministic selection process where the best or worst have a higher chance of being selected. We could use deterministic tournament selection, but I don't think it is very wise to do it on the whole League. Some images will have a low sigma-value (low certainty value) and it would be unfair to demote an image while it is still unknown if the image has the correct rating. To create the tournament we will create a selection with a query that looks something like this:

SELECT * FROM Images WHERE Sigma >= 0.9 AND LeagueId = 2 ORDER BY RAND() LIMIT 16

This will most likely select 16 images, but it may be less. To ensure we actually select 16 images we will continue executing this query while lowering the sigma value with steps 0f 0.1 until we have reached a selection that contains 16 images. It might take up to 10 queries to select 16 images, but as the database grows it becomes more likely that only 1 query is needed. After we have 16 images it is time for tournament selection. Tournament selection will look at the current rating of the image in the league. To quote Wikipedia on tournament selection:

choose k (the tournament size) individuals from the population at random
choose the best individual from pool/tournament with probability p
choose the second best individual with probability p*(1-p)
choose the third best individual with probability p*((1-p)^2)
and so on...

So k=16 and lets pick p=0.5. Why 0.5? Hmm don't know. It looks like that will give us a decent chance distribution.


It is still a somewhat arbitrary chosen number, but we don't have to fuzz about stuff like that. It will become an adjustable variable in some config file so making adjustments shouldn't be too hard. We could repeat this whole process 4 times to select all 4 drawings but that will take too many queries. Instead lets just use 4x4 tournament selections. So forget about the table above and lets use k=4 and p=0.75:

So this gives us 4 images of which it is likely that it will contain images up for promotion or demotion to restore the balance of the leagues.

Server Selection - Normal Battle

A normal battle selection should keep the league balance intact (if/when a winner is declared). We should obviously still allow the possibility of demotion or promotion, but for every promotion we should also demote an image. The most important part of a normal battle selection is to create balance in sigma values. This behavior clearly deviates from online game ranking systems, but it is something we can make use of, so we should. A normal battle has two possible creation methods:

  • Promotion Battle A promotion battle gives images the chance of being promoted to a higher league. This opportunity will only be given to images that are highly ranked within the league and also have a high sigma-value.
  • Internal Battle In internal battle picks 4 images from the same league that are somewhat equally matched. The outcome from this battle will most likely not promote or demote any image. Images with a low sigma have a higher chance of being selected for an internal battle.

Lets start with the promotion battle. By performing promotion battles.. lets see... 10% of the time, images get the chance of being promoted to a new league. By modifying this percentage we are able to fine-tune how stable the leagues are. In a promotion battle we will pick two images that will get a chance of getting promoted and two images of the next league that may get demoted. Lets use the top-15% of the League and the bottom 15% of the next League. To know what the top/bottom 15% of the Leagues are (without selecting them all) we'll make handy use of MySQL LIMIT functionality to select the minimum/maximum MMR value should be to be considered in the top 15%.

(SELECT MMR FROM Images WHERE LeaugeId = 2 ORDER BY MMR DESC LIMIT %15PercentOfGoldLeagueSize%, 1) UNION ALL
(SELECT MMR FROM Images WHERE LeagueId = 1 ORDER BY MMR ASC LIMIT %15PercentOfPlatinumLeagueSize%, 1)

The two numerical values we get by executing this query will give us the range for the images we need to select. We now have to pick 2 from each region that have a high enough sigma value.

(SELECT * FROM Images WHERE LeagueId = 2 AND MMR > %MinimumMMRForPromotion% AND Sigma >= 0.9 ORDER BY RAND() LIMIT 2)
(SELECT * FROM Images WHERE LeagueId = 1 AND MMR < %MaximumMMRForDemotion% AND Sigma >= 0.9 ORDER BY RAND() LIMIT 2)

We 'could' have done this in one query if MySQL was friendly enough to allow the use of LIMIT in sub-queries. Ah well. As software engineer you need to accept that some things cannot be solved in the most beautiful way. It is possible that we do not get 4 images in the scenario that not enough images in the bottom/top rank have a sigma high enough to be sure they should be considered for promotion/demotion. If that scenario we will default to the normal behavior of "Internal Battle".

Using all these fancy percentage and magic numbers that we pulled out of our hat (though I was going to say 'ass'?), we'll be seeing a minor side effect. When there aren't many images in the database no battle for promotion will be created, not even forced battles! Is this really a problem? In the previous forced-battle-paragraph we have already decided to start with one league and to increase this number when we have enough images. So when we increase the number of leagues with care it should all work out fine, probably.

Only one more battle selection to go, Internal Battles. These battles makes up for the gross majority of the votes that we will receive so we need to make sure that this selection will not create nasty side effects. The goal of these battles is to increase certainty of MMR values, so images with a lower sigma should have a higher chance of being selected. These battles give new images the chance of climbing the ladder of the current league, while not giving them an unfair advantage. There a three possibilities that the sigma value of an image is low.

  • The image is relatively new and has not receive many votes.
  • The image has just been demoted or promoted to a new league.
  • A losing/winning streak against images with unexpected results.

Because these battles will take place within one league we will start by selecting the league at random. Each league will have a chance of being selected proportional to the size of the league, with the requirement that the league must have a minimum of 4 images. Lets use an example for clarification:

LeagueSizeLeague Selection chance
Diamond30% (minimum requirement not met)
Platinum77 / 34 = 20.6%
Gold88 / 34 = 23.5%
Silver88 / 34 = 23.5%
Bronze1111 / 34 = 32.4%
TOTAL excluding diamond3434 / 34 = 100%

After a league has been selected, four images will be selected from that league to compete with each other. We could make a very complicated selection that even within the league images will get a fair selection (battles with images of similar MMR's), but I have the gut-feeling that this won't be necessary. We will just randomly select four images from the league giving higher chances to those with low sigma-values, but how can we achieve this efficiently? Something similar too tournament selection on the sigma-values? Sure why not. For tournament selection, p<0.5 will result in strange effects on the chances for the worst individual in the tournament, so it should at least be 0.5. If the tournament-size is 4, than with any p≥0.5 the chances of being selected for images with high sigma will be very slim. For the forced promotion/demotion battles this isn't a big issue, but for normal battles it is. Pictures with high sigma should still have a decent chance of being selected. For this reason we will use a tournament-size of 2 with p=0.6. Lets say there are 100 images in the league all with different sigma values. If every image had an equal chance of being selected than there will be 4% chance of being 1 of the 4 images. With tournament selection with k=2 and p=0.6 the image with the highest sigma value has a chance of 3.2% while the image with the lowest sigma value gets a chance of 4.8%. I hope this difference will restore balance in sigma values without giving too much bias. If not, i'll just tinker with the config values :)

We'll pick 8 random images from the league and perform 4x2 tournament selections, but it would be best if this method also worked if there are less than 8 images in the league. The selection will result in 4 to 8 images (hopefully 8) that we will split up in 4 groups of similar size. Each group will hold a tournament selection with p=0.6 resulting in the selection of 4 images. Some tournament-sizes may become 1 to account for a low image-count which will result in an automatic win in the tournament selection.

Server Selection - Logs and Fraud

The server selection has become quiet difficult and all that work just to create a nag screen for the DRM :). Collecting unicorn images is one of the more important motivations for me to start this project so this needs to be done right. There are a lot of variables in the selection methods described above and a lot can go wrong. It would be best to keep a log of all selections that the server made so we can figure out what is going wrong when the ranking is not working properly. We also don't want people to use fraud to submit a lot of fake votes so we also need to secure ourselves against fake submissions. It would be easy (for a programmer) to copy-past part of the vote-submitting code to a new project and put it in a for-loop to bump his own unicorn image to the top of the ranks. I think it is possible to combine the log and a fraud detection system in one simple MySQL table. Each time a new selection has been made by the server it will be stored in the Selection-table with a timestamp. This selection will be signed using my private key and both the signature and the selection-info will be sent to the cliënt (the game's DRM). When the cliënt makes his vote, the selection-info, signature and vote will be sent to the server. If the signature is fake we know that the user tried to vote on a selection that has not been created by the server. To prevent somebody for voting again-and-again on the same 4 unicorns we will mark a selection as 'vote received'. So if no selection exists in the database over a time period of 30 days (that has not already been voted on), we will simply ignore the vote. This system is nice and clean and allows us to purge selections made more than 30 days ago. So no IP-blocking (and logging) will be necessary. This system also lets the database-size scale linearly with the number of unicorn-images (instead of the number of votes).

Wow, I'm already over 3000 words, maybe I should stop. NO!! This should be done right! We're talking about unicorns. I will never forgive myself for generating an inferior voting system that would shame the existence of unicorns! All that text just for writing down my thoughts on the server selection process, there must be something seriously wrong with me. Didn't I stated somewhere that it would be stupid to reinvent the wheel of voting systems? Stupid, that must be the word that describes me best.

Client Voting - Calculating MMR and Sigma

It is time to start thinking about the actual calculation of MMR and sigma values. All this time we just used these values pretending they exist while we didn't even define how they come into existence. Lets define them in words again for good measure.

  • MMR An MMR-value is a number that represents your current ranking in the league. The bigger this value is (compared to the other images in your league) the higher you are ranked.
  • Sigma Sigma is a letter in the Greek alphabet used by scientists to come across as someone who is very knowledgeable. Sigma is a a value somewhere between 0 and 1 that represents the certainty that the system has on your current MMR value. If this value is 0, than the server has absolutely no certainty about the correctness of the MMR value and when this value is 1 the system is very certain that your current MMR value accurately describes your current position in the league.

To zero-sum or not to zero-sum, that's the question. Should each vote result in a redistribution of points, or should we just assign points to the winner? When we are not-zero summing we might end up with an unbalanced system where points keep growing and growing until some integer overflows. On the other hand if we only distribute points than it might be the case that the top x% of images will hog up all the points if the system is running for too long. I think I'll go for a zero-sum game, because the behavior of those systems are a little bit more predictable. I just need a way of preventing images from hogging up all the points, leaving the rest with 0. Lets see it like a poker-game where 4 players each bring their MMR-chips to the table.

  • We don't want this game to stop because the commission (rake) is so high that the dealer is taking all the chips away.
  • We also don't want the dealer to give out free additional chips to the winner because the game makes less sense when some players have a gazillion chips.
  • We also do not want anyone to run out of chips, or wait till one person has won them all. The game should be able to be played for all eternity.
  • When you have few chips and you win, you get more chips than somebody with lots of chips would have won. Conversely when you have a lot of chips, you don't get as many chips as somebody with less chips would have won.
  • When you have few chips and you lose, you don't lose too many chips as somebody with lots of chips would have lost. Conversely if you have lots of chips, you will lose more chips than somebody with less chips would have lost.

Seems reasonable, lets do this! Oh wait. What about images that have already received lots of votes compared to those who haven't? Isn't it unfair to punish/reward them equally based on MMR value alone? It would probably best to also use the sigma value in calculating how much points are won/lost. Arg I want to finish this post already, this has taken me too long. Okay the higher the sigma, the less points you will win/lose, but never to the point where your position in the league will be fixed for all eternity. Hmm, this in combination with a zero-sum game may become tricky, we need an example to see if this results in a reasonable point-system. I think the best way to go about doing this, is by dividing the formula in separate, manageable, chunks.

To continue using our poker-analogy, how much money (MMR-points) will each unicorn need to put in the pot to be allowed to play? Each unicorn 'must' play, but some unicorns don't have any money to spare. I've chosen the formula: [boooooring]. Who cares for the formula? This is the effect:

So if the normal bet-size is 0.01 (2% of the average stack-size of 0.5), the unicorns sigma=0.4 and the MMR=0.8 than the unicorn needs to put in 2.09 times this amount, so 0.0209 chips. If the sigma=0.9 and MMR=0.6 than you only need to put 0.0069 chips in the pot. In poker it's "winner takes all", but for our voting system it is better that the amount is determined by how expected or unexpected the outcome is. So how much chips is the winner able to take from the pot?

  • If the image ranked 1st wins than he may take 10% from the pot (and the rest will be given back to the 4 contestants proportionally to the amount  payed)
  • If the image ranked 2nd wins than he may take 40% from the pot (and the rest will be given back to the 4 contestants proportionally to the amount payed)
  • If the image ranked 3rd wins than he may take 70% from the pot (and the rest will be given back to the 4 contestants proportionally to the amount payed)
  • If the image ranked 4th wins than he may take the whole pot.

This system does everything I want. I'm not sure if every number and percentage will work exactly as I intended, but there are enough variables we can modify to tweak the overall algorithm. It should be noted that images may end up with an MMR value higher than 1, so for these algorithms we need to cap MMR to 1. Next up:  Updating Sigma-values.

There is no need to update sigma values as a zero-sum game, it's not even reasonable. There are 4 possible outcomes. The image ranked 1st wins, the image ranked second wins, the image ranked 3rd wins or the image ranked 4th wins. Actually it is also possible that all 4 images are reported and no one wins, but this situation is discussed below at "Reported Images". So there are only 4 possible situations and I don'f feel the need to define this behavior in an algorithm. Here is a cause-and-effect table:

ResultEffect on image ranked 1stEffect on image ranked 2ndEffect on image ranked 3rdEffect on image ranked 4th
Image ranked 1st wins:sigma += (1-sigma) * 0.10sigma += (1-sigma) * 0.06sigma += (1-sigma) * 0.04sigma += (1-sigma) * 0.02
Image ranked 2nd wins:sigma -= (1-sigma) * 0.01sigma -= (1-sigma) * 0.02sigma += (1-sigma) * 0.04sigma += (1-sigma) * 0.02
Image ranked 3rd wins:sigma -= (1-sigma) * 0.03sigma -= (1-sigma) * 0.02sigma -= (1-sigma) * 0.06sigma += (1-sigma) * 0.02
Image ranked 4th wins:sigma -= (1-sigma) * 0.05sigma -= (1-sigma) * 0.03sigma -= (1-sigma) * 0.01sigma -= (1-sigma) * 0.10

I just guessed on these calculations. I tried to come up with functions that are somewhat reasonable, but I can't be sure. This is one of those "Let's try it and see if it works"-algorithms. It's science! ... Stop judging me!

Client Voting - Promotion and Demotion

So far all calculations when a vote is received with MMR and sigma are done with the assumption that they occur within the same league. If the images are scattered across multiple leagues it's either because the server has created a promotion/demotion battle, or simply because the vote is from a very old server selection (although not older than 30 days). Votes received from battles with images spanning more than 2 different leagues are ignored. No battle will be created by the server that spans across 3 or even 4 leagues so receiving a vote from a situation like this should be very rare.

Most of the algorithms used so far are more complex than you might expect of a 'simple' voting system, but most of the complexity is to make sure that the system will not derail after a flock of votes (or is it called a swarm?). One important property of my voting system that I want to keep intact is: "For all images within the same league the average MMR is 0.5". If I'm able to keep this property, than balancing the voting algorithms becomes a whole lot easier. I have to keep this property intact even when we are promoting or demoting some images (which might sound counter intuitive). When we receive a vote that spans two or more leagues we will not shuffle around MMR values. The outcome of the battle will only effect sigma-values and possibly whether images should be promoted or demoted.

  • If all four images are from same league.
    • If the winner is in the top 15% of his league and has a sigma >= 0.9.
      • If promoting the winner would help in balancing the leagues.
        • Promote the winner.
  • If one image is from one league higher than the other three.
    • If he loses.
      • If he is in the bottom 15% of his league and has a sigma >= 0.9.
        • If demoting him would help in balancing the leagues.
          • Demote him.
        • Else if the winner is in the top 15% of his league and has a sigma >= 0.9.
          • Demote the image from one league higher and promote the image that has won.
  • If two images are from one league higher than the other two.
    • If the winner is from the lower league.
      • If the winner is in the top 15% of his league and has a sigma >= 0.9.
        • If the image from the higher league with the lowest MMR (of the two) is in the bottom 15% and has a sigma >= 0.9.
          • Demote this image from the higher league and promote the winner.
        • Else if the other image from the higher league is in the bottom 15% and has a sigma >= 0.9.
          • Demote this image from the higher league and promote the winner.
        • Else if promoting the winner would help in balancing the leagues.
          • Promote the winner.
  • If one image is in one league lower than the other three.
    • If he wins.
      • If he is in the top 15% of his league and has a sigma >= 0.9
        • If promoting him would help balancing the leagues
          • Promote him
        • Else if the image from the higher league with the lowest MMR (of the three) is in the bottom 15% and has a sigma >= 0.9.
          • Demote this image and promote the winner.
        • Else if the image from the higher league with the second-lowest MMR (of the three) is in the bottom 15% and has a sigma >= 0.9.
          • Demote this image and promote the winner.
        • Else if the image from the higher league with the third-lowest MMR (of the three) is in the bottom 15% and has a sigma >= 0.9.
          • Demote this image and promote the winner.

When an image is promoted/demoted he may not keep his MMR-points, that would be too unfair. An image that will be promoted will get an MMR-value of 0.2 with a sigma of 0.4. An image that gets demoted gets an MMR-value of 0.8 also with sigma 0.4. This results in changing the average MMR-value within leagues. To fix this we keep track, per league, how much additional MMR-points the league has or how many MMR points are lacking. We will again use fuzzy logic to pull the average towards 0.5. Every time images play out an internal battle they put chips on the table. If the some MMR-points needs to be taken away to restore the MMR-balance the server simply 'steals' 1% of the chips from the table. If the league is lacking a couple of MMR points the server just increases the number of chips on the table with 1%. The actual average MMR of a league may not be exactly 0.5, but it will be very close. Any change in this average will automatically restore itself. Fuzzy logic hooray!

Alrighty. All that is left are images that are reported as "Not a unicorn" or "offensive".

Client Voting - Reported Images

When a user votes on 4 images he's also able to report one or multiple images for offensive content. Images that are reported often should probable be removed from the selection process, but I'm feeling really lazy today. I should probably also create as system that images marked as 'slapware-approved' can no longer be reported, but that too is something for later. I will just store a record in the database that they have been reported and don't do anything with that information (for now). When I'm going to implement this i will add a link a the bottom of this sentence to the text where I discuss the implementation*.

* Still not implemented

What's Next

The whole C# part of of submitting votes and downloading battles is scheduled for the next update. The server-side is as good as done. When this system is up and running we need to keep a close eye on the battles created, the MMR-values and the sigma values. It is very likely we need to tweak some of the variables to make it work as intended.

Music + DRM Part 2Multithreaded Renderloop - 1