OK, so here is the story. I’m a coach for one of the robotics teams and, back in March/April, was forced to “stay at home” due to the virus. i started working on a VEX rankings system. I remembered an ELO ranking system years ago so I thought I’d start there.
Long story short, I have a system that grabs the events from RobotEvents and imports the match results into a database. i have modified standard ELO rating system (https://en.wikipedia.org/wiki/Elo_rating_system) a bit and I think this system is pretty good - although there is no doubt that I will continually make tweaks to the formula.
I’m not a programmer and certainly not a web designer (hence the ease of Wix), but it’s been kind of fun.
Yep, very familiar with TBA and, yes, VEX needs something like that. It’s been fun - we will see how long it’s fun. Had to learn PHP and SQL to do it all.
One note - I counted all matches, but not any scrimmages. If anyone thinks I missed some, let me know. Sometimes it may be because they aren’t posted on RE when I look, then I may miss them later. I think there are a handful of events this weekend so as more and more data comes in, we will see how it shakes out
What I like about it is that it assumes performance is a function of a normal distribution. As one competes more, the “certainty” of the rating increases, making a tighter band. It also doesn’t over-reward teams that compete significantly more than others.
To address Meng’s concerns:
By tracking teams over all of their competitions, it builds up how competitive a region or tournament is, assuming that eventually there aren’t many islands. Obviously not entirely true in Vex, particularly in covid times. Certainly within the US, this approach should produce reasonable results.
Per above, TrueSkill only increases the confidence interval of the rating for teams that participate in a lot of competitions.
Very true; would be interested to know what percent of tournaments are not listed in RE. As far as I know, the US has a high percentage of tournaments in RE. It would be wonderful if all tournaments had a public record.
True. This is where FRC is ahead of Vex. The FRC data available from TBA provides a breakdown of individual scoring elements (e.g. in last year’s game, FRC would have provided the number and color of each cube stacked and in towers). Over enough games, one can reasonably infer what type of robot a team has and better predict compatibility of robot team compositions.
The point of a system like TrueSkill (and ELO) is precisely to predict matches. Both produce confidence intervals in how likely one team is to win.
Agree with you about #2. That’s why I’m not using a straight ELO system. It was very obvious that teams that just play more were higher (and lower) ranked. I’ve accounted for that, but is it accurate? Need more data.
Not accounting for tournaments not in RE or scrimmages … oh well.
As for the “strength of the region”. With enough data, it should as long as there are some matches outside the region.
Well… hope you guys will understand that where i am coming from, i will always need to take a global perspective… Singapore is just too small.
I do understand how ELO ranking work, etc… and yes, i know it gets more accurate with the prediction with more data.
But there lies the biggest bugbear i have with such system.
Regions that do not have lots of tournaments will be most likely flying below the radar.
And in order to have a accurate ranking between regions, you do need quite a fair bit of cross-region tournaments data.
So, yes… it might work well for USA, but don’t expect it to be useful for international teams. And I wouldn’t rely on it for worlds as well.
Perhaps this year with the advent of Remote Live Tournaments (which should be assessed separately from In-Person Tournaments), there will be opportunity for Singapore teams to compete more, and more widely (timezones not-withstanding).
I don’t think anyone is holding this, or methods like it, as some sort of Holy Grail of who the “best” teams are. There’s always going to be disagreement on that.
I’m taking this at face value - someone spent some time understanding how to get a decent data set out and tried to apply some “data science”-y techniques to it. As the saying goes, “All models are wrong. Some models are useful”, this seems interesting. Having seen a number of the teams ranked highly under this methodology, I agree with some and disagree with others. Probably much the same had we compared notes at Worlds
I am not against the person doing this… I am just highlighting the limitations and giving caution about relying too much on this sort of ranking system.
In fact, i always wish some of my students will take up the challenge and do something of this sort… even just for fun.
And i have seen quite a number of teams that depends so much on it that they neglected the usual scoutings, etc.
of course, no algorithm will ever beat good manual scouting, but this is still a cool list that could give a general idea of which teams are the best each season. Not particularly useful this year with the very irregular tournaments and many regions not competing at all, but interesting.
Updated after seven events this past weekend. PigPen gained the most points after winning the Cornerstone event. Dutch barely hangs onto the top spot after losing in their final. Revelation would have taken it from them if they had won their final for sure.
Updated after the NOV-7 weekend. 10 events! 120+ new teams. I changed the formatting of the W-L-T to be centered for better readability. 354 teams now that have played. I’ve only posted the Top 100. This wasn’t really meant for “let’s see who is the bottom team(s)” thing.
Lot’s of movement in the Top 15. Dutch loses its top spot (didn’t play). Gears take that spot. Freedom Gladiators takes the biggest jump - all the way to #2 with a big win. Iron Pride C and D teams debut at 11 and 13.
Updated after the NOV-14 weekend. 78 new teams … now up to 432 teams so I’ve increased the display to show the top 150 teams.
A couple of notes: I got a couple of emails about records not being correct for a few teams. So after auditing it, I found a system (my) mistake where it could skip a match. Also there was one early event in Michigan where half of the event counted twice - so I fixed that.
One other minor tweak was made to the formula to, again, adjust the amount of points that teams gain/lose. Teams were gaining too much just by playing more. Playing more is a good thing and likely helps you, but as you get higher in the standings you SHOULD beat lesser ranked teams and perhaps by a lot.
A new #1 team - Hopkinetics Hydra. 15-0 and pretty much destroyed everything they’ve played.
Hard to adjust for regional difficulty, but it is still a fun list and looks reasonably accurate.
I’d like to figure out how to do a search, but haven’t got there yet. I stopped posting the complete list because people were focusing on the “worst teams in the world” and that isn’t the point. With 432 teams, maybe I’ll expand that to 300 though.