The impact on lucky pairings in qualification matches

In larger tournaments with fewer qualifying matches per team, the impact of the random pairings can be pretty huge.

If you get paired with 6 great teams, you’re golden. If you get paired with 6 bad teams, there’s no way you’re making it into the finals.

The average score mechanism is fair, but subject to noise based on the impact of these random pairings.

The result: Sometimes, very bad teams make it into the finals (our team got lucky in Tournament 1 and ended up in the finals, even though our robot barely contributed any points in our Q matches—we were carried by other teams). Other times, excellent teams don’t make it into the finals due to bad Q pairings.

To see this most clearly, look at Tournament 3 here and the fate of 5125Y (not my team). They got the highest scores in Qualifying matches (107), and they won the Solo Skills with 82 diver points solo, but they didn’t make it to the finals due to bad pairings:

https://www.robotevents.com/robot-competitions/vex-iq-challenge/RE-VIQC-19-9862.html

So here’s a team that can contribute 82 points on their own, not in the finals? If the tournament is supposed to help us identify the very best teams, this is a huge problem.

How can we fix this?

What about other metrics for picking finalists? What about the teams with the best best scores? Looking at Tournament 3, out of the 10 teams with the best single Q match scores, only 5 of them made it into the finals. So here are a bunch of top-scoring alliances, that showed us they were capable of high scores, and they weren’t given a shot at the finals.

In fact, one if the alliances in the finals only got 40 points. How did that make team 5125Y feel? The team that could get 82 points as a solo robot?

Or what about another metric, the teams with the highest worst scores? Or maybe, to avoid the impact of one very unlucky match, the highest second-worst score?

Both of these metrics (best worst and best best) also involve pairing luck, but maybe a bit less. I guess it’s always easier to detect the real dogs than the real winners (real dogs will have low worst scores and low best scores… but we’re not sure about real winners).

Some kind of meta score is possible too, where you try to assess each team’s impact on the matches they participated in. Each team can have a rolling average, and the we try to see whether this other team pushed you above your own average, or below it. Did they help you or hurt you? Did you help them or hurt them? This would be an Elo-like way for scores to move up or down. The moving average could keep being adjusted throughout the Q matches, ans the rankings recomputed as more information comes in (so we keep adjusting the impact of past matches based on this new information).

The best teams would help everyone they played with to score above their own average. The worst teams would push everyone they played with to score below their own average.

The down-side to more complicated formulas is that it turns it into more of a “black box”, but I’m not sure people are really digging into the average on the tournament floor anyway. Elo is more complex for Chess and such, but no one seems to care that they don’t understand the underlying math.

Essentially, for each match, we can ask, “How surprising is this result?” If the result is surprising in either direction, the team’s score should change in that direction.

Currently, we’re doing that with the average, but we’re only factoring in how surprising the result is based on this team’s past performance, and not the other team’s past performance. Did you get a bad score because you did poorly (big negative surprise, score should go down a lot), or because you were paired with a team that always gets bad scores? (no actual surprise, score should barely go down)

A simple example:

Your average is 90 points.

You get paired with a team that has an average of 30 points.

Together, you score 70 points.

Wow, you raised them up 40 above their own average. +40 for you.

Ouch, they pulled you down 20 below your own average. -20 for them.

This of course makes playing with bad teams really valuable, and two good teams playing together mostly worthless (if we both have an average of 90, and we work together and get 90, do we each get 0 points).

It seems like any fixed, predetermined pairing might be susceptible to this, which leads us to the next idea: dynamic pairing during qualification. I haven’t done the math, but my hunch is that valuable information would come out of always pairing the best teams (so far) with the worst teams. Maybe even a regular average would work if you did this. Pair up randomly, and each pair plays one match. For the second match, pair up best with worst. Play another match. Repeat.

4 Likes

It’s hard but that’s the way it is. At our state we have 42 teams and the bottom tier is bad. The last 2 years we have been paired with bad teams but made it to the finals. Sometimes you have to tell the team to just sit there and you do the rest. This also happened at worlds where we were paired with the lower teams and our total score showed.

1 Like

Part of robotics is being able to carry your own weight and also pulling others as well. We have all been in the situation where we are paired with a team mate that might not perform as well as you. But these situations are what separated the great teams from the good teams. As iq is a cooperative game you either go up together or you go down together.

1 Like

Please, pretty, pretty please, never tell another robot to just sit there.

6 Likes

Aren’t a certain number of scores dropped from the calculation depending on the number of matches?

Yes, one out of every four.

I don’t agree with OP’s idea that the system needs to change, but if a robot can not contribute, I think good sportsmanship is to not hinder the other team. They are trying their best to win, too.

2 Likes

I’m sorry, I just can’t agree. Every robot can contribute something. How disheartening would it be if someone informed you that all of your hard work was worth nothing? That’s what’s happening in this circumstance…

4 Likes

Yes, I do now see that the lowest score is dropped from the average. This is explained on page 24 of this Game Manual:

https://www.robotevents.com/events/39567/uploads/5139/download

Dropping one score from an average doesn’t change anything. Dropping the lowest just inflates scores. Why not drop the highest? It doesn’t really address the core lucky pairing problem.

If you’re in a tournament with 43 teams, and you get paired up with only 6 of them at random, some people are going to get paired with the bottom 6 teams, and some people are going to get paired with the top 6 teams. Since this is completely random, and there is no “pole position” from previous tournaments, it is going to happen sometimes.

And when it happens, the result may very well be that the very best team present does not make it into the finals.

The more qualifying matches each team gets, the better. However, at a big tournament, this isn’t possible.

It’s also my understanding that the removal of the earlybird system this year encouraged our local tournament to grow in size so that local participants had enough spaces to accommodate them (because there is no longer a mechanism stopping non-local teams from signing up for all three of our local tournaments).

As engineers, I think this is a solvable problem.

It’s just that getting assigned six partners at random and than taking the average isn’t the right solution.

Here’s another idea: a tournament tree structure. The way you advance is that you are paired with four other teams, and six matches are run:

AB
CD
CA
BD
AD
CB

Each of the 4 team gets an average from this process, and the average is truly fair within the cluster. Depending on how much differentiation is present, some subset of these four teams (from 0 to 4 of them) advance to the next level. If there’s no differentiation, and the scores are all low, no teams advance. If there’s no differentiation, and the scores are all high, then all four might advance. If there’s differentiation, however, then the top cluster from these four will advance.

Then repeat the process as many times as necessary to get your list of finalists.

To avoid having teams get knocked out after only 3 matches, you could also run a losers bracket.

It’ll go one way or the other - dropping the lowest means teams aren’t penalised for one bad match or one bad pairing. Dropping the highest will stop teams being carried in one match, but won’t help these that are carried in all.
But at the end of the day, it is unlikely that a team will get carried high into the top 20 if the other teams at the event are all strong since a number of strong-strong pairing will massively boost these teams average.

I like the IQ system, it’s always worked well for me. Yeah, the odd team sometimes gets carried and they get to experience being in elims but it’s rarely a big deal. And a really well driven/programmed Clutch with a solid strategy can sometimes be a better partner than a badly driven mega-bot.

1 Like

How disheartening for a team that put in just as much hard work, maybe over several seasons to become quite accomplished, loses their chance to move on because a robot prevents them from doing their best, or even worse, descores? For example this year I’ve seen robots end up dropping green cubes out of the field in an attempt to score on the platform that would have been a guaranteed 20 points if left for the partner.

5 Likes

When I competed in Ringmaster, there were many instances where we have told a push bot type team to just move out of our way and stay put against the wall and hit the Bonus tray when there are 20 sec left. I do believe that this was appropriate as we ended the match with a score of 250. Which was great for both our averages. Having alliance partners just sit there is sometimes an effective strategy

1 Like

I’ve been there on the other end of that, where our robot and team were so bad, and we were teamed up in Q’s with a team who had a shot at the finals, and I told our team: Don’t touch your controller, just sit there.

In a previous Q match, our team was causing more harm than good, and in practicing with this team before the next Q match, it was clear that our team was just getting in their way. It didn’t seem fair: we were just going to bring this promising team down.

Fortunately, the kids on the other team were like, “No, please don’t. You don’t need to do that!” That was a moment of stellar sportsmanship on their part. In the end, they got 14th place and missed their shot at the finals.

Yes, their score with us was their lowest and was dropped. However, if they had been paired with a better team, a higher low score would have been dropped, and they would have been in the finals.

This is the fundamental nature of a random partnership challenge. It is actually what makes this whole thing interesting (it’s highly unusual in the realm of competitive anything to be partnered with one of your direct competitors).

However, the ranking formula needs to do a better job of teasing apart each team’s contribution.

I’m not at all worried about the odd bad team that gets carried into the finals.

I’m terribly worried about the odd excellent team, that would have won the whole darn thing, that gets skipped from the finals. In our most recent tournament, it is very likely that this happened (with the Skills winner and highest scoring Q-match team not even ending up in the finals).

Consider again the plight of 5125y (not my team, just a great example of this problem):

https://www.robotevents.com/robot-competitions/vex-iq-challenge/RE-VIQC-19-9862.html

And if bad teams DO get into the finals, that means they took the place of other good teams. So you can’t have one oddity without the other.

Let’s start with this premise - why? I think you are trying to view the competition through the lens of what you believe it should be rather than what it is defined as. You seem to have some notion of what makes a team “good” or “bad”, but maybe your personal definition isn’t what the designers of the game want or intend?

Take a look at the definition of the Teamwork Challenge, which is the name of the matches you are referring to:

In the Teamwork Challenge, an Alliance of two (2) Robots, operating under driver control, work together in each Match.

(emphasis mine)

The defined goal of these matches is to see how well teams work together. If RECF wanted to know how well each team can do on their own, they’d let each team compete separately. And, of course, they already do exactly that in the skills challenges! The goal of the teamwork matches should be focusing on teamwork. In my humble opinion, any team that asks their teamwork partner to sit still during the match is not engaging in teamwork - in fact I believe this could be considered a G1 violation (telling a “lesser” team to sit still is extremely disrespectful) and thus potentially subject to disqualification.

Now, all that said: 6 teamwork matches is not sufficient. Having only 10 finalists out of a 50 team event also seems insufficient in my opinion. I agree with those points. Events frankly need to provide more, either by adding fields, extending the day, or just by letting in fewer teams (and expecting that others will step up to host other events to distribute the load). None of those things can be done though without additional volunteers. Channeling @Foster here: have you been proactive in helping these events expand so that each team can get in more teamwork matches, either by rounding up volunteers, fields, etc. and/or hosting your own events? All VEX events are “for the people, by the people”, so volunteering to help expand an existing event or run your own event is the easiest way right now to improve the team experience.

Teams should be providing feedback to the events that 6 teamwork matches are not enough. If the events don’t change, then vote with your wallet (so to speak) and stop attending that event (but be warned that you might have to run your own instead to give your team(s) somewhere to play).

Discussing the rankings being unfair is a time-honored tradition in competitive robotics that’s been happening for at least 25 years. There’s nothing wrong with that of course, but the reality is that it is very unlikely that things will be changed. Even if RECF agreed with all your points here and immediately said “we need to change this”, it likely wouldn’t happen until the 2021-2022 season (because next season’s rules are probably pretty much wrapped up already). That’s why I say your best chance of improvement is to focus on getting your teams more teamwork matches, because most everyone agrees that the more teamwork matches you play the closer the rankings will be to what people seem to expect that they should be.

9 Likes

One of my favorite phrases is “It’s not about the robot”. It’s about all the other things that go on around the robot. Collaboration, brainstorming, design. communications, inspiration, math, physics, strategy, planning, retrospectives, changing directions, engineering, programming, etc.

More than 6 matches is easy peasy. IQ fields are small, take <10 mins to set up, game elements are cheap and 1/2 the parts are legal components and can be reused to build robots next season. Each one of your clubs should be able to bring a field and elements. Put them on the floor if you don’t have tables.

We run 24 team event with 4 tables and we get 1 minute turn times. Roboteers are just exhausted after playing 12 matches. With 12 matches the odds of being with just lower performing robots is less. Plus there are 3 low scoring matches that come off the averages.

My big 60 team event runs on 8 fields, divisions are your friend.

Your great roboteers are missing a chance. What can THEY do to help other teams? Help put a hook on the front of the claw to help lift green cubes onto platforms?

I’d be very sad to say to a team “sit there, don’t do anything”. There isn’t any way that can inspire roboteers.

5 Likes

If 6 matches are too few, and that is indeed the root of the problem, then VEX tournament rules should cover this, specifying a minimum fraction of other teams present that each team must be paired with in Q matches.

For example, last year, the same tournaments had a cap of 24, and each team got 8 qualifying matches. That meant you were paired with 1/3 of the other teams, at random, which is a beefy sample.

This year, there were 43 teams, and each team got 6 qualifying matches, meaning they were paired with roughly 1/7 of the other teams. The sample was more than twice as poor as last year.

Looking at the rules, I can’t find anything preventing each team from only having one qualifying match each.

Yes, we can “vote with our wallets” and avoid tournaments that have the most random outcomes. However, so many other details are specified and controlled by the rules (for good reason) that this factor might as well be specified and controlled too.

Also, the local tournament in question runs a very tight ship and a very WATCHABLE tournament all the way through. That seems to be their focus. All the Q matches are held in a stage-type area with stands nearby for the audience. The kids aren’t off in some corner running Q matches that nobody sees. It’s all happening front-and-center.

They have 3 teamwork fields up there, but they don’t run the matches in parallel. They run one table at a time, round-robin, and manage one match every 2 minutes, 30 matches per hour, and stay on schedule all day.

With such speedy turnaround times, how do teams do pit repairs, change batteries, get practice and skills runs in?
We recently participated in an event with 27 teams, running 3 fields, and the drivers practically spent the entire day in queue. If there were robot repairs to be done, it was on a lap next to the field. They only got 3 of their 6 skills runs in before the skills fields closed.
I’m all for maximizing play time, but when VIQC has sideshows going on, it’s hard to have the best product on the field at all times, and even harder to evolve a robot or strategy at an event.

The pits are right next to the fields. We run one match at a time and they seem to have time to fix things. TM lets you bypass matches, so we can skip a match to wrangle a team to the table and run them later. (Ie A, B, Skip C, D, A then C then B, C, D, …)

Remember 1 min turn times are from match to match, the actual gap between a robot’s matches may be 8-12 mins.

Skills are run twice a day on the same set of match tables, the only difference is we run all 4 tables at the same time. (Shotgun start)

1 Like

Foster, sounds like you have 4 tables, but are only running 1 match every 2 minutes, or 30 per hour, just like our local tournament (they have 3 tables and are running one match every 2 minutes).

The difference on your end is simply fewer teams (24 vs 43).

Again, I emphasize that our local event felt forced to raise their entry caps this year (up to 50 from 24 last year) because of the death of the Early Bird registration option this year. They wanted to make sure there were enough slots for all their local teams, even if remote teams jumped the gun and registered for all three tournaments.