Notes on the Team Rankings

compiled by Jon Bruschke of CSU, Fullerton

WHY?

 

NOTE: These are my own thoughts and do not represent a community consensus.  There are many critics of this ranking system and many points they make are insightful and valid.  This discussion serves as an argument for system; its validity is certainly not above debate, and I welcome discussion, critique, and challenge, as a way to improve the system.  In the end, I believe the systems adds interest, especially in the JV and novice divisions, and I will defend it as a useful approach that is not, at present, a substitute for district or NDT rankings.

 

There are rankings of schools with NDT points and CEDA points, but no rankings of individual teams, except for the pre-bid rankings to the NDT.  The team rankings I offer are a supplement to those ranking systems.  With all tournament results entered into a single database, there are a number of calculations that can be made that would have heretofore been too labor intensive to try.  I offer them not as a definitively superior system, but as a starting point so that we as a community can start imagining new and different ways of measuring success.  One kind of cool thing is that it allows the comparison of Novice and JV teams in a way that has almost never happened heretofore. 

In 2002-2003, this system correctly predicted 15 of 16 first-rounds.  The team on this list that didn't get a bid was NYU GG, who won CEDA nationals and got to the octos of the NDT where they were also the 15th seed.

There are 2 basic calcuations:  One is a raw score that awards points based on a team's finish at a tournament and the quality of that tournament (called the "raw Bruschke score").  The second is a broader score that uses the raw scores and other information, like elim win percentages, record against the top 25, etc.  These latter scores I'm calling the debate RPI.

 

THE RAW SCORE: HOW IS IT CALCULATED?

 

The basic assumption of the system is that the best measure of tournament quality is tournament size.  The more teams, the better the tournament.  While this may not be a perfectly valid assumption, I will mention that, (a) teams go to tournaments that are well run, including the good teams, (b) tournaments with good competition tend to attract good competition, and (c) I don't believe that anyone can seriously name a time they went to a 200-team tournament that really sucked.  More on this in the next section, but for now I will mention that the system is only as good as this assumption.

 

The first calculation is to compute a percentile finish for each team at each tournament.  If there are 121 teams at a tournament, they are ranked 1-121, and assigned a percentile finish.  Obviously, first and second place are determined by who won the final round, third and fourth are determined by ranking the semi-finalists by normal criteria (prelim wins first, then adjusted points, then total points), fifth through eighth by ranking the quarter-finalists, and so on, finally ranking all the teams that didn't clear.  The team in first gets a percentile rank of 100%, and each team gets a percentile by dividing their finish by the total number of teams (team 17 at a 121-team tournament, for example, gets an 86% score).  This is, by the way, how the SAT is scored.

 

The next step is to have tournaments weighted.

 

For the total points score, each tournament is given a weighting.  The best-attended tournament of the year receives a weighting of 1.00, and each other tournament is weighted by dividing the number of teams at that tournament with the highest-rated tournament.  For example, if the best-attended tournament had 211 teams at it, a tournament with 113 teams would have a weighting of 113 divided by 211, or .54.  Points are calculated by summing percentile ranks at each tournament multiplied by the weighting for the tournament.  Here's an example:

 
Percentile Finish Tournament weight Score
84.23 1.00 84.23
90 .73 65.7
81.31 .65 52.85
97.3 .93 90.49
Total: 293.27

 

IS TOURNAMENT SIZE THE BEST MEASURE OF TOURNAMENT QUALITY?

 

Well, maybe not, I think that they're the best measure that we have.  See points A, B, and C above.  The system breaks down if a tournament has 100 crappy teams attend, but I again assert that as an empirical point that doesn't really happen.  The point is an empirical one, but I believe that at well-run tournaments teams start to attend, which raises the quality of competition, and then more teams start to attend to debate against the good competition, so that if you hit a tournament with a size of about 60 teams you have one where there's excellent regional competition with a good national draw and if you hit over 100 teams then everybody who's anybody is there.

 

WHAT ABOUT ROUND ROBINS?

 

I hate round robins.  The best teams get together to debate each other and get better and nobody else can get as good as those teams are getting because they can't get a bunch of consecutive rounds against good teams.  And invitations to round robins always leave out deserving teams, so you can't really say a team left out but deserving should miss out on the points they would have earned had they been invited.  All the same teams go to all the same major invitationals anyway and should be debating each other from octos on if they really are the best teams in the nation.  In my view, little is lost by not including them in ranking systems, and much is to be gained in community inclusiveness by not having them altogether.

 

In the current ranking system, round robins count the same as any other small tournament.

 

I will begrudgingly say that there IS a way that they can be incorporated into the current system.  Simply make the weighting for each tournament depend on two things instead of one:  The number of teams at the tournament AND the average point totals for the teams attending the tournament.  Each factor could count equally or some unequal weighting could be generated (tournament weights could, for example, depend 75% on the average points of teams in attendance plus 25% based on size).  This is not done here due mostly to my basic dislike for round robins, but analytically it poses no problem.  With the addition of the Win Quality Index (see below), wins at round robins count just like wins against any other good team.

 

WHAT ABOUT THE NDT?

 

It's true that the NDT is NOT one of the 5-6 largest tournaments of the year, and this system that weights tournaments solely by size may not weight it as heavily.  However, there are three points to be made.

 

First, it doesn't really matter what the rankings are AFTER the NDT.  Much as nobody cares about the coaches' poll when the NCAA basketball tournament is over (because you know who the national champion is and no longer have to rely on polls) you don't really need rankings AFTER the NDT because you know who the champion is.

 

Second, the NDT is NOT a small tournament.  It's weighting would be fairly substantial, although not determinative in end of the year rankings.

 

Third, there are at least two ways the system could be altered to incorporate the NDT (which are not used in the current system).  (a) The NDT could be assigned a weight equal to or ever larger than the largest tournament of the year.  Since the top weight is always 1.0, the NDT could be given a weight of 1.0, or even a weight as large as 2.0 (making it count twice as much as the largest tournament).  (b) If the quality of competition weightings, as described in the round robin section above, were adopted, the NDT weighting would shoot right up there.

 

CONCLUDING THOUGHTS

 

Undoubtedly, this project will strike some as elitist.  I guess in some ways it is.  My only defense is that we are pretty much an elitist activity at our core -- virtually every tournament declares a champion and calls one team better than all the rest.  We have at least 2 major tournaments to crown national champions.  Part (but only part) of the value of our activity comes from its competitive nature, and part of competition is that when someone wins someone loses.  Whether there is a Bruschke ranking system or not, there will be attempts to figure out who the best teams are at various points in the season, if only for invitations to the vice-ridden round robins.

 

Right now, those decisions are made in ways that involve either politics, gut instincts, or judgment calls.  What this system introduces is an attempt to find an objective way to rank the teams, one that doesn't rely on who you know, who you drink with, how you did last year, or what high school camp you went to.  It depends exclusively on how you've finished at the tournaments you've attended.  It isn't perfect, it won't correct the other imbalances in our community, but it represents a way to try to provide an equitable way to rank the teams during the course of the season.  I hope it will stimulate discussion of what our community is and what we should be about.

THE BCS OF DEBATE: The Debate RPI score

 

The raw Bruschke Points are an attempt to rank teams (rather than schools) based on tournament finishes and nothing else.  Basically, teams get points based on (a) their finish at tournaments, and (b) the tournament size.  The best you could do was win the largest tournament of the year.  I think they remain a useful index of overall team quality.

 

It is, however, possible to do quite well at a tournament while missing the bulk of the competition.  The move to use “opponent wins” as a tie-breaker for seeding has revealed that there is often a vast difference in the quality of competition that teams with equal finishes faced.  The debate RPI (in 2002 the Bruschke Points 2) compensate for this and add some measures of opponent strength.

 

The funky new addition is a “Win Quality Index,” which is a measure of the strength of your best wins.  It begins by selecting a number of wins to count based on a sliding scale depending on the number of debates that have happened at a given point in the season.  It finds the team with the most wins and makes 40% of their wins the number counted.  For example, if the team with the most wins in the country has 50 wins, 40% of that is 20 wins, and so the WQI counts everybody’s best 20 wins.  Opponent strength is measured by the original Bruschke Points (see the accompanying explanation for how those are calculated).  The function would sum the Bruschke Points for the 20 opponents with the most Bruschke Points.  In one sentence, the WQI is: “The summed opponent Bruschke Points for the best 40% of their wins.”  (Slightly inaccurate – it’s 40% of the wins that the team with the most wins has, not the team in question).

 

Like the original Bruschke Points, the WQI favors teams that debate more but does so in a less direct way.  If the WQI is counting the top 20 wins, a team with 50 wins will have a considerable advantage over a team with only 22 wins.  Nonetheless, if a team has only 22 wins but beat 18 of the top 20 teams in the country they have a shot.  I believe that this is as it should be – if you are debating less often, you need more quality wins to prove you belong at the top.

 

However, relying exclusively on a measure of opponent win strength disadvantages teams at the very top of the bracket, since they will have the best records and they can’t beat themselves.  In addition, since they are winning the most, they are bumping other teams out of the tournaments and lower their records in prelims, creating fewer opponent wins in traditional terms and a lower WQI in Bruschke Points 2 nomenclature.  What is needed is a measure that accounts BOTH for finishes at tournaments AND opponent strength, or at least a measure of how they fare against really good opponents.

 

With the WQI in hand, the rest of the Bruschke 2 points calculate fairly easily.  For five different measures, a team is given a percentile finish (which standardizes the units across measures) and then has them weighted as follows:

 

Original Bruschke Points: 35%

Win Quality Index: 35%

Elim round record: 10%

Record against the Top 25 Bruschke Point teams: 10%

Overall win percentage: 10%

 

The result is a scale where all teams are ranked between 0 and 1, with 1 being the highest score.