Elimination Drafting for Policy Debate
Debate seeding is broken. Top teams should draft their opponents in elimination rounds.
The format of debate tournaments is not optimal. In most tournaments, all competitors, including the top teams, must “rank themselves” for the purposes of the bracket. The preliminary rounds’ ultimate purpose is to create a fair bracket for the elimination rounds.
Debate is uniquely high variance in that there is no objective scoring, no universal understanding of what arguments are to be presented, and no methodological way for judges to comprehend those arguments. There is already plenty of controversy within sports with objective rules and point systems and professional referees whose sole purpose is to determine the winner. Judging in debate, by contrast, is an obligation, not a privilege. Generally, the experience necessary for being able to judge a debate is minimal: Essentially, judges just need to be out of high school to judge high school, or out of college to judge college. Debate experience is generally a prerequisite as well, but how meaningful that debate experience is drastically differs from round to round.
The prelim system is thus generally unrepresentative of the actual strength of competitors. 6 or 7 rounds is simply not enough, given the high variance and pure chance, to accurately seed competitors. There are two reasons for this:
(1) The law of large numbers.
Consider an example of a top, but not elite, team. Let’s suppose that they are at a tournament with six preliminary rounds. If they have a 70% chance of winning each prelim round, this chart reflects the probability of x wins at that tournament:
Presume this tournament clears all 4-2 teams and above. This team thus has a 74.431% chance of clearing, or for every 10 tournaments, they will clear at about 7. For every 10 tournaments, they will likely have 1 tournament where they go 6-0. So, in a normal competitive season schedule of 10 tournaments, they will not clear entirely in 2 tournaments and be one of the top seeds at one tournament. In a purely mathematical sense, not all 4-2 teams are created equal. This analysis doesn’t even account for the unluckiness of judging and pairings, which can increase the variance even further.
(2) “4-2 variance” and the failure of speaker points.
Let’s assume for a moment that all the teams that break are the best teams at the tournament, having x wins with whatever is most likely (in the above case, this team would have 4 wins). Even then, there would be tremendous variance within seeding. 34.375% of teams should clear at debate tournaments under the ideal case (this is P(x≥4) with 6 trials and p of success = .5--->ie., a perfectly evenly distributed binominal distribution with a µ of 3), with 23.4% of total teams being 4-2, 9.4% of total teams being 5-1, and 1.6% of total teams being 6-0. There will be more 4-2 teams than 5-1 and 6-0 teams combined. But there is no effective way of distinguishing those teams in the 4-2 brackets. Speaker points attempt to do this, but speaker points also represent other things about the team than their p of winning. For example, strategically mediocre teams (ie., p=.5) that have a lot of persuasion skills can be, if they are in the 4 brackets, unfairly seeded against strategically superior teams (e.g., the p=.7 case).
Not only do speaker points have objectives other than resolving the problem of separating intrabracket disparity, but they also involve variance---perhaps even more variance than winning rounds. In the TOC last year, I had a µ speaker points of 29.23, but a standard deviation, σ, of .324. That’s extraordinarily large in an activity where tenths of speaker points can determine not only speaker awards but team rankings (also, in a tremendous example of the variance of debate, we dropped ASPEC round 1 and got our points nuked). Speaker points mean different things to different people: a 29.3 is not a universally understood concept, and core questions like “Should average speaker points at the TOC be lowered to account for the higher quality of debater?” are unanswered. The law of large numbers still applies to speaker points, too, so speaker points are essentially attempting to resolve the problem of variance with another high-variance model.
The general solution to high variance models is to make them low variance by increasing n. However, that is logistically impossible in a debate tournament. While tournaments could push for 7 or 8 prelim rounds, the variance wouldn’t diminish to negligent degrees. That would require tens, if not hundreds, of trials. Additionally, these mathematically small differences in n make a tournament much more difficult for a debater: 6-7 rounds is the agreed upon “sweet spot” where debaters don’t get too tired. Thus, it has been “accepted” that variance is an innate part of debate tournaments, something that is to be expected and then “weeded out” until the finals. However, as was represented by 2023 Glenbrooks, the top three teams can hit each other in the octafinals (ie., top 16). Such issues are common with such a low number of trials.
The question then becomes: how can one decrease variance while keeping n the same? My solution: Draft elimination rounds.
A draft elimination system would entail having the highest seeds choose, in order, who they debate in elimination rounds. So, for example, in a doubles-breaking tournament (ie., 32 teams), the top seeds get to choose who they debate from a diminishing number pool. This process would continue until the finals. Such a system is common in esports and used in select cases in professional sports.
This method is logistically feasible for debate---a tournament could announce all breaking teams and their seeds, and then give each team 10 minutes to create a drafting strategy. Then, drafting could begin either physically or digitally on Tabroom. While this is harder to implement than automatic pairings based on prelim seeds, it allows the top teams to control their destiny. There are distinct advantages to such a system:
(1) It increases, in an immaterial way, the n value of the tournament. Teams have a perceptual understanding of how “truly” good another team is via past results---this understanding of the “true p” is more consistent and applicable to future results of the tournament than a selective snapshot of 6 or 7 rounds.
(2) It increases the incentive for elite teams to care about prelim rounds. Teams that are confident in their ability to clear don’t care about whether they’re the 6-0 team, the top 5-2 team, the second 5-2 team, etc. because brackets are basically random. They generally care about not being 4-2 so they don’t have a difficult first elimination round, but otherwise, there is no difference in difficulty given the randomness of prelim rounds and speaker points. But especially as the tournament goes on, the incentive to have the first pick is even more important. That strategic advantage would mean more exciting debates between 5-0 teams in round 6 of a 6 prelim tournament or 6-0 teams in round 7 or a 7 prelim tournament.
(3) It’s incredibly exciting. One can only imagine the “underdog” potential of the team first picked by the top team---if they won that round, it would perhaps be one of the most intriguing storylines of the entire season. The drama would be insane into the octafinal rounds: Imagine, for example, the effects on the coaches’ poll as the teams actually prove who deserves to make it further into the tournament, rather than having an easy or hard round to finals. Debaters trash talk, but in the shadows of Discord and Twitter DMs. Let’s make them say it out loud.
(4) It would incentivize specific strategies, particularly on the neg. Because of variance, it is difficult to determine before a tournament who to write a case neg to. But, if one knew that there would be more of a choice in who to debate, teams would be more incentivized to write detailed case negs as opposed to generics because the power is in their hands.
It’s difficult to think of disadvantages to this system. Perhaps it makes the lives of tournament organizers more difficult, but I don’t think it makes it impossible. For early adopters, it would add a flair to their tournament that might incentivize a bigger, better tournament, which could justify a higher tournament price (the hype factor of such a tournament would be greatly appreciated, at least as an experiment). Perhaps it would be “too competitive,” as if the debates that clear at national tournaments weren’t already. The biggest flaw might be that it could increase the K/policy divide, but (1) that seems non-unique and (2) good clash debaters on both sides could be incentivized to debate the other teams (my understanding is that most policy teams fear clash debates, and K teams would prefer them, so it would probably privilege K teams more---yay equality).
This has actually been done before, in some capacity. For many years, Emporia State hosted their college tournament with a “call-out” system in elims (George RR Pflaum on Tabroom). Unfortunately with the demise of their program, it seems to have fallen into obscurity.
It’s an interesting and fun experiment (I participated in it myself), but I don’t think it resolves your issues about the arbitrariness of speaker points, or brackets, but rather magnifies them. Nor do I think it results in many of the purported benefits.
If a bracket of 32 teams has 10 6-0/5-1 teams and 22 4-2 teams, that still relies on arbitrary seeding because 5 of the 4-2 teams (seeds 11-15, seed 16 would not draft) would have the advantage of drafting their opponents, while the others don’t. Which 4-2s (or the order of 5-1s for that matter) get to choose still rely on the tiebreakers such as speaker points, opponent win rate, etc., but unlike a typical bracket, it weights them more because teams that lucked out are arbitrarily granted more power, and teams that were just close, but not there, have zero power (16th, 17th seed for example).
I also agree with the framing of writing tailored, specific strategies generally being good, but I don’t think this really leads to that.
First, I don’t think it incentivizes top teams writing tailored strategies more than they already would. The top 10 teams are still going to prioritize their prep vs the other top 10 teams, because those are the teams they are likely to debate in either system. But they wouldn’t choose those teams for as long as they can—what’s the point of the top seed choosing another top 10 team in doubles? Clout, sure. But why risk losing in Doubles (in a bid round in HS, or too early to be considered for a First Round in college), when you could choose the worst team in Doubles, Octas, and then start debating the “best” teams in Quarters when you already would’ve started to.
Second, having this power doesn’t mean you could accurately predict who your elim choices will be either, which means the top 10 teams would be better off writing tailored strategies to other top 10 teams, and the lower seeds would never be able to accurately predict a.) if they’re in elims, b.) if they’ll be able to draft teams IF they are in elims, and c.) who their options will be. Meaning, it doesn’t really lead to either subset of teams writing tailored strategies or prep, because it would once again be up to the whims of speaker point / arbitrary seeding.
Third, it doesn’t really increase the quality of debates. Again, top teams choosing worse teams, etc., but that’d be even more true in clash debates—if policy teams despise the K, they’ll avoid those debates. If K teams prefer them, they’d look for the best team to exploit, leading to worse quality debates. For example, a lot of HS debaters are horrible at answering the K, or understanding it. K teams would never choose to debate any policy team that would have a slight advantage in those rounds. They’d always choose the random sophomores who will drop everything on the
flow because they don’t know what’s happening.
It would also be a logistical nightmare for tournaments. Online coin flips and strike cards are a great example—while simple in nature, there are always inevitably problems with teams entering their choices on time. I recently ran a tournament with both, and we had to manually enter strikes for multiple teams due to a missed deadline. Now take those problems to the level of drafting teams—just under half of the teams in elims would have to make sure they went on Tabroom at the right time, and drafted teams in an order they were happy with, all before a pairing could be blasted. Tabroom staff would have to make many, many manual fixes to make sure it’s correct in a world of teams make mistakes or miss the deadline.
Any solution to this would require a decent amount of time dedicated in a schedule per elimination round, adding to the length of already long schedules.
It would also give an unfair advantage once again to the top seeds—in the first elim, the top seed would know exactly who they are debating for longer than their opponents (up to an alarming amount of time given how much is allocated for a drafting period). This becomes even worse in later elims, particularly the first of a morning. The top seed would know they are the top seed, so they have all night to decide, while the team they choose would never know until that morning. While this already happens with current brackets, its at least equalized where both teams get the large amount of prep, because they both know they are debating.
Apologies for the length of this comment, I kept having more and more thoughts as I went along. This is by no means an attempt to say your idea is bad—I quite like it, and am super glad someone is at least trying to come up with a solution.
I have concerns about how this could effect debaters of racial and gender minorities. The reason is the idea of "Teams have a perceptual understanding of how “truly” good another team is via past results". I know debaters and structural inequalities well enough to know this translates roughly to "teams get picked based on their reps" - we can mention it would be off past records but its really off of their perceived good debate-ability. I believe this could both result in teams avoiding going against K teams which could even leave K teams being pitted against each-other earlier based off not getting chosen (K v K debates don't make as much sense to initiate, and policy debaters are often scared of the K especially antiblackness Kritiks) which is incredibly bad for minorities to be pitted against each other early leaving overall less late elimination rounds. Other possibility is that teams that are perceived as less strong because of being gender minorities or having a gender minority debater would be a more likely first pick by early picking seeds. It should be a rather agreed upon fact that there is a very large "dudebro" culture in debate especially among top debaters in every event and the disparity in wins of gender minorities vs men is glaring. If we are going by the premise that higher seeds are likely stronger debaters, this means stronger debaters get the pick of gender minority debaters first as well as debaters who are men are less likely to pick their friends who are mostly men to go against. This results in gender minorities having strictly harder routes to late elims. You could try and argue that the goal should be for them to win - but how is it fair to risk that gender minority debaters could get picked out to have harder brackets which could only hurt gender minority success more. Not to mention the embarrassment of being picked first likely being relegated to a gender minority only adds to this public perception.
These thoughts are early and scattered but I believe we should make sure we consider minority voices in how we structure tournaments especially given debate being so inaccessible already to minority debaters. It would be really bad if the first teams chosen were teams from urban debate leagues with less resources or teams who are at their first tournaments of the season. While this system leads to better incentives for top seeds possibly, it adds embarrassment and possibly active harm to lower seeds.