Spellweaver was the first game I ever played competitively. I had some arguments about card balance on the forums, and I was always mad that seemingly everyone in the community upheld statistics as the supreme arbiter of good game balance. I was kind of a fool myself back then and so I could never really articulate their fallacies as well as I can now. From what I've seen Spellweaver's community was partly an anomaly and other game communities aren't as deep into this perverse valuation as they were, but it's still a common attitude that's worth debunking.

Imagine that the Dream Reactor devs look at their stats and see that players who play Cataclysm near the top of the ladder win 50% of the time. Is this a good indication that Cataclysm is not overpowered?


Besides the question of how they're counting Cataclysm mirrors (I don't remember them ever specifying), it could be the case - and this is not far off from how it was during part of my era - that there are two main decks running around at the top of the ladder one of which revolves around Cataclysm and the other revolves around another broken card. Both cards are overpowered to the point that few other decks are seeing much play at the top, so a disproportionate number of matches are between Vamp-Lamp (the first tyrannical Cataclysm deck I remember) and Hatebears (its biggest competitor during that time). The stats show both decks - as well as the key cards in each - winning about 50% of their matches, so the devs - and the community - assume that the cards are balanced and aren't open to any reasoning showing how they're close to strictly better versions of other good cards.

Another example happened in Prismata, although to a far lesser extent. In the days of old (5EE) Wild Drone, most top players agreed that Wild Drone was extremely player-favoring; in many sets its presence gave a large advantage to player 1, and in many other sets a large advantage to player 2. But the lead developer said in at least one reddit statement that he wasn't going to change it because "our stats showed that top-level Wild Drone games have an almost 50/50 winrate between player 1 and player 2". I'm sure his stats did. Unfortunately for the community such stats can't tell the difference between a player-balanced unit and a coinflip. (To his credit, he did eventually change the unit.)

None of this is to suggest, of course, that statistics are useless or that the devs of a competitive game shouldn't gather them. Statistics are useful to quickly look at a large amount of data. But when you treat them as the supreme arbiter they aren't, they become actively counterproductive, because they can never truly isolate the variable you're interested in, and so they're always inconclusive, easily misleading, and never provide a complete picture of what it's like to play your game. Pro-and-con comparisons of game elements, and the experience of playing, are the most important tools for balancing a game.

(Neither of those tools alone is sufficient either because the experience of playing also doesn't isolate the variables you're interested in and provides a very limited range of personal experience, while out-of-game comparisons require weighing how much each factor is worth which is only reliable when your intuition is informed by experience and skill at the game.)

This page was last modified 2020 May 08, Friday, 00:56 (UTC)