The misguided use of statistics as an indication of game balance

last edited 2024-06-06

Spellweaver was the first game I ever played competitively. I had many arguments about card balance on the forums, and I was always mad that the community upheld statistics as the supreme arbiter of good game balance. Other game communities aren't as deep into this perverse valuation as they were, but it's still a common attitude worth debunking.

Imagine the Dream Reactor devs look at their stats and see that players who play Cataclysm near the top of the ladder win 50% of the time. Is this a good indication that Cataclysm is not overpowered?

No!

It could be the case - and this is not far from how it was during part of my era - that there are two main decks at the top of the ladder one of which revolves around Cataclysm and the other revolves around another broken card. Both cards are overpowered and few other decks are seeing much play at the top, so a disproportionate number of matches are between Vamp-Lamp (the first tyrannical Cataclysm deck I remember) and Hatebears (its biggest competitor during that time). The stats show both decks - as well as the key cards in each - winning about 50% of their matches, so the devs - and the community - assume that the cards are balanced and aren't open to any reasoning showing how they're close to strictly better versions of other good cards.

Another example happened in Prismata, although to a lesser extent. In the days of old (5EE) Wild Drone, most top players agreed that Wild Drone was extremely player-favoring; in many sets its presence gave a large advantage to player 1, and in many other sets a large advantage to player 2. But the lead developer said he wasn't going to change it because "our stats showed that top-level Wild Drone games have a near 50/50 winrate between player 1 and player 2". I'm sure his stats did. Unfortunately such stats can't distinguish between a player-balanced unit and a coinflip. (To his credit, he did eventually change the unit.)

This is not to suggest that statistics are useless or that the devs shouldn't gather them. But beacuse they can't isolate the variable you're interested in, they're always inconclusive, open to interpretation, and don't provide a complete picture of what it's like to play the game. Pro-and-con comparisons of game elements and the experience of playing are the most important tools for balancing a game.