If you keep up on the gaming press, you’ve probably read about various industry figures arguing about next gen costs and quality. There’s Midway marketing guy Steve Allison arguing that 93% of new IP games fail (followed by this retort), Blast Entertainment CEO Sean Brennan decrying next gen console development costs, and industry analysts divided over the amazing success of the Nintendo Wii in spite of its comparatively underpowered graphics hardware.
Developers and marketers alike are struggling with the idea that the costs associated with next generation game development may make it unprofitable, which I’ve ranted about before. Next gen is a high-risk environment right now, and as previously discussed, risk means a dearth of innovative or niche games.
Game companies have a few problems. First and foremost, developing games for Xbox360 and PS3 is way expensive. Secondly, developing for the Wii is easier and cheaper, but it means you have to compete with Nintendo’s first party games, which are always powerful market forces. Finally, since development of any next gen game will take several years, companies are having to place a bet on which consoles are the most likely to be the most profitable in 2009. Of course, it’s possible to do games in less time than that, but usually the quality suffers dramatically. So the real question here is how much can quality suffer without impacting sales? Some people believe that there’s no correlation between quality and sales, and thus think that the way to make money is to make things that are easily marketable (read: licenses). Game developers themselves usually argue that sales above a certain level require a game to be sufficient quality. I decided to see which of these perspectives was correct for the Playstation 2 era.
Quality vs Sales
It is time for some graphs. What I did is take sales data from December 2006 for every PS2 game released in North America and correlate it with the score for the same game from Metacritic.com. There are about 500 games that were thrown out because they were not released in North America or because they had no Metacritic ranking. I used the remaining 1281 games in the data set to look for some correlation between game quality (as defined by Metacritic’s score) and sales.
This graph shows the data as a whole, with number of units sold on the horizontal axis and Metacritic rating on the vertical axis. Most of the data is scrunched way over to the left side of the graph because most games sold an order of magnitude fewer units than games like Grand Theft Auto 3. You can see a curve in the data, though: as more units are sold, fewer and fewer games rank below 80%. This initial view is encouraging; it seems to suggest that there may be a correlation between sales and quality after all. To get more information we need to cut out the outliers. Capping our graph at two million units shows the curve a little more clearly.
There is clearly a trend in this data: no game that sold over 1 million units scored less than 60%. Though the distribution between 60% and 90% is fairly random, the lack of titles below the 60% range after the 1 million mark means that really bad games have an upper bound of sales regardless of the marketing or license applied to the title. This graph has some problems, however. Most of the data points are still clumped together on the left side of the graph, an area we can call the Problem Zone. It is a problem because the number of points is too dense to tell what is really going on in there. We need to zoom in further.
If we limit the view to 800,000 units, the Problem Zone gets a little clearer. We can still see the trend towards higher scores as the units increase, but the distribution also becomes more random, which means that some games with rankings in the 50% range are still able to sell around 500,000 units. This may be where marketing is flexing its muscles, though we can see from the overall shape of the graph that marketing alone can only take a game so far. Even at this resolution the Problem Zone is pretty noisy, which means that there are a lot of games that did not sell very well on PS2.
This is a close-up of the Problem Zone. We can see that at this range (between 0 and 300,000 units) the distribution of scores to units is almost random. All of the games in this range sold fairly poorly (though some may still have been profitable depending on how much was spent on development), and they represent the entire spectrum of scores. The randomness of this distribution means that within 300,000 units, the marketing people are absolutely right: there is no correlation between sales and quality. We see plenty of 90% and higher games that sold just as well (or as poorly) as games that scored below 50%. What this graph does not tell us is why bad games sold–it only shows that Metacritic rating was not a major factor.
The other interesting thing about this last graph is the number of data points. The graph is much, much denser than the ones before it, which means that most PS2 games fall within this range. In fact, even though we have zoomed way in to look at the Problem Zone, we can still see a cluster below the 100,000 mark. This means there are a whole lot of games that never even moved 100,000 units, making them almost certainly financial failures.
So what does all of this information mean? Here are my conclusions:
- Any game can fail, regardless of its quality. There are a great many games at the low end of the graph, and some of them received extremely high scores. Making a high quality game is therefore not an automatic guarantee of financial success.
- However, bad games have a much more difficult time succeeding. While making a high-quality game does not assure that a lot of units will sell, making a low-quality game does guarantee that the maximum number of units sold will be limited.
- There are no bad games that sold over a million units.
- Of the 19 games that sold over two million units, only one received a score of less than 80%.
- If we assume that game scores are assigned independently of marketing budgets, we can see that there is an upper bound for marketing’s influence on sales; if there were no bound, we could expect to see many more bad games selling past the 500,000 units mark.
- However, I think we can also assume that even good games cannot succeed without excellent marketing. If sales were driven by quality alone, there would not be any high-scoring games in the Problem Zone portion of the graph.
- A huge number of PS2 games games (about 45%) failed to ship more than 100,000 units.
This means that there is a correlation between game quality and sales which can be stated thusly: bad games do not sell. This does not mean that good games always sell, just that bad games cannot be saved by marketing. The data also suggests that the games that sell the most have to not only be really good, they also have to be marketed heavily. The conclusion is not that marketing is irrelevant, only that its powers are limited without the help of high quality game play. Developers who want to sell units should be striving to make good games if only because quality will allow their marketing department to actually be effective.
Since I went to all the trouble of compiling this data, I figured I can get a few more graphs out of it before this article is done.
Here is a comparison of unit sales. As stated in the previous section, a lot of games failed to ship over 100,000 units, and about 80% of titles released for PS2 shipped less than 300,000 units. Depending on the cost of development and the sticker price of the game in stores, these games likely generated very little profit, if any. By next generation budget standards, these games are all abysmal failures.
This graph compares the average number of units sold against score ranges. It suggests the same conclusion that we came to above, but it is a little less accurate because it deals with averages (especially in the 90th percentile, where the GTA games really bias the result). Still, the message to developers should be clear: good games have a much better chance of selling than bad games.
This last one is the distribution of rankings across all 1281 games. This is probably more of a commentary on game journalism than anything else. It shows that most games score in the 70% range in aggregate, and that there is almost a bell curve with 75% at the peak. Ratings lower than 60% are generally meaningless, as all the reviewer needs to communicate to the reader is that the game is not worth buying.
What I think is interesting about this graph is the drop off between the 70% percent range and the 80% percent range. Many game developers believe that game reviewers subconsciously abide by a rule called the “80% Divide,” which stipulates that a game must impress the reviewer in some way to achieve a rating of 80% or higher. If the game has no major flaws and yet fails to impress the reviewer because it is not “new” enough, it will often receive a score of 70%. Games that are broken in some way but have some impressive aspect or feature can still make it into the 80th percentile (like Indigo Prophecy, for example). This graph seems to suggest that this “80% Divide” represents a real bias amongst the game journalism community.
So there you have it. I hope that this article adds to the interesting debate between game industry pundits about how games should be created, marketed, and sold for the next generation. I do not claim a side in this argument, but this research suggests that neither marketing nor the developers are wholly responsible for driving sales. That said, it also suggests that it is in the best interests of game developers to make high quality games.