Ratings Changes

Somebody set up us the skull

Ever since I started thinking about what game scores really mean, I’ve been bothered by the warped scale that the game review industry seems to use. If you follow sites like metacritic.com or gamerankings.com, you have probably noticed that most games fall into a very small range of scores. There’s no difference, for example, between a score of 30% and a score of 40%; both mean that the game is terrible and you shouldn’t buy it. On the other hand, there is a huge difference between 70% and 80%; one means that the game is so-so and the other means that it’s very good. Even the delta between 79% and 80% is enormous. And finally, ratings over 90% are just fodder for angry fanboys; is a 96% game really perceptibly better than a 95% game?

In short, the ratings curve for video games is extremely warped. Interestingly, aggregate review systems for other types of media don’t seem to have this problem; a review of 50% on rottentomatoes.com means what you would expect: that the film is throughly average.

Anyway, I had been thinking of changing my scoring system on this site to better reflect what I think reviewers mean when they select scores. I wanted to get away from actual numbers (too granular) and compress the range of possible scores down into a smaller set. While I was thinking about how best to do this, 1up.com switched from using numbers to letter grades. Letter grades seem like a pretty good system, but I’ve decided to go for something even less granular.

As of now, all games on this site are rated with a system that I ripped off from Leonard Maltin: 4 stars or a skull. I’ve detailed the individual ratings in the FAQ, but generally, four stars means “awesome,” one star means “just ok,” and a skull means the game is terrible. I think that this system will better communicate the value of individual games without getting too specific about how many percentage points better one game is vs another. And, in following with 1Up’s policy, the internal table for converting between my star rating and the warped scale used by the rest of the world will not be published.

I’ve changed the Horror Games and Reviews pages (and some others) to reflect the rating system changes. Check it out and let me know what you think.

14 thoughts on “Ratings Changes

  1. I’ve never really understood the percentage version of a score. It’s so large that it’s hard to ‘get’ the the difference between a 76% and a 78% game.

    I knew someone who worked at a games magazine and explained the whole system to me, but to be honest, the end result always turned out vague. A tally of guesswork. Hence, why they picked scores out of 10. Out of 10 itself is, again, vague for me too. An 8/10 is better than 7/10. But is it really that much difference? Really. The larger the numbers, then the large the scale to choose an answer.

    I guess that’s why I prefer to look at a glance the Out of 5 system. It’s what I use when reviewing films and music for a what’s on magazine and it sums up neatly what I write.

    Edge magazine once tried losing the score system altogether, but quickly reverted (since, let’s face it, they’re pretty snobby as it is). People seem to like a score more than the reasons. I don’t blame them, critics are just there pouring failed novelistic witterings on to print. I’ll admit, I’m one of them.

  2. One or two reviewers have actually been known to use percentages and /10 sensibly. PC Gamer UK, Edge, GamesTM and Eurogamer all do.

  3. http://www.GamesThatWontSuck.com
    it’s all too human to put things into scales and hierarchies. classification is fun and an easy organized way to look at things. I for one think that rating games is not something that can be put into a scale. I prefer a simple review statement for each game. PlanetPhillip.com has employed this, and it gets a clear message to the viewer about who should play the game. that is, after all, the entire point.

    PlanetPhillips rating system is good enough, and is as follows:

    Play it Now!
    Play it when you have time
    Consider it
    Think Twice Before Playing
    Avoid IT!
    Too Buggy to Rate!

  4. The new review system works for me. I was never a fan of the percentage based system (too many gradations for something that’s not easily measurable).

  5. For me, really, any scale is equal, just as long as it’s consistent. But I can see how the new ratings would be better for people unfamiliar to game scores, since they give a quick and simple look at a game without wasting time splitting hairs. Avoid the skulls. That makes it so that readers might consider giving other games with reasonably low percentages a chance because after all, it didn’t get a skull.

  6. http://www.GamesThatWontSuck.com/
    the reason I dislike scale based ratings is that people end up giving [url=http://www.dreamdawn.com/sh/info.php?name=The%20X-Files:%20Resist%20or%20Serve]the X-Files game[/quote] the same rating as [url=http://www.dreamdawn.com/sh/info.php?name=Cold%20Fear]Cold Fear[/url], or [url=http://www.dreamdawn.com/sh/info.php?name=Siren,Forbidden%20Siren]Siren[/url] the same rating as [url=http://www.dreamdawn.com/sh/info.php?name=Fatal%20Frame,%20Rei%20Zero,%20Project%20Zero]each[/url] [url=http://www.dreamdawn.com/sh/info.php?name=Fatal%20Frame%202:%20Crimson%20Butterfly,%20Rei:%20Beni%20Chou]Fatal[/url] [url=http://www.dreamdawn.com/sh/info.php?name=Fatal%20Frame%203,%20Rei%203]Frame[/url].

    its a collection of reviews from people who both loved and hated the game, and it’ nothing more than an average of opinions, and what does that actually say about how much any random person will like the game? it’s a gamble of odds.

    Whats wrong with a simple bottom-line quotation saying who might enjoy the game? thats far more accurate, the only fair way to rate a game.

    but what I find most odd is that you alone rated Siren higher than Fatal Frame. what the heck, man? tell me thats a typo or something.

  7. Hey, Chris, this system is a lot better.

    It’s always bothered me when I see a “93%” versus a “95%” and a scale of 1-5 (or bomb-4) seems far clearer.

    The only time I think those kinds of scales don’t work is when the reviewer is inconsistent with their scoring (but of course, that would cause any kind of rating system to fail)– Roger Ebert does that — not in the show, with the “Thumbs,” but in his written reviews, there is a star system (out of 5, I think), and it can be maddening how his stipulations for quality vary.

  8. Badmovies.org has had the skull/1 out of 5 rating system (they use slimes instead of stars) for a long time now. It’s always been easy to use and does generally show where the movies lie in terms of “badness” or “B-ness”. Of course, on that site, the skull movie reviews are always the most fun to read. The reviewer really goes to town on how bad the movie is and it is always extremely amusing 🙂

  9. I once saw a magazine’s review system that was a percentage based system that broke games down into their individual components. As far as i recall it was broked down as :

    Graphics: 30
    Sound: 20

    Obviously this gave a mark out of 100 and weighted each of the criteria as appropriate. It also allows one to meassure the individual components of games with reasonable and not excessive accuracy. I agree, the outright perscent system is arbitrary and often pointless, but do you do you think that there is perhaps room for a divided up system? For me, i prefer the star rating system as it leaves more room for the subjectivity of games and less room for inacurate ratings.

  10. Late to the party here. I do have to question the idea (and I’m not singling out Chris; I’ve seen this it breached several times) that 50% represents an “average”, blandly-OK game. If I got 50% on a test, I wouldn’t expect to pass. 50% does not represent success for me; it represents an effort that maybe did some things right but ultimately is a run-of-the-mill failure. (70% would represent the “good game” threshold.)

    (Roger Ebert, for his part, kinda goes along the same lines; he considers anything less than 3 stars a negative review.
    Then again, he’s referred to the star system as the “bane” of his career, so.)

    I do agree with you that the percentage system has too many gradations; I also like how you’ve always aggregated scores from fans and other sources. Also, after reading a certain critic for a while, I think, you can get a feel of what his or her scoring system “means”, idiosyncratic or not. The new ratings, though, don’t always seem *quite* to match up with the impressions of the full review. (Haunting Ground, for instance, is compared favorably throughout the review to Clock Tower 3, and yet the latter has the higher score.)

    I would suggest rethinking pegging one-star reviews as “just OK”. I don’t know of any system where one star stands for anything less than bad, and that “bad” connotation is just too strong to overcome. (If Maltin does it, it’s an aberrancy.)

  11. > Synonymous

    I’m not sure I understand your point. 1 star doesn’t necessarily map to the 50th percentile; the whole problem is that the percent-based system is skewed so that 70 is the mean and 50 is the lower bound. So I’m trying to use a different ranking system un-skew my reviews and provide a more useful metric: how good is this game? So I’ve intentionally provided no mapping between the percentage-based system and my new star-based system; rather than trying to apply a percentage rank to each star, I think it’s more useful to use the definitions of each ranking that I provided in the FAQ.

    I think that you are also saying that there is precedent for 1 star meaning “bad.” In my case, the skull ranking means “terrible” and 1 star means “ok if you like this sort of thing,” which means “bad unless you are really a fan.” I think that’s consistent with most other ranking systems I’m aware of; Maltins 4-stars-and-bomb system certainly works this way.

Comments are closed.