Metacritic Matters: How Review Scores Hurt Video Games

Metacritic Matters: How Review Scores Hurt Video Games

Bugs in Fallout: New Vegas might have eaten your save file. Maybe they took away a few hours of progress, or forced you to reset a couple of quests. Maybe game-crashing bugs pissed you off to the point where you wished you could get your money back. But they probably didn’t cost you a million dollars.

Maybe you’ve heard the story: publisher Bethesda was due to give developer Obsidian a bonus if their post-apocalyptic RPG averaged an 85 on Metacritic, the review aggregation site. It got an 84 on PC and Xbox 360, and an 82 on PS3.

“If only it was a stable product and didn’t ship with so many bugs, I would’ve given New Vegas a higher score,” wrote a reviewer for the website 1up, which gave New Vegas a B, or 75 on Metacritic’s scale.

“It’s disappointing to see such an otherwise brilliant and polished game suffer from years-old bugs, and unfortunately our review score for the game has to reflect that,” said The Escapist‘s review, which gave the game an 80.

If New Vegas had hit an 85, Obsidian would have gotten their bonus. And according to one person familiar with the situation who asked not to be named while speaking to Kotaku, that bonus was worth $US1 million. For a team of 70 or so, that averages out to around $US14,000 a person. Enough for a cheap car. Maybe a few mortgage payments.

Those sure were some costly bugs.

This is not an anomaly: for years now, video game publishers have been using Metacritic as a tool to negotiate with developers. And for years now, observers have been criticising the practice. But it still happens. Over the past few months, I’ve talked to some 20 developers, publishers, and critics about Metacritic’s influences, and I’ve found that the system is broken in quite a few ways.

There is something inherently wrong with the way publishers use Metacritic. And something needs to change.

Why Metacritic Matters

Hop into a debate with some video game fans on your favourite message board, and there’s one subject that will always come up: review scores. Which game scored the highest? Which scored the lowest? Which are the best review websites? Which are the worst?

Inevitably, at some point, someone will jump into the fray and say something like “lol review scores mean nothing anyway.” To some people, maybe that’s true. But to the people who make and sell video games, review scores are more important than many casual fans realise. Mostly because of Metacritic.

For the uninitiated: Metacritic is an aggregation website that rounds up review scores for all sorts of media, including video games. The people who run Metacritic take those scores, convert them to a 100-point scale, average them out using a mysterious weighting formula (more on that later), and spit out a number that they call a Metascore, meant to grade the quality of that game. The Metascore for BioShock Infinite, for example, is currently an 94. Aliens: Colonial Marines? 48.

To people who work in gaming, these Metascores can mean a lot. Say you’re a developer who needs money. You’ve got some ideas to pitch to publishers. You take some meetings. They’re going to ask: just how good have your games been?

“Typically, when you go into pitch meetings and whatnot, publishers are going to want to know your track record as far as Metacritic,” said Kim Swift, a game designer best known for helping create games like Portal and Quantum Conundrum. “As a company, what is your Metacritic average? As an individual, what is your Metacritic average?”

Swift works for Airtight Games, an independent studio that is tied to no publishers. Their Metacritic history: Dark Void, which has a 59 on Metacritic, and last year’s Quantum Conundrum, which sits at 77.

In order to survive, studios like Airtight have to negotiate deals with big companies like Capcom and Square Enix. Often that means talking about Metacritic. Sometimes that means wearing their history of Metacritic scores like a scarlet letter.

This is common. An employee of a well-known game studio told me about a recent pitch meeting with a publisher, during which the publisher brought up the studio’s last two Metacritic scores, which were both average. The studio employee asked that I not name the parties involved, but claimed the publisher used the Metascores as leverage against the studio, first to negotiate for less favourable terms, and then to turn down the pitch entirely.

Often, developer bonuses or royalties are tied to game review scores. Fallout: New Vegas is one high-profile example, but it happens fairly often.

“It’s pretty common in the industry these days, actually,” Swift told me. “When you’re negotiating with the publisher for a contract, you build in bonuses for the team based on Metacritic score. So if you get above a 90, then you get X amount for a bonus. If you get below that, you don’t get anything at all or get a smaller amount.”

In other words, a developer’s priority is sometimes not just to make a good game, but to make a game that they think will resonate with reviewers.

“When you’re working on a game, part of what you want to do is have a high score,” said Swift. She said she’d never seen a developer change part of a video game just for the sake of raising scores, but the influence is undoubtedly there.

“It’s usually some other thing like, ‘Hey, we could use another couple hours on this game because people perceive a longer game to be a higher value,’” Swift said. “It’s never directly pointing back to, ‘This is gonna improve our score by X number of points.’”

Matt Burns, a longtime game designer who worked for a number of big shooter companies and now makes indies with his company Shadegrown Games, wrote about his personal experiences with Metacritic back in 2008. Burns said he watched firsthand as a development studio worked as hard as possible to make a game that would snag high review scores.

“Armed with the knowledge that higher review scores meant more money for them, game producers were thus encouraged to identify the elements that reviewers seemed to most notice and most like — detailed graphics, scripted set piece battles, ‘robust’ online multiplayer, ‘player choice’ and more, more of everything,” Burns wrote.

“Like a food company performing a taste test to find out that people basically like the saltiest, greasiest variation of anything and adjusting its product lineup accordingly, the big publishers struggled to stuff as much of those key elements as possible into every game they funded. Multiplayer modes were suddenly tacked on late in development. More missions and weapons were added to bulk up their offering — to be created by outsource partners. Level-based games suddenly turned into open-world games.

“Before you cry in despair, keep in mind that all these people wanted in the end was the best game possible — or, more precisely, the best-reviewed game possible.”

And then there’s this wry joke by Warren Spector, talking about the words that influenced his career during a talk at the DICE conference earlier this year. Powerful words. Legacy. Mentor. And…


While chatting with Obsidian head Feargus Urquhart for the profile I wrote last December, I asked him about what had happened with Fallout: New Vegas. For legal reasons, he couldn’t get into the specifics.

“I can’t comment on contracts directly,” he said. “But what I can say is that in general, publishers like to have Metacritic scores as an aspect of contracts. As a developer, that’s challenging for a number of reasons. The first is that we have no control over that, though we do have the responsibility to go make a brilliant game that can hopefully score an 80 or an 85 or a 90 or something like that.”

According to Metacritic’s rating scale, any game above a 75 is considered “good”, but realistically, according to multiple developers I spoke with, publishers expect scores of 85 or higher. Sometimes, Urquhart told me, the demands can get unreasonable.

“A lot of times when we’re talking to publishers — and this is no specific publisher — but there are conversations I’ve had in which the royalty that we could get was based upon getting a 95,” he said. “I’ve had this conversation with a publisher, and I explained to them, I said, ‘Okay, there are six games in the past five years who have averaged a 95, and all of those have a budget of at least three times what you’re offering me.’ They were like, ‘Well, we just don’t think we should do it if you don’t hit a 95.’”

That’s the developer’s perspective. Now let’s look at this from the other side. Say you’re a publisher. You’re about to sign a seven- or eight-figure deal with a development studio, and you want to make sure they’re not going to hand you a clunker. Why not use Metacritic as a security blanket in order to minimise risks and ensure you get yourself a great game?

Here’s some very reasonable rationalisation from a person who worked at a major publisher and asked not to be named:

“Let’s say that [a publisher] wanted to pay $US1 million up front (through milestone payments over the course of development), but the developer wanted $US1.2 million. If they wouldn’t budge, sometimes we would offer to make up the difference in a bonus, paid out only if the game hit a certain Metacritic.

“That conversation could happen during development too. Maybe a developer wanted more time and money in the middle of the production, to make a better game. So the counter was, ‘If you’re so sure it will make the game better, we’re gonna tie the additional funds to the Metacritic score.’ It was a way to minimise risk.”

But a different person who once worked for major publishers says that Metacritic scores are just an excuse publishers use in order to deprive developers of the bonuses they deserve.

“Well, generally the whole Metacritic emphasis originated from publishers wanting to dodge royalties,” that source said. “So even if a game sold well, they could withhold payment based off review scores… The big thing about Metacritic is that it’s always camouflaged as a drive for quality but the intent is nothing of the sort.”

Multiple developers I spoke to echoed similar thoughts, although nobody could share hard evidence to back up this theory. I reached out to a number of major publishers including Activision, EA, and Bethesda, but none agreed to comment for this story.

Marc Doyle, the former lawyer who co-founded Metacritic in 2001 and keeps it running every day, told me during a phone conversation last week that he feels no responsibility for what video game publishers or developers do with his website.

“Metacritic has absolutely nothing to do with how the industry uses our numbers,” he said. “Metacritic has always been about educating the gamer. We’re using product reviews as a tool to help them make the most of their time and money.”

But gamers aren’t the only ones who use Metascores. Not by a long shot. Even the massive Japanese publisher Square Enix recently cited Metacritic as one of the factors they used to predict sales for their games.

“Let’s talk about Sleeping Dogs: we were looking at selling roughly 2~2.5 million units in the EUR/ NA market based on its game content, genre and Metacritic scores,” former Square Enix president Yoichi Wada wrote in a recent financial briefing. “In the same way, game quality and Metacritic scores led us to believe that Hitman had potential to sell 4.5~5 million units, and 5~6 million units for Tomb Raider in EUR/ NA and Japanese markets combined.”

“Review scores are a part of our industry and it’s something we pay attention to as developers,” said Swift. And they lead to trends. “Review scores of this year are gonna drastically affect what’s gonna be seen next year,” she said.

Even big retailers like Walmart and Target ask publishers for Metacritic predictions when deciding whether or not to feature certain games.

“One of the criteria [retailers] have is, ‘What’s the review score gonna be?’” said Tim Pivnicny, vice president of sales and marketing at Atlus USA. “That comes up a lot… They’re concerned if it’s going to be a good game.”

Metacritic has a significant influence on the way games are produced today. That’s a problem.

Why Metacritic Shouldn’t Matter

When I first heard about the Fallout: New Vegas bonus, I wrote an editorial about how silly it is for publishers to use Metacritic as a measure of quality. Video games are personal experiences, and they can’t be evaluated objectively, especially through some sort of arbitrary numerical score that means different things to different people. (Go ahead and try to explain to me the qualitative difference between an 8.1 and an 8.2.)

That’s the obvious reason. But there are others.

For one, people are gaming the system. On both sides of the aisle.

There’s the story of the mocked mock reviewer, for example. Some background: game publishers and developers often hire consultants or game critics to come into their offices, play early copies of games, and write up mock reviews that predict how those games will perform on Metacritic. Often, if possible, publishers and developers will make changes to their games based on what those mock reviews say. Mock reviewers are then ethically prohibited from writing consumer reviews of that game, as they have taken money from the publisher.

One developer — a high-ranking studio employee who we’ll call Ed — told me he hired someone to write a mock review, then threw that review in the shredder. Ed didn’t care what was inside. He just wanted to make sure the reviewer — a notoriously fickle scorer — couldn’t review his studio’s game. Ed knew that by eliminating at least that one potentially-negative review score from contention, he could skew the Metascore higher. Checkmate.

(In case you’re wondering, Kotaku writers are prohibited from doing mock reviews or taking any work from the publishers we cover.)

When I asked Metacritic’s Doyle about practices like this, he admitted that he had heard similar stories. He said he works closely with all 140 review publications that he aggregates on Metacritic, and he said he constantly evaluates and examines each one. “Trying to prevent people from gaming the system is something I always think about,” Doyle said.

But it’s still happening.

“Anything we can do to optimise the score, we’re gonna do,” Ed told me.

Sometimes it’s subtle things: lavish review events that force game critics to review games on a studio’s terms; flexible review embargoes; swag that gets sent to offices and discarded oh-so-often, like Gears of War beef jerky and Legos based on sets from Lego City Undercover. So long as Metacritic has an effect on the people who make games, the people who make games will find ways to influence it.

Those most susceptible to pressure from video game publishers may be the smaller websites that need traffic from aggregate sites like Metacritic in order to survive — websites that might make sketchy deals in order to get that traffic. Jeff Rivera, a game journalist who worked as an editor for a group of websites called Advanced Media Network (which later became, told me he saw one of those deals back in 2006.

“We had an agreement with Sega that we would run a week-long special with our top stories on the DS channel being dedicated to Super Monkey Ball,” Rivera said in an e-mail. “I was handling the review, and on the night before we were going to publish, I got an IM from a co-worker asking what I was going to score the game.

“I told him that I didn’t know yet and wondered why I was being asked, as it was something I’d never had happen before. He went on to tell me that PR said that our review would be guaranteed exclusive for a day if my score was to be 8.0 or better.”

Rivera said he had already written his review at that point, and that he had scored the game an 8.1.(The review is no longer online, but it’s still listed under Kombo on GameRankings.)

“I told them I that didn’t know what I would give it, because I didn’t want them feeling like they ‘bought’ my review score,” he said. “More pressure came to divulge my score, and I kept saying that I didn’t know, but that 8.0 was the ball park range.”

When I asked Sega for comment on this story, they sent over a statement: “Sega has a strict internal policy against soliciting high scores in exchange for early reviews and against the practice of influencing reviewers.” But Rivera said this happened in 2006. I asked Sega when they enacted this policy, but the publisher never got back to me.

From conversations I’ve had with developers and other press, it seems like this sort of thing happens less often these days. But there are always stories and whispers. Developers begging reviewers to change their scores. PR people intentionally sending out late review copies when they know a game is going to be bad, or sending early copies to websites known for handing out higher scores.

If you read about games online, you’re probably familiar with some of the websites on Metacritic: outlets like IGN and GameSpot are well-established publications that pay their writers and have solid reputations. But other names on Metacritic’s large list of publications are less recognisable. Some are run by volunteers; others are lesser-known to American gamers.

In order to give more importance to the bigger websites, Metacritic uses a weighting system that puts more emphasis on the heavy-hitters, making their scores count for more. But Doyle and his team won’t give any details about the system they use. This opaqueness has led to some controversy over the years: most recently, a Full Sail University study made headlines when the people behind it claimed to have modelled Metacritic’s formula, but their model turned out to be wrong. The event led many to ask: why doesn’t Metacritic just tell us how they weight outlets?

“We’re transparent about everything on Metacritic except for the critic weightings,” Doyle told me. “That may seem like a drastic thing, but I’m just telling you that, in my opinion, it’s not. If you simply stripped out all the weights, it wouldn’t have a huge effect on that number.”

Doyle gave me a few explanations: for one, he said he doesn’t want publishers pressuring the highest-weighted publications. Another reason: Metacritic tweaks the system frequently, and they don’t want to have to talk about it every time they do, potentially embarrassing a publication whose weight they’ve just lowered.

But people find it hard to trust what they don’t understand. And nobody understands how Metascores are computed.

One of Doyle’s other big policies has also been in the news recently: Metacritic’s refusal to change an outlet’s first review score, no matter what happens. It’s a policy they’ve had for a while now, Doyle told me. He enacted it because during the first few years of Metacritic, which launched in 2001, reviewers kept changing their scores for vague reasons that Doyle believes were caused by publisher pressure.

“I decided that if we can, as an aggregator, act as a disincentive for these outside entities, whoever they may be, to pull that kind of stuff, and we can protect our critics by backing up their first published and honest opinion, then we’re gonna do what we can to do that.”

Sometimes, however, this leads to some skewed Metacritic results. Late last year, GameSpot pulled their review of Natural Selection 2, which had been written by a freelancer. The review contained multiple factual inaccuracies. A different writer then reviewed the game, giving it an 8. But the original score — a 60 — remains on Metacritic to this day.

More recently, the website Polygon, which uses an adjustable review scale, gave SimCity a 9.5 out of 10 before it launched. On launch day, when crippling server errors rendered the game unplayable for most, Polygon changed their score to an 8. A few days later, as the catastrophic problems continued, they switched it to a 4. It’s currently a 6.5. Yet anyone who goes to SimCity’s Metacritic page will still see the 9.5.

Still, Doyle stands by his policy.

“Metacritic scores really are that snapshot in time when a game is released, or close to after it’s released,” said Doyle, “when the critics decide, ‘I’ve played this enough, I can evaluate this now fairly, and here’s the score.’”

Another problem for developers: outlier scores. What happens when tons of people like a game, but for one or two reviewers, it just doesn’t click?

“The problem is is the scale,” said Obsidian’s Urquhart. “There’s an expectation that a good game is between 80 and 90. If a good game is between 80 and 90, and let’s say an average game is gonna maybe get 50 scores, if you wanna hit that 85 and someone gives you a 35, that just took 10 90s down to 85… Just math-wise, how do you deal with that? Some guy who wants to make a name for himself can absolutely screw the numbers.”

One reviewer well-known for aberrant scores is Tom Chick, who runs the blog Quarter To Three. Chick is listed for having the lowest Metacritic score on BioShock Infinite (a 60) and Halo 4 (a 20), among others. He uses a 1-5 scale that Metacritic converts into multiples of 20, so Chick’s “I liked this game,” — 3 out of 5 — is converted into a 60, which most Metacritic readers see as a bad score.

But Chick is OK with this system, and when I asked him his thoughts on how Metacritic uses his numbers, he defended the aggregation site.

“An aggregate is only as good as its individual components,” Chick said in an email. “And I feel that a lot of the data fed into Metacritic is of questionable value for how it clusters ratings into a narrow margin between seven and nine. But that’s not a Metacritic problem. That’s an IGN problem, a Game Informer problem, a GameSpot problem. And part of how we get past that problem is by recognising more varied data. That’s ultimately one of the reasons I’m on Metacritic: I believe a wider range of opinions can add to its value.”

Chick uses a totally different scale than many other websites on Metacritic: Game Informer, for example, describes their 6/10 as follows: “Limited Appeal: Although there may be fans of games receiving this score, many will be left yearning for a more rewarding game experience.”

Chick, on the other hand, says his 6/10 means something else entirely. “I believe strongly in using the entire range of a ratings scale, so three stars means that I like a game,” he said. “Quite literally. We have a ratings explanation on Quarter to Three that explains that three stars means ‘I like it.’ It’s that simple.”

Yet Chick’s 60 and Game Informer’s 60 are averaged together. They both affect developer bonuses. They both have an impact on contract negotiations. And they both change the way video games are made.

“The nature of an aggregate system is that multiple scores are aggregated,” Chick said. “You might as well blame IGN for giving a game a 92 instead of a 96. As for how I feel about a studio losing its bonus because the publisher has set an arbitrary number, that’s not my responsibility. My responsibility is solely to my readers.”

Chick’s message is admirable, and his criticism is always sharp, but his scores illustrate one the biggest problems with how publishers and developers use Metacritic today: inconsistency. When Chick’s scale is so drastically different than Game Informer’s, how can any outside observer look at an average of the two and think that number has any meaning or significance?

There are other points to think about, too. If one person loves a game, and another person hates a game, is it an *average* game? Or just a game that one person loved and another person hated? If two people score a game 100 and two people score it 0, it’s not worth a 50 — it’s just polarising.

The system doesn’t work. And I’m not the only one who thinks so.

Outside Voices Weigh In

“Metacritic’s usefulness as a consumer aid is clear and obvious,” said longtime critic and Gears of War: Judgment writer Tom Bissell. “That the game industry has internalised its values, however, and uses its metrics, apparently uncritically, as a valuable source of self-appraisal, has to be one of the great mysteries of modern industry. It cannot be a coincidence that the form of modern entertainment most self-conscious about its status as an art form is also so slavishly attached to Metacritic.”

“It bastardises the editorial process for reviews,” said Justin Kranzl, an ex-game critic and current PR rep for Square Enix. “We’re conditioning readers to skip the copy or the video and just get the score. For people who love a dynamic and varied media landscape — and any self respecting PR person should fall into that category — that’s terrible.”

“I think Metacritic is something only publishers care about,” said Monkey Island designer and longtime game developer Ron Gilbert. “The devs I know only care about it to the extent that a publisher bonus has been tried to the games Metacritic score (which is a stupid, stupid, stupid thing to do). I’ve never looked up the Metacritic score for any game I’ve worked on. It’s completely irrelevant to me.”

“Metacritic encourages the fallacy that all opinions should be weighted equally, and that a ‘bad’ review is an unenthusiastic review,” said Bissell. “But that’s not true. There are some games I am *more* likely to play when a certain critic gives them what Metacritic regards as a ‘bad’ review. Metacritic leaves no room to discuss, much less pursue, guilty-pleasure games, noble failure games, or divisive games. Everything’s just a 7, or an 8, or a 6.5. That’s the least interesting conversation I can imagine.”

(Metacritic game hubs do include blurbs from each of the reviews they aggregate.)

“Rating a game is so subjective,” said Airtight’s Kim Swift. “I think one of the scarier things for a developer is when a reviewer opens up with ‘I typically hate this type of game’ and you’re like ‘Oh, crap’.”

“I don’t want to carry that burden… these are people with children and families,” said longtime critic Adam Sessler, an ex-TV host who now produces videos for Revision 3 Games. “It is a horrible feeling that what I’m saying — I’m giving my subjective evaluation of an experience that will not be the same experience as other people are going to have — that somehow withholds food and resources… To me it is noxious in the extreme.”

“In fact I would encourage more outlets to employ scoring scales that are incompatible with Metacritic,” said Kranzl, “and I’m always down to discuss with them different ways of getting there.”

“I’ll say this,” said Sessler. “I have considered not doing this job before because of this, because I think there’s something so morally questionable and repugnant about it.”

“I wish it would go away, but if not Metacritic, then some other service would pop up,” said Gilbert. “We humans love to quantify stuff. I wonder what the Metacritic of the Mona Lisa was. I heard it hung in King Louis XIV’s bathroom for many years.”


Perhaps it’s in our nature to make numbers out of everything. And it’s hard to deny that Metacritic is a useful tool for measuring how a small group of people felt about a game at one particular point.

But it’s not a useful tool for much else. There are too many variables, too many people trying to manipulate the system. There’s too much subjectivity in the review process for anyone to treat it like an objective measure of quality. Video games are designed to be personal experiences, and it is disingenuous for publishers to act like review scores are any more than the quantification of those personal experiences. It’s harmful to everyone. Everyone.

It’s harmful to critics, who have to deal with PR pressure and the guilt of taking money out of peoples’ pockets.

It’s harmful to developers, whose careers can be tied to the whims of a critic who may be in a bad mood when he or she plays their game.

It’s harmful to publishers, who must be concerned that they have to put so much value on a website that won’t tell anyone how they calculate their review score averages.

Most importantly, it’s harmful to gamers, because it has a palpable negative impact on the way our video games turn out every year. When developers change games because they think that’s what reviewers will want to see, nobody wins.

Metacritic is a useful tool, but video game publishers have turned it into a weapon. And something’s gotta change.


  • So if video game reviewers want to see better games next year, they should give everything a 100 this year? So that devers actually get paid what they need to make a good game?

    • defeating the point of metacritic.

      I have always never read what MC had to say to a game before buying it because the reviewers opinion hardly matches my own except maybe for the more famous classic failures(A:CM, DNF, etc) and I had suspected that the double edged blade of a collective review score would be used by publishers as well as developers to gauge the end result of my games that I am buying.

    • one persons trash is anothers treasure. reviews are subject to quantitative scores.

      Tomb Raider on PC was a stunning game with only a few easy to overlook flaws(QTE instructions at the start of the game being masked by subtitles) . I didn’t have ANY interest in day 0 reviews since reviewers were only given XB360 copies, as such its a totally different product, and even on PC where people were whining about issues caused by tressFX and bad nVIDIA drivers also didnt affect me because my gaming rig is an AMD build. When a review score reflects a qualitative score, and not a quantitative opinion- then you have my interest.

      Elvis never did no drugs

      • Haha. The reason the reviews gave out console versions and not PC as most of the (purchasing rather than torrenting) market is consoles. It’s just facts, mate. Most PC gamers I know spend waaaaay more on hardware than software. As a console gamer, I am not interested in how it played on a random PC.

    • +1. Boo hoo, developers miss out on bonus because they release a game broken and buggy for me to pay to test. They didn’t deserve a bonus. And the metacritic score reflected that.

    • I do kind of agree with you but what is the difference in quality between 84 and 85? Why should a development team lose money because of that?

      • Well, for better or worse, Metacritic is about the closest thing there is to an objective measure of quality for games. You can’t have a contract that promises a $1m bonus or whatever just if they create “a good game”, because what’s “a good game”? You have to define what “good” is in a way that is measurable.

        In the case of the Obsidian/Bethesda example given, both the developer and the publisher agreed on a deal that defined a good game as one scoring 85+ on metacritic. Saying it’s unfair that they missed out because they scored 84 or whatever is rubbish. If they’d scored 86 and the publisher didn’t pay up, then there would be outrage because they broke the agreement. Complaining they missed out on the bonus for a score of 84 when they agreed on 85 is the other side of the same thing. Perhaps a compromise would have been to negotiate bonuses on a sliding scale, say starting at $100k for a score of 70 and scaling up to the full $1m for 85.

        But, speaking for myself, if they’re going to be attaching bonuses to these deals then I’d much rather they were based on review scores than on sales figures.

      • It’s the difference between a Distinction and a High Distinction, at the university I went to at least.

        It’s also the difference between First and Second class honours at the same institution, if done from an Economics degree. From a Science degree, the dividing point is 80. (Same honours program in a different degree.)

        Sometimes we have to deal with seemingly arbitrary divisions. If we’d done a slightly better job we’d be on the preferred side of those dividing lines.

    • Congratulations Brodie and Jethronerdlinger… for failing to read the whole article and becoming part of the problem that sees us get more generic, bland and non-innovative games in an attempt to cater to reviewers rather than gamers… because so many of those gamers are happy to take an absurdly and moronically simplistic assessment of a situation.

      Metacritic scores are a horribly flawed beast for the myriad of reasons pointed out in this article. That a talented and accomplished studio like Obsidian was denied a bonus due to missing out of a set metacritic score (by ONE POINT, no less), is a disgusting symptom of an inexcusably shallow and arbitrary means of determining the “value” of a product.

  • Number based scales are a stupid and outdated way to rate something. Simple as that. You;r trying to put an objective value on something that is subjective.

  • All i want from a review is a commentary on the amount of bugs or possible game breaking glitches on release. Thats it. I dont care if a writer enjoyed the game, i just want an outsiders review of quality on a technical level.

    It will never happen obviously, because too many people need to be told what theyll enjoy, but i can dream…

    • That’s not really a review, though. It’s just a technical assessment. It’s like reviewing a movie and only talking about how good the sets and special effects are, or an album and only talking about how good or bad the production is, or a car and only talking about how reliable it is. While those things are important, they’re far from the only thing that people want to see in a review.

      • I did say what “I” want in a review. To me a review is a commentary on quality, not an opinion on whether the writer enjoyed it. I really couldnt care less if someone likes a painting, enjoyed a movie or really digs a certain car, because enjoyment is subjective. You might think expensive cheese is brilliant, but to me it may taste like old sweat. But you can still say whether or not the cheese is well made, and delivering the flavour as intended.

        • But what I’m getting at is that it would only be a commentary on one aspect of the quality of the game i.e. how many bugs, glitches or other technical issues are in there. It doesn’t say anything about the quality of any other aspect of the game.

          A review isn’t about “telling people what they’ll enjoy”. It’s about telling them about what some other person (or people) enjoyed or didn’t enjoy. From that people can decide for themselves if it sounds like something they’ll enjoy. Otherwise, to use your example. you’d end up buying and eating every competently-made cheese out there in the hope that you might end up with 2 or 3 that you actually like. Which is fine if you have the time, money and inclination to sample all those cheeses.

  • I’ve pretty much boycotted Metacritic since the New Vegas issue. It’s not Metacritic’s fault, mind you, but I just feel people should stop relying on it so much. Reviewing games based on issues outside the game itself is problematic. Has Polygon re-adjusted their score for Sim City now? It started at a 9.5 then was bumped to 8 for launch issues, then again to 4. It’s now back up to a dizzying 6.5 on the basis that, when you get right down to it, Sim City is actually kind of boring.

    So why did it get a 9.5 in the first place? Because giving a game a number is fundamentally wrong to begin with.

  • Tom Bissell is right (as usual): a conversation about 6.5, 7, and 9 is the most boring one imaginable. And that’s often what we end up with when we’re discussing games.


  • The New Vegas situation always get’s raised in these situations, but I don’t have a lot of sympathy. At the end of the day, this is the agreement Obsidian signed, and it was a bonus, not a given. The game was buggy, and it reviewed poorly. Like almost all of their games. They can’t have seriously believed they would get that bonus.

    • New Vegas was so buggy, if anything it’s a travesty that it nearly got an average of 85. I don’t recall anyone who played it praising it that highly to me- in stark contrast to the legions of Fallout 3 superfans.

      • After about six months with a combination of fan and developer patchers, and a mod to add additional tracks to the soundtrack, it was an excellent game. But it just shouldn’t have been released in the state it was in.

  • I won’t get into a debate regarding Metacritic’s relevance with anyone if I can avoid it.

    The only thing I do know is, as a long time of the Fallout series – New Vegas was a disaster, at least on PS3. I played Fallout 3 through 4 times on two different systems, and while Fallout 3 was far from perfect, it did a pretty good job. In fact most of it’s issues were introduced with the DLC, the base product was pretty good.

    I took 2 days off to play New Vegas when it was released. My first day was interrupted by my PS3 locking up around 8 times over a 4-5 hour period and losing save progress each and every time. Whilst a little sad and agitated, I knew such issues were going to cop a day one patch. Left it for the rest of the day, jumped on the next day and there was a patch. Joy! No. It was even worse. I couldn’t load any save files, and all new games died inside 5 minutes.

    I’d waited for this game for the better part of a year and a half, got my shiny collector’s edition, and that was what I got for my troubles. Another patch came later that day, and made it largely playable, but still with a fair few game breaking bugs. I was literally saving every ten minutes.

    The end result was I fought through the game, and promptly traded it in. New Vegas was a giant brown shit stain on what was a pretty darned good record for one of my all time favorite IP’s. I’ve completed Fallout, Fallout 2, Fallout Tactics and Fallout 3 (plus all DLC) prior, and NV just made me so mad I can’t even describe it.

    Not saying they deserved to miss out on their bonuses – that part is pretty damned tragic. But the fact is, that game was an unfunny joke on release. I don’t have a lot sympathy. The game issues were global as well. How the fuck does a title as big as NV get released in such a state? Did no one play test it? Of course they did. There’s no way on this green earth they couldn’t have known of these issues. They were widespread and completely unavoidable. It means they decided to release it regardless, probably figuring the shitstorm would be OK once they patched it. Releasing it in such a state and then knowing people would pay good money for it, and be forced to endure all that crap whilst waiting for a patch – that was the biggest insult to a Fallout fan I can think of.

    And that’s my rant – apologies, I’m a huge Fallout fan and still haven’t entirely forgiven Obsidian for New Vegas.

    • See my thread below. It wasn’t Obsidian’s fault; it was ostensibly Bethesda’s as QA was their responsibility.

      • It sounds about right. I still marvel at the decision to release it as an entirely busted product though, regardless of who’s responsible. It’s pretty much a giant fuck-you to all the fans, as they’re the ones who’ll buy it day one.

        So incredibly disappointing.

        • It wasn’t that bad, but it could have been better. And I played it on PS3 at the time: Gamebryo on PS3, need I say more, etc.

  • What wasn’t really discussed here is why MetaCritic is important to publishers.

    MetaCritic provides measures for 2 things, firstly, as discussed, the aggregate for review scores.

    Secondly, marketing. A very simple usage for MetaCritic is it’s marketing appeal. If someone is on the edge between buying and not buying a game the MetaCritic score can be the deciding factor.

    In the film industry, awards are used in a similar fashion. If a film gets an Oscar there’ll be an increase in royalty amounts, or an additional payment.

    The reasoning for all this is, better score = more money coming in. The fact of the matter is people DO buy games based off review scores, they DO buy/watch films based on awards. Until people stop doing this we won’t see these systems change.

  • You know what the kicker is about the New Vegas story?

    The developer doesn’t determine what amount of QA should be done on a project — I mean, they do a certain amount of QA as a matter of course. But ultimately the _publisher_ determines the amount of QA done on a project.

    In New Vegas’s case, Bethesda. Bethesda are ultimately responsible for the bugs in New Vegas, and they’re the ones responsible for giving them the bonus.

    • We don’t know the quality of the product Bethesda received for QA. They may have expected a close-to-finished product by date X so they could release in time for Christmas, and got an absolute mess instead. I suspect this is what occurred because some of the bugs went beyond ‘poor QA’ and entered the realm of badly broken (the save file problems were insane). It’s also consistent with Obsidian’s other releases (Neverwinter Nights 2 and Alpha Protocol both ran like dogs), and was predicted by an insider after the release of Alpha Protocol. As much as I love their story-telling, I still don’t fully understand why companies keep contracting Obsidian.

      • Yep, agree with you there. IMO if a publisher was going into business with Obsidian then they’d be mad to NOT have some bonus component attached to the quality of the game because, quite frankly, Obisidian have some form there.

    • Then Obsidian are pretty dumb for not putting in a contract clause that says something like “we still get the bonus if we’re within 5 points of the target and at least 5 reviews list stuff we’re not responsible for as a negative”.

      All this “boo hoo” is basically saying “raw metacritic scores aren’t great contractual milestones”. Then blame developers for accepting them! It’s a commercial negotiation! At least the Metacritic score bonuses can be achievable- if the publisher has so much power, they could just say “no bonus for you, take it or leave it”. The developer has SOME power if the publisher wants to make the game- it’s up to them to try and adjust these contracts and explain to the publisher why it’s best for the publisher for the incentives to line up with what will make a game popular, not with what will score best on Metacritic.

      Unfortunately many devs, like many other businesses, will either not use a lawyer for these negotiations or will use a crappy lawyer who doesn’t understand the issues and isn’t creative.

      If any devs want to hire me to negotiate their next contact, by the way, my schedule is not completely full and my firm’s rates are very reasonable.

    • Just to play devil’s advocate for a moment:

      It’s fine to argue that, but it’s also reasonable to expect that this was factored into the negotation. If you’re making a program which you know is filled with bugs to the point that it will probably impact review scores, if you want your bonus you better make it so good that the lowered scores will still be in your ‘bonus’ range. Otherwise, you didn’t really deserve the bonus, did you?

      It’s like if you continually turn up ten minutes late to work because every other day the bus turns up late. But you keep catching a bus which – if it were on time – only drops you off at work five minutes before you need to start. Maybe you should take an earlier bus, so you won’t be late due to factors outside your control. Giving yourself some leeway is a factor that IS within your control.

      • Yeah, except it’s not really that simple as that. QA is how you find out you have bugs. If you skimp on that, then you don’t know you have them, and you can’t fix them.

        Yes, I mean, you can negotiate more time and money and factor in inhouse QA (expensive), then you might miss the opportunity to actually make the game.

        • There’s this misconception that QA is a task at the end of development, not a process that continues throughout the development cycle. Personally, I think that Obsidian shares this misconception, which is why their games are always so sloppy. They should be working to reduce bugs even entering the code, because this is significantly cheaper than identifying a litany of bugs at the end of development and trying to fix them years after a particular piece of code was written. This is how the best developers operate.

  • Fantastic article, and I definitely concur. I haven’t enjoyed games that got scores of ninety, and I’ve deeply adored games that got a sixty.
    That said though, it’s usually more likely that the Metacritic score will be similar to my feelings towards a game.

    • It seems likely, though, that the game that got 90 is going to be enjoyed by a lot more people than the game that got 60. No one in this process is asking, “I wonder if there is someone in the world who will love this game like their baby, because if we please just one person then all those millions of dollars will be worthwhile.” They’re asking, “Can we please ENOUGH people to justify the financial outlay on development?” Metacritic doesn’t have to have the same opinion as you; nor does it want to. It wants to have the same opinion as the average of informed commentators; the theory is that, marketing being equal (which is not) people will prefer to buy games that they like to games that they don’t, and that therefore Metacritic scores will correlate to sales.

      • The cases of enjoying critically-mediocre games (and vice versa) that I was referring to are entirely infrequent, which is why I added the qualifier of saying that it’s much more likely that Metacritic will line up with my opinion. I’m pretty sure we’re agreeing.
        You’ve got some good thoughts, and managed to verbalise them in a much more intelligent way than I did, so props to you.

  • Must admit that I don’t bother checking Metacritic scores, I have a few reviewers I feel have similar taste to me and I rely on their reviews and “word of mouth’ buzz. I do the same with films. To the extent I ever visit Metacritic it is as a hub to find reviews of stuff… I’d be interested to know how many people actually rely blindly on a Metacritic score without actually reading reviews as well, is it that common a practice? Is there any survey on this?

  • If Metacritic were to disappear tomorrow, the problem of publishers qualifying art into dollars wouldn’t go away. It would just take a different form. If Metacritic didn’t exist, publishers would have created it. Or created their own review-averaging metric in-house.

    Numbers. They like numbers. They need numbers. They base things on numbers and always have. Metacritic is just a cheaper, easier way of doing it so they don’t have to.

    • I love games, and though i do consider them art in some sense, they are first and foremost a consumer product. Products need real evaluation, the artistic merit of the product, if any, contributes to it but is only a part of the greater concern.

      • Even traditional art, to an extent, is validated by what people are willing to pay for it. To an extent, that is the difference in general sentiment between Banksy and graffiti.

  • It doesn’t reflect the discord between positive reviews and bad word of mouth.
    Or how a good review wont account for market saturation, or poor audience interest.

    See Assassins Creed 3 for both of these…

  • See, I can’t get behind this article. After two decades or more of very worthwhile crusading to make publishers understand that the most important thing in selling a game is not manipulative marketing, but rather quality, we’re now saying that publishers should ignore the most objective quantifier of quality available to them and instead trust their instincts as to what quality is? What instincts do we think people like Kotick or Ricciatello have for quality? Zero, right? And Metacritic has its flaws, but it’s generally pretty on the money. Games with sub-par Metacritic scores generally have serious problems that should have been identified and fixed if the creative lead had known what they were doing. It’s not cases of budgets running short or trivial one-in-a million bugs; it’s serious underlying design issues or an almost total incompetence in the QA process. In the case of Fallout: New Vegas, it was full of bugs because they were tracking their issues on scraps of paper and the back of napkins; they had no systematic QA whatsoever. These teams DON’T deserve bonuses.

  • Great article. Hit a lot of very important points. Assuming equivalent scales between sites was the biggest stand-out problem for me (6/10 meaning different things to different sites).

    • I agree, I give this opinion a 17/2. Which is to say I give it the second reason on my list of as to why I agree with it, or it is a completely arbitrary number followed by another number that may or may not be related to the first, or it just doesn’t matter.

  • Having different reviewers use the scale in different ways shouldn’t be a problem if scores are standardised before aggregation.

    For example, if one review site averages a 7 for a selection of games with most falling within one point of the average, while a second averages a 5 with most games falling within two points of the average, you could standardise them by adjusting the first site’s scores by 2*N-9 (assuming the scores follow a normal distribution) so they both follow the same scale.

    This is independent of weighting a particular site’s scores in the aggregate: it is simply to correct for differences in the way they use the scale. And the more reviews a site does, the more accurate the standardisation process can be.

    • As near as I can tell, they don’t do that. I actually was in email contact with Doyle in 2009 and offered to introduce a weighting system that used the full 10 point scale. They weren’t particularly interested, I suspect because they actually assign scores to reviews that don’t have them.

  • The biggest problem I have is that for so many review sites, a score of 5/10 is automatically given just for the game existing in the first place with an extra point given if it just loads successfully, making the lowest score possible a 6/10.

  • I bought New Vegas on release day and played it to conclusion in 59 hours on maximum graphics settings on PC and I think it crashed to desktop maybe two or three times during that period which isn’t all too much. As a PC gamer I am used to games crashing.. Fallout 3 on the other hand was a frustrating experience at times but it was released not long after the onset of Multi-core CPU’s which brought with it more issues to deal with. All which were patched or an easy trip into settings.ini to fix anyway. I fail to see why people have to have a seizure because a game isn’t flawless on release and smash it with a low review score.

    I think squeezing complex games such as Fallout and TES into the outdated consoles and their hardware restraints is a risky business but that being said, there seems to be more console gamers around these days even though the PS3 is practically obsolete now for experiencing games at their highest performance potential. I don’t play games on console that are available on PC for the reason that I’d give the game a ‘lesser score’ because of how it plays/looks/feels on console. Not having a dig at console gamers but if you really wanted to play FV:NV then you should have played it on PC rather than expecting a masterful game on the PS3. Same goes with Skyrim on PS3 which was a shambles… Oblivion too… even though I DID do 200 hours of Oblivion on PS3 and enjoy it through it’s random console freezing spaz attacks.

    To give a game 6/10 for not reaching ‘your’ expectations after what a day’s play or maybe 10 hours to me is absurd. I don’t think BUGS should affect scores as much as people seem to love knocking 10% off a score each time they have to reset their PC. There are often ways around BUGS, such as Google searches and FIXING problems and TWEAKING hardware. Just because a game is buggy and crashes doesn’t mean it deserves a low score, it usually means your setup is flawed.

    I personally couldn’t care less if a developers family ‘starves’ because a game they made gets a low score, that is the stupidest thing I have ever read. People who dont make games also miss out on bonuses. Game developers are magic people that must be protected is a bizarre notion.

    That being said, there’s a massive divide between triple A rated games and the canyon of 70%-85% rated games and sometimes games are total rubbish for whatever reason and they deserve to be reviewed and rated as such, such as all art.

    I dont usually play games on release (unless they’re the BIG ones) but during dry spells I rely on metacritic for reviews and overall impression of a game before I spend my time on money on a game that may already be two or three years old. In the end people need to focus more on the content of reviews rather than the big shiny number at the end before they make up their mind on whether to play it or not.

  • As stated beforehand, reviews are subjective, depending wholly on the person reviewing it. Show me a review that says it is unbiased and I will show you a hypocrite. A review will always be biased even if the views presented align with the majority of consumers A:CM is a good example of this, crap game full of bugs and severe problems, got crap review by everyone, was called crap by people who bought/acquired it. Other games aren’t this easy however. WoW or specifically its clones are frequently reviewed with someones idea of how an MMO should behave in mind even if it isn’t worded this way. FF13 was polarizing for many reasons, almost all of which are biased. Minecraft, people who don’t get it will call it an awful game that is painful to look at and they are right to do this too. An opinion is precisely that, and it should remain that.

    If I am after reviews I will seek out community reviews that have a discussion about what is good and bad about the game. I will also go to gamefaqs and take a selection of reviews that give it a variety of scores and then read them for the information about what they thought was wrong or right with them then formulate my own opinion.

    I don’t need someone to tell me a game is good or not, but I will listen to them tell me what they thought of the game. The problems arise when what they say is taken as gospel and this problem is at the core of not so much metacritic itself but the amount of weight people put in metacritic as opposed to their sources. People are just far too lazy to research. It would be like flipping to the end of a book, reading “and they all lived happily ever after” and going, well I know it is a good book now and I don’t need to read it.

  • It’s the world of business. Everything is “targeted” around arbitrary KPIs which are usually, more often than not, wholly at the whim of a truly idiotic vocal minority which has nought to do with the business.

  • All I’ll say is that critic scores certainly sway my purchase decisions. If a game gets a review score of less than 70, I won’t even consider it. If it gets over 90… it’s a must-buy. Anything between 70-90 and I have to rely on other things to sway my decisions, like who made the game… my friend’s opinions… the genre… etc.

Show more comments

Log in to comment on this story!