With Baseball Stats, Trying To Synthesise Real, Fun

The more true-to-life the machine simulation, the more repulsive humans find it. That's the "uncanny valley." After hundreds of games I realise baseball on my Xbox 360 and PS3 will always battle that conundrum - and not in their visuals.

MLB 10 The Show is acclaimed and marketed as the most realistic baseball simulation on a games console. So I had to snort when I got to the end of a season and noticed that Alex Rodriguez, at 34, had clubbed 61 home runs for the New York Yankees. My God, given that team, that number and that player, the mind reels at it actually happening in real life. That's realistic?

It strains the imagination, but let's ask what would be, if not repellent, more disappointing to the human eye: A game that does allow A-Rod a one-for-the-ages home run performance, or one so painstakingly mathematically accurate he's handcuffed to half that total, which is more what he's expected to deliver this year?

When they build the game, said Jody Kelsey, a senior producer for MLB The Show, they try not to make realism and fun an either/or proposition. "The question is how do we accurately represent reality in a way that is fun and interesting to the user," he said. "You have to remember that baseball is a game where failing seven out of every 10 times is quite literally an amazing success. It's that 30 percent of the time when you finally succeed that we want the user to feel as rewarded as possible. It's a rush only the sport of baseball can produce."

Earlier in April, I reached out to Marc Normandin, a video game enthusiast and an analyst for the respected journal Baseball Prospectus. I asked if he would look at some simulation numbers in both MLB 10 The Show and MLB 2K10and compare them with his publication's projections for certain players.

The purpose wasn't to ambush anyone's authenticity claim. I just wanted to see what went into what must be one hell of a balancing act. No sport has its stats become labels - if not titles - like baseball. The 40-home run slugger, the 20-game winner, the venerated .300 hitter, these season-long reputations create enormous expectations, not the least of which is how a hitter will perform in a video game. How close do they hew to reality, and how much do they indulge a fan whamming on the square button every time Albert Pujols is at bat?

I asked Normandin to look at data from five simulated seasons in MLB 2K10 and MLB 10 The Show - realizing that it's an anecdotal sample, because so would be 50 simulations, and compiling stats for half of them would be prohibitively time consuming. Normandin figured the measure of how these games project players wouldn't be in the superstars' numbers, but in the up-and-comers' and the journeymen's performances - players that Baseball Prospectus figures would be due for a boost - or a drop - this year, for reasons like age or changing teams or leagues.

In The Show two caught his eye.

"Kevin Kouzmanoff's numbers look terrible, and Adrian Beltre's numbers look terrible, and those are two guys who we think should see improvement," Normandin said. Kouzmanoff and Beltre, third basemen for the A's and Red Sox, respectively, moved to more favourable home hitting venues (departing San Diego and Seattle, respectively.)

In The Show, Beltre on average hit for a .250 batting average, a .405 slugging percentage (the total bases a player gets from all hits divided by his at-bats), had a .297 on-base percentage, and his OPS, or the addition of his on-base percentage plus his slugging percentage (a now widely accepted measure of overall quality) was .703. All are pedestrian numbers.

"Beltre's middle-of-the-road projection is better than anything in those sims," Normandin said. He reasoned that The Show does not adjust for "park factor," an advanced measure that controls for the favorability, or lack of it, of a hitter's home stadium. Kelsey confirmed that the game does not weight player performance according to league or park trends.

"The only reason I can come up with is that it's never been designed that way," Kelsey said via email.

"We use three years worth of stats to determine each player's individual attributes," Kelsey said, with the most recent year weighted the most. Asked if pure simulation data was divorced from the game's player ratings - if it was calculated by some other means for purposes of realism - Kelsey said it was not. "Simulations mirror what an average user might do, but if the user is doing great or poor throughout their season, their results will vary from a sim."

There also seem to be other factors tossed in that affect whether a player has an excellent or terrible season. In one year on The Show, Boston pitcher Clay Buchholz compiled a fat 6.49 earned run average even over a full season (170 innings pitched) backed by the Red Sox defence, considered among the best in the majors. "That's just not happening," Normandin said. Kelsey noted that in three other simulations I ran, Buchholz still had 12 to 14 victories and an ERA around 4.00. "As in real life, players can have a bad season in MLB The Show," Kelsey said. "The ability for a player to perform above or below his expectations adds to the realism of the simulation."

Back to Beltre. In MLB 2K10, "The performances look more like I expect to see from Beltre than The Show's," Normandin said. Beltre hit .262, with a .350 on-base percentage, .449 slugging percentage, and an OPS that just nicked .800 with rounding.

Sean Bailey of 2K Sports said MLB 2K10's designers did factor in Beltre's team and ballpark switch in building his ratings.

"Beltre definitely helped his virtual case by moving to the hitter-friendly park in Boston," Bailey said. "We also have to take into consideration whether an off-season move will result in the player being pitched around or if it means the player widll be seeing better pitches. Beltre is a guy who should see better pitches now that he is part of a dangerous lineup."

A player like the Pirates' Garrett Jones also offers a good look at the games' behaviour, Normandin reasoned, because he has no three-year record (just two prior to this one) and last year, in only half a season, returned numbers indicating a player on the rise (in 2009, he hit 21 home runs, .938 OPS, 44 RBI) Baseball Prospectus will look at Jones and adjust his performance for what's expected of a major leaguer the same age - in this case, Jones got out of the minors late in life (he was 28 last year.) Normandin expected Jones to have a strong performance in The Show and he was right. Jones averaged 31 home runs, hit for a batting average in the .290s, with an OPS above .900.

"It took forever for him to hit for power in the minors," Normandin said, "I expected him to retain some of that power from last year; the Show has him retaining all of it. Because of the way they weight data and his small sample from last year, he comes out amazing this year, based on what they had to work with last year."

At MLB The Show, Kelsey said that 350 at-bats worth of data is reliable enough to project his batting average plus-or-minus .050. "If we're talking about power, it's worth noting he's hit more home runs in the last 300 at bats than many hitters in 600." Kelsey said. "Even if he hit a grand total of zero home runs for another 300 at bats (highly unlikely), he'd still be worth his salt as home runs go."

Normandin examined another slugger his publication figures to regress: Philadelphia's Raul Ibanez, at 37, who hit 34 home runs, batted .277 and had an .899 OPS last year. In MLB 2K10, Ibanez comes back with 30 home runs, a .285 batting average and an OPS near. 870. Normandin thinks Ibanez had a strong first half of 2009 that overrode a more normal, and poorer, second half. For a game measuring three seasons, with the most recent the heaviest, "they're much too focused on what he was capable of in the first half of last year."

"With much respect to your BP analyst, we do not feel Ibanez is overrated," Bailey said. "The 34 home runs can't simply be attributed to going to a hitter's park in Philly [when]21 of his 34 home runs came on the road. He had a great year and has been too consistent over his career to punish via ratings. However, our ratings guru says that Raul has one of the shortest leashes on him, as far as dropping his ratings [in midseason roster updates.]37 years old is still 37 years old."

This is all just calculation, though. Since a player's rating also affects user performance, both have to design for a realistic outcome there within the success or failure of an actual at-bat. "Players and ratings both play a dominant role in at-bat results, and the effects stack cumulatively," Kelsey said. "There are many components, from the physical update of the ball and batter to the influence of pitch speed, type, and location on the batter's aim and timing."

Bailey identified nine background calculations going on in MLB 2K10 when a player swings, but said the game is necessarily dependent on a user's timing and the physics that follow. "We want the gamers to dictate the result of their plate battle, not our background calculations. The calculations are there so that each batter accurately represents the strengths and weaknesses of their real life counterpart."

For an analyst like Normandin, the ability for a proficient player to warp players' numbers has him spending more time in MLB The Show's Road to the Show singleplayer mode, working on developing one guy, rather than playing all nine in a roster. "After a period of time, it's like everyone you control is hitting .500 and you get too good for the game," he said. Playing all nine hitters gives you the benefit of at least 27 plate appearances where you can experiment and practice your timing, whereas in singleplayer, you get three or four per game, increasing their importance and providing more realistic progression.

"The mode I use is a good indication of what I expect," he said. "If you play as that, it takes much longer to get good at the game; if you play regular season or franchise modes, you'll get good at it pretty fast."

Baseball fans like Normandin, and there are many, can derive some appreciation of seven failures in 10 tries, as long as they form the believable narrative of a major league baseball game. But video games are still driven in large measure by one's successes in them. And let's face it: No one buys a ticket to see someone hit .265. They don't spend twice as much to do it in a video game either.

"I think you do go into that uncanny valley a little," Normandin said. "The Show might be as accurate as they want it to be, because if you get any further, you alienate parts of the market. Game designers have to be aware of those guys, while making it for someone like me who wants to see it at its most realistic, they're still also making it for a guy who just wants to see his favourite player be as good as he thinks he is."

Stick Jockey is Kotaku's column on sports video games. It appears Saturdays at 2 p.m. U.S. Mountain time.

Editor's note: Due to travel and time off, Stick Jockey will not be appearing the next two Saturdays. The next column will publish May 22.


Be the first to comment on this story!

Trending Stories Right Now