Benchmarks are Wiggedy Wiggedy Whack...

I wrote this article for the latest edition of CPU Magazine. Remember, if you haven't had the opportunity to read this magazine you should go pick up a copy soon.
There has been a lot of speculation about AMD’s new Phenom processor, and up until now the chip’s performance benchmark scores vs. the competition’s were predicted to be somewhat underwhelming. I’d like to address that this month, as well as the validity of benchmarking as it pertains to the customer experience in general.
I like performance benchmarks, but I’m starting to think people are putting too much weight on them and not enough on the overall customer experience. This is roughly as silly as directly equating one’s physical strength to how much he can bench, or comparing drag racing to Formula One racing—there are some profound differences here.
When you really boil it down, no one cares about benchmarks when they are playing a game, churning out a report, or reading a medical image. Let’s face it, on the high end 300 points in 3DMark06 isn’t going to affect gameplay even at the highest resolution with maximum detail. Gamers care that their PCs display the best possible image while delivering the most compelling and most stable gaming experience. Users care that when they turn on their PC it boots reasonably quickly and works with all of their devices. If one of their components fails, they are looking for a simple way of replacing the component with the best access to customer support. Thus there are undeniable differences between overall experience and performance benchmarks. I like to refer to this internally as the “experience benchmark,” and it’s a tough one to measure because it’s so subjective.
Yes, there was a point when gamers thought the only things that mattered were frame rates, but I think that the tide is shifting. Ask any pro gamer to compare two similarly equipped PCs side by side running same game at the same resolution, and I guarantee you that he will have no clue which delivers the highest frame rate. He’ll most likely be inclined to choose the one that he thinks delivers the best image quality.
These changes are occurring because current hardware delivers incredibly high levels of performance and the software has not quite caught up yet, although some games are starting to deliver, like MS Flight Simulator X, Company of Heroes, and upcoming titles such as Hellgate London. Vista introduced new levels of complexity and experience to gamers and enthusiasts alike. Overall the initial experience of Vista sucked, but it has turned a corner and things are beginning to look up thanks to DirectX 10.
So again, I believe that performance benchmarks are not the true measure of a compelling experience. There are many more factors to consider when building and/or buying your next PC. You’ll want to consider operating noise, image quality, ease of access, ease of upgradeability, ease of replacing components, how “quick” it feels when you’re booting it up, storage space, stability, operating system usability, style and design, and, if you are a gamer, how well it delivers in the area of gameplay.
So, AMD decided to unveil Phenom running at 3.0GHz without showing actual benchmarks. What it showed was a game running smoothly with all details enabled, which makes perfect sense to me. And for the record, if you were to benchmark Phenom at 3GHz you would see that it kicks the living crap out of any current AMD or Intel processor—it is a stone cold killer (at 3GHz, now imagine how it would perform if they could squeek some more juice out of it?).
I’m guessing that AMD will be able to launch some parts at higher clocks than it is currently showing in its roadmaps, and if the company can get these chips on shelves in a timely fashion, I think it could be a major coup and could even be the impetus for the turnaround the company so desperately needs. Of course, Intel probably won’t get caught flat-footed, but AMD has to start somewhere.
It’s interesting to note that AMD isn’t showing benchmarks on a part that delivers the goods—perhaps it too is seeing that performance benchmarks are only a small piece of the overall experience puzzle. That said, I suspect there will be some more shaking up at AMD before the sun starts to shine green again.
14 comments:
"So, AMD decided to unveil Phenom running at 3.0GHz without showing actual benchmarks. What it showed was a game running smoothly with all details enabled, which makes perfect sense to me."
-- Not really, games are VIDEO CARD dependent, and far less CPU reliant. You write it yourself:
"Ask any pro gamer to compare two similarly equipped PCs side by side running same game at the same resolution, and I guarantee you that he will have no clue which delivers the highest frame rate. He’ll most likely be inclined to choose the one that he thinks delivers the best image quality."
-- you can configure an PCIE system with a 2.4 C2D or 2.8 A64 and you'll get "the best image quality" as long as you have a powerful VGA card. Using games to show of raw CPU power in a useful matter, is in this day and age, not the way to do it, and doesn't make much sense.
Show us Divx/Audio encoding, Winrar compression, Autocad operations, 3D rendering, etc. Applications which are CPU limited, only then can you gauge system performance.
As it stands, AMD Phenom has to be introduced at 2.4+ at prices competitive with C2D, and they better have a higher clocked version to follow up, as Peryn seems to be shaping up nicely to increase a few % clock for clock compared to Conroe.
Seems they're pulling a HD 2900 XT? Launching their top end product as a mid-range/mid-priced product?
As always, price and availability will prevail over pure performance and benchmarking:)
AMD has stumbled severely in the past couple of years. I understand that we need to keep them around to keep INTEL honest and innovative but lets not put lipstick on this pig.
Proof of this is in their lates antitrust lawsuits. And Marketing campaign. The have turned themselves into the winey tattletale on the play ground.
AMD has completely forgotten how to engineer a good product and have fallen back to there old standby which is slash prices and hope for the best.
This has to be the best article that you have written in the last month or two. Benchmarks have too much hype surrounding them - that is so true.
Man, write a book. I'd read it like I read Harry Potter (which was a long time ago).
Benchmarks are so people can see how applications perform on a certain piece of hardware.
Synthetics = Garbage.
Sisoft sandra has no use in my life.
It does not calculate things for me.
It does not compile programs.
It does not check email.
It does not surf the web.
It does not dissasemble or debug programs.
It does not play games.
It does not encode audio.
It does not encode video.
It does not synthesize audio.
(are you getting the point yet)
Now when i look at my benchmarks for all the things i generally do with my computer I can see which processor is best for me.
Benchmarks are extremely useful, unless are you AMD's marketing dept.
I think you need to change the title to "Synthetic Benchmarks are Wiggedy Wiggedy Whack..."
Yes it's true a game is not the way to benchmark CPU's true performance... but in the case of 3ghz phenom demonstration, showing CPU raw number crunching performance just wasn't the main priority for AMD imo. When i first read the article a certain thing simply jumped at me. It was the small size of the cooler. Let's face it that thing wasn't some bleeding edge monster cooler, in fact it was rather small and plain looking (and considering the bling around it it really stood out as if someone wanted to attract attention there ;) ). And what does that tell me as a performance per watt conscious user and also a dedicated overclocker? That thing simply couldn't be running molten hot on a suicide/heavy overclock for that long on such a cooler. And they proved that by running the game so GPUs and CPU would be under load all the time. I understand the CPU was hand picked for the demonstration, but as a potential AM2+ user i now know this design is capable of 3ghz+ with stock cooling thus ensuring me more OC headroom with a more effective cooler. Guess we could call it a "stability" benchmark for a change. Don't know about you but THAT does make a lot sense to me :)
Rahul, What are your thoughts on Henri Richard leaving AMD? Is it a positive sign that upper management is making the right changes, or is it a sign of a sinking ship?
"And for the record, if you were to benchmark Phenom at 3GHz you would see that it kicks the living crap out of any current AMD or Intel processor"
Citation needed ;)
The really big problem with many benchmarks is that you don't know what is measured, because you don't have the source code. So looking at all these benchmark results at the hardware sites is really funny.
Hi Rahul :)
Any comments on Theo Valich incredible post at Inquirer about breaking the 30K mark with Phenom @3Ghz and CF 2900XT cards?
Seems pretty aligned with your line of "cold blooded killer" :D
If Theo is correct ,than 3Ghz Phenom is much much faster than any intel Quad pout there,even Penryn and possibly Nehalem!
Hi Rahul :)
Any comments on Theo Valich incredible post at Inquirer about breaking the 30K mark with Phenom @3Ghz and CF 2900XT cards?
Seems pretty aligned with your line of "cold blooded killer" :D
If Theo is correct ,than 3Ghz Phenom is much much faster than any intel Quad pout there,even Penryn and possibly Nehalem!
After seing the K10 GCC benchmark published on AMD site today, I am humbly waiting for high-clocked parts.
I do wish for 64bit Linux indie benchmarks and reviews as well.
This may convince me to not go 2P, but stay with 1P.
(In all honesty, what I read so far regarding Shanghai, Barcelona doesn't look as tasty.) In any case I hope more will become clear in Q4.
Guess you guys are finally realizing what Kyle over at HardOCP has been saying now for 4 years about the user experience and benchmarks? Great article from 2005.
http://www.hardocp.com/article.html?art=ODYyLCwsaGVudGh1c2lhc3Q=
And others that look at real world performance, not just benchmarks.
http://www.hardocp.com/article.html?art=MTI2MiwsLGhlbnRodXNpYXN0
http://enthusiast.hardocp.com/article.html?art=MTEwOCwsLGhlbnRodXNpYXN0
http://enthusiast.hardocp.com/article.html?art=MTAwMiwsLGhlbnRodXNpYXN0
when you play a game, your framerate isnt going to be decided upon your cpu, its depends on your gpu so why would you compare the two? and obviously a 3.0 GHz quad processor RIGHT NOW would blow out anything else in the market.
Post a Comment