The facebook one I haven't seen, so I don't know. The restaurant one isn't necessarily an unfair test, but the result does have to be taken with some heavy caveats.
Did WP get a list of four star restaurants up first? Yes (sort of - it got a list of restaurants up graded by rating, not specifically 4-star ones only). But that's where the comparison has to end really because the people are likely looking at and doing different things in their apps. What if, for example, I want to filter by distance then star rating? That might be easily doable in another search app but Local scout doesn't allow that, so it would take longer to find the same end result using it as I'd have to eyeball the star rating for all local entries. Who has the best customer ratings system with the most feedback, and where is that info sourced from? Did the contestants all know the best way to just get a list up fast or were they doing it in a way that got them the most information or the most accurate result for what they specifically wanted? This isn't saying Local Scout produced poor results at all, just that the results everyone produced aren't really directly comparable. Or to put it another way, someone searching via a restaurant app might have to wait a few seconds more, but they might get richer info that someone using Local Scout might have to run a new search to get. In the circumstances it might not matter if you're e.g. just looking for the nearest coffee shop as fast as possible regardless of what chain it is or any other criteria. Or it might matter a lot if you want a vegetarian restaurant 3 star and above within a 3-mile radius with customer reviews.
Getting the end result fast might indeed be the test, but there's no assessment of how
useful that result is so the test to me is flawed because speed isn't the only criteria that's relevant. I can get dressed fastest for the day if all I wear is a sack, but will I be comfortable? Probably not, so my boast about being quickest is consequently devalued.
That's why I liked the tweet test - it's doing the exact same task to get the exact same end result. And the fact that WP effectively drew with an iPhone 4S doing it is something to genuinely be proud of. I'd love to have seen WP7 beat Android and the iPhone running an identical search via Yelp!, as it's available on all three platforms, but we didn't get that - and that's what bugs me (with the caveat I'm a real stickler for detail and fairness so probably get annoyed by this sort of thing more than most would).
Now if you'll excuse me, I'm going to go and change out of this sack. It's itchy