Windows Mobile 10 without aarch64 support?

The slides i linked are from ARM not from Qualcomm.

Yeah. What I should have said is that I'd prefer a source that isn't directly involved in selling the architecture, which excludes both Qualcomm and ARM. Sorry about that.

Geekbench3 is not compiled for NEON (nor SSE/AVX for that matter). <snipped>
Most of the improvements of the ARM v8 ISA directly aim at performance gains. Take the register set as example. You just don't increase your register set from 16 to 32 if you would not expect significant gains from that. Its not just the gates of the registers and multiplexer you would waste, but you are loosing 3 bits of instruction encoding space, which is very precious when you want to stay at fixed length 32 bit instructions. Its not only large functions, which gain from increased register set but also small leaf functions in the call graph, because they now have 8 scratch registers available.

If it's not NEON, then those benchmarks are using some other ARMv8 hardware accelerated feature to misrepresent what should be a measure of general and pure integer performance. Maybe their integer tests are running AES or SHA1 cyphers which the CPU offloads to ARMv8's hardware accelerated cypher units. Either way, it's extremely fishy.

I understand that the ARMv8 ISA includes many advancements . I'm not saying they are irrelevant. They do matter when performing very specific tasks. However, despite all the advancements you mentioned, there is still no way those will get us to general performance improvements in the range of >=20%, just by recompiling to a new ISA. Not happening. I'm not buying it.

The nice thing about synthetic benchmarks is, that they are not reliant on the OS or other libraries. Side effects from updated OS/libraries are therefore removed.

Agreed. On the other hand, I don't know anybody that uses their smartphone primarily to run dhrystone, AES, or other such algorithms which fit entirely into a CPU's L1 cache. That's the only sort of thing Geekbench really tests. Such tests have their place if you're talking about CPU theory and want to understand very specific strengths and weaknesses of a given CPU architecture. They are almost meaningless if we're trying to gauge how a CPU will perform in average, everyday computing tasks which Joe and Jane care about. Unfortunately, if we want to measure the impact of the ARMv8 ISA on the types of things that actually matter to Joe and Jane, we have no choice but to measure OS glitches and inconsistencies along with everything else. IMHO that's still far more interesting than what Geekbench provides, even if it isn't as pure. To convey the stats that actually mater to most users, I prefer something like PCMark.

In addition there is a Cortex A57 version of the Note 4 (version N910C with Exynos 5433). Note 4 was never updated to 64 bit Android. So there you have a comparison right at hand.

Okay, thanks for pointing that out. I've taken a closer look and the two devices you mentioned. These are my results:

GPUCPUClockClock %
Samsung Galaxy Note 4Mali-T760 MP6Exynos 5433 / 4x A571.9 GHz100%
Samsung Galaxy S6Mali-T760 MP8Exynos 7420 / 4x A572.1 GHz110.5%

As the GPU in the S6 is quite a bit more powerful than the GPU in the Note 4, we must disregard any graphics/video related benchmarks (video playback, photo editing, 3D rendering etc). We'd also have to compensate for the Galaxy S6's 10.5% higher clock rate. These are the benchmarks from Anandtech's bench, which most closely correlate with what we're trying to measure:

bench-2.png
source

These benchmarks are a far more accurate representation of what to expect in real world usage. They mimic the CPU workloads created by browsing the web, or editing a document on your phone. The results are either very similar for both CPUs, or they quite accurately reflect the 10% bump in clock rate the S6 has over the Note 4. Almost all the other results listed in Anandtech's bench are GPU related, so we can't rely on them to tell us anything about ARMv8. If both CPUs had the same clock rate, the measured results would be pretty much identical across the board.

This is pretty much exactly what I expected to find. In everyday use, there is little to no discernible difference between 32bit ARMv7 and 64bit ARMv8 code when run on the same CPU. If the general >=20% improvement was in any way accurate, we'd have seen that reflected in these benchmarks.
 
Last edited:
should have said is that I'd prefer a source that isn't directly involved in selling the architecture, which excludes both Qualcomm and ARM.

So you think ARM hopes to sell more of its CPUs, when they have to admit that their 32 bit performance is not up to par?

These benchmarks are a far more accurate representation...

*sigh* I am talking the whole time about native compilation and you coming up with benchmarks, which showing the current status of the Java and Java-script engines more than anything else. Besides that, why do you assume, that those engines have been updated to 64 bit at all?

And regarding "real world usage". Does it matter much if a java-script code snippet on a web-site executes in 5 or 10ms? I don't think so. The user will not notice a difference anyway.
I am talking apparently about use-cases where performance matters, and where developers deliberately choose native compilation for performance reasons.
In games for instance, a frame-rate difference of 20% would be very noticeable to the human eye without needing a stop watch. Or in cases, where the user has to wait for a significant amount of time like when an expensive video or photo effect is applied, or when you creating a zip archive etc.
 
Last edited:
If you don't like the java script tests, there are still the PCMark benchmarks to look at.

Still, what runs java script? That would be a parser and runtime environment. Those are anything but trivial, they are components that developers invest a lot of optimization effort into, and every part of them is implemented by natively compiled components. Where exactly is the relevant difference to a natively compiled game? For our purposes, as long as the functions we measure aren't trivial, mimic typical everyday computational loads, and are implemented by decently optimized native code, it's irrelevant what that code actually does. It's still a relevant measurement.

At the very least, it's still a lot more meaningful than the tiny mathematical algorithms tested by Geekbench that all fit into L1 cache (which would also be completely atypical for games BTW).

You seem to be saying that it's reasonable to expect a natively compiled and optimized game to exhibit performance differences >=20%, while a natively compiled and optimized java script runtime environment or HTML rendering engine could exhibit performance differences of 0%. That seems like an extraordinary claim to me. Extraordinary claims require extraordinary evidence. I doubt there is any.

If you can show me a benchmark that at least somewhat mimics a smartphone's typical computational load, where these two devices show very different results, I'll change my mind. From what I see here now I very much doubt this is an issue we need to care about.
 
In games for instance, a frame-rate difference of 20% would be very noticeable to the human eye without needing a stop watch. Or in cases, where the user has to wait for a significant amount of time like when an expensive video or photo effect is applied, or when you creating a zip archive etc.
I think that games, video and photo effects depends heavely on the GPU and not the CPU.
 

Staff online

Members online

Forum statistics

Threads
339,403
Messages
2,262,424
Members
428,756
Latest member
Cmp240