1. Cruncher04's Avatar
    It just came to my attention, that even in the latest version of Visual Studio, there is no support for aarch64. When thinking about it, there is also no way to submit aarch64 apps to the app-store.
    Conclusion would be, that Windows Mobile 10 (and all apps in the store) will be 32 bit even on Lumia 950, even though they contain 64 bit processors. Likewise, Lumia 950 phones will be much slower than Android phones with the same Qualcomm SoC.

    Is Microsoft stupid or do i miss something?
    09-25-2015 02:44 PM
  2. npoe's Avatar
    I haven't seen data about the difference in performance for 32 bits apps vs 64 bits app in ARM. I remember some Qualcomm head engineer saying that it was really about marketing and not about performance.
    09-25-2015 03:51 PM
  3. Cruncher04's Avatar
    I haven't seen data about the difference in performance for 32 bits apps vs 64 bits app in ARM. I remember some Qualcomm head engineer saying that it was really about marketing and not about performance.
    That was the statement of a Qualcomm engineer, when only Apple had 64 bit :p
    Cortex-A57 is about 20-25% faster with 64 bit code, which is significant. The aarch64 ISA is much more optimized and among other things has double the amount of registers (increase from 16 to 32).
    09-25-2015 06:17 PM
  4. jpal12's Avatar
    The main difference between 64-bit and 32-bit arm is the new arm version. The 64-bit part is mostly just marketing.
    a5cent likes this.
    09-25-2015 09:30 PM
  5. Cruncher04's Avatar
    The main difference between 64-bit and 32-bit arm is the new arm version. The 64-bit part is mostly just marketing.
    Therefore i was referring to AArch64 and not just 64 bit. Programs compiled for AArch64 show a significant speed-up compared to when compiled for AArch32/ARMv7 on the _SAME_ CPU (Cortex A57, Cortex A72)
    My Issue is, that Windows itself and all apps are compiled for AArch32/ARMv7 and not for AArch64, which makes the CPU work in compatibility mode and by doing so throwing lots of the performance out of the Window (in particular compared to Android and iOS, which both fully support Aarch64).
    This just means, that while having the same SoC as Android Smartphones, the performance will be quite a bit worse.

    Or to make it more clear, Windows phones will be one generation behind Android phones using the same SoC performance wise.
    09-26-2015 08:13 AM
  6. npoe's Avatar
    I was kind of expecting 32 bits when they got 3 GB of RAM for the Lumia 950 but I didn't know the specific impact that 64 bits or 32 bits would have. I kind of wonder if they are going to change the OS once Qualcomm has their proprietary architecture in the market.
    09-26-2015 06:10 PM
  7. Cruncher04's Avatar
    I kind of wonder if they are going to change the OS once Qualcomm has their proprietary architecture in the market.
    Not sure, why they would decide to align introduction of 64 bit support with Qualcomm's proprietary architecture. I mean Snapdragon 820 with its proprietary architecture will be AArch64 compliant as well as any other phone SoC for the next years. Even low and mid-rage phones introducing AArch64 with Cortex A53.
    What irritates me even more is, that there is no sign of AArch64 support in Visual Studio 2015. I mean this would be a prerequisite for everything else.

    It did already bother me that Microsoft introduces an high-end phone at the end of Snapdragon 810's life cycle. And then there is not even a chance to use this SoC as it was supposed to be?
    09-26-2015 06:53 PM
  8. npoe's Avatar
    That bothers me too but since I'm using a Lumia 920 the 950 XL will be an awesome upgrade. I just waited to much time to change my cellphone this time. Hopefully they wont take that much time for the 2016 flagships.

    Sadly with the way that Microsoft usually works I don't hold a lot of hope for them ot release a new 64 bit version but I still want to believe that it'll be here "soon". I wish to not be disappointed.
    09-28-2015 09:05 PM
  9. AfroPhysics's Avatar
    The applications will probably show as AnyCPU not aarch64. Any platform specific optimizations will be handled by the CLR and not be specifically compiled to that and only that. Updating and extending compiler support in the CLR and framework for CPU based optimizations can be done later, which is a benefit of managed code.
    09-28-2015 09:57 PM
  10. Ivan05il's Avatar
    "just marketing" is not something MS can afford to ignore. Android phones and Apple are fighting who has more cores, more bits, more RAM. Yes, some of it are gimmicks or not to be used to the full potential, but that's not that important in the big picture. Dragging reason into this won't sell the phones. MS needs to show that Windows Mobile isn't just an afterthought, but that they care to give their customers the best.
    09-29-2015 04:24 AM
  11. Cruncher04's Avatar
    The applications will probably show as AnyCPU not aarch64. Any platform specific optimizations will be handled by the CLR and not be specifically compiled to that and only that. Updating and extending compiler support in the CLR and framework for CPU based optimizations can be done later, which is a benefit of managed code.
    Few statements:
    1) Updating runtime translation within CLR has as prerequisite that Windows 10 Mobile is compiled itself for 64 bit in the first place.
    2) Not all apps are managed code, there are lots a native applications and Windows 10 Mobile itself, which are compiled native.
    3) It can be argued that native apps need the performance most, because that is one of the reasons they are compiled native in the first place. In fact most if not all high quality games are all C/C++. Runtime critical algorithms in photo and video apps including apps like VLC are all C/C++.
    09-29-2015 03:42 PM
  12. mortici's Avatar
    That's because ARM based chips are not what MS is going after... x86 SoC chips is where they are going. There is not reason to invest in a new arch type for dev, when you can just using existing amd64/x86. You know so you can run win32 apps on your phone with continuum....

    MS isn't looking to keep the branch of ARM and x86, they want to get rid of ARM and position themselves with x86, thus completing the whole picture. Universal apps help this along because you have support for the old 32bit ARM chips, focusing on the new aarch64 is a wasted effort when x86 SoC chips can and should outperform them in performance and battery... that all depends on Intel at this point I suspect...

    This is all speculation and personal opinion.
    09-29-2015 06:38 PM
  13. a5cent's Avatar
    The answer is unfortunately "no". Visual Studio doesn't (yet?) include compilers that target ARMv8.

    On the other hand, we've been through this in the PC space before, and at no point did an updated ISA alone ever result in general performance gains >= 20%. While such gains (and more) were measurable under very specific circumstances (video encoding with QuickSync is one example), they were never general gains. For the average word/mail/office user, which requires little more than bread & butter integer performance, those ISA additions came and went largely unnoticed. That's how most of this will play out for smartphones too. That's why I'm highly sceptical that >=20% is a realistic percentage, at least not when measured under circumstances that mimic real life smartphone usage (not synthetic benchmarks).

    I've searched, but so far I haven't found anyone that benchmarked the ARMv7 and ARMv8 variants of a single app on the A57. Comparisons are always drawn to older A53 or A15 CPUs which is completely useless for this purpose. Can you help out?

    Don't get me wrong. I'm not excusing MS here. In contrast to the whole 64bit debate which is a complete joke, supporting ARMv8 is important and IMHO should already be part of Visual Studio. It is a big deal, I'm just sceptical it's the kind of deal-breaker big deal that a general performance improvement of >=20% would represent.

    Sources (other than Qualcomm marketing material) very much appreciated...

    @mortici
    I disagree with your take on ARM. Just look at how MS is evolving Visual Studio into a cross platform development environment for Android and iOS which also has direct support for cross platform development tools like Unity and Xamarin. Even MS' own UWP bridges (Astoria / Islandwood) depend on the availability of ARM compilers. As long as MS intends to play in the smartphone space, they can't afford to ignore ARM compiler technology. It's not going away or being phased out.
    Last edited by a5cent; 09-29-2015 at 07:27 PM. Reason: last paragraph
    npoe and mortici like this.
    09-29-2015 07:15 PM
  14. Cruncher04's Avatar
    Regarding aarch32 vs aarch64 on Cortex A57 there are some slides from ARM:



    In addition you can look-up Geekbench scores from Galaxy Note 4 (Cortex A57 32 bit) and Galaxy S6 (Cortex A57 64 bit)

    On the architecture side, the aarch64 ISA offers (among others) the following advantages:
    - 31 instead of 14 GP registers available
    - barriers with release and acquire semantics
    - integer divide
    - NEON/VFP embedded into ISA instead of using co-pro interface with separate status flags
    - 64-bit AAPCS enables more efficient procedure calls (e.g. 8 parameter/return + 8 scratch)
    - conditional select instead of only conditional move

    focusing on the new aarch64 is a wasted effort when x86 SoC chips can and should outperform them in performance and battery... that all depends on Intel at this point I suspect...
    Latest ARM designs outperforming Airmont(Atom) by a very healthy margin up to the point where with Apple A9/A9x even Skylake is challenged. Atoms are moving fast to mid range/low end as we speak. If Microsoft making their bets on x86 only, they will be left behind quickly.
    Last edited by Cruncher04; 09-30-2015 at 05:53 AM.
    a5cent likes this.
    09-30-2015 05:25 AM
  15. mortici's Avatar
    The answer is unfortunately "no". Visual Studio doesn't (yet?) include compilers that target ARMv8.

    On the other hand, we've been through this in the PC space before, and at no point did an updated ISA alone ever result in general performance gains >= 20%. While such gains (and more) were measurable under very specific circumstances (video encoding with QuickSync is one example), they were never general gains. For the average word/mail/office user, which requires little more than bread & butter integer performance, those ISA additions came and went largely unnoticed. That's how most of this will play out for smartphones too. That's why I'm highly sceptical that >=20% is a realistic percentage, at least not when measured under circumstances that mimic real life smartphone usage (not synthetic benchmarks).

    I've searched, but so far I haven't found anyone that benchmarked the ARMv7 and ARMv8 variants of a single app on the A57. Comparisons are always drawn to older A53 or A15 CPUs which is completely useless for this purpose. Can you help out?

    Don't get me wrong. I'm not excusing MS here. In contrast to the whole 64bit debate which is a complete joke, supporting ARMv8 is important and IMHO should already be part of Visual Studio. It is a big deal, I'm just sceptical it's the kind of deal-breaker big deal that a general performance improvement of >=20% would represent.

    Sources (other than Qualcomm marketing material) very much appreciated...

    @mortici
    I disagree with your take on ARM. Just look at how MS is evolving Visual Studio into a cross platform development environment for Android and iOS which also has direct support for cross platform development tools like Unity and Xamarin. Even MS' own UWP bridges (Astoria / Islandwood) depend on the availability of ARM compilers. As long as MS intends to play in the smartphone space, they can't afford to ignore ARM compiler technology. It's not going away or being phased out.
    I agree with you, my take on the bridges is the fact that they are absolutely needed if you wish to bridge the gaps between architecture and software. There has to be a means for the apps to be developed for legacy and future devices from a single point. Hence the bridges were created to allow cross platform universal development. Point being that if you are developing an ARM variant, especially in Visual Studio, you can simply do an x86 variant at the same time. ARM was/is necessary due to power and form factor restrictions, with new SoC x86 chips on the way that compete equally if not better in this realm the need might be reduced if not eliminated.

    Only time will tell, but this whole thing hangs on the developer adoption. There is incentive now, because those who only developed Win32 apps, can now cross into the mobile space and vice versa, with the tools the MS I providing. Hence why I speculate that MS is looking into the future of better than ARM, x86 SoC chips and officially being able to say Windows on any size device. Addtionally this helps devs too, because you are developing for one architecture type, not ARMv7,ARMv8,64/32, SPARC, etc. just x86/amd64 like its been done in the past.

    The question is then whether x86 is the best architecture for all computing...

    Again this is all speculation and opinion, but the prospect of a phone sized x86 device with the power of say a Core i5 Skylake chip (core M maybe?) that can be docked and scales up to a full desktop/tablet OS (depending on peripherials) would be short of amazing. My PC travels with me, in my pocket.

    The goal here is to bring the legacy stuff along with this is the hard part. You need the devs that only cerated apps for ARM to convert to universal apps and now target more than just a cellphone but ALL types of devices, granted should their app actually have value on said devices, but either way you could also set them up to only run on mobile screens and no scaling for desktop or tablet mode.

    Basically re-code, how you code :D
    Last edited by mortici; 09-30-2015 at 01:48 PM.
    09-30-2015 01:15 PM
  16. mortici's Avatar
    Latest ARM designs outperforming Airmont(Atom) by a very healthy margin up to the point where with Apple A9/A9x even Skylake is challenged. Atoms are moving fast to mid range/low end as we speak. If Microsoft making their bets on x86 only, they will be left behind quickly.
    You are correct, that's currently though. The atom chips have been around in 64bit flavor since 2012/2013 but due to power religated to tablets. The new x7 lines coming up win the 14u chip dies are a different animal based on cherry trail. Sadly they will need separate LTE chips as opposed to the x3... Either way they claim to outperform the Qualcomm chips by up to 50%, the question is at what cost to the battery.

    Intel is in a position to shake things up at this point. If they can pack a better punch than the ARM chips at equal or better yet, better battery life, while providing the ability to run a "desktop" OS then things will get interesting in my opinion..

    Again speculation and opinion on my part, but I think the next two years will be interesting.
    09-30-2015 01:28 PM
  17. Cruncher04's Avatar
    The new x7 lines coming up win the 14u chip dies are a different animal based on cherry trail. Sadly they will need separate LTE chips as opposed to the x3... Either way they claim to outperform the Qualcomm chips by up to 50%, the question is at what cost to the battery.
    Sorry to burst your bubble but i was talking about the Airmont u-architecture, which you find in Cherrytrail Atom x7. Look up some performance comparisons on the web. Take for example Atom X7 Z8700 you'll find in the Surface 3 and then compare to Cortex A57 you'll find in Galaxy S6. And if you want some more fun, compare with Apple A9 in iPhone 6S. We are talking 50%-100% faster within a phone power envelope vs. tablet.
    09-30-2015 02:27 PM
  18. Cruncher04's Avatar
    double post
    09-30-2015 02:45 PM
  19. a5cent's Avatar
    @Cruncher

    I just really wish there was something better to go by than Qualcomm's marketing material, using the most synthetic of benchmarks no less, Geekbench. I saw that chart too, but I'm calling BS on the integer performance test under ARMv8 (I suspect ARMv8 NEON "shenanigans" that almost never translate into real world improvements). 😒

    Remove that and the differences are negligible.


    The Note 4 uses a S805 which is not based on the A57, so comparing that to the Galaxy S6 in order to gauge the impact of ARMv8 wouldn't result in a valid comparison either.

    I've never seen just recompilation targeting an updated ISA alone lead to general improvements in the >=20% range. Even today, we're still running a lot of 32bit PC software compiled for vanilla x86 because it's not had a notable impact.

    Sure, that could be different here, but until there is something comparing how each ISA handles real world computational loads, I'll remain doubtful it's that important. To a degree also because I just can't see MS outright ignoring ARMv8, as they currently are, if there truly was that much to gain from it.

    I'll keep looking for benchmarks that are less synthetic, but until we've got more reliable tests proving otherwise, I think it's safer to assume this capability is not a critical omission. And yes, I will keep looking...
    Last edited by a5cent; 09-30-2015 at 06:00 PM.
    09-30-2015 03:18 PM
  20. Cruncher04's Avatar
    I just really wish there was something better to go by than Qualcomm's marketing material, using the most synthetic of benchmarks no less, Geekbench. I saw that chart too, but I'm calling BS on the integer performance test under ARMv8 (I suspect ARMv8 NEON "shenanigans" that almost never translate into real world improvements).
    The slides i linked are from ARM not from Qualcomm. The main message of the slide is, that Cortex A57 is 45% faster than A15 and not so much the fact, that you will not gain the speed-up with 32 bit code (which is rather a restriction and not something you would brag about).
    Geekbench3 is not compiled for NEON (nor SSE/AVX for that matter). The nice thing about synthetic benchmarks is, that they are not reliant on the OS or other libraries. Side effects from updated OS/libraries are therefore removed.
    In addition there is a Cortex A57 version of the Note 4 (version N910C with Exynos 5433). Note 4 was never updated to 64 bit Android. So there you have a comparison right at hand.

    I've never seen just recompilation targeting an updated ISA alone lead to general improvements in the >=20% range. Even today, we're still running a lot of 32bit PC software compiled for vanilla x86 because it's not had a notable impact.
    Most of the improvements of the ARM v8 ISA directly aim at performance gains. Take the register set as example. You just don't increase your register set from 16 to 32 if you would not expect significant gains from that. Its not just the gates of the registers and multiplexer you would waste, but you are loosing 3 bits of instruction encoding space, which is very precious when you want to stay at fixed length 32 bit instructions. Its not only large functions, which gain from increased register set but also small leaf functions in the call graph, because they now have 8 scratch registers available.

    To a degree also because I just can't see MS outright ignoring ARMv8, as they currently are, if there truly was that much to gain from it.
    This concern was precisely why i was opening this thread. Could be as example that they have so much pressure finishing Windows Mobile, that there was just no room for anything on top, like moving the whole toolchain and OS to 64 bit ARM despite the known gains.
    09-30-2015 09:16 PM
  21. a5cent's Avatar
    The slides i linked are from ARM not from Qualcomm.
    Yeah. What I should have said is that I'd prefer a source that isn't directly involved in selling the architecture, which excludes both Qualcomm and ARM. Sorry about that.

    Geekbench3 is not compiled for NEON (nor SSE/AVX for that matter). <snipped>
    Most of the improvements of the ARM v8 ISA directly aim at performance gains. Take the register set as example. You just don't increase your register set from 16 to 32 if you would not expect significant gains from that. Its not just the gates of the registers and multiplexer you would waste, but you are loosing 3 bits of instruction encoding space, which is very precious when you want to stay at fixed length 32 bit instructions. Its not only large functions, which gain from increased register set but also small leaf functions in the call graph, because they now have 8 scratch registers available.
    If it's not NEON, then those benchmarks are using some other ARMv8 hardware accelerated feature to misrepresent what should be a measure of general and pure integer performance. Maybe their integer tests are running AES or SHA1 cyphers which the CPU offloads to ARMv8's hardware accelerated cypher units. Either way, it's extremely fishy.

    I understand that the ARMv8 ISA includes many advancements . I'm not saying they are irrelevant. They do matter when performing very specific tasks. However, despite all the advancements you mentioned, there is still no way those will get us to general performance improvements in the range of >=20%, just by recompiling to a new ISA. Not happening. I'm not buying it.

    The nice thing about synthetic benchmarks is, that they are not reliant on the OS or other libraries. Side effects from updated OS/libraries are therefore removed.
    Agreed. On the other hand, I don't know anybody that uses their smartphone primarily to run dhrystone, AES, or other such algorithms which fit entirely into a CPU's L1 cache. That's the only sort of thing Geekbench really tests. Such tests have their place if you're talking about CPU theory and want to understand very specific strengths and weaknesses of a given CPU architecture. They are almost meaningless if we're trying to gauge how a CPU will perform in average, everyday computing tasks which Joe and Jane care about. Unfortunately, if we want to measure the impact of the ARMv8 ISA on the types of things that actually matter to Joe and Jane, we have no choice but to measure OS glitches and inconsistencies along with everything else. IMHO that's still far more interesting than what Geekbench provides, even if it isn't as pure. To convey the stats that actually mater to most users, I prefer something like PCMark.

    In addition there is a Cortex A57 version of the Note 4 (version N910C with Exynos 5433). Note 4 was never updated to 64 bit Android. So there you have a comparison right at hand.
    Okay, thanks for pointing that out. I've taken a closer look and the two devices you mentioned. These are my results:

    GPU CPU Clock Clock %
    Samsung Galaxy Note 4 Mali-T760 MP6 Exynos 5433 / 4x A57 1.9 GHz 100%
    Samsung Galaxy S6 Mali-T760 MP8 Exynos 7420 / 4x A57 2.1 GHz 110.5%

    As the GPU in the S6 is quite a bit more powerful than the GPU in the Note 4, we must disregard any graphics/video related benchmarks (video playback, photo editing, 3D rendering etc). We'd also have to compensate for the Galaxy S6's 10.5% higher clock rate. These are the benchmarks from Anandtech's bench, which most closely correlate with what we're trying to measure:

    bench-2.png
    source

    These benchmarks are a far more accurate representation of what to expect in real world usage. They mimic the CPU workloads created by browsing the web, or editing a document on your phone. The results are either very similar for both CPUs, or they quite accurately reflect the 10% bump in clock rate the S6 has over the Note 4. Almost all the other results listed in Anandtech's bench are GPU related, so we can't rely on them to tell us anything about ARMv8. If both CPUs had the same clock rate, the measured results would be pretty much identical across the board.

    This is pretty much exactly what I expected to find. In everyday use, there is little to no discernible difference between 32bit ARMv7 and 64bit ARMv8 code when run on the same CPU. If the general >=20% improvement was in any way accurate, we'd have seen that reflected in these benchmarks.
    Last edited by a5cent; 10-01-2015 at 10:36 AM. Reason: spelling
    10-01-2015 08:14 AM
  22. Cruncher04's Avatar
    should have said is that I'd prefer a source that isn't directly involved in selling the architecture, which excludes both Qualcomm and ARM.
    So you think ARM hopes to sell more of its CPUs, when they have to admit that their 32 bit performance is not up to par?

    These benchmarks are a far more accurate representation...
    *sigh* I am talking the whole time about native compilation and you coming up with benchmarks, which showing the current status of the Java and Java-script engines more than anything else. Besides that, why do you assume, that those engines have been updated to 64 bit at all?

    And regarding "real world usage". Does it matter much if a java-script code snippet on a web-site executes in 5 or 10ms? I don't think so. The user will not notice a difference anyway.
    I am talking apparently about use-cases where performance matters, and where developers deliberately choose native compilation for performance reasons.
    In games for instance, a frame-rate difference of 20% would be very noticeable to the human eye without needing a stop watch. Or in cases, where the user has to wait for a significant amount of time like when an expensive video or photo effect is applied, or when you creating a zip archive etc.
    Last edited by Cruncher04; 10-01-2015 at 04:04 PM.
    10-01-2015 03:42 PM
  23. a5cent's Avatar
    If you don't like the java script tests, there are still the PCMark benchmarks to look at.

    Still, what runs java script? That would be a parser and runtime environment. Those are anything but trivial, they are components that developers invest a lot of optimization effort into, and every part of them is implemented by natively compiled components. Where exactly is the relevant difference to a natively compiled game? For our purposes, as long as the functions we measure aren't trivial, mimic typical everyday computational loads, and are implemented by decently optimized native code, it's irrelevant what that code actually does. It's still a relevant measurement.

    At the very least, it's still a lot more meaningful than the tiny mathematical algorithms tested by Geekbench that all fit into L1 cache (which would also be completely atypical for games BTW).

    You seem to be saying that it's reasonable to expect a natively compiled and optimized game to exhibit performance differences >=20%, while a natively compiled and optimized java script runtime environment or HTML rendering engine could exhibit performance differences of 0%. That seems like an extraordinary claim to me. Extraordinary claims require extraordinary evidence. I doubt there is any.

    If you can show me a benchmark that at least somewhat mimics a smartphone's typical computational load, where these two devices show very different results, I'll change my mind. From what I see here now I very much doubt this is an issue we need to care about.
    10-01-2015 04:39 PM
  24. npoe's Avatar
    In games for instance, a frame-rate difference of 20% would be very noticeable to the human eye without needing a stop watch. Or in cases, where the user has to wait for a significant amount of time like when an expensive video or photo effect is applied, or when you creating a zip archive etc.
    I think that games, video and photo effects depends heavely on the GPU and not the CPU.
    10-06-2015 08:20 PM

Similar Threads

  1. Windows Phone and You Tube
    By Auggybendoggy in forum Windows Phone 8.1
    Replies: 6
    Last Post: 12-07-2015, 11:51 AM
  2. Replies: 1
    Last Post: 09-25-2015, 03:31 PM
  3. Twitch to ditch Flash, support video uploads in 2016
    By WindowsCentral.com in forum Windows Central News Discussion
    Replies: 0
    Last Post: 09-25-2015, 02:42 PM
  4. Ca I get some help with a Win 10 update problem?
    By swapnil bhosale in forum Nokia Lumia 520
    Replies: 0
    Last Post: 09-25-2015, 02:09 PM
  5. 5 intriguing apps for Windows 10: September 25, 2015
    By WindowsCentral.com in forum Windows Central News Discussion
    Replies: 0
    Last Post: 09-25-2015, 01:50 PM
LINK TO POST COPIED TO CLIPBOARD