I don't think you have the technical background to understand those issues. Just your assertion that "the number of cores is not something that software can be optimized for" is completely false. The exact opposite is true, namely that those cores are only taken advantage of if software is explicitly designed to make use of them. Most of your other assertions are also false, particularly in regard to graphics acceleration. The characteristics of every GPU are very different... it's anything but standardized. I'm not sure what I could say to convince you however. Tell me what you'd need and I will try and provide it.
Anyway, if it was as simple as you assume, we would have long seen WP on a whole host of different SoCs, not just on Qualcomm chips.
I can assure you I have the technical background to understand these issues. I have run projects to port software stacks (including modern OS's) between chipsets and processor platforms, including porting components myself. I understand the pain of staying up all night fixing weird bugs that appear on one platform and not another, the shouting at chipset vendors when they finally admit their products don't work according to the specification, and the joy of getting it working seconds before the deadline.
Now, I agree with you about graphics hardware. What I said was wrong (written too early in the morning). What is more standardised are the software interfaces (OpenGL, DirectX). The chipset vendor will normally supply a driver for the GPU, presenting one of those interfaces which can be used directly by the OS.
Regarding number of cores - I agree that my point was also poorly expressed. Software can, of course, be optimised for the number of processor cores. However, except for very specific cases, it would be be a bad idea to write an application or OS specifically targetting, say, two cores. What you would do is allow the code to scale to make use of the available cores - for a simple example, starting as many threads as there are cores, and dividing the work between them (there might also be good reasons to start more or fewer threads than the number of cores).
Optimisations can also be made for available RAM. Memory bandwidth and cache management can be absolutely critical to performance, but optimisation here is very hard and often targets extremely restricted uses.
What you end up with is a set of parameters that can be tuned, or that the OS and applications automatically adapt to. Whoever is doing the porting (Microsoft for WP, the handset vendor for Android) will choose some sensible parameters, and perhaps play with them a bit to find what works best, as in practice, optimisation is something of an art and results are not always what you expect.
I don't really agree that porting WP to other chipsets would be that hard. Android handset vendors do this all the time, and WP and Android are not fundamentally very different from a technical perspective. Both are modular, pre-emptive multitasking OS's with architectures ultimately derived from MULTICS. For some reason, Microsoft have chosen to restrict the OS to Qualcomm chips. Given appropriate support from Microsoft, I expect it would be quite possible to port WP to ARM-based non-Qualcomm chipsets without major difficulty. Android is even officially supported on ARM and x86, and has been ported to MIPS, so even moving to different processor types is possible when source code is available.
As for what I need to know? Well, I don't need to know anything, but I would be curious:
- Is my analysis in the paragraph above correct? Or is there something fundamental that ties WP to Qualcomm chips and means it would be very difficult to run it on, say, an Nvidia device?
- What do Microsoft actually do when they 'optimise' WP for a specific chipset? What do they change compared to the 'generic WP' they presumably develop internally?