I don't get the issue, this is normal for onboard sound solution. The audio chip used is AT MOST 2$... and can reach 5-10$ on ultra high-end desktop computer motherboard. The sound quality simply sucks. If you want rich quality sound that pushes the full potential of your headphones, and be able to drive high impedance headphones, with no static or distortion, you need to cash out on a dedicated sound card.
On my desktop, I have the ASUS Xonar Essence STX on PCI-E, a 200$ sound card. This is nothing pro grade. Sounds cards are expensive, because components to give you proper clean sound are costly, more over, it's not something you actually change. you keep it until it breaks. It usually last 2-3 computer builds on my side... so 8-10 years life cycle.. and even then it's more it doesn't support the latest version of Windows, or I seek for better sound card. So not bad investment. Now, of course, I know the Surface Pro device can't be open and inserted a PCI-E card. But they are USB sound card for about 70-100$ which are pretty good, and fairly small.
If you are on a tight budget, you have the ASUS Xonar U3 for 30-35$. It's not going to give you some amazing sound, but it will give you a sound that maybe better than what you have (I don't have a SP2 yet to compare, sold out everywhere), but it will give you a nice clean sound, almost static free... well the best a 35$ can get you.
The reason why you have the sound only when you pause the video, is because on mobile devices, the sound card "turns-off" when not being in used to save power.
If you wonder:
-> My laptop onboard sounds like I am using dollar shop earphone, despite costly ones. It has no base, and no mid range.
-> My desktop onboard sound has static, and everything sounds like my speakers on inside a metal barrel.
-> My old desktop onboard sound has heavy static, and sounded like I am having the Transformers do the music.
On board sound card are designed to offer you basic sound, for basic video watching, and Windows sounds. Not give you this rich, live like experience. They are designed to be virtually free. It has no SPU (Sound Processing Unit), it just takes the sound, simplifies it, send it to the CPU for processing, get it back and converted using the cheapest component possible from digital to analogue.