Windows on x86 and 4GB of RAM

A few months ago I decided to get a shiny new gaming system from Dell. I eventually decided to go with the XPS 720 with pretty much all the bells and whistles I thought were reasonable. One of which was going with 4GB of RAM. After all, just about everyone agrees that more RAM leads to better performance. This machine also came with Windows Vista Home Premium, I’m the early adopter type so I saw no issue in this.

To my surprise Windows only saw 3GB of RAM! Since I do a little hobby OS development, I immediately had an idea of what could cause this. I jumped to the conclusion that PAE was not properly enabled, and started sifting through different sites to find the proper way to enable PAE. Of course if this were the solution, I wouldn’t really have much to say here… It turns out that PAE is in fact enabled by default on Windows Vista if your system supports it, after all, you need to in order to make use of the no-execute bit on x86.

At this point, I should probably explain what PAE is and why it should make 4GB available without a problem. Basically there are two ways to look at memory on an x86 system. There is “physical memory,” which is your system memory start to finish in order. Note, this includes memory mapped devices. This is how the memory would look if you used an OS which does not use paging. Then there is “virtual memory.” The idea here is that your physical memory is viewed as blocks (“pages”) that can be mapped to any location you wish in your virtual memory space. For example, as the OS writer, I could ask the system to map the 4K of memory at physical address 0x11223000 to address 0x00000000. In this case, any reads and writes that programs do in first 4K of memory will occur at the physical address associated with it. Virtual memory is also what allows modern OSes to protect one application from another. It does this by switching the virtual memory layout during task switch so that each process has a unique view of what memory looks like.

The problem is that memory mapped devices occupy physical address space as well, so if you have a 512MB video card, then that’s half a gig of of that 4GB physical memory space which can’t be RAM. Here’s where PAE comes in. Before PAE, your physical memory was limited to 4GB and your virtual memory was limited to 4GB (per process) . PAE changes this to be 64GB (36-bits) of physical address while keeping the 4GB of virtual addresses. Sure, no single process can use more than 4GB at a time, but all the RAM could be put to use. There is one catch though, your memory mapped devices must still be below the 4GB mark because they use 32-bit addressing when doing DMA. So the natural solution is to relocate the RAM that is displaced by devices to above the 4GB mark. Most modern motherboards support this.

It turns out that for “compatibility reasons” Microsoft has opted to simply ignore any RAM it sees above 4GB. Some people are convinced it is a hardware problem, saying:

To be perfectly clear, this isn’t a Windows problem– it’s an x86 hardware problem. The memory hole is quite literally invisible to the CPU, no matter what 32-bit operating system you choose. The following diagram from Intel illustrates just where the memory hole is:

This simply isn’t the case (Sorry Jeff, I love your blog BTW). It is a design choice by the windows engineers to take the easy way out. A perfectly viable solution is to divide your memory up into types, namely “suitable for DMA” and “not suitable for DMA.” I know this works, because Linux does it. In fact, here’s a screen shot of my shiny new dell using all 4GB of my RAM.

kinfocenter showing 4GB

This isn’t a 64-bit build, it’s 32-bit (arch reports “i686″ not “x86-64″) with the 64GB support config option used (which basically means enable PAE). There are plenty of people saying online that all 32-bit operating systems have this problem. This isn’t true.

Jeff gets it right with this statement though:

As far as 32-bit Vista is concerned, the world ends at 4,096 megabytes. That’s it. That’s all there is. No más.

Addressing more than 4 GB of memory is possible in a 32-bit operating system, but it takes nasty hardware hacks like 36-bit PAE extensions in the CPU, together with nasty software hacks like the AWE API. Unless the application is specifically coded to be take advantage of these hacks, it’s confined to 4 GB. Well, actually, it’s stuck with even less– 2 GB or 3 GB of virtual address space, at least on Windows.

Except that PAE isn’t a nasty hack by any stretch, in fact, Vista uses it already as previously mentioned. User space software doesn’t need to be specially programed to take advantage of the extra memory since it will only see 4GB at a time anyway (minus kernel land). Also AWE-API is used to address the 4GB of virtual memory limitation not the physical limitations! What AWE does is allow an application to selectively map physical RAM locations to user space virtual locations. A program can thus can access much more than the 2GB user space that windows will give it by default, just not all at the same time.

Microsoft of course does support upwards of 4GB on it’s x86-64 builds, that’s all fine and dandy, but my Dell didn’t come with that. And to my knowledge Dell (as a company, not the hardware) doesn’t support x86-64 officially yet. So maybe I’ll take that up with them and demand a 64-bit copy.

All in all, it’s a little lame that Windows doesn’t support all the RAM that it could on 32-bit builds. It really wouldn’t be hard, but it would likely require that driver writers start passing a flag to the allocator specifying that the memory be OK for DMA. I see that this is a problem since there are simply tons of drivers. But at the least, the extra RAM could have been used for things where DMA is clearly not involved (pretty much all user space uses since only drivers should be doing DMA). Also Microsoft could have done something clever like add a new flag to the driver’s PE header which when not present would make the allocator only return addresses below 4GB and if set would allow the driver to use a more robust allocator API.

I hope this shed some light on the subject, because there is unfortunately a lot of mis-information out there.

6 thoughts on “Windows on x86 and 4GB of RAM

  1. Pingback: Micro-optimization is stupid « Evan Teran’s Blog

  2. CrashCoder

    Nice article. Please discuss the implications for servers, e.g. Does an x86 based implementation of 32-bit Windows Server 2003 Datacenter Edition really support 64GB? If so, is it due to PAE or some other technology? Do memory mapped devices reduce the amount of useable RAM when 64GB are installed?

  3. Evan Teran Post author

    The answer is “it depends” as you can see from this page: http://msdn.microsoft.com/en-us/library/aa366778.aspx#physical_memory_limits_windows_server_2008, Microsoft has imposed different limitations based on which version of windows server you have. It appears that the 32-bit version of Windows Server 2003 Datacenter Edition does infact support 64GB of RAM. Yes it is due to PAE, pretty much every newer mobo and cpu support this.

    However, your observation is correct. Memory mapped devices do in fact take away a small chunk of this. The big difference is that something like 512MB (or likely less, why does a server need a good video card) is not a significant portion of 64GB.

    Additionally, as I mentioned in the blog, this 64GB applies to physical memory, not virtual memory.

    So the big gain is that you will be able to run processes and the OS can cache more (read avoid disk usage) without resorting to paging. If an individual process needs more than 4GB then, you will need to use a 64GB OS.

  4. Evan Teran Post author

    Starting with SP2, XP has PAE mode enabled (in order to access the NX bit) but explicitly ignores all physical memory located above the 4GB mark. From http://msdn.microsoft.com/en-us/windows/hardware/gg487512#EKD:

    In order to limit the impact to device driver compatibility, changes to the hardware abstraction layer (HAL) were made to Windows XP SP2 and Windows Server 2003 SP1 Standard Edition to limit physical address space to 4 GB. Driver developers are encouraged to read about DEP.

    This is pretty much exactly what I explained in this blog entry but comes straight from microsoft.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>