'.NET OutOfMemoryException

Why does this:

class OutOfMemoryTest02
{
    static void Main()
    {
        string value = new string('a', int.MaxValue);
    }
}

Throw the exception; but this wont:

class OutOfMemoryTest
{
    private static void Main()
    {
        Int64 i = 0;
        ArrayList l = new ArrayList();
        while (true)
        {
            l.Add(new String('c', 1024));

            i++;
        }
    }
}

Whats the difference?



Solution 1:[1]

Have you looked up int.MaxValue in the docs? it's the equivalent of 2GB, which is probably more RAM than you have available for a contiguous block of 'a' characters - that is what you are asking for here.

http://msdn.microsoft.com/en-us/library/system.int32.maxvalue.aspx

Your infinite loop will eventually cause the same exception (or a different one indirectly related to overuse of RAM), but it will take a while. Try increasing 1024 to 10 * 1024 * 1024 to reproduce the symptom faster in the loop case.

When I run with this larger string size, I get the exception in under 10 seconds after 68 loops (checking i).

Solution 2:[2]

Your

new string('a', int.MaxValue);

throws an OutOfMemoryException simply because .NET's string has a length limitation. The "Remarks" section in the MSDN docs says:

The maximum size of a String object in memory is 2 GB, or about 1 billion characters.

On my system (.NET 4.5 x64) new string('a', int.MaxValue/2 - 31) throws, whereas new string('a', int.MaxValue/2 - 32) works.

In your second example, the infinite loop allocates ~2048 byte blocks until your OS cannot allocate any more block in the virtual address space. When this is reached, you'll get an OutOfMemoryException too.

(~2048 byte = 1024 chars * 2 bytes per UTF-16 code point + string overhead bytes)

Try this great article of Eric.

Solution 3:[3]

Because int.MaxValue is 2,147,483,647, or, 2 gigabytes which needs to be allocated contiguously.

In the second example, the OS only needs to find 1024 bytes to allocate each time and can swap to hard-drive. I am sure if you left it running long enough you'd end up in a dark place :)

Solution 4:[4]

The String object can use a backing shared string pool to reduce memory usage. In the former case, you're generating one string thats several gigabytes. In the second case, its likely the compiler is auto-interning the string, so you're generating a 1024 byte string, and then referencing that same string many times.

That being said, an ArrayList of that size should run you out of memory, but its likely you haven't let the code run long enough for it to run out of memory.

Solution 5:[5]

The 2nd snippet will crash as well. It just takes a wholeheckofalot longer since it is consuming memory much slower. Pay attention to your hard disk access light, it's furiously blinking while Windows chucks pages out of RAM to make room. The first string constructor immediately fails since the heap manager won't allow you to allocate 4 gigabytes.

Solution 6:[6]

Both versions will cause an OOM exception, it's just that (on a 32bit machine) you will get it immediately with the first version when you try to allocate a "single" very large object.

The second version will take much longer however as there will be a lot of thrashing to get to the OOM condition for a couple of factors:

  • You will be allocating millions of small objects which are all reachable by the GC. Once you start putting the system under pressure, the GC will spend an inordinate amount of time scanning generations with millions and millions of objects in. This will take a considerable amount of time and start to play havoc with paging as cold and hot memory will be constantly paged in and out as generations are scanned.

  • There will be page thrashing as GC scans millions of objects in generations to try and free memory. Scanning will cause huge amounts of memory to be paged in and out constantly.

The thrashing will cause the system to grind to a halt processing overhead and so the OOM condition will take a long time to be reached. Most time will be spent thrashing on the GC and paging for the second version.

Solution 7:[7]

In your first sample you are trying to create a 2g string at one time

In the second example you keep adding 1k to an array. You will need to loop more than 2 million times to reach the same amount of consumption.

And it's also not all stored at once, in one variable. Thus, some of your memory usage can be persisted to disk to make room for the new data, I think.

Solution 8:[8]

Because a single object cannot have more than 2 GB:

First some background; in the 2.0 version of the .Net runtime (CLR) we made a conscious design decision to keep the maximum object size allowed in the GC Heap at 2GB, even on the 64-bit version of the runtime

In your first example, you try to allocate one object that 2 GB, with the object overhead (8 Bytes?) it's simply too big.

I don't know how the ArrayList works internally, but you allocate multiple objects of 2 GB each and the ArrayList - to my knowledge - only holds pointers which are 4 (8 on x64?) Bytes, regardless how big the object they point to is.

To quote another article:

Also, objects that have references to other objects store only the reference. So if you have an object that holds references to three other objects, the memory footprint is only 12 extra bytes: one 32-bit pointer to each of the referenced objects. It doesn't matter how large the referenced object is.

Solution 9:[9]

One reason your system could be coming to a halt is because .NET's code runs closer to the metal and you're in a tight loop which should consume 100% CPU provided the process priority allows it to. If you would like to prevent the application from consuming too much CPU while it performs the tight loop you should add something like System.Threading.Thread.Sleep(10) to the end of the loop, which will forcibly yield processing time to other threads.

One major difference between the JVM and .NET's CLR (Common Language Runtime) is that the CLR does not limit the size of your memory on an x64 system/application (in 32bit applications, without the Large Address Aware flag the OS limits any application to 2GB due to addressing limitations). The JIT compiler creates native windows code for your processing architecture and then runs it in the same scope that any other windows application would run. The JVM is a more isolated sandbox which constrains the application to a specified size depending on configuration/command line switches.

As for differences between the two algorithms:

The single string creation is not guaranteed to fail when running in an x64 environment with enough contiguous memory to allocate the 4GB necessary to contain int.MaxValue characters (.NET strings are Unicode by default, which requires 2 bytes per character). A 32 bit application will always fail, even with the Large Address Aware flag set because the maximum memory is still something like 3.5GB).

The while loop version of your code will likely consume more overall memory, provided you have plenty available, before throwing the exception because your strings can be allocated in smaller fragments, but it is guaranteed to hit the error eventually (although if you have plenty of resources, it could happen as a result of the ArrayList exceeding the maximum number of elements in an array rather than the inability to allocate new space for a small string). Kent Murra is also correct about string interning; you will either need to randomize the length of the string or the character contents to avoid interning, otherwise you're simply creating pointers to the same string. Steve Townsend's recommendation to increase string length would also make finding large enough contiguous blocks of memory harder to come by, which will allow the exception to happen more quickly.

EDIT:

Thought I'd give some links people may find handy for understanding .NET memory:

These two articles are a little older, but very good in depth reading:

Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework

Garbage Collection Part 2: Automatic Memory Management in the Microsoft .NET Framework

These are blogs from a .NET Garbage Collection developer for information about newer version of .NET memory management:

So, what’s new in the CLR 4.0 GC?

CLR 4.5: Maoni Stephens - Server Background GC

This SO Question may help you observe the inner workings of .NET memory:

.NET Memory Profiling Tools

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Glorfindel
Solution 3 Moo-Juice
Solution 4 Kent Murra
Solution 5 Hans Passant
Solution 6 Tim Lloyd
Solution 7 CaffGeek
Solution 8 Glorfindel
Solution 9 Community