'Why is stack memory size so limited?
When you allocate memory on the heap, the only limit is free RAM (or virtual memory). It makes Gb of memory.
So why is stack size so limited (around 1 Mb)? What technical reason prevents you to create really big objects on the stack?
Update: My intent might not be clear, I do not want to allocate huge objects on the stack and I do not need a bigger stack. This question is just pure curiosity!
Solution 1:[1]
My intuition is the following. The stack is not as easy to manage as the heap. The stack need to be stored in continuous memory locations. This means that you cannot randomly allocate the stack as needed, but you need to at least reserve virtual addresses for that purpose. The larger the size of the reserved virtual address space, the fewer threads you can create.
For example, a 32-bit application generally has a virtual address space of 2GB. This means that if the stack size is 2MB (as default in pthreads), then you can create a maximum of 1024 threads. This can be small for applications such as web servers. Increasing the stack size to, say, 100MB (i.e., you reserve 100MB, but do not necessarily allocated 100MB to the stack immediately), would limit the number of threads to about 20, which can be limiting even for simple GUI applications.
A interesting question is, why do we still have this limit on 64-bit platforms. I do not know the answer, but I assume that people are already used to some "stack best practices": be careful to allocate huge objects on the heap and, if needed, manually increase the stack size. Therefore, nobody found it useful to add "huge" stack support on 64-bit platforms.
Solution 2:[2]
One aspect that nobody has mentioned yet:
A limited stack size is an error detection and containment mechanism.
Generally, the main job of the stack in C and C++ is to keep track of the call stack and local variables, and if the stack grows out of bounds, it is almost always an error in the design and/or the behaviour of the application.
If the stack would be allowed to grow arbitrarily large, these errors (like infinite recursion) would be caught very late, only after the operating systems resources are exhausted. This is prevented by setting an arbitrary limit to the stack size. The actual size is not that important, apart from it being small enough to prevent system degradation.
Solution 3:[3]
It is just a default size. If you need more, you can get more - most often by telling the linker to allocate extra stack space.
The downside to having large stacks is that if you create many threads, they will need one stack each. If all the stacks are allocating multi-MBs, but not using it, the space will be wasted.
You have to find the proper balance for your program.
Some people, like @BJovke, believe that virtual memory is essentially free. It is true that you don't need to have physical memory backing all the virtual memory. You do have to be able to at least give out addresses to the virtual memory.
However, on a typical 32-bit PC the size of the virtual memory is the same as the size of the physical memory - because we only have 32 bits for any address, virtual or not.
Because all threads in a process share the same address space, they have to divide it between them. And after the operating system has taken its part, there is "only" 2-3 GB left for an application. And that size is the limit for both the physical and the virtual memory, because there just aren't any more addresses.
Solution 4:[4]
Think of the stack in the order of near to far. Registers are close to the CPU (fast), the stack is a bit further (but still relatively close) and the heap is far away (slow access).
The stack lives on the heap ofcourse, but still, since it's being used continuously, it probably never leaves the CPU cache(s), making it faster than just the average heap access. This is a reason to keep the stack reasonably sized; to keep it cached as much as possible. Allocating big stack objects (possibly automatically resizing the stack as you get overflows) goes against this principle.
So it's a good paradigm for performance, not just a left-over from old times.
Solution 5:[5]
For one thing, the stack is continuous, so if you allocate 12MB, you must remove 12MB when you want to go below whatever you created. Also moving objects around becomes much harder. Here is a real world example that may make things easier to understand:
Say you are stacking boxes around a room. Which is easier to manage:
- stacking boxes of any weight on top of each other, but when you need to get something on the bottom you have to undo your entire pile. If you want to take a item out of the pile and give it to someone else you must take off all of the boxes and move the box to the other person's pile (Stack only)
- You put all of your boxes (except for really small boxes) over in a special area where you do not stack stuff on top of other stuff and write down where you put it on a piece of paper (a pointer) and put the paper on the pile. If you need to give the box to someone else you just hand them the slip of paper from your pile, or just give them a photocopy of the paper and leave the original where it was in your pile. (Stack + heap)
Those two examples are gross generalizations and there are some points that are blatantly wrong in the analogy but it is close enough that it hopefully will help you see the advantages in both cases.
Solution 6:[6]
Allocating large objects in a, say, 100MB stack would make it impossible on most machines to have them loaded at once into cache, which pretty much defeats the purpose of the stack.
The point of the stack is to have small objects that belong to the same scope (and are, therefore, usually needed together or close to each other) stored together in contiguous memory addresses, so that the program can have them all loaded into cache at the same time, minimizing cache misses and, in general, the time CPU has to wait until it gets some missing piece of data from the slower RAM.
A 50MB object stored in the stack would not fit into the cache, meaning after every cache line there would be a CPU waiting time until the next piece of data is brought from RAM, meaning one would be clogging the call stack and not getting any significant benefit (in terms of speed) as compared to loading from the heap.
Solution 7:[7]
Many of the things you think you need a big stack for, can be done some other way.
Sedgewick's "Algorithms" has a couple good examples of "removing" recursion from recursive algorithms such as QuickSort, by replacing the recursion with iteration. In reality, the algorithm is still recursive, and there is still as stack, but you allocate the sorting stack on the heap, rather than using the runtime stack.
(I favor the second edition, with algorithms given in Pascal. It can be had used for eight bucks.)
Another way to look at it, is if you think you need a big stack, your code is inefficient. There is a better way that uses less stack.
Solution 8:[8]
If you could have an infinte stack then every virtual address could potentially be used by the stack. If the stack can use evey address, then there is no place for the heap to go. Every address you picked for a heap variable could be overwritten by a growing stack.
To put it another way, variables on the stack and variables on the heap occupy the same virtual address space. We need some way of preventing the heap allocator from allocating data where the stack might grow into. A stack size is an easy way to do it. The heap allocator knows that the stack addresses are taken and so it uses something else.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Andreas Grapentin |
Solution 3 | |
Solution 4 | Ruud van Gaal |
Solution 5 | |
Solution 6 | mosegui |
Solution 7 | Mike Crawford |
Solution 8 | user2881022 |