What are they and Where are the "stack" and "heap"?

What are these stack and heap that talk so much about memory management?

Are these really portions of memory as some people speak or is it just an abstract concept to make it easier to understand how memory is managed?

Are any of them faster than the other? If one is clearly faster, why does the other exist?

Makes a difference if I'm using, for example, Assembly, C, Java, or Javascript, Windows or Linux? Is this controlled by the "language" or the operating system?

Anyway, I wanted to better understand this concept that seems to be misunderstood by programmers. An explanation would be very useful for those who are starting out or have learned it the wrong way.

Author: Maniero, 2014-02-02

4 answers

A stack (or stack), in this context, is an optimized way to organize data in memory allocated in sequence and abandoned (yes, there is usually no deallocation) in sequence reversed to the input.

Stack

A heap (or heap, ok, no one translates that) is the most flexible memory organization that allows the use of any available logical area.

Lot

What stack are we talking about?

There are some very widespread stack concepts in computing, to name a few:

  • There is the execution stack of some architecture where the instructions and data are being stacked and after executing something there, the stacking occurs.
  • There is the function call stack, which is confused with memory management, where functions are called and stacked and when their execution ends, it exits the stack.
  • There is a generic data structure stacking diverse data. example in C # .

Abstract concept

The two concepts of the question are abstract. There is not physically a specific memory area for the stack (let alone its area is physically stacked) and there is no reserved area for the heap, on the contrary, it is often quite fragmented. We use the concept to better understand the operation and its implications, especially in the case of the stack.

Most popular modern computer architectures do not have great facilities to handle this stack of memory (usually only have the stack pointer logger), as well as the heap, although in this case, instructions that help manipulate virtual memory in a certain way help organize the heap, but this applies to all memory, not only the heap.

Getting a little more concrete

Already the operating system is well aware of these concepts and it is essential that they have some form, even if limited, to manipulate the memory of applications, especially in modern systems and general utility. Modern systems have a complex management through what has been called virtual memory which is also an abstract concept, often misunderstood.

Where we move directly

In Assembly or C is very common to have contact with this memory management. In Assembly it is common to manipulate the stack almost directly and in both languages at least the allocation and deallocation of the heap must be done manually through the operating system API. In C the stack is managed by the compiler, unless some unusual operation is required.

Nothing prevents you from using any library that abstracts this manipulation, but this is only common in higher-end languages level. In fact, it is very common for other languages to internally use the OS API to do heavy memory management but memory access in the "retail" is done by a manager of its own, in general called garbage collector through techniques of counting references to an object in heap (some consider that this is not a technique of garbage collector) or later check if there are references to the object in heap. Even using a more abstract library, the concepts remain.

The higher level, the less need to manage all this, but understanding the overall workings is important in all languages.

Languages that do not need performance can leave everything in the heap and "facilitate" compression and access.

Stack

Allocation

Under normal conditions, the stack is allocated at the beginning of application execution , more precisely at the beginning of thread , even if the application only has the main thread.

The stack is a contiguous portion of memory reserved for stacking the data needed during the execution of code blocks.

Each allocation need is an excerpt of the stack that is always used in sequence determined by a marker, that is, a pointer, a pointer , it "moves" to indicate that a new part in the sequence of this reserved portion she's engaged.

When something reserved for a segment is no longer needed, this marker moves in the opposite direction to the data sequence indicating that some of that data can be discarded (overlapped with new data).

The allocation of each piece of memory does not exist in the stack, it is only the movement of this pointer indicating that that area will be used by some data.

Roughly speaking we can say that the application has full control over the * stack, except when the available space on it runs out.

There are features to manually change the size of the stack, but this is unusual.

Operation

The stack works using a LIFO (last in First Out) or UEPS (last in, first out) form.

The scope of a variable usually sets the allocation time in the stack. The data used as parameters and function returns are allocated in the stack . So the pile of function calls get confused with the memory stack.

We can say that the parameters are the first variables of a function allocated in the stack. Data access in the stack is usually done directly, but there are indirections as well.

stack allocation

Gave to understand that each thread has its own stack , right? And the size of the stack of each thread created can have its size set before the creation. A value default is often used.

The stack is considered an automatic form of allocation (often confused with static which is allocation that occurs next to the execution right on its load. Technically there is another area of memory that is actually static, which is allocated before the start of execution. The effectively static area cannot be manipulated, cannot be written (at least it should not be able to). The stack itself is static, although your data is not, after all they will be placed and abandoned according to its use, its management is automatic.

Decision on where to allocate

Just like in heap , it is not possible to allocate data in stack before knowing its size (you do not need to know when compiling, but at the time of performing the allocation, but in stack it has some restrictions). But if the size is indeterminate at compile time or it can be determined as possibly large (perhaps a few tens of bytes), the allocation should probably occur in the heap.

High-Level Languages predetermine this. Others let the programmer have more control, and may even abuse the stack if it is useful and the programmer knows what he is doing.

Stack overflow

The famous stack overflow occurs when you try to allocate something in the stack and there is no space reserved available. Also, in some cases if the language provides mechanisms that allow, there may be overflow of one data on top of another that follows in the stack. Uncontrolled recursive runs Cause stack overflow .

Other stack

There is also a call stack which is where the addresses to which the stack pointer should return are stored when the execution of a function ends.

Heap

Allocation

O heap , unlike stack, does not impose a template, a memory allocation pattern. This is not very efficient but it is quite flexible.

The heap is considered to be dynamic . In general you allocate or deallocate small snippets of memory, just for the need of the data. This allocation can occur physically in any free part of the memory available for your process.

The virtual memory management of the operating system, aided by processor instructions, help organize this.

In a way we can say that the stack as a whole is the first object allocated in the heap.

Effectively these actual allocations often occur in fixed-size blocks called pages. This prevents the application from making dozens or hundreds of small allocations that would fragment the memory in an extreme way and avoids calls to the operating system that switches context and is usually much slower. In general every memory allocation system allocates more than it needs and gives access to the application as it needs, in some cases, it almost simulates a stack , for some time, or reorganize the memory (through a GC compactor).

Relocation

The dislocation of heap usually happens:

  • manually (at the risk of bugs ), although this is not available for some languages;
  • through such garbage collector which identifies when a part of the heap is no longer needed;
  • when an application terminates.

Depends on implementation

There are even languages that have specialized heaps that may have a little different behavior, but let's simplify for the common cases.

Abstract concept

It is clear that the heap is not an area of memory, even conceptualizing abstractly, it is a set of small areas of memory. Physically it is often fragmented by all memory. These parts are very flexible in size and life time .

For security reasons it is good to know that dislodging is an abstract concept as well. It is often possible to access data from an application even after it has finished. Content is only deleted by manual request or when an available area is rewritten.

Cost of heap

A allocation in the heap "costs" expensive. Many tasks must be performed by the operating system to ensure the perfect allocation of an area to a stretch of it, especially in competing environments, very common today, and even when you do not need the OS still has a complex algorithm to allocate. Deallocating, or making available back an area also has its cost, in some cases for the allocation cost cheaper the release costs well expensive (ironically can be controlled by several batteries).

There are even ways to avoid calls to the operating system for every necessary allocation, but still the" cost " of processing this is considered high. Keeping lists (in some cases linked) of allocated areas or Pages is not trivial to the processor, at least compared to the pointer movement that is required in the stack.

Operation

heap allocation

Or heap it is accessed via pointers. Even in languages that do not have the concept of pointers available to the programmer, this is done internally transparently.

example of general allocation

Note that in the example, an object of Type class1 is allocated in the heap. But there is a reference to this object, which is allocated in the stack (in some cases it might not be).

This allocation is necessary because the size of the object may be too large to fit the stack (or at least occupy a considerable part), or because it can survive longer than the function that created it.

If it were in the stack the "only" way to keep it "alive" would be by copying to the calling function, and so on to all the others, where it is needed. Imagine how it turns out "expensive". The way it is organized, only the reference, which is short, needs to be copied, and this can only be done using registers, super fast.

Conclusion

Then the runtime of a programming language communicates with the OS to manage memory. Whether this runtime is exposed to the programmer depends on the purpose of the language. In languages called "managed", all this occurs, the two concepts exist and need to be understood, but you do not have to manually manipulate the heap. It happens to be as transparent as the stack is in other languages lower level (except Assembly).

The allocation of both are usually performed in RAM, but nothing prevents it from being allocated elsewhere. Virtual memory can place all or part of stack or heap in mass memory, for example.

"stole" some images of this response from OS that are very good at illustrating all this.

 214
Author: Maniero, 2020-09-02 15:28:41

The translation of "stack" is stack, that is, a data structure in which the last element to enter is the first to exit (think of a stack of books). The stack, therefore, works quite simply - elements are added/removed in an organized / restricted way, which allows the processors to be optimized to perform the operations involved (e.g. intel operators have dedicated registrars to store the address of the base and top of the stack). So one can say that the stack is faster than the heap.

A concept related to the stack is that of "call frame". When functions are called, pointers and parameters are written to the top of the stack so that the called function has access to the parameters to execute and then the program can continue to execute from the point where there was the function call. Again, the processor can support this (assembly call command for 8086, for example).

I don't remember seeing translation for the term "heap", but the main feature of the portion of memory dedicated to a program referred to as" heap " is that it is intended for Dynamic Memory Allocation (famous alloc()/malloc()/realloc() in C/new in C++ for example). It is precisely because the elements allocated in the heap can be allocated/deallocated at any time and in any order that access to the heap tends to be slower, which can lead to memory fragmentation (spaces lost between regions of memory used).

As I mentioned, intel processors support stack control directly. For other processors, the operating system may need to take care of this. In any case, when the program / thread is initialized, it is up to the operating system to reserve a portion of memory for the program/thread. In many cases this area is "shared" between the stack and the heap, one of them starts from the smallest address and the other starts from the largest address and the two they grow towards each other. When one meets the other, an OutOfMeMory exception or something like that occurs.

In assembly / C, the programmer should be aware of these differences and will be directly involved in choosing where to allocate each variable. In more "modern" languages, such as Java, which has conveniences like garbage collection, the concepts still apply, but it is easier for a programmer to live without knowing the details of what happens under the hood.

 76
Author: fpessoa, 2014-02-02 20:00:34

I would like to present here my understanding less technical than the answers given above but that can be of help to the programmer who just wants to know what it is without delving into the subject.

Stack memory is used to store arguments of a function, procedure, method. Being it static pre-allocated at the start of the program and unallocated at the end, and so it is faster than the Heap memory that needs to be allocated / unallocated every time it is necessary.

Stack memory allocation is usually made a single allocation of a large block and for each function/method/procedure argument a part or chunk of that memory is allocated. The fact that the stack memory has a fixed size specified in the project, makes it a potential source of problems if in this case a smaller size is specified than is intended to be used in the program.

So it was common to see in the past (less nowadays) the famous error messages that gives title to this site "Stack Overflow", meaning that the stack memory is over. This occurs or would occur a lot in cascading function calls (function that calls another, that calls another, that...) and also when there are recursive calls of the same function/procedure/method in which there is no termination control or that the termination of the same would use so much stack memory that it would be necessary to increase its size and recompile the program.

Nowadays it is rare to occur the famous " stack overflow " because today's compilers Reserve as default a very large amount of stack in their project, which did not occur in the past where computers had much less memory than current ones, not being able to allocate a lot of stack in those remote times.

Heap memory is heavily used for object allocation and unallocated at the end of its use. Memory in a generic concept can be static or dynamic, static (also called Global) uses the data Segment(intel 80x86 processors), already the dynamics uses the Heap and that's what differentiates one from the other.

But what's the difference from Heap to Stack ? While the stack is used by functions / methods / procedures, the heap can be used at any point in the program normally in creating objects or pointers to some data structure.

 45
Author: Claudio Ferreira, 2015-11-19 12:39:24

Definitions

Stack

In the stack, objects allocated within function scopes are saved including local variables of functions, arguments, addresses of code areas being executed before other function calls, function returns.

Memory allocation occurs sequentially, and since the position of these objects is known during compile time, we can assign proper names to these objects and access them directly. When an object which is allocated in the stack exits its respective scope, the object is automatically deleted. So you don't have to worry about allocating and deallocating memory with stack objects but attention, the stack has a limited size.

Heap

The heap is the proper memory location to allocate many large objects, as this section of the program is much larger than the stack, and its size is limited only by the virtual memory available on your machine. The objects allocated in the heap are all those allocated using new or malloc() (dynamically allocated objects). Since the position these objects will be in during program execution is unknown at compile time, the only way to access them is via pointers. You should remember to control the deallocation of these objects, as they are not automatically destroyed.

Answers

Is one of them faster than the other? If one is clearly faster, why is there the other one?

The Pilha (Stack) is faster because the variables/objects are created at compile time, the stack does not extend through the virtual memory of the machine (HD) so at some point an object/variable allocated in Heap may be stored in HD and soon this should be loaded in RAM.

Does it make a difference if I'm using, for example, Assembly, C, Java, or Javascript, Windows or Linux? This is controlled by the "language" or the system operational?

Languages such as Java and Python have a Garbage Collector that removes objects and variables from memory that are no longer being referenced. As for the difference between SOs there can be one regarding addressing, this question I think can be best explained with this link to the SOen

Reference: http://www.unidev.com.br/index.php?/topic/55299-entendendo-as-divis%C3%B5es-de-mem%C3%B3ria-stack-heap-global-e-code/

 40
Author: Ricardo, 2020-06-11 14:45:34