TRPL: the stack and the heap

This commit is contained in:
Steve Klabnik 2015-05-10 20:22:06 -04:00
parent a90dd32949
commit e66653bee5

View file

@ -1,3 +1,570 @@
% The Stack and the Heap
Coming Soon
As a systems language, Rust operates at a low level. If youre coming from a
high-level language, there are some aspects of systems programming that you may
not be familiar with. The most important one is how memory works, with a stack
and a heap. If youre familiar with how C-like languages use stack allocation,
this chapter will be a refresher. If youre not, youll learn about this more
general concept, but with a Rust-y focus.
# Memory management
These two terms are about memory management. The stack and the heap are
abstractions that help you determine when to allocate and deallocate memory.
Heres a high-level comparison:
The stack is very fast, and is where memory is allocated in Rust by default.
But the allocation is local to a function call, and is limited in size. The
heap, on the other hand, is slower, and is explicitly allocated by your
program. But its effectively unlimited in size, and is globally accessible.
# The Stack
Lets talk about this Rust program:
```rust
fn main() {
let x = 42;
}
```
This program has one variable binding, `x`. This memory needs to be allocated
from somewhere. Rust stack allocates by default, which means that basic
values go on the stack. What does that mean?
Well, when a function gets called, some memory gets allocated for all of its
local variables and some other information. This is called a stack frame, and
for the purpose of this tutorial, were going to ignore the extra information
and just consider the local variables were allocating. So in this case, when
`main()` is run, well allocate a single 32-bit integer for our stack frame.
This is automatically handled for you, as you can see, we didnt have to write
any special Rust code or anything.
When the function is over, its stack frame gets deallocated. This happens
automatically, we didnt have to do anything special here.
Thats all there is for this simple program. The key thing to understand here
is that stack allocation is very, very fast. Since we know all the local
variables we have ahead of time, we can grab the memory all at once. And since
well throw them all away at the same time as well, we can get rid of it very
fast too.
The downside is that we cant keep values around if we need them for longer
than a single function. We also havent talked about what that name, stack
means. To do that, we need a slightly more complicated example:
```rust
fn foo() {
let y = 5;
let z = 100;
}
fn main() {
let x = 42;
foo();
}
```
This program has three variables total: two in `foo()`, one in `main()`. Just
as before, when `main()` is called, a single integer is allocated for its stack
frame. But before we can show what happens when `foo()` is called, we need to
visualize whats going on with memory. Your operating system presents a view of
memory to your program thats pretty simple: a huge list of addresses, from 0
to a large number, representing how much RAM your computer has. For example, if
you have a gigabyte of RAM, your addresses go from `0` to `1,073,741,824`. That
number comes from 2<sup>30</sup>, the number of bytes in a gigabyte.
This memory is kind of like a giant array: addresses start at zero and go
up to the final number. So heres a diagram of our first stack frame:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
Weve got `x` located at address `0`, with the value `42`.
When `foo()` is called, a new stack frame is allocated:
| Address | Name | Value |
+---------+------+-------+
| 2 | z | 100 |
| 1 | y | 5 |
| 0 | x | 42 |
Because `0` was taken by the first frame, `1` and `2` are used for `foo()`s
stack frame. It grows upward, the more functions we call.
Theres some important things we have to take note of here. The numbers 0, 1,
and 2 are all solely for illustrative purposes, and bear no relationship to the
actual numbers the computer will actually use. In particular, the series of
addresses are in reality going to be separated by some number of bytes that
separate each address, and that separation may even exceed the size of the
value being stored.
After `foo()` is over, its frame is deallocated:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
And then, after `main()`, even this last value goes away. Easy!
Its called a stack because it works like a stack of dinner plates: the first
plate you put down is the last plate to pick back up. Stacks are sometimes
called last in, first out queues for this reason, as the last value you put
on the stack is the first one you retrieve from it.
Lets try a three-deep example:
```rust
fn bar() {
let i = 6;
}
fn foo() {
let a = 5;
let b = 100;
let c = 1;
bar();
}
fn main() {
let x = 42;
foo();
}
```
Okay, first, we call `main()`:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
Next up, `main()` calls `foo()`:
| Address | Name | Value |
+---------+------+-------+
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
And then `foo()` calls `bar()`:
| Address | Name | Value |
+---------+------+-------+
| 4 | i | 6 |
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
Whew! Our stack is growing tall.
After `bar()` is over, its frame is deallocated, leaving just `foo()` and
`main()`:
| Address | Name | Value |
+---------+------+-------+
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
And then `foo()` ends, leaving just `main()`
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
And then were done. Getting the hang of it? Its like piling up dishes: you
add to the top, you take away from the top.
# The Heap
Now, this works pretty well, but not everything can work like this. Sometimes,
you need to pass some memory between different functions, or keep it alive for
longer than a single functions execution. For this, we can use the heap.
In Rust, you can allocate memory on the heap with the [`Box<T>` type][box].
Heres an example:
```rust
fn main() {
let x = Box::new(5);
let y = 42;
}
```
[box]: ../std/boxed/index.html
Heres what happens in memory when `main()` is called:
| Address | Name | Value |
+---------+------+--------+
| 1 | y | 42 |
| 0 | x | ?????? |
We allocate space for two variables on the stack. `y` is `42`, as it always has
been, but what about `x`? Well, `x` is a `Box<i32>`, and boxes allocate memory
on the heap. The actual value of the box is a structure which has a pointer to
the heap. When we start executing the function, and `Box::new()` is called,
it allocates some memory for the heap, and puts `5` there. The memory now looks
like this:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 5 |
| ... | ... | ... |
| 1 | y | 42 |
| 0 | x | 2<sup>30</sup> |
We have 2<sup>30</sup> in our hypothetical computer with 1GB of RAM. And since
our stack grows from zero, the easiest place to allocate memory is from the
other end. So our first value is at the highest place in memory. And the value
of the struct at `x` has a [raw pointer][rawpointer] to the place weve
allocated on the heap, so the value of `x` is 2<sup>30</sup>, the memory
location weve asked for.
[rawpointer]: raw-pointers.html
We havent really talked too much about what it actually means to allocate and
deallocate memory in these contexts. Getting into very deep detail is out of
the scope of this tutorial, but whats important to point out here is that
the heap isnt just a stack that grows from the opposite end. Well have an
example of this later in the book, but because the heap can be allocated and
freed in any order, it can end up with holes. Heres a diagram of the memory
layout of a program which has been running for a while now:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 5 |
| (2<sup>30</sup>) - 1 | | |
| (2<sup>30</sup>) - 2 | | |
| (2<sup>30</sup>) - 3 | | 42 |
| ... | ... | ... |
| 3 | y | (2<sup>30</sup>) - 3 |
| 2 | y | 42 |
| 1 | y | 42 |
| 0 | x | 2<sup>30</sup> |
In this case, weve allocated four things on the heap, but deallocated two of
them. Theres a gap between 2<sup>30</sup> and (2<sup>30</sup>) - 3 which isnt
currently being used. The specific details of how and why this happens depends
on what kind of strategy you use to manage the heap. Different programs can use
different memory allocators, which are libraries that manage this for you.
Rust programs use [jemalloc][jemalloc] for this purpose.
[jemalloc]: http://www.canonware.com/jemalloc/
Anyway, back to our example. Since this memory is on the heap, it can stay
alive longer than the function which allocates the box. In this case, however,
it doesnt.[^moving] When the function is over, we need to free the stack frame
for `main()`. `Box<T>`, though, has a trick up its sleve: [Drop][drop]. The
implementation of `Drop` for `Box` deallocates the memory that was allocated
when it was created. Great! So when `x` goes away, it first frees the memory
allocated on the heap:
| Address | Name | Value |
+---------+------+--------+
| 1 | y | 42 |
| 0 | x | ?????? |
[drop]: drop.html
[moving]: We can make the memory live longer by transferring ownership,
sometimes called moving out of the box. More complex examples will
be covered later.
And then the stack frame goes away, freeing all of our memory.
# Arguments and borrowing
Weve got some basic examples with the stack and the heap going, but what about
function arguments and borrowing? Heres a small Rust program:
```rust
fn foo(i: &i32) {
let z = 42;
}
fn main() {
let x = 5;
let y = &x;
foo(y);
}
```
When we enter `main()`, memory looks like this:
| Address | Name | Value |
+---------+------+-------+
| 1 | y | 0 |
| 0 | x | 5 |
`x` is a plain old `5`, and `y` is a reference to `x`. So its value is the
memory location that `x` lives at, which in this case is `0`.
What about when we call `foo()`, passing `y` as an argument?
| Address | Name | Value |
+---------+------+-------+
| 3 | z | 42 |
| 2 | i | 0 |
| 1 | y | 0 |
| 0 | x | 5 |
Stack frames arent just for local bindings, theyre for arguments too. So in
this case, we need to have both `i`, our argument, and `z`, our local variable
binding. `i` is a copy of the argument, `y`. Since `y`s value is `0`, so is
`i`s.
This is one reason why borrowing a variable doesnt deallocate any memory: the
value of a reference is just a pointer to a memory location. If we got rid of
the underlying memory, things wouldnt work very well.
# A complex example
Okay, lets go through this complex program step-by-step:
```rust
fn foo(x: &i32) {
let y = 10;
let z = &y;
baz(z);
bar(x, z);
}
fn bar(a: &i32, b: &i32) {
let c = 5;
let d = Box::new(5);
let e = &d;
baz(e);
}
fn baz(f: &i32) {
let g = 100;
}
fn main() {
let h = 3;
let i = Box::new(20);
let j = &h;
foo(j);
}
```
First, we call `main()`:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
We allocate memory for `j`, `i`, and `h`. `i` is on the heap, and so has a
value pointing there.
Next, at the end of `main()`, `foo()` gets called:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Space gets allocated for `x`, `y`, and `z`. The argument `x` has the same value
as `j`, since thats what we passed it in. Its a pointer to the `0` address,
since `j` points at `h`.
Next, `foo()` calls `baz()`, passing `z`:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 7 | g | 100 |
| 6 | f | 4 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Weve allocated memory for `f` and `g`. `baz()` is very short, so when its
over, we get rid of its stack frame:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Next, `foo()` calls `bar()` with `x` and `z`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
We end up allocating another value on the heap, and so we have to subtract one
from 2<sup>30</sup>. Its easier to just write that than `1,073,741,823`. In any
case, we set up the variables as usual.
At the end of `bar()`, it calls `baz()`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 12 | g | 100 |
| 11 | f | 4 |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
With this, were at our deepest point! Whew! Congrats for following along this
far.
After `baz()` is over, we get rid of `f` and `g`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Next, we return from `bar()`. `d` in this case is a `Box<T>`, so it also frees
what it points to: (2<sup>30</sup>) - 1.
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
And after that, `foo()` returns:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
And then, finally, `main()`, which cleans the rest up. When `i` is `Drop`ped,
it will clean up the last of the heap too.
# What do other languages do?
Most languages with a garbage collector heap-allocate by default. This means
that every value is boxed. There are a number of reasons why this is done, but
theyre out of scope for this tutorial. There are some possible optimizations
that dont make it true 100% of the time, too. Rather than relying on the stack
and `Drop` to clean up memory, the garbage collector deals with the heap
instead.
# Which to use?
So if the stack is faster and easier to manage, why do we need the heap? A big
reason is that Stack-allocation alone means you only have LIFO semantics for
reclaiming storage. Heap-allocation is strictly more general, allowing storage
to be taken from and returned to the pool in arbitrary order, but at a
complexity cost.
Generally, you should prefer stack allocation, and so, Rust stack-allocates by
default. The LIFO model of the stack is simpler, at a fundamental level. This
has two big impacts: runtime efficiency and semantic impact.
## Runtime Efficiency.
Managing the memory for the stack is trivial: The machine just
increments or decrements a single value, the so-called “stack pointer”.
Managing memory for the heap is non-trivial: heap-allocated memory is freed at
arbitrary points, and each block of heap-allocated memory can be of arbitrary
size, the memory manager must generally work much harder to identify memory for
reuse.
If youd like to dive into this topic in greater detail, [this paper][wilson]
is a great introduction.
[wilson]: http://www.cs.northwestern.edu/~pdinda/icsclass/doc/dsa.pdf
## Semantic impact
Stack-allocation impacts the Rust language itself, and thus the developers
mental model. The LIFO semantics is what drives how the Rust language handles
automatic memory management. Even the deallocation of a uniquely-owned
heap-allocated box can be driven by the stack-based LIFO semantics, as
discussed throughout this chapter. The flexibility (i.e. expressiveness) of non
LIFO-semantics means that in general the compiler cannot automatically infer at
compile-time where memory should be freed; it has to rely on dynamic protocols,
potentially from outside the language itself, to drive deallocation (reference
counting, as used by `Rc<T>` and `Arc<T>`, is one example of this).
When taken to the extreme, the increased expressive power of heap allocation
comes at the cost of either significant runtime support (e.g. in the form of a
garbage collector) or significant programmer effort (in the form of explicit
memory management calls that require verification not provided by the Rust
compiler).