TRPL: the stack and the heap
This commit is contained in:
parent
a90dd32949
commit
e66653bee5
1 changed files with 568 additions and 1 deletions
|
@ -1,3 +1,570 @@
|
|||
% The Stack and the Heap
|
||||
|
||||
Coming Soon
|
||||
As a systems language, Rust operates at a low level. If you’re coming from a
|
||||
high-level language, there are some aspects of systems programming that you may
|
||||
not be familiar with. The most important one is how memory works, with a stack
|
||||
and a heap. If you’re familiar with how C-like languages use stack allocation,
|
||||
this chapter will be a refresher. If you’re not, you’ll learn about this more
|
||||
general concept, but with a Rust-y focus.
|
||||
|
||||
# Memory management
|
||||
|
||||
These two terms are about memory management. The stack and the heap are
|
||||
abstractions that help you determine when to allocate and deallocate memory.
|
||||
|
||||
Here’s a high-level comparison:
|
||||
|
||||
The stack is very fast, and is where memory is allocated in Rust by default.
|
||||
But the allocation is local to a function call, and is limited in size. The
|
||||
heap, on the other hand, is slower, and is explicitly allocated by your
|
||||
program. But it’s effectively unlimited in size, and is globally accessible.
|
||||
|
||||
# The Stack
|
||||
|
||||
Let’s talk about this Rust program:
|
||||
|
||||
```rust
|
||||
fn main() {
|
||||
let x = 42;
|
||||
}
|
||||
```
|
||||
|
||||
This program has one variable binding, `x`. This memory needs to be allocated
|
||||
from somewhere. Rust ‘stack allocates’ by default, which means that basic
|
||||
values ‘go on the stack’. What does that mean?
|
||||
|
||||
Well, when a function gets called, some memory gets allocated for all of its
|
||||
local variables and some other information. This is called a ‘stack frame’, and
|
||||
for the purpose of this tutorial, we’re going to ignore the extra information
|
||||
and just consider the local variables we’re allocating. So in this case, when
|
||||
`main()` is run, we’ll allocate a single 32-bit integer for our stack frame.
|
||||
This is automatically handled for you, as you can see, we didn’t have to write
|
||||
any special Rust code or anything.
|
||||
|
||||
When the function is over, its stack frame gets deallocated. This happens
|
||||
automatically, we didn’t have to do anything special here.
|
||||
|
||||
That’s all there is for this simple program. The key thing to understand here
|
||||
is that stack allocation is very, very fast. Since we know all the local
|
||||
variables we have ahead of time, we can grab the memory all at once. And since
|
||||
we’ll throw them all away at the same time as well, we can get rid of it very
|
||||
fast too.
|
||||
|
||||
The downside is that we can’t keep values around if we need them for longer
|
||||
than a single function. We also haven’t talked about what that name, ‘stack’
|
||||
means. To do that, we need a slightly more complicated example:
|
||||
|
||||
```rust
|
||||
fn foo() {
|
||||
let y = 5;
|
||||
let z = 100;
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let x = 42;
|
||||
|
||||
foo();
|
||||
}
|
||||
```
|
||||
|
||||
This program has three variables total: two in `foo()`, one in `main()`. Just
|
||||
as before, when `main()` is called, a single integer is allocated for its stack
|
||||
frame. But before we can show what happens when `foo()` is called, we need to
|
||||
visualize what’s going on with memory. Your operating system presents a view of
|
||||
memory to your program that’s pretty simple: a huge list of addresses, from 0
|
||||
to a large number, representing how much RAM your computer has. For example, if
|
||||
you have a gigabyte of RAM, your addresses go from `0` to `1,073,741,824`. That
|
||||
number comes from 2<sup>30</sup>, the number of bytes in a gigabyte.
|
||||
|
||||
This memory is kind of like a giant array: addresses start at zero and go
|
||||
up to the final number. So here’s a diagram of our first stack frame:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 0 | x | 42 |
|
||||
|
||||
We’ve got `x` located at address `0`, with the value `42`.
|
||||
|
||||
When `foo()` is called, a new stack frame is allocated:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 2 | z | 100 |
|
||||
| 1 | y | 5 |
|
||||
| 0 | x | 42 |
|
||||
|
||||
Because `0` was taken by the first frame, `1` and `2` are used for `foo()`’s
|
||||
stack frame. It grows upward, the more functions we call.
|
||||
|
||||
|
||||
There’s some important things we have to take note of here. The numbers 0, 1,
|
||||
and 2 are all solely for illustrative purposes, and bear no relationship to the
|
||||
actual numbers the computer will actually use. In particular, the series of
|
||||
addresses are in reality going to be separated by some number of bytes that
|
||||
separate each address, and that separation may even exceed the size of the
|
||||
value being stored.
|
||||
|
||||
After `foo()` is over, its frame is deallocated:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 0 | x | 42 |
|
||||
|
||||
And then, after `main()`, even this last value goes away. Easy!
|
||||
|
||||
It’s called a ‘stack’ because it works like a stack of dinner plates: the first
|
||||
plate you put down is the last plate to pick back up. Stacks are sometimes
|
||||
called ‘last in, first out queues’ for this reason, as the last value you put
|
||||
on the stack is the first one you retrieve from it.
|
||||
|
||||
Let’s try a three-deep example:
|
||||
|
||||
```rust
|
||||
fn bar() {
|
||||
let i = 6;
|
||||
}
|
||||
|
||||
fn foo() {
|
||||
let a = 5;
|
||||
let b = 100;
|
||||
let c = 1;
|
||||
|
||||
bar();
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let x = 42;
|
||||
|
||||
foo();
|
||||
}
|
||||
```
|
||||
|
||||
Okay, first, we call `main()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 0 | x | 42 |
|
||||
|
||||
Next up, `main()` calls `foo()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 3 | c | 1 |
|
||||
| 2 | b | 100 |
|
||||
| 1 | a | 5 |
|
||||
| 0 | x | 42 |
|
||||
|
||||
And then `foo()` calls `bar()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 4 | i | 6 |
|
||||
| 3 | c | 1 |
|
||||
| 2 | b | 100 |
|
||||
| 1 | a | 5 |
|
||||
| 0 | x | 42 |
|
||||
|
||||
Whew! Our stack is growing tall.
|
||||
|
||||
After `bar()` is over, its frame is deallocated, leaving just `foo()` and
|
||||
`main()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 3 | c | 1 |
|
||||
| 2 | b | 100 |
|
||||
| 1 | a | 5 |
|
||||
| 0 | x | 42 |
|
||||
|
||||
And then `foo()` ends, leaving just `main()`
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 0 | x | 42 |
|
||||
|
||||
And then we’re done. Getting the hang of it? It’s like piling up dishes: you
|
||||
add to the top, you take away from the top.
|
||||
|
||||
# The Heap
|
||||
|
||||
Now, this works pretty well, but not everything can work like this. Sometimes,
|
||||
you need to pass some memory between different functions, or keep it alive for
|
||||
longer than a single function’s execution. For this, we can use the heap.
|
||||
|
||||
In Rust, you can allocate memory on the heap with the [`Box<T>` type][box].
|
||||
Here’s an example:
|
||||
|
||||
```rust
|
||||
fn main() {
|
||||
let x = Box::new(5);
|
||||
let y = 42;
|
||||
}
|
||||
```
|
||||
|
||||
[box]: ../std/boxed/index.html
|
||||
|
||||
Here’s what happens in memory when `main()` is called:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+--------+
|
||||
| 1 | y | 42 |
|
||||
| 0 | x | ?????? |
|
||||
|
||||
We allocate space for two variables on the stack. `y` is `42`, as it always has
|
||||
been, but what about `x`? Well, `x` is a `Box<i32>`, and boxes allocate memory
|
||||
on the heap. The actual value of the box is a structure which has a pointer to
|
||||
‘the heap’. When we start executing the function, and `Box::new()` is called,
|
||||
it allocates some memory for the heap, and puts `5` there. The memory now looks
|
||||
like this:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 5 |
|
||||
| ... | ... | ... |
|
||||
| 1 | y | 42 |
|
||||
| 0 | x | 2<sup>30</sup> |
|
||||
|
||||
We have 2<sup>30</sup> in our hypothetical computer with 1GB of RAM. And since
|
||||
our stack grows from zero, the easiest place to allocate memory is from the
|
||||
other end. So our first value is at the highest place in memory. And the value
|
||||
of the struct at `x` has a [raw pointer][rawpointer] to the place we’ve
|
||||
allocated on the heap, so the value of `x` is 2<sup>30</sup>, the memory
|
||||
location we’ve asked for.
|
||||
|
||||
[rawpointer]: raw-pointers.html
|
||||
|
||||
We haven’t really talked too much about what it actually means to allocate and
|
||||
deallocate memory in these contexts. Getting into very deep detail is out of
|
||||
the scope of this tutorial, but what’s important to point out here is that
|
||||
the heap isn’t just a stack that grows from the opposite end. We’ll have an
|
||||
example of this later in the book, but because the heap can be allocated and
|
||||
freed in any order, it can end up with ‘holes’. Here’s a diagram of the memory
|
||||
layout of a program which has been running for a while now:
|
||||
|
||||
|
||||
| Address | Name | Value |
|
||||
+----------------------+------+----------------------+
|
||||
| 2<sup>30</sup> | | 5 |
|
||||
| (2<sup>30</sup>) - 1 | | |
|
||||
| (2<sup>30</sup>) - 2 | | |
|
||||
| (2<sup>30</sup>) - 3 | | 42 |
|
||||
| ... | ... | ... |
|
||||
| 3 | y | (2<sup>30</sup>) - 3 |
|
||||
| 2 | y | 42 |
|
||||
| 1 | y | 42 |
|
||||
| 0 | x | 2<sup>30</sup> |
|
||||
|
||||
In this case, we’ve allocated four things on the heap, but deallocated two of
|
||||
them. There’s a gap between 2<sup>30</sup> and (2<sup>30</sup>) - 3 which isn’t
|
||||
currently being used. The specific details of how and why this happens depends
|
||||
on what kind of strategy you use to manage the heap. Different programs can use
|
||||
different ‘memory allocators’, which are libraries that manage this for you.
|
||||
Rust programs use [jemalloc][jemalloc] for this purpose.
|
||||
|
||||
[jemalloc]: http://www.canonware.com/jemalloc/
|
||||
|
||||
Anyway, back to our example. Since this memory is on the heap, it can stay
|
||||
alive longer than the function which allocates the box. In this case, however,
|
||||
it doesn’t.[^moving] When the function is over, we need to free the stack frame
|
||||
for `main()`. `Box<T>`, though, has a trick up its sleve: [Drop][drop]. The
|
||||
implementation of `Drop` for `Box` deallocates the memory that was allocated
|
||||
when it was created. Great! So when `x` goes away, it first frees the memory
|
||||
allocated on the heap:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+--------+
|
||||
| 1 | y | 42 |
|
||||
| 0 | x | ?????? |
|
||||
|
||||
[drop]: drop.html
|
||||
[moving]: We can make the memory live longer by transferring ownership,
|
||||
sometimes called ‘moving out of the box’. More complex examples will
|
||||
be covered later.
|
||||
|
||||
|
||||
And then the stack frame goes away, freeing all of our memory.
|
||||
|
||||
# Arguments and borrowing
|
||||
|
||||
We’ve got some basic examples with the stack and the heap going, but what about
|
||||
function arguments and borrowing? Here’s a small Rust program:
|
||||
|
||||
```rust
|
||||
fn foo(i: &i32) {
|
||||
let z = 42;
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let x = 5;
|
||||
let y = &x;
|
||||
|
||||
foo(y);
|
||||
}
|
||||
```
|
||||
|
||||
When we enter `main()`, memory looks like this:
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 1 | y | 0 |
|
||||
| 0 | x | 5 |
|
||||
|
||||
`x` is a plain old `5`, and `y` is a reference to `x`. So its value is the
|
||||
memory location that `x` lives at, which in this case is `0`.
|
||||
|
||||
What about when we call `foo()`, passing `y` as an argument?
|
||||
|
||||
| Address | Name | Value |
|
||||
+---------+------+-------+
|
||||
| 3 | z | 42 |
|
||||
| 2 | i | 0 |
|
||||
| 1 | y | 0 |
|
||||
| 0 | x | 5 |
|
||||
|
||||
Stack frames aren’t just for local bindings, they’re for arguments too. So in
|
||||
this case, we need to have both `i`, our argument, and `z`, our local variable
|
||||
binding. `i` is a copy of the argument, `y`. Since `y`’s value is `0`, so is
|
||||
`i`’s.
|
||||
|
||||
This is one reason why borrowing a variable doesn’t deallocate any memory: the
|
||||
value of a reference is just a pointer to a memory location. If we got rid of
|
||||
the underlying memory, things wouldn’t work very well.
|
||||
|
||||
# A complex example
|
||||
|
||||
Okay, let’s go through this complex program step-by-step:
|
||||
|
||||
```rust
|
||||
fn foo(x: &i32) {
|
||||
let y = 10;
|
||||
let z = &y;
|
||||
|
||||
baz(z);
|
||||
bar(x, z);
|
||||
}
|
||||
|
||||
fn bar(a: &i32, b: &i32) {
|
||||
let c = 5;
|
||||
let d = Box::new(5);
|
||||
let e = &d;
|
||||
|
||||
baz(e);
|
||||
}
|
||||
|
||||
fn baz(f: &i32) {
|
||||
let g = 100;
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let h = 3;
|
||||
let i = Box::new(20);
|
||||
let j = &h;
|
||||
|
||||
foo(j);
|
||||
}
|
||||
```
|
||||
|
||||
First, we call `main()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
We allocate memory for `j`, `i`, and `h`. `i` is on the heap, and so has a
|
||||
value pointing there.
|
||||
|
||||
Next, at the end of `main()`, `foo()` gets called:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
Space gets allocated for `x`, `y`, and `z`. The argument `x` has the same value
|
||||
as `j`, since that’s what we passed it in. It’s a pointer to the `0` address,
|
||||
since `j` points at `h`.
|
||||
|
||||
Next, `foo()` calls `baz()`, passing `z`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 7 | g | 100 |
|
||||
| 6 | f | 4 |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
We’ve allocated memory for `f` and `g`. `baz()` is very short, so when it’s
|
||||
over, we get rid of its stack frame:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
Next, `foo()` calls `bar()` with `x` and `z`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+----------------------+------+----------------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| (2<sup>30</sup>) - 1 | | 5 |
|
||||
| ... | ... | ... |
|
||||
| 10 | e | 4 |
|
||||
| 9 | d | (2<sup>30</sup>) - 1 |
|
||||
| 8 | c | 5 |
|
||||
| 7 | b | 4 |
|
||||
| 6 | a | 0 |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
We end up allocating another value on the heap, and so we have to subtract one
|
||||
from 2<sup>30</sup>. It’s easier to just write that than `1,073,741,823`. In any
|
||||
case, we set up the variables as usual.
|
||||
|
||||
At the end of `bar()`, it calls `baz()`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+----------------------+------+----------------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| (2<sup>30</sup>) - 1 | | 5 |
|
||||
| ... | ... | ... |
|
||||
| 12 | g | 100 |
|
||||
| 11 | f | 4 |
|
||||
| 10 | e | 4 |
|
||||
| 9 | d | (2<sup>30</sup>) - 1 |
|
||||
| 8 | c | 5 |
|
||||
| 7 | b | 4 |
|
||||
| 6 | a | 0 |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
With this, we’re at our deepest point! Whew! Congrats for following along this
|
||||
far.
|
||||
|
||||
After `baz()` is over, we get rid of `f` and `g`:
|
||||
|
||||
| Address | Name | Value |
|
||||
+----------------------+------+----------------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| (2<sup>30</sup>) - 1 | | 5 |
|
||||
| ... | ... | ... |
|
||||
| 10 | e | 4 |
|
||||
| 9 | d | (2<sup>30</sup>) - 1 |
|
||||
| 8 | c | 5 |
|
||||
| 7 | b | 4 |
|
||||
| 6 | a | 0 |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
Next, we return from `bar()`. `d` in this case is a `Box<T>`, so it also frees
|
||||
what it points to: (2<sup>30</sup>) - 1.
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 5 | z | 4 |
|
||||
| 4 | y | 10 |
|
||||
| 3 | x | 0 |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
And after that, `foo()` returns:
|
||||
|
||||
| Address | Name | Value |
|
||||
+-----------------+------+----------------+
|
||||
| 2<sup>30</sup> | | 20 |
|
||||
| ... | ... | ... |
|
||||
| 2 | j | 0 |
|
||||
| 1 | i | 2<sup>30</sup> |
|
||||
| 0 | h | 3 |
|
||||
|
||||
And then, finally, `main()`, which cleans the rest up. When `i` is `Drop`ped,
|
||||
it will clean up the last of the heap too.
|
||||
|
||||
# What do other languages do?
|
||||
|
||||
Most languages with a garbage collector heap-allocate by default. This means
|
||||
that every value is boxed. There are a number of reasons why this is done, but
|
||||
they’re out of scope for this tutorial. There are some possible optimizations
|
||||
that don’t make it true 100% of the time, too. Rather than relying on the stack
|
||||
and `Drop` to clean up memory, the garbage collector deals with the heap
|
||||
instead.
|
||||
|
||||
# Which to use?
|
||||
|
||||
So if the stack is faster and easier to manage, why do we need the heap? A big
|
||||
reason is that Stack-allocation alone means you only have LIFO semantics for
|
||||
reclaiming storage. Heap-allocation is strictly more general, allowing storage
|
||||
to be taken from and returned to the pool in arbitrary order, but at a
|
||||
complexity cost.
|
||||
|
||||
Generally, you should prefer stack allocation, and so, Rust stack-allocates by
|
||||
default. The LIFO model of the stack is simpler, at a fundamental level. This
|
||||
has two big impacts: runtime efficiency and semantic impact.
|
||||
|
||||
## Runtime Efficiency.
|
||||
|
||||
Managing the memory for the stack is trivial: The machine just
|
||||
increments or decrements a single value, the so-called “stack pointer”.
|
||||
Managing memory for the heap is non-trivial: heap-allocated memory is freed at
|
||||
arbitrary points, and each block of heap-allocated memory can be of arbitrary
|
||||
size, the memory manager must generally work much harder to identify memory for
|
||||
reuse.
|
||||
|
||||
If you’d like to dive into this topic in greater detail, [this paper][wilson]
|
||||
is a great introduction.
|
||||
|
||||
[wilson]: http://www.cs.northwestern.edu/~pdinda/icsclass/doc/dsa.pdf
|
||||
|
||||
## Semantic impact
|
||||
|
||||
Stack-allocation impacts the Rust language itself, and thus the developer’s
|
||||
mental model. The LIFO semantics is what drives how the Rust language handles
|
||||
automatic memory management. Even the deallocation of a uniquely-owned
|
||||
heap-allocated box can be driven by the stack-based LIFO semantics, as
|
||||
discussed throughout this chapter. The flexibility (i.e. expressiveness) of non
|
||||
LIFO-semantics means that in general the compiler cannot automatically infer at
|
||||
compile-time where memory should be freed; it has to rely on dynamic protocols,
|
||||
potentially from outside the language itself, to drive deallocation (reference
|
||||
counting, as used by `Rc<T>` and `Arc<T>`, is one example of this).
|
||||
|
||||
When taken to the extreme, the increased expressive power of heap allocation
|
||||
comes at the cost of either significant runtime support (e.g. in the form of a
|
||||
garbage collector) or significant programmer effort (in the form of explicit
|
||||
memory management calls that require verification not provided by the Rust
|
||||
compiler).
|
||||
|
|
Loading…
Add table
Reference in a new issue