Rollup merge of #22393 - kmcallister:macros-chapter , r=steveklabnik

This is a more introductory document, suitable for Part II. The arcane details move to an "Advanced macros" chapter in Part III.

Conflicts:
	src/doc/trpl/macros.md
This commit is contained in:
Manish Goregaokar 2015-02-17 17:33:53 +05:30
commit 09eb965903
3 changed files with 509 additions and 521 deletions

View file

@ -28,13 +28,14 @@
* [Generics](generics.md)
* [Traits](traits.md)
* [Static and Dynamic Dispatch](static-and-dynamic-dispatch.md)
* [Macros](macros.md)
* [Concurrency](concurrency.md)
* [Error Handling](error-handling.md)
* [Documentation](documentation.md)
* [III: Advanced Topics](advanced.md)
* [FFI](ffi.md)
* [Unsafe Code](unsafe.md)
* [Macros](macros.md)
* [Advanced Macros](advanced-macros.md)
* [Compiler Plugins](plugins.md)
* [Conclusion](conclusion.md)
* [Glossary](glossary.md)

View file

@ -0,0 +1,210 @@
% Advanced macros
This chapter picks up where the [introductory macro chapter](macros.html) left
off.
# Syntactic requirements
Even when Rust code contains un-expanded macros, it can be parsed as a full
syntax tree. This property can be very useful for editors and other tools that
process code. It also has a few consequences for the design of Rust's macro
system.
One consequence is that Rust must determine, when it parses a macro invocation,
whether the macro stands in for
* zero or more items,
* zero or more methods,
* an expression,
* a statement, or
* a pattern.
A macro invocation within a block could stand for some items, or for an
expression / statement. Rust uses a simple rule to resolve this ambiguity. A
macro invocation that stands for items must be either
* delimited by curly braces, e.g. `foo! { ... }`, or
* terminated by a semicolon, e.g. `foo!(...);`
Another consequence of pre-expansion parsing is that the macro invocation must
consist of valid Rust tokens. Furthermore, parentheses, brackets, and braces
must be balanced within a macro invocation. For example, `foo!([)` is
forbidden. This allows Rust to know where the macro invocation ends.
More formally, the macro invocation body must be a sequence of *token trees*.
A token tree is defined recursively as either
* a sequence of token trees surrounded by matching `()`, `[]`, or `{}`, or
* any other single token.
Within a matcher, each metavariable has a *fragment specifier*, identifying
which syntactic form it matches.
* `ident`: an identifier. Examples: `x`; `foo`.
* `path`: a qualified name. Example: `T::SpecialA`.
* `expr`: an expression. Examples: `2 + 2`; `if true then { 1 } else { 2 }`; `f(42)`.
* `ty`: a type. Examples: `i32`; `Vec<(char, String)>`; `&T`.
* `pat`: a pattern. Examples: `Some(t)`; `(17, 'a')`; `_`.
* `stmt`: a single statement. Example: `let x = 3`.
* `block`: a brace-delimited sequence of statements. Example:
`{ log(error, "hi"); return 12; }`.
* `item`: an [item][]. Examples: `fn foo() { }`; `struct Bar;`.
* `meta`: a "meta item", as found in attributes. Example: `cfg(target_os = "windows")`.
* `tt`: a single token tree.
There are additional rules regarding the next token after a metavariable:
* `expr` variables must be followed by one of: `=> , ;`
* `ty` and `path` variables must be followed by one of: `=> , : = > as`
* `pat` variables must be followed by one of: `=> , =`
* Other variables may be followed by any token.
These rules provide some flexibility for Rust's syntax to evolve without
breaking existing macros.
The macro system does not deal with parse ambiguity at all. For example, the
grammar `$($t:ty)* $e:expr` will always fail to parse, because the parser would
be forced to choose between parsing `$t` and parsing `$e`. Changing the
invocation syntax to put a distinctive token in front can solve the problem. In
this case, you can write `$(T $t:ty)* E $e:exp`.
[item]: ../reference.html#items
# Scoping and macro import/export
Macros are expanded at an early stage in compilation, before name resolution.
One downside is that scoping works differently for macros, compared to other
constructs in the language.
Definition and expansion of macros both happen in a single depth-first,
lexical-order traversal of a crate's source. So a macro defined at module scope
is visible to any subsequent code in the same module, which includes the body
of any subsequent child `mod` items.
A macro defined within the body of a single `fn`, or anywhere else not at
module scope, is visible only within that item.
If a module has the `macro_use` attribute, its macros are also visible in its
parent module after the child's `mod` item. If the parent also has `macro_use`
then the macros will be visible in the grandparent after the parent's `mod`
item, and so forth.
The `macro_use` attribute can also appear on `extern crate`. In this context
it controls which macros are loaded from the external crate, e.g.
```rust,ignore
#[macro_use(foo, bar)]
extern crate baz;
```
If the attribute is given simply as `#[macro_use]`, all macros are loaded. If
there is no `#[macro_use]` attribute then no macros are loaded. Only macros
defined with the `#[macro_export]` attribute may be loaded.
To load a crate's macros *without* linking it into the output, use `#[no_link]`
as well.
An example:
```rust
macro_rules! m1 { () => (()) }
// visible here: m1
mod foo {
// visible here: m1
#[macro_export]
macro_rules! m2 { () => (()) }
// visible here: m1, m2
}
// visible here: m1
macro_rules! m3 { () => (()) }
// visible here: m1, m3
#[macro_use]
mod bar {
// visible here: m1, m3
macro_rules! m4 { () => (()) }
// visible here: m1, m3, m4
}
// visible here: m1, m3, m4
# fn main() { }
```
When this library is loaded with `#[macro_use] extern crate`, only `m2` will
be imported.
The Rust Reference has a [listing of macro-related
attributes](../reference.html#macro--and-plugin-related-attributes).
# The variable `$crate`
A further difficulty occurs when a macro is used in multiple crates. Say that
`mylib` defines
```rust
pub fn increment(x: u32) -> u32 {
x + 1
}
#[macro_export]
macro_rules! inc_a {
($x:expr) => ( ::increment($x) )
}
#[macro_export]
macro_rules! inc_b {
($x:expr) => ( ::mylib::increment($x) )
}
# fn main() { }
```
`inc_a` only works within `mylib`, while `inc_b` only works outside the
library. Furthermore, `inc_b` will break if the user imports `mylib` under
another name.
Rust does not (yet) have a hygiene system for crate references, but it does
provide a simple workaround for this problem. Within a macro imported from a
crate named `foo`, the special macro variable `$crate` will expand to `::foo`.
By contrast, when a macro is defined and then used in the same crate, `$crate`
will expand to nothing. This means we can write
```rust
#[macro_export]
macro_rules! inc {
($x:expr) => ( $crate::increment($x) )
}
# fn main() { }
```
to define a single macro that works both inside and outside our library. The
function name will expand to either `::increment` or `::mylib::increment`.
To keep this system simple and correct, `#[macro_use] extern crate ...` may
only appear at the root of your crate, not inside `mod`. This ensures that
`$crate` is a single identifier.
# A final note
Macros, as currently implemented, are not for the faint of heart. Even
ordinary syntax errors can be more difficult to debug when they occur inside a
macro, and errors caused by parse problems in generated code can be very
tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
states, invoking `trace_macros!(true)` will automatically print those
intermediate states out, and passing the flag `--pretty expanded` as a
command-line argument to the compiler will show the result of expansion.
If Rust's macro system can't do what you need, you may want to write a
[compiler plugin](plugins.html) instead. Compared to `macro_rules!`
macros, this is significantly more work, the interfaces are much less stable,
and the warnings about debugging apply ten-fold. In exchange you get the
flexibility of running arbitrary Rust code within the compiler. Syntax
extension plugins are sometimes called *procedural macros* for this reason.

View file

@ -1,588 +1,365 @@
% Macros
# Introduction
By now you've learned about many of the tools Rust provides for abstracting and
reusing code. These units of code reuse have a rich semantic structure. For
example, functions have a type signature, type parameters have trait bounds,
and overloaded functions must belong to a particular trait.
Functions are the primary tool that programmers can use to build abstractions.
Sometimes, however, programmers want to abstract over compile-time syntax
rather than run-time values.
Macros provide syntactic abstraction.
For an example of how this can be useful, consider the following two code fragments,
which both pattern-match on their input and both return early in one case,
doing nothing otherwise:
This structure means that Rust's core abstractions have powerful compile-time
correctness checking. But this comes at the price of reduced flexibility. If
you visually identify a pattern of repeated code, you may find it's difficult
or cumbersome to express that pattern as a generic function, a trait, or
anything else within Rust's semantics.
~~~~
# enum T { SpecialA(u32), SpecialB(u32) }
# fn f() -> u32 {
# let input_1 = T::SpecialA(0);
# let input_2 = T::SpecialA(0);
match input_1 {
T::SpecialA(x) => { return x; }
_ => {}
}
// ...
match input_2 {
T::SpecialB(x) => { return x; }
_ => {}
}
# return 0;
# }
~~~~
Macros allow us to abstract at a *syntactic* level. A macro invocation is
shorthand for an "expanded" syntactic form. This expansion happens early in
compilation, before any static checking. As a result, macros can capture many
patterns of code reuse that Rust's core abstractions cannot.
This code could become tiresome if repeated many times.
However, no function can capture its functionality to make it possible
to abstract the repetition away.
Rust's macro system, however, can eliminate the repetition. Macros are
lightweight custom syntax extensions, themselves defined using the
`macro_rules!` syntax extension. The following `early_return` macro captures
the pattern in the above code:
The drawback is that macro-based code can be harder to understand, because
fewer of the built-in rules apply. Like an ordinary function, a well-behaved
macro can be used without understanding its implementation. However, it can be
difficult to design a well-behaved macro! Additionally, compiler errors in
macro code are harder to interpret, because they describe problems in the
expanded code, not the source-level form that developers use.
~~~~
# enum T { SpecialA(u32), SpecialB(u32) }
# fn f() -> u32 {
# let input_1 = T::SpecialA(0);
# let input_2 = T::SpecialA(0);
macro_rules! early_return {
($inp:expr, $sp:path) => ( // invoke it like `(input_5, SpecialE)`
match $inp {
$sp(x) => { return x; }
_ => {}
}
);
}
// ...
early_return!(input_1, T::SpecialA);
// ...
early_return!(input_2, T::SpecialB);
# return 0;
# }
# fn main() {}
~~~~
These drawbacks make macros something of a "feature of last resort". That's not
to say that macros are bad; they are part of Rust because sometimes they're
needed for truly concise, well-abstracted code. Just keep this tradeoff in
mind.
Macros are defined in pattern-matching style: in the above example, the text
`($inp:expr, $sp:path)` that appears on the left-hand side of the `=>` is the
*macro invocation syntax*, a pattern denoting how to write a call to the
macro. The text on the right-hand side of the `=>`, beginning with `match
$inp`, is the *macro transcription syntax*: what the macro expands to.
# Defining a macro
# Invocation syntax
You may have seen the `vec!` macro, used to initialize a [vector][] with any
number of elements.
The macro invocation syntax specifies the syntax for the arguments to the
macro. It appears on the left-hand side of the `=>` in a macro definition. It
conforms to the following rules:
[vector]: arrays-vectors-and-slices.html
1. It must be surrounded by parentheses.
2. `$` has special meaning (described below).
3. The `()`s, `[]`s, and `{}`s it contains must balance. For example, `([)` is
forbidden.
4. Some arguments can be followed only by a limited set of separators, to
avoid ambiguity (described below).
```rust
let x: Vec<u32> = vec![1, 2, 3];
# assert_eq!(&[1,2,3], &x);
```
Otherwise, the invocation syntax is free-form.
This can't be an ordinary function, because it takes any number of arguments.
But we can imagine it as syntactic shorthand for
To take a fragment of Rust code as an argument, write `$` followed by a name
(for use on the right-hand side), followed by a `:`, followed by a *fragment
specifier*. The fragment specifier denotes the sort of fragment to match. The
most common fragment specifiers are:
```rust
let x: Vec<u32> = {
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
};
# assert_eq!(&[1,2,3], &x);
```
* `ident` (an identifier, referring to a variable or item. Examples: `f`, `x`,
`foo`.)
* `expr` (an expression. Examples: `2 + 2`; `if true then { 1 } else { 2 }`;
`f(42)`.)
* `ty` (a type. Examples: `i32`, `Vec<(char, String)>`, `&T`.)
* `path` (a path to struct or enum variant. Example: `T::SpecialA`)
* `pat` (a pattern, usually appearing in a `match` or on the left-hand side of
a declaration. Examples: `Some(t)`; `(17, 'a')`; `_`.)
* `block` (a sequence of actions. Example: `{ log(error, "hi"); return 12; }`)
We can implement this shorthand, using a macro: [^actual]
The parser interprets any token that's not preceded by a `$` literally. Rust's usual
rules of tokenization apply,
[^actual]: The actual definition of `vec!` in libcollections differs from the
one presented here, for reasons of efficiency and reusability. Some
of these are mentioned in the [advanced macros chapter][].
So `($x:ident -> (($e:expr)))`, though excessively fancy, would designate a macro
that could be invoked like: `my_macro!(i->(( 2+2 )))`.
To avoid ambiguity, macro invocation syntax must conform to the following rules:
* `expr` must be followed by `=>`, `,` or `;`.
* `ty` and `path` must be followed by `=>`, `,`, `:`, `=`, `>` or `as`.
* `pat` must be followed by `=>`, `,` or `=`.
* `ident` and `block` can be followed by any token.
## Invocation location
A macro invocation may take the place of (and therefore expand to) an
expression, item, statement, or pattern. The Rust parser will parse the macro
invocation as a "placeholder" for whichever syntactic form is appropriate for
the location.
At expansion time, the output of the macro will be parsed as whichever of the
three nonterminals it stands in for. This means that a single macro might,
for example, expand to an item or an expression, depending on its arguments
(and cause a syntax error if it is called with the wrong argument for its
location). Although this behavior sounds excessively dynamic, it is known to
be useful under some circumstances.
# Transcription syntax
The right-hand side of the `=>` follows the same rules as the left-hand side,
except that a `$` need only be followed by the name of the syntactic fragment
to transcribe into the macro expansion; its type need not be repeated.
The right-hand side must be enclosed by delimiters, which the transcriber ignores.
Therefore `() => ((1,2,3))` is a macro that expands to a tuple expression,
`() => (let $x=$val)` is a macro that expands to a statement,
and `() => (1,2,3)` is a macro that expands to a syntax error
(since the transcriber interprets the parentheses on the right-hand-size as delimiters,
and `1,2,3` is not a valid Rust expression on its own).
Except for permissibility of `$name` (and `$(...)*`, discussed below), the
right-hand side of a macro definition is ordinary Rust syntax. In particular,
macro invocations (including invocations of the macro currently being defined)
are permitted in expression, statement, and item locations. However, nothing
else about the code is examined or executed by the macro system; execution
still has to wait until run-time.
## Interpolation location
The interpolation `$argument_name` may appear in any location consistent with
its fragment specifier (i.e., if it is specified as `ident`, it may be used
anywhere an identifier is permitted).
# Multiplicity
## Invocation
Going back to the motivating example, recall that `early_return` expanded into
a `match` that would `return` if the `match`'s scrutinee matched the
"special case" identifier provided as the second argument to `early_return`,
and do nothing otherwise. Now suppose that we wanted to write a
version of `early_return` that could handle a variable number of "special"
cases.
The syntax `$(...)*` on the left-hand side of the `=>` in a macro definition
accepts zero or more occurrences of its contents. It works much
like the `*` operator in regular expressions. It also supports a
separator token (a comma-separated list could be written `$(...),*`), and `+`
instead of `*` to mean "at least one."
~~~~
# enum T { SpecialA(u32), SpecialB(u32), SpecialC(u32), SpecialD(u32) }
# fn f() -> u32 {
# let input_1 = T::SpecialA(0);
# let input_2 = T::SpecialA(0);
macro_rules! early_return {
($inp:expr, [ $($sp:path),+ ]) => (
match $inp {
```rust
macro_rules! vec {
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$(
$sp(x) => { return x; }
)+
_ => {}
temp_vec.push($x);
)*
temp_vec
}
)
};
}
// ...
early_return!(input_1, [T::SpecialA,T::SpecialC,T::SpecialD]);
// ...
early_return!(input_2, [T::SpecialB]);
# return 0;
# }
# fn main() {}
~~~~
### Transcription
As the above example demonstrates, `$(...)*` is also valid on the right-hand
side of a macro definition. The behavior of `*` in transcription,
especially in cases where multiple `*`s are nested, and multiple different
names are involved, can seem somewhat magical and unintuitive at first. The
system that interprets them is called "Macro By Example." The two rules to
keep in mind are (1) the behavior of `$(...)*` is to walk through one "layer"
of repetitions for all of the `$name`s it contains in lockstep, and (2) each
`$name` must be under at least as many `$(...)*`s as it was matched against.
If it is under more, it'll be repeated, as appropriate.
## Parsing limitations
For technical reasons, there are two limitations to the treatment of syntax
fragments by the macro parser:
1. The parser will always parse as much as possible of a Rust syntactic
fragment. For example, if the comma were omitted from the syntax of
`early_return!` above, `input_1 [` would've been interpreted as the beginning
of an array index. In fact, invoking the macro would have been impossible.
2. The parser must have eliminated all ambiguity by the time it reaches a
`$name:fragment_specifier` declaration. This limitation can result in parse
errors when declarations occur at the beginning of, or immediately after,
a `$(...)*`. For example, the grammar `$($t:ty)* $e:expr` will always fail to
parse because the parser would be forced to choose between parsing `t` and
parsing `e`. Changing the invocation syntax to require a distinctive token in
front can solve the problem. In the above example, `$(T $t:ty)* E $e:exp`
solves the problem.
# Macro argument pattern matching
## Motivation
Now consider code like the following:
~~~~
# enum T1 { Good1(T2, u32), Bad1}
# struct T2 { body: T3 }
# enum T3 { Good2(u32), Bad2}
# fn f(x: T1) -> u32 {
match x {
T1::Good1(g1, val) => {
match g1.body {
T3::Good2(result) => {
// complicated stuff goes here
return result + val;
},
_ => panic!("Didn't get good_2")
}
}
_ => return 0 // default value
}
# }
# fn main() {}
~~~~
All the complicated stuff is deeply indented, and the error-handling code is
separated from matches that fail. We'd like to write a macro that performs
a match, but with a syntax that suits the problem better. The following macro
can solve the problem:
~~~~
macro_rules! biased_match {
// special case: `let (x) = ...` is illegal, so use `let x = ...` instead
( ($e:expr) -> ($p:pat) else $err:stmt ;
binds $bind_res:ident
) => (
let $bind_res = match $e {
$p => ( $bind_res ),
_ => { $err }
};
);
// more than one name; use a tuple
( ($e:expr) -> ($p:pat) else $err:stmt ;
binds $( $bind_res:ident ),*
) => (
let ( $( $bind_res ),* ) = match $e {
$p => ( $( $bind_res ),* ),
_ => { $err }
};
)
}
# enum T1 { Good1(T2, u32), Bad1}
# struct T2 { body: T3 }
# enum T3 { Good2(u32), Bad2}
# fn f(x: T1) -> u32 {
biased_match!((x) -> (T1::Good1(g1, val)) else { return 0 };
binds g1, val );
biased_match!((g1.body) -> (T3::Good2(result) )
else { panic!("Didn't get good_2") };
binds result );
// complicated stuff goes here
return result + val;
# }
# fn main() {}
~~~~
This solves the indentation problem. But if we have a lot of chained matches
like this, we might prefer to write a single macro invocation. The input
pattern we want is clear:
~~~~
# fn main() {}
# macro_rules! b {
( $( ($e:expr) -> ($p:pat) else $err:stmt ; )*
binds $( $bind_res:ident ),*
)
# => (0) }
~~~~
However, it's not possible to directly expand to nested match statements. But
there is a solution.
## The recursive approach to macro writing
A macro may accept multiple different input grammars. The first one to
successfully match the actual argument to a macro invocation is the one that
"wins."
In the case of the example above, we want to write a recursive macro to
process the semicolon-terminated lines, one-by-one. So, we want the following
input patterns:
~~~~
# macro_rules! b {
( binds $( $bind_res:ident ),* )
# => (0) }
# fn main() {}
~~~~
...and:
~~~~
# fn main() {}
# macro_rules! b {
( ($e :expr) -> ($p :pat) else $err :stmt ;
$( ($e_rest:expr) -> ($p_rest:pat) else $err_rest:stmt ; )*
binds $( $bind_res:ident ),*
)
# => (0) }
~~~~
The resulting macro looks like this. Note that the separation into
`biased_match!` and `biased_match_rec!` occurs only because we have an outer
piece of syntax (the `let`) which we only want to transcribe once.
~~~~
# fn main() {
macro_rules! biased_match_rec {
// Handle the first layer
( ($e :expr) -> ($p :pat) else $err :stmt ;
$( ($e_rest:expr) -> ($p_rest:pat) else $err_rest:stmt ; )*
binds $( $bind_res:ident ),*
) => (
match $e {
$p => {
// Recursively handle the next layer
biased_match_rec!($( ($e_rest) -> ($p_rest) else $err_rest ; )*
binds $( $bind_res ),*
)
}
_ => { $err }
}
);
// Produce the requested values
( binds $( $bind_res:ident ),* ) => ( ($( $bind_res ),*) )
}
// Wrap the whole thing in a `let`.
macro_rules! biased_match {
// special case: `let (x) = ...` is illegal, so use `let x = ...` instead
( $( ($e:expr) -> ($p:pat) else $err:stmt ; )*
binds $bind_res:ident
) => (
let $bind_res = biased_match_rec!(
$( ($e) -> ($p) else $err ; )*
binds $bind_res
);
);
// more than one name: use a tuple
( $( ($e:expr) -> ($p:pat) else $err:stmt ; )*
binds $( $bind_res:ident ),*
) => (
let ( $( $bind_res ),* ) = biased_match_rec!(
$( ($e) -> ($p) else $err ; )*
binds $( $bind_res ),*
);
)
}
# enum T1 { Good1(T2, u32), Bad1}
# struct T2 { body: T3 }
# enum T3 { Good2(u32), Bad2}
# fn f(x: T1) -> u32 {
biased_match!(
(x) -> (T1::Good1(g1, val)) else { return 0 };
(g1.body) -> (T3::Good2(result) ) else { panic!("Didn't get Good2") };
binds val, result );
// complicated stuff goes here
return result + val;
# assert_eq!(&[1,2,3], &vec![1,2,3]);
# }
# }
~~~~
```
This technique applies to many cases where transcribing a result all at once is not possible.
The resulting code resembles ordinary functional programming in some respects,
but has some important differences from functional programming.
Whoa, that's a lot of new syntax! Let's break it down.
The first difference is important, but also easy to forget: the transcription
(right-hand) side of a `macro_rules!` rule is literal syntax, which can only
be executed at run-time. If a piece of transcription syntax does not itself
appear inside another macro invocation, it will become part of the final
program. If it is inside a macro invocation (for example, the recursive
invocation of `biased_match_rec!`), it does have the opportunity to affect
transcription, but only through the process of attempted pattern matching.
```ignore
macro_rules! vec { ... }
```
The second, related, difference is that the evaluation order of macros feels
"backwards" compared to ordinary programming. Given an invocation
`m1!(m2!())`, the expander first expands `m1!`, giving it as input the literal
syntax `m2!()`. If it transcribes its argument unchanged into an appropriate
position (in particular, not as an argument to yet another macro invocation),
the expander will then proceed to evaluate `m2!()` (along with any other macro
invocations `m1!(m2!())` produced).
This says we're defining a macro named `vec`, much as `fn vec` would define a
function named `vec`. In prose, we informally write a macro's name with an
exclamation point, e.g. `vec!`. The exclamation point is part of the invocation
syntax and serves to distinguish a macro from an ordinary function.
# Hygiene
## Matching
To prevent clashes, rust implements
[hygienic macros](http://en.wikipedia.org/wiki/Hygienic_macro).
The macro is defined through a series of *rules*, which are pattern-matching
cases. Above, we had
As an example, `loop` and `for-loop` labels (discussed in the lifetimes guide)
will not clash. The following code will print "Hello!" only once:
```ignore
( $( $x:expr ),* ) => { ... };
```
~~~
macro_rules! loop_x {
($e: expr) => (
// $e will not interact with this 'x
'x: loop {
println!("Hello!");
$e
}
);
This is like a `match` expression arm, but the matching happens on Rust syntax
trees, at compile time. The semicolon is optional on the last (here, only)
case. The "pattern" on the left-hand side of `=>` is known as a *matcher*.
These have [their own little grammar] within the language.
[their own little grammar]: ../reference.html#macros
The matcher `$x:expr` will match any Rust expression, binding that syntax tree
to the *metavariable* `$x`. The identifier `expr` is a *fragment specifier*;
the full possibilities are enumerated in the [advanced macros chapter][].
Surrounding the matcher with `$(...),*` will match zero or more expressions,
separated by commas.
Aside from the special matcher syntax, any Rust tokens that appear in a matcher
must match exactly. For example,
```rust
macro_rules! foo {
(x => $e:expr) => (println!("mode X: {}", $e));
(y => $e:expr) => (println!("mode Y: {}", $e));
}
fn main() {
'x: loop {
loop_x!(break 'x);
println!("I am never printed.");
}
foo!(y => 3);
}
~~~
```
The two `'x` names did not clash, which would have caused the loop
to print "I am never printed" and to run forever.
will print
# Scoping and macro import/export
```text
mode Y: 3
```
Macros are expanded at an early stage in compilation, before name resolution.
One downside is that scoping works differently for macros, compared to other
constructs in the language.
Definition and expansion of macros both happen in a single depth-first,
lexical-order traversal of a crate's source. So a macro defined at module scope
is visible to any subsequent code in the same module, which includes the body
of any subsequent child `mod` items.
A macro defined within the body of a single `fn`, or anywhere else not at
module scope, is visible only within that item.
If a module has the `macro_use` attribute, its macros are also visible in its
parent module after the child's `mod` item. If the parent also has `macro_use`
then the macros will be visible in the grandparent after the parent's `mod`
item, and so forth.
The `macro_use` attribute can also appear on `extern crate`. In this context
it controls which macros are loaded from the external crate, e.g.
With
```rust,ignore
#[macro_use(foo, bar)]
extern crate baz;
foo!(z => 3);
```
If the attribute is given simply as `#[macro_use]`, all macros are loaded. If
there is no `#[macro_use]` attribute then no macros are loaded. Only macros
defined with the `#[macro_export]` attribute may be loaded.
we get the compiler error
To load a crate's macros *without* linking it into the output, use `#[no_link]`
as well.
```text
error: no rules expected the token `z`
```
An example:
## Expansion
The right-hand side of a macro rule is ordinary Rust syntax, for the most part.
But we can splice in bits of syntax captured by the matcher. From the original
example:
```ignore
$(
temp_vec.push($x);
)*
```
Each matched expression `$x` will produce a single `push` statement in the
macro expansion. The repetition in the expansion proceeds in "lockstep" with
repetition in the matcher (more on this in a moment).
Because `$x` was already declared as matching an expression, we don't repeat
`:expr` on the right-hand side. Also, we don't include a separating comma as
part of the repetition operator. Instead, we have a terminating semicolon
within the repeated block.
Another detail: the `vec!` macro has *two* pairs of braces on the right-hand
side. They are often combined like so:
```ignore
macro_rules! foo {
() => {{
...
}}
}
```
The outer braces are part of the syntax of `macro_rules!`. In fact, you can use
`()` or `[]` instead. They simply delimit the right-hand side as a whole.
The inner braces are part of the expanded syntax. Remember, the `vec!` macro is
used in an expression context. To write an expression with multiple statements,
including `let`-bindings, we use a block. If your macro expands to a single
expression, you don't need this extra layer of braces.
Note that we never *declared* that the macro produces an expression. In fact,
this is not determined until we use the macro as an expression. With care, you
can write a macro whose expansion works in several contexts. For example,
shorthand for a data type could be valid as either an expression or a pattern.
## Repetition
The repetition behavior can seem somewhat magical, especially when multiple
names are bound at multiple nested levels of repetition. The two rules to keep
in mind are:
1. the behavior of `$(...)*` is to walk through one "layer" of repetitions, for
all of the `$name`s it contains, in lockstep, and
2. each `$name` must be under at least as many `$(...)*`s as it was matched
against. If it is under more, it'll be duplicated, as appropriate.
This baroque macro illustrates the duplication of variables from outer
repetition levels.
```rust
macro_rules! m1 { () => (()) }
// visible here: m1
mod foo {
// visible here: m1
#[macro_export]
macro_rules! m2 { () => (()) }
// visible here: m1, m2
macro_rules! o_O {
(
$(
$x:expr; [ $( $y:expr ),* ]
);*
) => {
&[ $($( $x + $y ),*),* ]
}
}
// visible here: m1
fn main() {
let a: &[i32]
= o_O!(10; [1, 2, 3];
20; [4, 5, 6]);
macro_rules! m3 { () => (()) }
// visible here: m1, m3
#[macro_use]
mod bar {
// visible here: m1, m3
macro_rules! m4 { () => (()) }
// visible here: m1, m3, m4
assert_eq!(a, [11, 12, 13, 24, 25, 26]);
}
// visible here: m1, m3, m4
# fn main() { }
```
When this library is loaded with `#[use_macros] extern crate`, only `m2` will
be imported.
That's most of the matcher syntax. These examples use `$(...)*`, which is a
"zero or more" match. Alternatively you can write `$(...)+` for a "one or
more" match. Both forms optionally include a separator, which can be any token
except `+` or `*`.
The Rust Reference has a [listing of macro-related
attributes](../reference.html#macro--and-plugin-related-attributes).
# Hygiene
# The variable `$crate`
Some languages implement macros using simple text substitution, which leads to
various problems. For example, this C program prints `13` instead of the
expected `25`.
A further difficulty occurs when a macro is used in multiple crates. Say that
`mylib` defines
```text
#define FIVE_TIMES(x) 5 * x
int main() {
printf("%d\n", FIVE_TIMES(2 + 3));
return 0;
}
```
After expansion we have `5 * 2 + 3`, and multiplication has greater precedence
than addition. If you've used C macros a lot, you probably know the standard
idioms for avoiding this problem, as well as five or six others. In Rust, we
don't have to worry about it.
```rust
pub fn increment(x: u32) -> u32 {
x + 1
macro_rules! five_times {
($x:expr) => (5 * $x);
}
#[macro_export]
macro_rules! inc_a {
($x:expr) => ( ::increment($x) )
fn main() {
assert_eq!(25, five_times!(2 + 3));
}
#[macro_export]
macro_rules! inc_b {
($x:expr) => ( ::mylib::increment($x) )
}
# fn main() { }
```
`inc_a` only works within `mylib`, while `inc_b` only works outside the
library. Furthermore, `inc_b` will break if the user imports `mylib` under
another name.
The metavariable `$x` is parsed as a single expression node, and keeps its
place in the syntax tree even after substitution.
Rust does not (yet) have a hygiene system for crate references, but it does
provide a simple workaround for this problem. Within a macro imported from a
crate named `foo`, the special macro variable `$crate` will expand to `::foo`.
By contrast, when a macro is defined and then used in the same crate, `$crate`
will expand to nothing. This means we can write
Another common problem in macro systems is *variable capture*. Here's a C
macro, using [a GNU C extension] to emulate Rust's expression blocks.
[a GNU C extension]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
```text
#define LOG(msg) ({ \
int state = get_log_state(); \
if (state > 0) { \
printf("log(%d): %s\n", state, msg); \
} \
})
```
This looks reasonable, but watch what happens in this example:
```text
const char *state = "reticulating splines";
LOG(state);
```
The program will likely segfault, after it tries to execute
```text
printf("log(%d): %s\n", state, state);
```
The equivalent Rust macro has the desired behavior.
```rust
#[macro_export]
macro_rules! inc {
($x:expr) => ( $crate::increment($x) )
# fn get_log_state() -> i32 { 3 }
macro_rules! log {
($msg:expr) => {{
let state: i32 = get_log_state();
if state > 0 {
println!("log({}): {}", state, $msg);
}
}};
}
fn main() {
let state: &str = "reticulating splines";
log!(state);
}
# fn main() { }
```
to define a single macro that works both inside and outside our library. The
function name will expand to either `::increment` or `::mylib::increment`.
This works because Rust has a [hygienic macro system][]. Each macro expansion
happens in a distinct *syntax context*, and each variable is tagged with the
syntax context where it was introduced. It's as though the variable `state`
inside `main` is painted a different "color" from the variable `state` inside
the macro, and therefore they don't conflict.
To keep this system simple and correct, `#[macro_use] extern crate ...` may
only appear at the root of your crate, not inside `mod`. This ensures that
`$crate` is a single identifier.
[hygienic macro system]: http://en.wikipedia.org/wiki/Hygienic_macro
# A final note
This also restricts the ability of macros to introduce new bindings at the
invocation site. Code such as the following will not work:
Macros, as currently implemented, are not for the faint of heart. Even
ordinary syntax errors can be more difficult to debug when they occur inside a
macro, and errors caused by parse problems in generated code can be very
tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
states, invoking `trace_macros!(true)` will automatically print those
intermediate states out, and passing the flag `--pretty expanded` as a
command-line argument to the compiler will show the result of expansion.
```rust,ignore
macro_rules! foo {
() => (let x = 3);
}
If Rust's macro system can't do what you need, you may want to write a
[compiler plugin](plugins.html) instead. Compared to `macro_rules!`
macros, this is significantly more work, the interfaces are much less stable,
and the warnings about debugging apply ten-fold. In exchange you get the
flexibility of running arbitrary Rust code within the compiler. Syntax
extension plugins are sometimes called *procedural macros* for this reason.
fn main() {
foo!();
println!("{}", x);
}
```
Instead you need to pass the variable name into the invocation, so it's tagged
with the right syntax context.
```rust
macro_rules! foo {
($v:ident) => (let $v = 3);
}
fn main() {
foo!(x);
println!("{}", x);
}
```
This holds for `let` bindings and loop labels, but not for [items][].
So the following code does compile:
```rust
macro_rules! foo {
() => (fn x() { });
}
fn main() {
foo!();
x();
}
```
[items]: ../reference.html#items
# Further reading
The [advanced macros chapter][] goes into more detail about macro syntax. It
also describes how to share macros between different modules or crates.
[advanced macros chapter]: advanced-macros.html