Commit graph

64 commits

Author SHA1 Message Date
Michael Howell
fa2e214a43 rustdoc-search: add standalone trailing :: test
Follow up for #132569
2024-11-17 08:07:16 -07:00
bors
e84902d35a Auto merge of #133047 - matthiaskrgr:rollup-9se1vth, r=matthiaskrgr
Rollup of 4 pull requests

Successful merges:

 - #128197 (Skip locking span interner for some syntax context checks)
 - #133040 ([rustdoc] Fix handling of footnote reference in footnote definition)
 - #133043 (rustdoc-search: case-sensitive only when capitals are used)
 - #133046 (Clippy subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-11-14 21:09:28 +00:00
Michael Howell
32500aa8e0 rustdoc-search: case-sensitive only when capitals are used
This is the "smartcase" behavior, described by vim and dtolnay.
2024-11-14 11:10:14 -07:00
Michael Howell
86da4be47f rustdoc: use a trie for name-based search
Preview and profiler results
----------------------------

Here's some quick profiling in Firefox done on the rust compiler docs:

- Before: https://share.firefox.dev/3UPm3M8
- After: https://share.firefox.dev/40LXvYb

Here's the results for the node.js profiler:

- https://notriddle.com/rustdoc-html-demo-15/trie-perf/index.html

Here's a copy that you can use to try it out. Compare it with [the nightly].
Try typing `typecheckercontext` one character at a time, slowly.

- https://notriddle.com/rustdoc-html-demo-15/compiler-doc-trie/index.html

[the nightly]: https://doc.rust-lang.org/nightly/nightly-rustc/

The fuzzy match algo is based on [Fast String Correction with
Levenshtein-Automata] and the corresponding implementation code in [moman]
and [Lucene]; the bit-packing representation comes from Lucene, but the
actual matcher is more based on `fsc.py`. As suggested in the paper, a
trie is used to represent the FSA dictionary.

The same trie is used for prefix matching. Substring matching is done with a
side table of three-character[^1] windows that point into the trie.

[Fast String Correction with Levenshtein-Automata]: https://github.com/tpn/pdfs/blob/master/Fast%20String%20Correction%20with%20Levenshtein-Automata%20(2002)%20(10.1.1.16.652).pdf
[Lucene]: https://fossies.org/linux/lucene/lucene/core/src/java/org/apache/lucene/util/automaton/Lev1TParametricDescription.java
[moman]: https://gitlab.com/notriddle/moman-rustdoc

User-visible changes
--------------------

I don't expect anybody to notice anything, but it does cause two changes:

- Substring matches, in the middle of a name, only apply if there's three
  or more characters in the search query.
- Levenshtein distance limit now maxes out at two. In the old version,
  the limit was w/3, so you could get looser matches for queries with
  9 or more characters[^1] in them.

[^1]: technically utf-16 code units
2024-11-13 12:04:46 -07:00
Michael Howell
9900ea48b5 Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
Michael Howell
12dc24f460 rustdoc-search: simplify rules for generics and type params
This commit is a response to feedback on the displayed type
signatures results, by making generics act stricter.

Generics are tightened by making order significant. This means
`Vec<Allocator>` now matches only with a true vector of allocators,
instead of matching the second type param. It also makes unboxing
within generics stricter, so `Result<A, B>` only matches if `B`
is in the error type and `A` is in the success type. The top level
of the function search is unaffected.

Find the discussion on:

* <https://rust-lang.zulipchat.com/#narrow/stream/393423-t-rustdoc.2Fmeetings/topic/meeting.202024-07-08/near/449965149>
* <https://github.com/rust-lang/rust/pull/124544#issuecomment-2204272265>
* <https://rust-lang.zulipchat.com/#narrow/channel/266220-t-rustdoc/topic/deciding.20on.20semantics.20of.20generics.20in.20rustdoc.20search/near/476841363>
2024-10-30 12:27:48 -07:00
Michael Howell
20a4b4fea1 rustdoc-search: show types signatures in results 2024-10-30 10:35:39 -07:00
Michael Howell
5c7e7dfe10 rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
binarycat
09773b4f24 allow type-based search on foreign functions
fixes https://github.com/rust-lang/rust/issues/131804
2024-10-25 12:19:04 -05:00
Michael Howell
3699e939e8 rustdoc-search: allow trailing Foo -> arg search 2024-09-05 17:58:05 -07:00
Noah Lev
dac7f20e13 Add test for Self not being a generic in search index 2024-08-04 12:49:28 -07:00
Michael Howell
8865b8c639 rustdoc-search: use lowercase, non-normalized name for type search
The type name ID map has underscores in its names, so the query
element should have them, too.
2024-06-09 11:56:52 -07:00
Sunshine
ceaa42b817 Update tests 2024-06-07 11:55:52 +08:00
Sunshine
f9f51839e5 Tidying 2024-06-07 06:09:30 +08:00
Sunshine
dd5103bb68 Add test for PR #126057 2024-06-07 05:49:46 +08:00
Nicholas Nethercote
c6fb703c05 rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
Michael Howell
3c4e180e68 rustdoc-search: add parser for & syntax 2024-04-19 14:31:21 -07:00
Michael Howell
8b47f67817 rustdoc-search: add index of borrow references 2024-04-19 14:31:21 -07:00
Michael Howell
f36c5af359 rustdoc-search: single result for items with multiple paths
This change uses the same "exact" paths as trait implementors
and type alias inlining to track items with multiple
reachable paths. This way, if you search for `vec`, you get
only the `std` exports of it, and not the one from `alloc`.

It still includes all the items in the search index so that
you can search for them by all available paths. For example,
try `core::option` and `std::option`, and notice that the
results page doesn't show duplicates, but still shows all
the items in their respective crates.
2024-04-08 17:07:14 -07:00
Matthias Krüger
a95e2f999a
Rollup merge of #122247 - notriddle:notriddle/search-unbox-limit, r=GuillaumeGomez
rustdoc-search: depth limit `T<U>` -> `U` unboxing

Profiler output:
https://notriddle.com/rustdoc-html-demo-9/search-unbox-limit/ (the only significant change is that one of the `rust` tests went from 378416ms to 16ms).

This is a performance enhancement aimed at a problem I found while using type-driven search on the Rust compiler. It is caused by [`Interner`], a trait with 41 associated types, many of which recurse back to `Self` again.

This caused search.js to struggle. It eventually terminates, after about 10 minutes of turning my PC into a space header, but it's doing `41!` unifications and that's too slow.

[`Interner`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/trait.Interner.html
2024-03-14 15:44:32 +01:00
Michael Howell
7b926555b7 rustdoc-search: add search query syntax Fn(T) -> U
This is implemented, in addition to the ML-style one,
because Rust does it. If we don't, we'll never hear the end of it.

This commit also refactors some duplicate parts of the parser
into a dedicated function.
2024-03-11 22:27:22 -07:00
Michael Howell
7f427f86bd rustdoc-search: parse and search with ML-style HOF
Option::map, for example, looks like this:

    option<t>, (t -> u) -> option<u>

This syntax searches all of the HOFs in Rust: traits Fn, FnOnce,
and FnMut, and bare fn primitives.
2024-03-11 21:22:03 -07:00
Michael Howell
fa5b9f0923 rustdoc-search: stress test for associated types 2024-03-11 09:20:49 -07:00
许杰友 Jieyou Xu (Joe)
6e48b96692
[AUTO_GENERATED] Migrate compiletest to use ui_test-style //@ directives 2024-02-22 16:04:04 +00:00
Matthias Krüger
d5fd88cb85
Rollup merge of #118194 - notriddle:notriddle/tuple-unit, r=GuillaumeGomez
rustdoc: search for tuples and unit by type with `()`

This feature extends rustdoc to support the syntax that most users will naturally attempt to use to search for tuples. Part of https://github.com/rust-lang/rust/issues/60485

Function signature searches already support tuples and unit. The explicit name `primitive:tuple` and `primitive:unit` can be used to match a tuple or unit, while `()` will match either one. It also follows the direction set by the actual language for parens as a group, so `(u8,)` will only match a tuple, while `(u8)` will match a plain, unwrapped byte—thanks to loose search semantics, it will also match the tuple.

## Preview

* [`option<t>, option<u> -> (t, u)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=option%3Ct%3E%2C option%3Cu%3E -%3E (t%2C u)>)
* [`[t] -> (t,)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=[t] -%3E (t%2C)>)
* [`(ipaddr,) -> socketaddr`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=(ipaddr%2C) -%3E socketaddr>)

## Motivation

When type-based search was first landed, it was directly [described as incomplete][a comment].

[a comment]: https://github.com/rust-lang/rust/pull/23289#issuecomment-79437386

Filling out the missing functionality is going to mean adding support for more of Rust's [type expression] syntax, such as tuples (in this PR), references, raw pointers, function pointers, and closures.

[type expression]: https://doc.rust-lang.org/reference/types.html#type-expressions

There does seem to be demand for this sort of thing, such as [this Discord message](https://discord.com/channels/442252698964721669/443150878111694848/1042145740065099796) expressing regret at rustdoc not supporting tuples in search queries.

## Reference description (from the Rustdoc book)

<table>
<thead>
  <tr>
    <th>Shorthand</th>
    <th>Explicit names</th>
  </tr>
</thead>
<tbody>
  <tr><td colspan="2">Before this PR</td></tr>
  <tr>
    <td><code>[]</code></td>
    <td><code>primitive:slice</code> and/or <code>primitive:array</code></td>
  </tr>
  <tr>
    <td><code>[T]</code></td>
    <td><code>primitive:slice&lt;T&gt;</code> and/or <code>primitive:array&lt;T&gt;</code></td>
  </tr>
  <tr>
    <td><code>!</code></td>
    <td><code>primitive:never</code></td>
  </tr>
  <tr><td colspan="2">After this PR</td></tr>
  <tr>
    <td><code>()</code></td>
    <td><code>primitive:unit</code> and/or <code>primitive:tuple</code></td>
  </tr>
  <tr>
    <td><code>(T)</code></td>
    <td><code>T</code></td>
  </tr>
  <tr>
    <td><code>(T,)</code></td>
    <td><code>primitive:tuple&lt;T&gt;</code></td>
  </tr>
</tbody>
</table>

A single type expression wrapped in parens is the same as that type expression, since parens act as the grouping operator. If they're empty, though, they will match both `unit` and `tuple`, and if there's more than one type (or a trailing or leading comma) it is the same as `primitive:tuple<...>`.

However, since items can be left out of the query, `(T)` will still return results for types that match tuples, even though it also matches the type on its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it also matches `Result<u32, Error>`.

## Future direction

The [type expression grammar](https://doc.rust-lang.org/reference/types.html#type-expressions) from the Reference is given below:

<pre><code>Syntax
    Type :
        TypeNoBounds
        | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitType</a>
        | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectType</a>
<br>
    TypeNoBounds :
        <a href="https://doc.rust-lang.org/reference/types.html#parenthesized-types">ParenthesizedType</a>
        | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitTypeOneBound</a>
        | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectTypeOneBound</a>
        | <a href="https://doc.rust-lang.org/reference/paths.html#paths-in-types">TypePath</a>
        | <a href="https://doc.rust-lang.org/reference/types/tuple.html#tuple-types">TupleType</a>
        | <a href="https://doc.rust-lang.org/reference/types/never.html">NeverType</a>
        | <a href="https://doc.rust-lang.org/reference/types/pointer.html#raw-pointers-const-and-mut">RawPointerType</a>
        | <a href="https://doc.rust-lang.org/reference/types/pointer.html#shared-references-">ReferenceType</a>
        | <a href="https://doc.rust-lang.org/reference/types/array.html">ArrayType</a>
        | <a href="https://doc.rust-lang.org/reference/types/slice.html">SliceType</a>
        | <a href="https://doc.rust-lang.org/reference/types/inferred.html">InferredType</a>
        | <a href="https://doc.rust-lang.org/reference/paths.html#qualified-paths">QualifiedPathInType</a>
        | <a href="https://doc.rust-lang.org/reference/types/function-pointer.html">BareFunctionType</a>
        | <a href="https://doc.rust-lang.org/reference/macros.html#macro-invocation">MacroInvocation</a>
</code></pre>

ImplTraitType and TraitObjectType (and ImplTraitTypeOneBound and TraitObjectTypeOneBound) are not yet implemented. They would mostly desugar to `trait:`, similarly to how `!` desugars to `primitive:never`.

ParenthesizedType and TuplePath are added in this PR.

TypePath is already implemented (except const generics, which is not planned, and function-like trait syntax, which is planned as part of closure support).

NeverType is already implemented.

RawPointerType and ReferenceType require parsing and fixes to the search index to store this information, but otherwise their behavior seems simple enough. Just like tuples and slices, `&T` would be equivalent to `primitive:reference<T>`, `&mut T` would be equivalent to `primitive:reference<keyword:mut, T>`, `*T` would be equivalent to `primitive:pointer<T>`, `*mut T` would be equivalent to `primitive:pointer<keyword:mut, T>`, and `*const T` would be equivalent to `primitive:pointer<keyword:const, T>`. Lifetime generics support is not planned, because lifetime subtyping seems too complicated.

ArrayType is subsumed by SliceType right now. Implementing const generics is not planned, because it seems like it would require a lot of implementation complexity for not much gain.

InferredType isn't really covered right now. Its semantics in a search context are not obvious.

QualifiedPathInType is not implemented, and it is not planned. I would need a use case to justify it, and act as a guide for what the exact semantics should be.

BareFunctionType is not implemented. Along with function-like trait syntax, which is formally considered a TypePath, it's the biggest missing feature to be able to do structured searches over generic APIs like `Option`.

MacroInvocation is not parsed (macro names are, but they don't mean the same thing here at all). Those are gone by the time Rustdoc sees the source code.
2024-01-06 16:07:46 +01:00
Michael Howell
0ea58e2346 rustdoc-search: count path edits with separate edit limit
Since the two are counted separately elsewhere, they should get
their own limits, too. The biggest problem with combining them
is that paths are loosely checked by not requiring every component
to match, which means that if they are short and matched loosely,
they can easily find "drunk typist" matches that make no sense,
like this old result:

    std::collections::btree_map::itermut matching slice::itermut
    maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4
    editDistance("std", "slice") = 4
    editDistance("itermut", "itermut") = 0
        4 + 0 <= 4 PASS

Of course, `slice::itermut` should not match stuff from btreemap.
`slice` should not match `std`.

The new result counts them separately:

    maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1
    maxEditDistance = "itermut".length / 3 = 7 / 3 = 2
    editDistance("std", "slice") = 4
        4 <= 1 FAIL

Effectively, this makes path queries less "typo-resistant".
It's not zero, but it means `vec` won't match the `v1` prelude.

Queries without parent paths are unchanged.
2023-12-26 18:46:17 -07:00
Michael Howell
f6a045cc6b rustdoc: search for tuples and unit by type with () 2023-12-26 12:54:17 -07:00
Michael Howell
6b69ebcae0 rustdoc-search: remove parallel searchWords array
This might have made sense if the algorithm could use `searchWords`
to skip having to look at `searchIndex`, but since it always
does a substring check on both the stock word and the normalizedName,
it doesn't seem to help performance anyway.
2023-12-15 16:26:35 -07:00
Michael Howell
9a9695a052 rustdoc-search: use set ops for ranking and filtering
This commit adds ranking and quick filtering to type-based search,
improving performance and having it order results based on their
type signatures.

Motivation
----------

If I write a query like `str -> String`, a lot of functions come up.
That's to be expected, but `String::from_str` should come up on top, and
it doesn't right now. This is because the sorting algorithm is based
on the functions name, and doesn't consider the type signature at all.
`slice::join` even comes up above it!

To fix this, the sorting should take into account the function's
signature, and the closer match should come up on top.

Guide-level description
-----------------------

When searching by type signature, types with a "closer" match will
show up above types that match less precisely.

Reference-level explanation
---------------------------

Functions signature search works in three major phases:

* A compact "fingerprint," based on the [bloom filter] technique, is used to
  check for matches and to estimate the distance. It sometimes has false
  positive matches, but it also operates on 128 bit contiguous memory and
  requires no backtracking, so it performs a lot better than real
  unification.

  The fingerprint represents the set of items in the type signature, but it
  does not represent nesting, and it ignores when the same item appears more
  than once.

  The result is rejected if any query bits are absent in the function, or
  if the distance is higher than the current maximum and 200
  results have already been found.

* The second step performs unification. This is where nesting and true bag
  semantics are taken into account, and it has no false positives. It uses a
  recursive, backtracking algorithm.

  The result is rejected if any query elements are absent in the function.

[bloom filter]: https://en.wikipedia.org/wiki/Bloom_filter

Drawbacks
---------

This makes the code bigger.

More than that, this design is a subtle trade-off. It makes the cases I've
tested against measurably faster, but it's not clear how well this extends
to other crates with potentially more functions and fewer types.

The more complex things get, the more important it is to gather a good set
of data to test with (this is arguably more important than the actual
benchmarking ifrastructure right now).

Rationale and alternatives
--------------------------

Throwing a bloom filter in front makes it faster.

More than that, it tries to take a tactic where the system can not only check
for potential matches, but also gets an accurate distance function without
needing to do unification. That way it can skip unification even on items
that have the needed elems, as long as they have more items than the
currently found maximum.

If I didn't want to be able to cheaply do set operations on the fingerprint,
a [cuckoo filter] is supposed to have better performance.
But the nice bit-banging set intersection doesn't work AFAIK.

I also looked into [minhashing], but since it's actually an unbiased
estimate of the similarity coefficient, I'm not sure how it could be used
to skip unification (I wouldn't know if the estimate was too low or
too high).

This function actually uses the number of distinct items as its
"distance function."
This should give the same results that it would have gotten from a Jaccard
Distance $1-\frac{|F\cap{}Q|}{|F\cup{}Q|}$, while being cheaper to compute.
This is because:

* The function $F$ must be a superset of the query $Q$, so their union is
  just $F$ and the intersection is $Q$ and it can be reduced to
  $1-\frac{|Q|}{|F|}.

* There are no magic thresholds. These values are only being used to
  compare against each other while sorting (and, if 200 results are found,
  to compare with the maximum match). This means we only care if one value
  is bigger than the other, not what it's actual value is, and since $Q$ is
  the same for everything, it can be safely left out, reducing the formula
  to $1-\frac{1}{|F|} = \frac{|F|}{|F|}-\frac{1}{|F|} = |F|-1$. And, since
  the values are only being compared with each other, $|F|$ is fine.

Prior art
---------

This is significantly different from how Hoogle does it.
It doesn't account for order, and it has no special account for nesting,
though `Box<t>` is still two items, while `t` is only one.

This should give the same results that it would have gotten from a Jaccard
Distance $1-\frac{|A\cap{}B|}{|A\cup{}B|}$, while being cheaper to compute.

Unresolved questions
--------------------

`[]` and `()`, the slice/array and tuple/union operators, are ignored while
building the signature for the query. This is because they match more than
one thing, making them ambiguous. Unfortunately, this also makes them
a performance cliff. Is this likely to be a problem?

Right now, the system just stashes the type distance into the
same field that levenshtein distance normally goes in. This means exact
query matches show up on top (for example, if you have a function like
`fn nothing(a: Nothing, b: i32)`, then searching for `nothing` will show it
on top even if there's another function with `fn bar(x: Nothing)` that's
technically a closer match in type signature.

Future possibilities
--------------------

It should be possible to adopt more sorting criteria to act as a tie breaker,
which could be determined during unification.

[cuckoo filter]: https://en.wikipedia.org/wiki/Cuckoo_filter
[minhashing]: https://en.wikipedia.org/wiki/MinHash
2023-12-13 10:37:15 -07:00
Michael Howell
7162cb9550 rustdoc-search: fix fast path unboxing bindings 2023-12-10 20:53:53 -07:00
Michael Howell
92b84f849a rustdoc-search: do not treat associated type names as types
Before: http://notriddle.com/rustdoc-html-demo-6/tor-before/tor_config/

After: http://notriddle.com/rustdoc-html-demo-6/tor-after/tor_config/

Profile: http://notriddle.com/rustdoc-html-demo-6/tor-profile/

As a bit of background information: in type-based queries, a type
name that does not exist gets treated as a generic type variable.

This causes a counterintuitive behavior in the `tor_config` crate,
which has a trait with an associated type variable called `T`.

This isn't a searchable concrete type, but its name still gets stored
in the typeNameIdMap, as a convenient way to intern its name.
2023-12-10 16:52:21 -07:00
Michael Howell
1b7b9540fe rustdoc-search: avoid infinite where clause unbox
Fixes #118242
2023-11-24 10:42:11 -07:00
Michael Howell
63c50712f4 rustdoc-search: add support for associated types 2023-11-19 18:54:36 -07:00
Michael Howell
a66972d551 rustdoc-search: fix accidental shared, mutable map 2023-11-17 18:22:31 -07:00
Guillaume Gomez
efac0b9c02 Add regression test for #115480 2023-10-11 11:41:39 +02:00
Guillaume Gomez
4be9cfabf2
Rollup merge of #109422 - notriddle:notriddle/impl-disambiguate-search, r=GuillaumeGomez
rustdoc-search: add impl disambiguator to duplicate assoc items

Preview (to see the difference, click the link and pay attention to the specific function that comes up):

| Before | After |
|--|--|
| [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) |
| [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool)

Helps with #90929

This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names.

This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it:

* I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers.

* Breaking `struct.Vec.html#method.len` would also be a disappointment.

On the other hand:

* The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing.

* This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones.

### The bug

On the "Before" links, this example search calls for `i64`:

![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2)

But if I click any of the results, I get `f64` instead.

![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82)

The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name.

More detailed description of the problem at:
https://github.com/rust-lang/rust/pull/109422#issuecomment-1491089293

> When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls.
>
> When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc.
>
> It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP.
>
> To fix this, we need three things:
>
> 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`)
> 2. A search index that stores methods with enough information to disambiguate the impl they were on.
> 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages?
>
> For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
2023-10-10 18:44:43 +02:00
Michael Howell
1eb2a76641 rustdoc-search: fix bug with multi-item impl trait 2023-10-05 22:32:37 -07:00
Michael Howell
3fbfe2bca5 rustdoc-search: add impl disambiguator to duplicate assoc items
Helps with #90929

This changes the search results, specifically, when there's more than
one impl with an associated item with the same name. For example,
the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>`
don't link to the same function, but most of the functions have the
same names.

This change should probably be FCP-ed, especially since it adds a new
anchor link format for `main.js` to handle, so that URLs like
`struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect
to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there
are a few reasons for it:

* I'd like to avoid making the HTML bigger. Obviously, fixing this bug
  is going to add at least a little more data to the search index, but
  adding more HTML penalises viewers for the benefit of searchers.

* Breaking `struct.Vec.html#method.len` would also be a disappointment.

On the other hand:

* The path-style anchors might be less prone to link rot than the numbered
  anchors. It's definitely less likely to have URLs that appear to "work",
  but silently point at the wrong thing.

* This commit arranges the path-style anchor to redirect to the numbered
  anchor. Nothing stops rustdoc from doing the opposite, making path-style
  anchors the default and redirecting the "legacy" numbered ones.
2023-09-21 15:16:44 -07:00
Michael Howell
269cb57947 rustdoc-search: fix bugs when unboxing and reordering combine 2023-09-09 16:58:37 -07:00
Michael Howell
6068850008 rustdoc: fix test case for generics that look like names 2023-09-03 13:06:08 -07:00
Michael Howell
0b3c617ec0 rustdoc-search: add support for type parameters
When writing a type-driven search query in rustdoc, specifically one
with more than one query element, non-existent types become generic
parameters instead of auto-correcting (which is currently only done
for single-element queries) or giving no result. You can also force a
generic type parameter by writing `generic:T` (and can force it to not
use a generic type parameter with something like `struct:T` or whatever,
though if this happens it means the thing you're looking for doesn't
exist and will give you no results).

There is no syntax provided for specifying type constraints
for generic type parameters.

When you have a generic type parameter in a search query, it will only
match up with generic type parameters in the actual function, not
concrete types that match, not concrete types that implement a trait.
It also strictly matches based on when they're the same or different,
so `option<T>, option<U> -> option<U>` matches `Option::and`, but not
`Option::or`. Similarly, `option<T>, option<T> -> option<T>`` matches
`Option::or`, but not `Option::and`.
2023-09-03 13:06:06 -07:00
Guillaume Gomez
e161fa1a6b Correctly handle paths from foreign items 2023-09-02 23:04:37 +02:00
Guillaume Gomez
05eda41658 Add tests for type-based search 2023-09-01 15:16:11 +02:00
bors
314c39d2ea Auto merge of #112233 - notriddle:notriddle/search-unify, r=GuillaumeGomez
rustdoc-search: clean up type unification and "unboxing"

This PR redesigns parameter matching, return matching, and generics matching to use a single function that compares two lists of types.

It also makes the algorithms more consistent, so the "unboxing" behavior where `Vec<i32>` is considered a match for `i32` works inside generics, and not just at the top level.
2023-06-15 03:04:46 +00:00
Michael Howell
db277f5284 rustdoc-search: search never type with !
This feature extends rustdoc to support the syntax that most users will
naturally attempt to use to search for diverging functions.
Part of #60485

It's already possible to do this search with `primitive:never`, but
that's not what the Rust language itself uses, so nobody will try it if
they aren't told or helped along.
2023-06-12 17:30:23 -07:00
Michael Howell
94badbe599 rustdoc-search: fix order-independence bug 2023-06-11 18:57:33 -07:00
Michael Howell
9946d67579 rustdoc-search: build args, return, and generics on one unifier
This enhances generics with the "unboxing" behavior where A<T>
matches T. It makes this unboxing transitive over generics.
2023-06-11 18:19:37 -07:00
Michael Howell
d3a4cd6813 rustdoc: add note about slice/array searches to help popup 2023-06-10 14:08:26 -07:00
Michael Howell
2e569274d3 rustdoc: search for slices and arrays by type with []
Part of #60485
2023-06-10 13:52:54 -07:00
Guillaume Gomez
9803651ee8 Update rustdoc-js* format 2023-06-09 17:00:47 +02:00