Rollup merge of #131936 - jalil-salame:rustdoc-types-rustc-hash, r=aDotInTheVoid

feat(rustdoc-json-types): introduce rustc-hash feature

This allows the public `rustdoc-types` crate to expose this feature easily and allows consumers of the crate to get the performance advantages from doing so.

The reasoning for this was discussed on [Zulip][1]

Changes:
- Make `rustc-hash` optional but default to including it
- Rename all occurrences of `FxHashMap` to `HashMap`.
- Feature gate the import and rename the imported `FxHashMap` to `HashMap`
- Introduce a type alias `FxHashMap` which resolves to the currently used `HashMap` (`rustc_hash::FxHashMap` or `std::collections::HashMap`) for use in `src/librustdoc`.

[1]: https://rust-lang.zulipchat.com/#narrow/channel/266220-t-rustdoc/topic/rustc-hash.20and.20performance.20of.20rustdoc-types

**extra context from the zulip thread:**

- `@obi1kenobi` requested benchmarks of the switch to `rustc-hash`
- I benchmarked switching `rustdoc-types` to `rustc-hash` which yielded a ~300ms improvement to `cargo-semver-checks`'s index building step (this step is done twice so the improvements are ~150ms per index).
- The benchmarks were presented in Zulip and people were in favor of introducing `rustc-hash` to the public `rustdoc-types` crate.
- There were differing opinions on how to introduce the dependency:
  1. "Hard" dependency: remove use of `std::collections::HashMap` in favor of `FxHashMap`.
  2. "Soft" dependency: make optional and introduce a feature then enable/disable it by default (this PR).
  3. ~~Make `rustdoc-types` generic and expose the `RandomState`~~ (a lot of work & complexity for little gain over a feature gate).

`@obi1kenobi` and I prefer the feature gate so that is what I am adding here.

My reasons for the preference are:
- `cargo-semver-checks` is especially perf sensitive, we don't expect people to care about ~150ms extra time when reading in a 500MB file (the size of the sample we used for benchmarking).
- Keeping `rustdoc-types` lean by having its only direct dependency be `serde` is nice for the general consumer of the crate.
- `rustc-hash` is not HashDOS resistant (but it is questionable whether `rustdoc-types` would be used on adversarial inputs).

r? `@aDotInTheVoid`
This commit is contained in:
Matthias Krüger 2024-10-19 22:01:00 +02:00 committed by GitHub
commit ab7e0d0bdd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 14 additions and 6 deletions

View file

@ -6,9 +6,12 @@ edition = "2021"
[lib]
path = "lib.rs"
[features]
default = ["rustc-hash"]
[dependencies]
serde = { version = "1.0", features = ["derive"] }
rustc-hash = "1.1.0"
rustc-hash = { version = "1.1.0", optional = true }
[dev-dependencies]
serde_json = "1.0"

View file

@ -3,11 +3,16 @@
//! These types are the public API exposed through the `--output-format json` flag. The [`Crate`]
//! struct is the root of the JSON blob and all other items are contained within.
#[cfg(not(feature = "rustc-hash"))]
use std::collections::HashMap;
use std::path::PathBuf;
pub use rustc_hash::FxHashMap;
#[cfg(feature = "rustc-hash")]
use rustc_hash::FxHashMap as HashMap;
use serde::{Deserialize, Serialize};
pub type FxHashMap<K, V> = HashMap<K, V>; // re-export for use in src/librustdoc
/// The version of JSON output that this crate represents.
///
/// This integer is incremented with every breaking change to the API,
@ -30,11 +35,11 @@ pub struct Crate {
pub includes_private: bool,
/// A collection of all items in the local crate as well as some external traits and their
/// items that are referenced locally.
pub index: FxHashMap<Id, Item>,
pub index: HashMap<Id, Item>,
/// Maps IDs to fully qualified paths and other info helpful for generating links.
pub paths: FxHashMap<Id, ItemSummary>,
pub paths: HashMap<Id, ItemSummary>,
/// Maps `crate_id` of items to a crate name and html_root_url if it exists.
pub external_crates: FxHashMap<u32, ExternalCrate>,
pub external_crates: HashMap<u32, ExternalCrate>,
/// A single version number to be used in the future when making backwards incompatible changes
/// to the JSON output.
pub format_version: u32,
@ -95,7 +100,7 @@ pub struct Item {
/// Some("") if there is some documentation but it is empty (EG `#[doc = ""]`).
pub docs: Option<String>,
/// This mapping resolves [intra-doc links](https://github.com/rust-lang/rfcs/blob/master/text/1946-intra-rustdoc-links.md) from the docstring to their IDs
pub links: FxHashMap<String, Id>,
pub links: HashMap<String, Id>,
/// Stringified versions of the attributes on this item (e.g. `"#[inline]"`)
pub attrs: Vec<String>,
/// Information about the items deprecation, if present.