Skip to content

Commit

Permalink
Transition to salsa coarse-grained tracked structs (#15763)
Browse files Browse the repository at this point in the history
## Summary

Transition to using coarse-grained tracked structs (depends on
salsa-rs/salsa#657). For now, this PR doesn't
add any `#[tracked]` fields, meaning that any changes cause the entire
struct to be invalidated. It also changes `AstNodeRef` to be
compared/hashed by pointer address, instead of performing a deep AST
comparison.

## Test Plan

This yields a 10-15% improvement on my machine (though weirdly some runs
were 5-10% without being flagged as inconsistent by criterion, is there
some non-determinism involved?). It's possible that some of this is
unrelated, I'll try applying the patch to the current salsa version to
make sure.

---------

Co-authored-by: Micha Reiser <[email protected]>
  • Loading branch information
ibraheemdev and MichaReiser authored Feb 11, 2025
1 parent 7fbd89c commit 69d86d1
Show file tree
Hide file tree
Showing 30 changed files with 137 additions and 223 deletions.
17 changes: 14 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ rayon = { version = "1.10.0" }
regex = { version = "1.10.2" }
rustc-hash = { version = "2.0.0" }
# When updating salsa, make sure to also update the revision in `fuzz/Cargo.toml`
salsa = { git = "https://github.com/salsa-rs/salsa.git", rev = "88a1d7774d78f048fbd77d40abca9ebd729fd1f0" }
salsa = { git = "https://github.com/salsa-rs/salsa.git", rev = "351d9cf0037be949d17800d0c7b4838e533c2ed6" }
schemars = { version = "0.8.16" }
seahash = { version = "4.1.0" }
serde = { version = "1.0.197", features = ["derive"] }
Expand Down
6 changes: 3 additions & 3 deletions crates/red_knot_python_semantic/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ license = { workspace = true }

[dependencies]
ruff_db = { workspace = true }
ruff_index = { workspace = true }
ruff_index = { workspace = true, features = ["salsa"] }
ruff_macros = { workspace = true }
ruff_python_ast = { workspace = true }
ruff_python_ast = { workspace = true, features = ["salsa"] }
ruff_python_parser = { workspace = true }
ruff_python_stdlib = { workspace = true }
ruff_source_file = { workspace = true }
Expand All @@ -31,7 +31,7 @@ drop_bomb = { workspace = true }
indexmap = { workspace = true }
itertools = { workspace = true }
ordermap = { workspace = true }
salsa = { workspace = true }
salsa = { workspace = true, features = ["compact_str"] }
thiserror = { workspace = true }
tracing = { workspace = true }
rustc-hash = { workspace = true }
Expand Down
42 changes: 29 additions & 13 deletions crates/red_knot_python_semantic/src/ast_node_ref.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,29 @@ use ruff_db::parsed::ParsedModule;
/// Holding on to any [`AstNodeRef`] prevents the [`ParsedModule`] from being released.
///
/// ## Equality
/// Two `AstNodeRef` are considered equal if their wrapped nodes are equal.
/// Two `AstNodeRef` are considered equal if their pointer addresses are equal.
///
/// ## Usage in salsa tracked structs
/// It's important that [`AstNodeRef`] fields in salsa tracked structs are tracked fields
/// (attributed with `#[tracked`]). It prevents that the tracked struct gets a new ID
/// everytime the AST changes, which in turn, invalidates the result of any query
/// that takes said tracked struct as a query argument or returns the tracked struct as part of its result.
///
/// For example, marking the [`AstNodeRef`] as tracked on `Expression`
/// has the effect that salsa will consider the expression as "unchanged" for as long as it:
///
/// * belongs to the same file
/// * belongs to the same scope
/// * has the same kind
/// * was created in the same order
///
/// This means that changes to expressions in other scopes don't invalidate the expression's id, giving
/// us some form of scope-stable identity for expressions. Only queries accessing the node field
/// run on every AST change. All other queries only run when the expression's identity changes.
///
/// The one exception to this is if it is known that all queries tacking the tracked struct
/// as argument or returning it as part of their result are known to access the node field.
/// Marking the field tracked is then unnecessary.
#[derive(Clone)]
pub struct AstNodeRef<T> {
/// Owned reference to the node's [`ParsedModule`].
Expand Down Expand Up @@ -67,23 +89,17 @@ where
}
}

impl<T> PartialEq for AstNodeRef<T>
where
T: PartialEq,
{
impl<T> PartialEq for AstNodeRef<T> {
fn eq(&self, other: &Self) -> bool {
self.node().eq(other.node())
self.node.eq(&other.node)
}
}

impl<T> Eq for AstNodeRef<T> where T: Eq {}
impl<T> Eq for AstNodeRef<T> {}

impl<T> Hash for AstNodeRef<T>
where
T: Hash,
{
impl<T> Hash for AstNodeRef<T> {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.node().hash(state);
self.node.hash(state);
}
}

Expand Down Expand Up @@ -117,7 +133,7 @@ mod tests {
let stmt_cloned = &cloned.syntax().body[0];
let cloned_node = unsafe { AstNodeRef::new(cloned.clone(), stmt_cloned) };

assert_eq!(node1, cloned_node);
assert_ne!(node1, cloned_node);

let other_raw = parse_unchecked_source("2 + 2", PySourceType::Python);
let other = ParsedModule::new(other_raw);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ pub(crate) fn search_paths(db: &dyn Db) -> SearchPathIterator {
}

#[derive(Debug, PartialEq, Eq)]
pub(crate) struct SearchPaths {
pub struct SearchPaths {
/// Search paths that have been statically determined purely from reading Ruff's configuration settings.
/// These shouldn't ever change unless the config settings themselves change.
static_paths: Vec<SearchPath>,
Expand Down
9 changes: 5 additions & 4 deletions crates/red_knot_python_semantic/src/semantic_index.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
use std::iter::FusedIterator;
use std::sync::Arc;

use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
use salsa::plumbing::AsId;

use ruff_db::files::File;
use ruff_db::parsed::parsed_module;
use ruff_index::{IndexSlice, IndexVec};

use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
use salsa::plumbing::AsId;
use salsa::Update;

use crate::module_name::ModuleName;
use crate::semantic_index::ast_ids::node_key::ExpressionNodeKey;
use crate::semantic_index::ast_ids::AstIds;
Expand Down Expand Up @@ -123,7 +124,7 @@ pub(crate) fn global_scope(db: &dyn Db, file: File) -> ScopeId<'_> {
}

/// The symbol tables and use-def maps for all scopes in a file.
#[derive(Debug)]
#[derive(Debug, Update)]
pub(crate) struct SemanticIndex<'db> {
/// List of all symbol tables in this file, indexed by scope.
symbol_tables: IndexVec<FileScopeId, Arc<SymbolTable>>,
Expand Down
5 changes: 3 additions & 2 deletions crates/red_knot_python_semantic/src/semantic_index/ast_ids.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ use crate::Db;
///
/// x = foo()
/// ```
#[derive(Debug)]
#[derive(Debug, salsa::Update)]
pub(crate) struct AstIds {
/// Maps expressions to their expression id.
expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
Expand Down Expand Up @@ -74,6 +74,7 @@ impl HasScopedUseId for ast::ExprRef<'_> {

/// Uniquely identifies an [`ast::Expr`] in a [`crate::semantic_index::symbol::FileScopeId`].
#[newtype_index]
#[derive(salsa::Update)]
pub struct ScopedExpressionId;

pub trait HasScopedExpressionId {
Expand Down Expand Up @@ -181,7 +182,7 @@ pub(crate) mod node_key {

use crate::node_key::NodeKey;

#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)]
#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug, salsa::Update)]
pub(crate) struct ExpressionNodeKey(NodeKey);

impl From<ast::ExprRef<'_>> for ExpressionNodeKey {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use rustc_hash::FxHashMap;

/// Describes an (annotated) attribute assignment that we discovered in a method
/// body, typically of the form `self.x: int`, `self.x: int = …` or `self.x = …`.
#[derive(Debug, Clone, PartialEq, Eq)]
#[derive(Debug, Clone, PartialEq, Eq, salsa::Update)]
pub(crate) enum AttributeAssignment<'db> {
/// An attribute assignment with an explicit type annotation, either
/// `self.x: <annotation>` or `self.x: <annotation> = …`.
Expand Down
12 changes: 3 additions & 9 deletions crates/red_knot_python_semantic/src/semantic_index/constraint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,20 @@ use crate::db::Db;
use crate::semantic_index::expression::Expression;
use crate::semantic_index::symbol::{FileScopeId, ScopeId};

#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq, salsa::Update)]
pub(crate) struct Constraint<'db> {
pub(crate) node: ConstraintNode<'db>,
pub(crate) is_positive: bool,
}

#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq, salsa::Update)]
pub(crate) enum ConstraintNode<'db> {
Expression(Expression<'db>),
Pattern(PatternConstraint<'db>),
}

/// Pattern kinds for which we support type narrowing and/or static visibility analysis.
#[derive(Debug, Clone, Hash, PartialEq)]
#[derive(Debug, Clone, Hash, PartialEq, salsa::Update)]
pub(crate) enum PatternConstraintKind<'db> {
Singleton(Singleton, Option<Expression<'db>>),
Value(Expression<'db>, Option<Expression<'db>>),
Expand All @@ -28,21 +28,15 @@ pub(crate) enum PatternConstraintKind<'db> {

#[salsa::tracked]
pub(crate) struct PatternConstraint<'db> {
#[id]
pub(crate) file: File,

#[id]
pub(crate) file_scope: FileScopeId,

#[no_eq]
#[return_ref]
pub(crate) subject: Expression<'db>,

#[no_eq]
#[return_ref]
pub(crate) kind: PatternConstraintKind<'db>,

#[no_eq]
count: countme::Count<PatternConstraint<'static>>,
}

Expand Down
Loading

0 comments on commit 69d86d1

Please sign in to comment.