Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to salsa coarse-grained tracked structs #15763

Merged
merged 6 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ rayon = { version = "1.10.0" }
regex = { version = "1.10.2" }
rustc-hash = { version = "2.0.0" }
# When updating salsa, make sure to also update the revision in `fuzz/Cargo.toml`
salsa = { git = "https://github.com/salsa-rs/salsa.git", rev = "88a1d7774d78f048fbd77d40abca9ebd729fd1f0" }
salsa = { git = "https://github.com/salsa-rs/salsa.git", rev = "351d9cf0037be949d17800d0c7b4838e533c2ed6" }
schemars = { version = "0.8.16" }
seahash = { version = "4.1.0" }
serde = { version = "1.0.197", features = ["derive"] }
Expand Down
6 changes: 3 additions & 3 deletions crates/red_knot_python_semantic/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ license = { workspace = true }

[dependencies]
ruff_db = { workspace = true }
ruff_index = { workspace = true }
ruff_index = { workspace = true, features = ["salsa"] }
ruff_macros = { workspace = true }
ruff_python_ast = { workspace = true }
ruff_python_ast = { workspace = true, features = ["salsa"] }
ruff_python_parser = { workspace = true }
ruff_python_stdlib = { workspace = true }
ruff_source_file = { workspace = true }
Expand All @@ -31,7 +31,7 @@ drop_bomb = { workspace = true }
indexmap = { workspace = true }
itertools = { workspace = true }
ordermap = { workspace = true }
salsa = { workspace = true }
salsa = { workspace = true, features = ["compact_str"] }
thiserror = { workspace = true }
tracing = { workspace = true }
rustc-hash = { workspace = true }
Expand Down
42 changes: 29 additions & 13 deletions crates/red_knot_python_semantic/src/ast_node_ref.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,29 @@ use ruff_db::parsed::ParsedModule;
/// Holding on to any [`AstNodeRef`] prevents the [`ParsedModule`] from being released.
///
/// ## Equality
/// Two `AstNodeRef` are considered equal if their wrapped nodes are equal.
/// Two `AstNodeRef` are considered equal if their pointer addresses are equal.
///
/// ## Usage in salsa tracked structs
/// It's important that [`AstNodeRef`] fields in salsa tracked structs are tracked fields
/// (attributed with `#[tracked`]). It prevents that the tracked struct gets a new ID
/// everytime the AST changes, which in turn, invalidates the result of any query
/// that takes said tracked struct as a query argument or returns the tracked struct as part of its result.
///
/// For example, marking the [`AstNodeRef`] as tracked on `Expression`
/// has the effect that salsa will consider the expression as "unchanged" for as long as it:
///
/// * belongs to the same file
/// * belongs to the same scope
/// * has the same kind
/// * was created in the same order
///
/// This means that changes to expressions in other scopes don't invalidate the expression's id, giving
/// us some form of scope-stable identity for expressions. Only queries accessing the node field
/// run on every AST change. All other queries only run when the expression's identity changes.
///
/// The one exception to this is if it is known that all queries tacking the tracked struct
/// as argument or returning it as part of their result are known to access the node field.
/// Marking the field tracked is then unnecessary.
#[derive(Clone)]
pub struct AstNodeRef<T> {
/// Owned reference to the node's [`ParsedModule`].
Expand Down Expand Up @@ -67,23 +89,17 @@ where
}
}

impl<T> PartialEq for AstNodeRef<T>
where
T: PartialEq,
{
impl<T> PartialEq for AstNodeRef<T> {
fn eq(&self, other: &Self) -> bool {
self.node().eq(other.node())
self.node.eq(&other.node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the relationship of this change to the rest of the PR?

Could we do this today in main, independently of the tracked-struct changes? And if we did, how much of the perf win of this PR would come with it?

(Not really suggesting we need to split it out, just trying to understand this change and its perf impact better.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably revert this change now that AstNodeRef fields are tracked again. Let me try tomorrow.

I don't think changing the implementation on main would lead to a performance improvement because all AstNodeRef fields are marked with no_eq and are never marked as id (they're never hashed). That means this implementation should be mostly unused today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you decided not to revert this change? Anything worth documenting about why?

}
}

impl<T> Eq for AstNodeRef<T> where T: Eq {}
impl<T> Eq for AstNodeRef<T> {}

impl<T> Hash for AstNodeRef<T>
where
T: Hash,
{
impl<T> Hash for AstNodeRef<T> {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.node().hash(state);
self.node.hash(state);
}
}

Expand Down Expand Up @@ -117,7 +133,7 @@ mod tests {
let stmt_cloned = &cloned.syntax().body[0];
let cloned_node = unsafe { AstNodeRef::new(cloned.clone(), stmt_cloned) };

assert_eq!(node1, cloned_node);
assert_ne!(node1, cloned_node);

let other_raw = parse_unchecked_source("2 + 2", PySourceType::Python);
let other = ParsedModule::new(other_raw);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ pub(crate) fn search_paths(db: &dyn Db) -> SearchPathIterator {
}

#[derive(Debug, PartialEq, Eq)]
pub(crate) struct SearchPaths {
pub struct SearchPaths {
/// Search paths that have been statically determined purely from reading Ruff's configuration settings.
/// These shouldn't ever change unless the config settings themselves change.
static_paths: Vec<SearchPath>,
Expand Down
9 changes: 5 additions & 4 deletions crates/red_knot_python_semantic/src/semantic_index.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
use std::iter::FusedIterator;
use std::sync::Arc;

use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
use salsa::plumbing::AsId;

use ruff_db::files::File;
use ruff_db::parsed::parsed_module;
use ruff_index::{IndexSlice, IndexVec};

use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
use salsa::plumbing::AsId;
use salsa::Update;

use crate::module_name::ModuleName;
use crate::semantic_index::ast_ids::node_key::ExpressionNodeKey;
use crate::semantic_index::ast_ids::AstIds;
Expand Down Expand Up @@ -123,7 +124,7 @@ pub(crate) fn global_scope(db: &dyn Db, file: File) -> ScopeId<'_> {
}

/// The symbol tables and use-def maps for all scopes in a file.
#[derive(Debug)]
#[derive(Debug, Update)]
pub(crate) struct SemanticIndex<'db> {
/// List of all symbol tables in this file, indexed by scope.
symbol_tables: IndexVec<FileScopeId, Arc<SymbolTable>>,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ use crate::Db;
///
/// x = foo()
/// ```
#[derive(Debug)]
#[derive(Debug, salsa::Update)]
pub(crate) struct AstIds {
/// Maps expressions to their expression id.
expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
Expand Down Expand Up @@ -74,6 +74,7 @@ impl HasScopedUseId for ast::ExprRef<'_> {

/// Uniquely identifies an [`ast::Expr`] in a [`crate::semantic_index::symbol::FileScopeId`].
#[newtype_index]
#[derive(salsa::Update)]
pub struct ScopedExpressionId;

pub trait HasScopedExpressionId {
Expand Down Expand Up @@ -181,7 +182,7 @@ pub(crate) mod node_key {

use crate::node_key::NodeKey;

#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)]
#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug, salsa::Update)]
pub(crate) struct ExpressionNodeKey(NodeKey);

impl From<ast::ExprRef<'_>> for ExpressionNodeKey {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use rustc_hash::FxHashMap;

/// Describes an (annotated) attribute assignment that we discovered in a method
/// body, typically of the form `self.x: int`, `self.x: int = …` or `self.x = …`.
#[derive(Debug, Clone, PartialEq, Eq)]
#[derive(Debug, Clone, PartialEq, Eq, salsa::Update)]
pub(crate) enum AttributeAssignment<'db> {
/// An attribute assignment with an explicit type annotation, either
/// `self.x: <annotation>` or `self.x: <annotation> = …`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,20 @@ use crate::db::Db;
use crate::semantic_index::expression::Expression;
use crate::semantic_index::symbol::{FileScopeId, ScopeId};

#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq, salsa::Update)]
pub(crate) struct Constraint<'db> {
pub(crate) node: ConstraintNode<'db>,
pub(crate) is_positive: bool,
}

#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq, salsa::Update)]
pub(crate) enum ConstraintNode<'db> {
Expression(Expression<'db>),
Pattern(PatternConstraint<'db>),
}

/// Pattern kinds for which we support type narrowing and/or static visibility analysis.
#[derive(Debug, Clone, Hash, PartialEq)]
#[derive(Debug, Clone, Hash, PartialEq, salsa::Update)]
pub(crate) enum PatternConstraintKind<'db> {
Singleton(Singleton, Option<Expression<'db>>),
Value(Expression<'db>, Option<Expression<'db>>),
Expand All @@ -28,21 +28,15 @@ pub(crate) enum PatternConstraintKind<'db> {

#[salsa::tracked]
pub(crate) struct PatternConstraint<'db> {
#[id]
pub(crate) file: File,

#[id]
pub(crate) file_scope: FileScopeId,

#[no_eq]
#[return_ref]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expression is Copy, no need to return a ref

pub(crate) subject: Expression<'db>,

#[no_eq]
#[return_ref]
pub(crate) kind: PatternConstraintKind<'db>,

#[no_eq]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Countme's eq implementation is a no-op anyway

count: countme::Count<PatternConstraint<'static>>,
}

Expand Down
Loading
Loading