Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC : static computation of source roots for code-generating tasks #4621

Draft
wants to merge 3 commits into
base: 0.12.x
Choose a base branch
from

Conversation

Baccata
Copy link
Contributor

@Baccata Baccata commented Feb 25, 2025

Context : #4530

@lihaoyi, I want to open this PR to continue the discussion. At the time of writing this, this adds a generatedSourceRoots parameter for tasks, as you've described, and uses it to compute the source directories for the bloop configuration, as opposed to actually running the allSources task. A similar change would need to be achieved in GenIdea and BSP.

So now, the problem I see is as follows : in order for this to be useful, it needs to be applied to all source generators provided OOTB by the mill codebase, but also third party providers of source-generator (like smithy4s), as the ability to jump to the sources is paramount for the user experience in IDEs.

Applying this change is a fair bit of work in this codebase, and a big ask for the larger ecosystem, and although I think SBT's approach of treating all source generators lazily is preferable to mill's current approach of running all source generators eagerly, I don't think I can weigh the benefits of changing the approach against the cost induced by breaking IDE behaviour for existing plugins.

With that in mind here's what I think would be a good mitigation :

// Rather than traversing the transitive inputs of `allSources`, we define this setting as a way to register lazy 
// source generators. We force the contract to return `Unit` to prevent the assumption that a returned `PathRef` would be
// added to the sources. 
def lazySourceGenerators = Seq[T[Unit]] 
 
 // Runs the lazy source generators, returning path refs computed from their `generatedSourceRoots`. 
 // When the tasks don't have `generatedSourceRoots` set, we assume that the `T.dest` is to be added as source root. 
 final def runLazySourceGenerators : T[Seq[PathRef]] = ???
 
// Does not run the lazy source generators, but rather computes the source roots they'll generate code in. 
// Wrapped in T because the evaluator is needed to get the T.dest of the lazy source generators. I presume that is 
// something we could thread in through the ctx (like `def destOf(task: T[_]) : Path`)  
final def lazyGeneratedSourceRoots: T[Seq[Path]] = ???
 
 // Keeps the current behaviour intact 
def allSources = T {
    sources() ++ generatedSources() ++ runLazySourceGenerators()
} 

The Bloop, GenIdea and BSP then get amended to call sources(), generatedSources() and lazyGeneratedSourceRoots(), but avoids runLazySourceGenerators.

This approach would allow to alleviate the pain point raised in the discussion above, whilst retaining the current behaviour of the ecosystem. Mill plugin providers could be encouraged to use runLazySourceGenerators instead of generatedSources, but users would have a recourse to turn eager source generators into lazy ones if they need it to be (provided a withGeneratedSourceRoots is provided to them, that is)

@lihaoyi
Copy link
Member

lihaoyi commented Feb 25, 2025

@Baccata one new development worth mentioning is that #4524 targeting 0.13.x, we've moved the computation of JavaModule source paths in general to static def sourcesFolders: Seq[os.SubPath] members computer before evaluation. This was done for reasons unrelated to codegen, but perhaps now that we're doing it a similar approach could be taken for generated sources?

Following the naming convention would just define a def JavaModule#generatedSourcesFolders: T[Seq[os.SubPath]] and use that for IDE codegen, which I think is similar to what you are proposing here, just to a slightly different name

@Baccata
Copy link
Contributor Author

Baccata commented Feb 25, 2025

@lihaoyi, with the difference that the subPaths need to be tied to the tasks, as they end up being relative to their T.dest (as opposed to the module's path), hence why my initial proposal in the discussion was to have (roughly) :

case class SourceGenerator(task: T[Unit], sourceRoots: Seq[os.SubPath]) 

trait JavaModule {
  def sourceGenerators : Seq[SourceGenerator]
}

But, putting aside the details, I agree that source generators are better captured statically like the sourceFolders (and like SBT does it). The question remains regarding "what do you want to do with the current generatedSources task ?" : do you want to keep it around to avoid changing everything in mill and its ecosystem and introduce a static alternative (def sourceGenerators as above), or do you want to deprecate it completely in favour of the static version ?

@Baccata
Copy link
Contributor Author

Baccata commented Feb 25, 2025

See c262ea6 for a showcase of what I meant in #4621 (comment)

val dest = mill
.eval
.EvaluatorPaths
.resolveDestPaths(workspace / "out", task)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know whether there's a better way to do it, to be honest

/**
* The folders containing all source files fed into the compiler
*/
def allSources: T[Seq[PathRef]] = Task { sources() ++ generatedSources() }
def allSources: T[Seq[PathRef]] =
Task { sources() ++ generatedSources() ++ deferredGeneratedSources() }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeps the current (0.12.x) behaviour of generatedSources to avoid breaking the ecosystem

@@ -133,14 +137,19 @@ class BloopImpl(evs: () => Seq[Evaluator], wd: os.Path) extends ExternalModule {
!module.asInstanceOf[outer.Module].skipBloop
else true

private def deferredAllSources(m: JavaModule): T[Seq[PathRef]] = Task {
m.sources() ++ m.generatedSources() ++ m.allDeferredSourceRoots()
Copy link
Contributor Author

@Baccata Baccata Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeps the current (0.12.x) behaviour of generatedSources being run to produce IDE config, in order to avoid breaking the ecosystem. In addition, adds roots that will be produced by deferredSourceGenerators when allSources/compile is run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants