-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create an LSP for yuck #1241
base: master
Are you sure you want to change the base?
Create an LSP for yuck #1241
Conversation
Current implementation detailsFor my current implementation, I am creating a config using I needed to do really hacky stuff to avoid mutating Please let me know what direction I should go in for handling yuck files so I may begin working on the actual LSP. Future implementation detailsFor implementing things like syntax highlighting and hover docs, I may need more information than just the start and end of individual widgets.I will also need to know the tokens of each ast node: |
Upon further inspection of the codebase, I have noticed that there exists some global state to hold the file database. A few questions:
imo it makes sense to have the daemon and LSP share the same files & config, since this reduces memory usage & improves performance. I am just wondering how I should go about sharing this state. Another thing of note is that currently I have the LSP as a separate executable. Should the LSP instead be started with something like |
This is something I'm also curious about for the formatter. I have written up a basic implementation (for which I will also open a draft PR soon), but this is one part I have yet to touch. It seems like almost everything I need regarding file handling is already implemented, just not exposed publicly. |
This is something that it would make sense for us to coordinate closer on, since the formatter will also require adjustments to the AST. |
I'll need to look into this a bit further, but just as a general heads up already: In most cases, a good LSP will still work when there are parsing errors (otherwise, autocomplete etc might be quite hard, and highlighting more than one error is very difficult as well). The eww parser is currently somewhat unfit for this, I'd fear. A lot of information is lost, and it definitely isn't super error-tolerant. Thus, and especially given the relative simplicity of yuck, I'd recommend trying to work a bit further disjunct from the main eww codebase for this. I could see some sort of connection to a running eww daemon be quite valuable for use cases such as showing errors from the daemon, but even that would require a good bit of work to get the required info. Writing a simple-ish CST-parser (maybe using something like tree-sitter, or basing on Rowan, which many LSPs use) would probably be a lot more flexible in the long run, even if it's of course a lot of initial effort. |
I don't have much experience in this area, so take everything I am saying with a grain of salt. But it seems to me that, if I am understanding you correctly, you think it would be best if both the LSP and the formatter implement their own parsers. Why?It seems to me that it would be a lot of additional work to try and maintain 2 or 3 different parsers. Would it not make more sense to collaborate on improving the parsers used in eww itself, so that eww, the LSP and the formatter all use the same one? This is what most languages seem to do, at least for the most part. I can't speak to the LSP side of things, but for the formatter, the eww AST seems to mostly fulfill my requirements. Perhaps later down the line there would need to be some more significant changes made to facilitate some more complicated changes, but as it stands now I have a formatter that can handle about 80% of yuck code properly, with no changes to the AST. It seems to be that if we were all to go out and implement our own parser solutions we would end up with 3 parsers, that all need to be maintained, which I foresee causing headaches down the line. |
I can understand where @elkowar is coming from. His implementation lacks error recovery, a huge part of what's needed for LSPs to work. Right now if there is any error at all, I believe the parser will just quit. I ran into a similar issue when I was working on another LSP for a language my friends and I made. We had to add a bit of extra code to the parser to make it so it could try its best to get all the tokens out if there was an error. That being said, I initially was writing some comment that was against using the current yuck parser. However, I realized that for a language like this, it would be very difficult to have correct autocomplete and syntax highlighting working when there are syntax errors. Take for example this code
it would be impossible to tell if that something else is its own thing or if its inside foo, making whatever autocomplete the LSP has useless. Therefore, I don't think we need to worry very much about errors. Perhaps I am missing something obvious here, and I haven't dug too deep into the parser's code, but could it be as simple as returning everything that was parsed up until the syntax error? That way we could have syntax highlighting for everything up until the syntax error. This is essentially what was done for the language my friends and I made, and it worked fine. |
I think using tree-sitter as a parser can be a good approach - It already exists and needs to exist for good editor support (e.g. zed and neovim), it handles error recovery and support incremental updates. I found this discussion showing how this was done in asm-lsp |
My guess is that it would require a lot of rewrites on eww's side on an already stable product for a feature which is not sacrosanct to it's functionality at least for now. |
Yea, making the eww AST verbose enough to be convenient for a proper LSP would require quite a bit of internal changes, for reasons of error recovery as well as the fact that currently, we don't preserve all information (i.e., if theres two different ways to write something, we don't necessarily retain all information needed to reconstruct the original code, we don't retain all comments, etc) The eww grammar has not changed in over two years, so there's not that much "maintenance" that'd be necessary here -- just write the grammar once, and that's probably it. The goals of an LSP/formatter parser and the main eww parser are just very different. I do think doing something with treesitter is probably a good first idea here. Having your fully own AST would allow you to retain any additional metadata you might want to store, as well. From all I gather, there isn't really a one-size-fits-all strategy for this across other languages. many languages have multi-step parsers and grammars, where things go through a tokenizer, then a CST, then an AST, etc etc. some languages might re-use part of the original parser for their LSPs. Rust-analyzer, from what I gather, has a completely separate CST and AST structure based on rowan for its parser, which also, again, just has very different requirements and restrictions from the parsing logic used in the compiler. |
However, if you're able to make the eww-parser and AST usable for these usecases, without having to re-structure the entire codebase, I'd definitely be willing to consider this as an option as well. I do, however, think that error recovery is very important for an LSP, and even a formatter. There's more "hints" for them than you might think: |
Thanks, @elkowar, for the detailed reply. As for the formatter: I will finish experimenting around with using the eww AST and then go ahead and write another implementation using tree sitter, as suggested. At that point, I will go forward with whatever feels more well suited / capable. Hopefully I will then have something presentable as a 0.0.1 in the next few weeks. |
Description
Closes #1232.
Features of the LSP (will) include
(literal)
expressions.Additional Notes
I am far from finished for now.
The goals of this draft PR is
Check the top comment for further discussion
Checklist
Please make sure you can check all the boxes that apply to this PR.
docs/content/main
directory has been adjusted to reflect my changes.cargo fmt
to automatically format all code before committing