Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient breakString #177

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

More efficient breakString #177

wants to merge 6 commits into from

Conversation

michael-schwarz
Copy link
Member

Closes #169.

See also goblint/analyzer#1513 that also has some runtime comparisons.

@michael-schwarz michael-schwarz changed the title More efficient breakString (#169) More efficient breakString Jan 14, 2025
@michael-schwarz michael-schwarz marked this pull request as draft January 14, 2025 16:48
@michael-schwarz michael-schwarz marked this pull request as ready for review January 14, 2025 20:36
@michael-schwarz
Copy link
Member Author

This was a bit of a pain because I did not consider that building the result from right to left means that the accumulator cannot be used to pass in a prefix, I'll squash merge then.

I am confident this is correct now, it both passes all printing tests in this repo, as well as all cram tests in the goblint repo which also exercise this function.

@sim642
Copy link
Member

sim642 commented Jan 16, 2025

This now traverses the string right-to-left (because the Stdlib splitting does so because that's needed to construct a list), but the old version went left-to-right.
Also, it looks like the document tree from this will be completely different, no? Leaning in a different direction, Lines in different places, no CText (which I assume is an optimization to keep trees smaller).

It looks like Pretty significantly prefers some shape over another and has flatten to improve things. So we need to be quite careful with these details: otherwise we might make breakString faster but later outputting it inefficient.
Even if the default flattenBeforePrint would take care of it, constructing the correct tree directly in the first place would be good.

@michael-schwarz
Copy link
Member Author

Judging by the comments a few lines below, this right-heavy tree actually seems preferable over the one they had previously:

cil/src/ocamlutil/pretty.ml

Lines 116 to 119 in f5ee39b

(* Note that the ++ operator in Ocaml are left-associative. This means
that if you have a long list of ++ then the whole thing is very unbalanced
towards the left side. This is the worst possible case since scanning the
left side of a Concat is the non-tail recursive case. *)

@michael-schwarz
Copy link
Member Author

cil/src/ocamlutil/pretty.ml

Lines 227 to 231 in f5ee39b

(* When we construct documents, most of the time they are heavily unbalanced
towards the left. This is due to the left-associativity of ++ and also to
the fact that constructors such as docList construct from the let of a
sequence. We would prefer to shift the imbalance to the right to avoid
consuming a lot of stack when we traverse the document *)

It seems like the new function I give constructs more or less the tree you would get after calling flatten?

src/ocamlutil/pretty.ml Show resolved Hide resolved
src/ocamlutil/pretty.ml Show resolved Hide resolved
src/ocamlutil/pretty.ml Outdated Show resolved Hide resolved
@sim642
Copy link
Member

sim642 commented Jan 17, 2025

It seems like the new function I give constructs more or less the tree you would get after calling flatten?

That's true, which maybe is a good thing.
Although it does also have an interesting effect: since flatten is written to do tail-recursion on the left, it does non-tail-recursion on the right. So when flattening a right-leaning tree, it actually goes into the very same deep recursion that the whole flattening process is trying to avoid later when doing the printing.

So it's not clear to me what of the old behavior should be retained (for the sake of just optimizing) and what's actually inconsequential (given how breakString wasn't really thought through for efficiency). Maybe it's not worth preserving all the old behavior after all.

@michael-schwarz
Copy link
Member Author

Given we did not encounter a stack overflow for goblint/analyzer#1513 where I also used an non tail-recursive helper, I would assume that this is also not problematic here.

If we run into any trouble, we can always go back to a more faithful replacement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pretty.text very slow and uses a lot of memory
2 participants