Use flambda compiler optimization #928

mrstanb · 2022-11-24T05:48:45Z

As the title says, we've been working on adding flambda optimization
For this, some changes have been made in files, such as:

make.sh
goblint.opam.locked
README.md

Please note that flambda is only supported in ocamlopt, hence all the flags, passed for compiler optimization, are only passed as ocamlopt_flags

mrstanb · 2022-11-24T05:53:48Z

Quick note about tests and checking if semantics were compromised with flambda:

I ran make test once with the original setup (i.e., after running make setup) -> everything went through smoohtly
I ran make test once with the flambda setup (i.e., after running make setup-flambda) -> everything went through smoohtly here as well

Hence, I presume that if make test works in both cases, then we shouldn't have messed up anything semantics-wise.

sim642 · 2022-11-24T08:09:08Z

Thanks for the PR! There are currently some merge conflicts so the diff also shows some unrelated things that are actually already on master.

Is this just to add the flambda setup or is there already also evidence that flambda indeed provides some speedup?

michael-schwarz · 2022-11-24T08:22:40Z

is there already also evidence that flambda indeed provides some speedup?

Iirc there was a gain of about 1/4 for the SQLite amalgamation when testing locally, but I am sure @mrstanb et al will have the exact numbers.

mrstanb · 2022-11-24T09:06:10Z

Thanks for the PR! There are currently some merge conflicts so the diff also shows some unrelated things that are actually already on master.

Is this just to add the flambda setup or is there already also evidence that flambda indeed provides some speedup?

There are indeed some performance improvements. Me and my teammates will make sure to post a follow-up with some stats.

mrstanb · 2022-11-24T09:06:47Z

Iirc there was a gain of about 1/4 for the SQLite amalgamation when testing locally, but I am sure @mrstanb et al will have the exact numbers.

Indeed, we'll make sure to post the different results that we had as a follow-up.

mrstanb · 2022-11-24T09:13:04Z

From my side the stats are as follows:

Running Goblint with the base configuration, no optimizations and with the SQLite amalgamation, I got a walltime of 8073.680s
Running Goblint with the base configuration, with flambda flag -O3 and with the SQLite amalgamation, I got a walltime of 6877.092s

jerhard

Thanks for the PR! It's interesting to see that activating flambda already has quite some impact on performance.

Now, with this additional compiler option, it would be great to extend our Github workflows with one that nightly runs all the regressoin-tests with the flambda switch. To add such a workflow, you may add a file in .github/workflows that contains the regression job, which is already present in both the locked.yml (this uses a fixed version of compiler and dependencies) and unlocked.yml files.

goblint.opam

sim642 · 2022-11-25T15:48:57Z

To add such a workflow, you may add a file in .github/workflows that contains the regression job, which is already present in both the locked.yml (this uses a fixed version of compiler and dependencies) and unlocked.yml files.

There's no need to make a new workflow altogether. It suffices to just add one extra compiler version to the unlocked matrix.
A locked version isn't possible with flambda anyway, because the lock file is locked into a non-flambda compiler.

mrstanb · 2022-12-04T15:05:14Z

I updated the .github/workflows/unlocked.yml file to include the OCaml compiler version 4.14.x+options. However, I'm not sure if this would be sufficient as a change, since I saw that in the Downgrade dependencies step of lower-bounds-downgrade job ocaml-base-compiler is used which is not the same as the compiler version ocaml-variants which we use with flambda.
Hence, my question: should there be some further changes to the unlocked.yml file? Also, should we specify somewhere the ocaml-option-flambda setting as we do in the golbint.opam.locked file? Otherwise, flambda wouldn't be enabled in the compiler and thus we might remain in the same position as originally.

sim642 · 2022-12-04T16:39:11Z

I think just +options does nothing on its own. You need to actually specify the options like the setup-ocaml documentation shows: https://github.com/ocaml/setup-ocaml#inputs.

The lower bounds jobs are completely separate and need not be modified.

nathanschmidt · 2022-12-11T17:33:45Z

TLDR: After testing out the effects of activating different Flambda flags, we opted for using -O3 without any further options. This leads to a speed-up of approx. 15%. The specification of further inlining options lead to mitigated results, being responsible for alternately improved or worse run-times depending on the system, hence why my teammates and I decided against them.

We started out by running Goblint with different flags over the SQLite Amalgamation. On our private machines, the flag combination -O3 -inline-toplevel=400 -inline-max-depth=1 -inline-max-unroll=0 seemed to be the most promising choice.

To confirm those results, we then proceeded with benchmarking on the CoolMUC-2 HPC of the LRZ in order to achieve reproducible results. Using this configuration, the following measurements were made (format: hours:minutes:seconds):

No optimizations: 02:37:47
Flambda with -O2: 02:22:28
Flambda with -O3: 02:14:31
Flambda with -O3 -inline-toplevel=400 -inline-max-depth=1 -inline-max-unroll=0: 02:20:23

The runs on the HPC-cluster confirmed our assumption of a noticeable time benefit of using Flambda. However, using just -O3 was the fastest configuration.

To ultimately take a decision on which flags to use, we continued our benchmarking with the coreutils programs of the bench repository. We activated some ana.int.* domains to see if there was some significant changes in the ordering of the above listed combinations.

On our local systems, we marked down this results: coreutils_benchmarks.md. The fastest configuration, again contradicting what we had observed on the HPC, was -O3 -inline-toplevel=400 -inline-max-depth=1 -inline-max-unroll=0.

We then made some further batch jobs on the LRZ-cluster with ana.int.interval, ana.int.def_ecx, ana.int.enums and ana.int.congruence enabled. Only one CPU was used to achieve greater run-times, which allows for better differentiation between all measurements. The cumulated run-times over all coreutils programs were:

No optimizations: 00:03:00
Flambda with -O2: 00:02:36
Flambda with -O3: 00:02:24
Flambda with -O3 -inline-toplevel=400 -inline-max-depth=1 -inline-max-unroll=0: 00:02:33

Once more, on the HPC, the fastest choice was just -O3. Therefore, as no clear advantage of activating the additional inlining options can be established, we decided that it would be wisest to stick with just -O3, which is now reflected in our PR.

goblint.opam.locked

README.md

jerhard · 2022-12-14T14:27:41Z

What are the built times on your machines with the flambda compiler with the regular make and with the release profile (make release) compared to the non-flambda compiler switch? That information would help us decide whether we should only keep the flambda switch, or keep it a separate option.

nathanschmidt · 2022-12-21T11:13:43Z

I've benchmarked make release both with Goblint as in the master branch and with flambda -O3 as in this PR. Here are the results for 5 builds each:

No flambda: AVERAGE = 9,777s

9,681s
9,792s
9,812s
9,731s
9,869s

Flambda with -O3: AVERAGE = 26,075s

26,296s
26,001s
25,992s
26,183s
25,903s

sim642 · 2022-12-21T12:59:55Z

Thanks! Could you also get the numbers for development build (make)? Hopefully there the slowdown isn't as significant, because it would be slightly annoying.

nathanschmidt · 2022-12-21T14:05:59Z

So for make:

No flambda: AVERAGE = 9,175s

9,148s
9,245s
9,092s
9,180s
9,208s

Flambda with -O3: AVERAGE = 13,829s

13,845s
13,765s
13,864s
13,844s
13,828s

michael-schwarz · 2022-12-21T15:10:56Z

I think this sort of slowdown for make is acceptable here.

michael-schwarz · 2022-12-21T15:12:28Z

I would suggest given these numbers that we should default to flambda. @sim642 @stilscher @jerhard 👍 or 👎 ?

sim642 · 2022-12-21T15:48:06Z

Sure, and actually there's probably no need to have -O3 in dev profile anyway because it disables cross-module optimizations. That probably brings the development mode overhead down even further.

jerhard · 2022-12-21T17:53:58Z

Yes, I would also be in favor of making flambda the default and removing the -O3 flag for the dev profile.

…etup with base compiler

nathanschmidt · 2022-12-29T18:31:20Z

In my latest commit, I removed the -O3 flag for make, which lead to slightly improved compilation times:

Flambda with no options: AVERAGE = 12,200s

12,487s
11,876s
12,164s
12,482s
11,990s

Furthermore, I made the ocaml-variants.4.14.0+options compiler with flambda enabled default (make setup). To still be able to use the ocaml-base-compiler if wished, I added the make setup-base command.

src/dune

README.md

make.sh

Co-authored-by: Simmo Saan <[email protected]>

…into use-flambda-compiler

sim642 self-requested a review November 24, 2022 08:04

sim642 added student-job performance Analysis time, memory usage setup Dependencies, CI, releasing labels Nov 24, 2022

sxprz and others added 6 commits November 24, 2022 11:29

Update README.md

52c7671

Update README.md

40b919b

Update make.sh to also install ocaml with flambda

acfcba0

Update make.sh for the right flambda switch + update goblint.opam.locked

cb20ac4

Remove eval from opam_setup_flambda

74c4df2

Update README with flambda installation info

bb6fcf1

adelavais force-pushed the use-flambda-compiler branch from 685e112 to bb6fcf1 Compare November 24, 2022 09:53

jerhard reviewed Nov 25, 2022

View reviewed changes

goblint.opam Outdated Show resolved Hide resolved

mrstanb and others added 3 commits November 26, 2022 22:30

Remove v2.9 for dune from goblint.opam

1f9674d

add flambda flags

4b81638

Add the flambda compiler version to unlocked.yml

fcda025

nathanschmidt added 2 commits December 5, 2022 16:19

Add ocaml-option-flambda to 4.14.0+options compiler in unlocked.yml

5f3f798

Update flambda flags to just -O3

cfc45e4

sim642 requested changes Dec 12, 2022

View reviewed changes

goblint.opam.locked Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

sim642 mentioned this pull request Dec 15, 2022

Add OCaml and library versions to --version #948

Merged

1 task

mrstanb added 2 commits December 20, 2022 09:32

Fix conflict with upstream README

cf0d5c4

Merge branch 'master' into use-flambda-compiler

00ac11c

adelavais mentioned this pull request Dec 21, 2022

Optimization: monomorphization #956

Merged

Added -O3 flag to dev env

4671b70

Removed -O3 from dev env, made flambda default, added make-base for s…

fb0b72e

…etup with base compiler

sim642 reviewed Jan 2, 2023

View reviewed changes

src/dune Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

make.sh Outdated Show resolved Hide resolved

make.sh Outdated Show resolved Hide resolved

montrie mentioned this pull request Jan 9, 2023

Iter optimization #964

Closed

mrstanb and others added 6 commits January 10, 2023 10:06

Add -unused-functor-parameter warning in src/dune

a299dfd

Co-authored-by: Simmo Saan <[email protected]>

Update README.md

3e222e0

Co-authored-by: Simmo Saan <[email protected]>

Fix autoformatted README

61ad751

Remove setup-base and use only flambda with -locked

7013805

Merge branch 'use-flambda-compiler' of github.com:adelavais/analyzer …

3c75d3e

…into use-flambda-compiler

Fix whitespace in README

0412216

sim642 self-requested a review January 11, 2023 06:50

sim642 added 3 commits January 14, 2023 16:32

Fix flambda conflict in locked workflows

4f9acc1

Fix flambda compiler name for setup-ocaml

b7e6794

Use flambda option in gobview

331ff8c

sim642 approved these changes Jan 16, 2023

View reviewed changes

sim642 added this to the v2.2.0 milestone Jan 16, 2023

sim642 merged commit 9ba891c into goblint:master Jan 16, 2023

montrie mentioned this pull request Jan 27, 2023

Iter optimization clean up #974

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use flambda compiler optimization #928

Use flambda compiler optimization #928

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

sim642 commented Nov 24, 2022

michael-schwarz commented Nov 24, 2022

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

jerhard left a comment

sim642 commented Nov 25, 2022

mrstanb commented Dec 4, 2022

sim642 commented Dec 4, 2022

nathanschmidt commented Dec 11, 2022

jerhard commented Dec 14, 2022

nathanschmidt commented Dec 21, 2022

sim642 commented Dec 21, 2022

nathanschmidt commented Dec 21, 2022

michael-schwarz commented Dec 21, 2022

michael-schwarz commented Dec 21, 2022

sim642 commented Dec 21, 2022

jerhard commented Dec 21, 2022 •

edited

Loading

nathanschmidt commented Dec 29, 2022

Use flambda compiler optimization #928

Use flambda compiler optimization #928

Conversation

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

sim642 commented Nov 24, 2022

michael-schwarz commented Nov 24, 2022

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

mrstanb commented Nov 24, 2022

jerhard left a comment

Choose a reason for hiding this comment

sim642 commented Nov 25, 2022

mrstanb commented Dec 4, 2022

sim642 commented Dec 4, 2022

nathanschmidt commented Dec 11, 2022

jerhard commented Dec 14, 2022

nathanschmidt commented Dec 21, 2022

sim642 commented Dec 21, 2022

nathanschmidt commented Dec 21, 2022

michael-schwarz commented Dec 21, 2022

michael-schwarz commented Dec 21, 2022

sim642 commented Dec 21, 2022

jerhard commented Dec 21, 2022 • edited Loading

nathanschmidt commented Dec 29, 2022

jerhard commented Dec 21, 2022 •

edited

Loading