Skip to content

Commit

Permalink
[docs] document how to use the Threaded parallel scheme (#821)
Browse files Browse the repository at this point in the history
  • Loading branch information
odow authored Jan 22, 2025
1 parent aec55e2 commit 1730279
Showing 1 changed file with 8 additions and 119 deletions.
127 changes: 8 additions & 119 deletions docs/src/guides/improve_computational_performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,38 +58,16 @@ SDDP.train(model; cut_type = SDDP.MULTI_CUT)

## Parallelism

SDDP.jl can take advantage of the parallel nature of modern computers to solve problems
across multiple cores.
SDDP.jl can take advantage of the parallel nature of modern computers to solve
problems across multiple threads.

!!! info
We highly recommend that you read the Julia manual's section on [parallel computing](https://docs.julialang.org/en/v1/manual/parallel-computing/).

You can start Julia from a command line with `N` processors using the `-p` flag:
Start Julia from a command line with `N` threads using the `--threads` flag:
```julia
julia -p N
julia --threads N
```

Alternatively, you can use the `Distributed.jl` package:
```julia
using Distributed
Distributed.addprocs(N)
```

!!! warning
Workers **DON'T** inherit their parent's Pkg environment. Therefore, if you started
Julia with `--project=/path/to/environment` (or if you activated an environment from the
REPL), you will need to put the following at the top of your script:
```julia
using Distributed
@everywhere begin
import Pkg
Pkg.activate("/path/to/environment")
end
```

Currently SDDP.jl supports to parallel schemes, [`SDDP.Serial`](@ref) and
[`SDDP.Asynchronous`](@ref). Instances of these parallel schemes should be passed to the
`parallel_scheme` argument of [`SDDP.train`](@ref) and [`SDDP.simulate`](@ref).
Then, pass an instance of [`SDDP.Threaded`](@ref) to the `parallel_scheme`
argument of [`SDDP.train`](@ref) and [`SDDP.simulate`](@ref).

```julia
using SDDP, HiGHS
Expand All @@ -99,95 +77,6 @@ model = SDDP.LinearPolicyGraph(
@variable(sp, x >= 0, SDDP.State, initial_value = 1)
@stageobjective(sp, x.out)
end
SDDP.train(model; iteration_limit = 10, parallel_scheme = SDDP.Asynchronous())
SDDP.simulate(model, 10; parallel_scheme = SDDP.Asynchronous())
```

There is a large overhead for using the asynchronous solver. Even if you choose asynchronous
mode, SDDP.jl will start in serial mode while the initialization takes place. Therefore, in
the log you will see that the initial iterations take place on the master thread (`Proc. ID
= 1`), and it is only after while that the solve switches to full parallelism.

!!! info
Because of the large data communication requirements (all cuts have to be shared with
all other cores), the solution time will not scale linearly with the number of cores.

!!! info
Given the same number of iterations, the policy obtained from asynchronous mode will be
_worse_ than the policy obtained from serial mode. However, the asynchronous solver can
take significantly less time to compute the same number of iterations.

### Data movement

By default, data defined on the master process is not made available to the workers.
Therefore, a model like the following:
```julia
data = 1
model = SDDP.LinearPolicyGraph(stages = 2, lower_bound = 0) do sp, t
@variable(sp, x >= 0, SDDP.State, initial_value = data)
@stageobjective(sp, x.out)
end
```
will result in an `UndefVarError` error like `UndefVarError: data not defined`.

There are three solutions for this problem.

#### Option 1: declare data inside the build function

```julia
model = SDDP.LinearPolicyGraph(stages = 2) do sp, t
data = 1
@variable(sp, x >= 0, SDDP.State, initial_value = 1)
@stageobjective(sp, x)
end
```

#### Option 2: use `@everywhere`

```julia
@everywhere begin
data = 1
end
model = SDDP.LinearPolicyGraph(stages = 2) do sp, t
@variable(sp, x >= 0, SDDP.State, initial_value = 1)
@stageobjective(sp, x)
end
```

#### Option 3: build the model in a function

```julia
function build_model()
data = 1
return SDDP.LinearPolicyGraph(stages = 2) do sp, t
@variable(sp, x >= 0, SDDP.State, initial_value = 1)
@stageobjective(sp, x)
end
end

model = build_model()
```

### Initialization hooks

!!! warning
This is important if you use Gurobi!

[`SDDP.Asynchronous`](@ref) accepts a pre-processing hook that is run on each
worker process _before_ the model is solved. The most useful situation is for
solvers than need an initialization step. A good example is Gurobi, which can
share an environment amongst all models on a worker. Notably, this environment
**cannot** be shared amongst workers, so defining one environment at the top of
a script will fail!

To initialize a new environment on each worker, use the following:

```julia
SDDP.train(
model;
parallel_scheme = SDDP.Asynchronous() do m::SDDP.PolicyGraph
env = Gurobi.Env()
set_optimizer(m, () -> Gurobi.Optimizer(env))
end,
)
SDDP.train(model; iteration_limit = 10, parallel_scheme = SDDP.Threaded())
SDDP.simulate(model, 10; parallel_scheme = SDDP.Threaded())
```

0 comments on commit 1730279

Please sign in to comment.