Skip to content

Commit

Permalink
Merge pull request #110 from ecmwf-ifs/naml-remove-claw
Browse files Browse the repository at this point in the history
CLOUDSC: Purge Loki-CLAW and GPU-CLAW variants from dwarf
  • Loading branch information
reuterbal authored Jan 19, 2025
2 parents b707227 + 8780ec1 commit 767d409
Show file tree
Hide file tree
Showing 9 changed files with 3 additions and 2,106 deletions.
9 changes: 0 additions & 9 deletions .github/scripts/verify-targets.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@ if [[ "$build_flags" == *"--with-gpu"* ]]
then
targets+=(dwarf-cloudsc-gpu-scc dwarf-cloudsc-gpu-scc-hoist dwarf-cloudsc-gpu-scc-k-caching)
targets+=(dwarf-cloudsc-gpu-omp-scc-hoist)
if [[ "$build_flags" == *"--with-claw"* ]]
then
targets+=(dwarf-cloudsc-gpu-claw)
fi
if [[ "$build_flags" == *"--with-field"* ]]
then
targets+=(dwarf-cloudsc-gpu-scc-field)
Expand All @@ -35,17 +31,12 @@ fi

if [[ "$build_flags" == *"--with-loki"* ]]
then
targets+=(dwarf-cloudsc-loki-idem dwarf-cloudsc-loki-sca)
targets+=(dwarf-cloudsc-loki-scc dwarf-cloudsc-loki-scc-hoist)
targets+=(dwarf-cloudsc-loki-idem-stack dwarf-cloudsc-loki-scc-stack)
if [[ "$build_flags" != *"--single-precision"* ]]
then
targets+=(dwarf-cloudsc-loki-c)
fi
if [[ "$build_flags" == *"--with-claw"* ]]
then
targets+=(dwarf-cloudsc-loki-claw-cpu dwarf-cloudsc-loki-claw-gpu)
fi
if [[ "$build_flags" == *"--with-cuda"* ]]
then
targets+=(dwarf-cloudsc-loki-scc-cuf-hoist dwarf-cloudsc-loki-scc-cuf-parametrise)
Expand Down
26 changes: 2 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,6 @@ Balthasar Reuter ([email protected])
- **dwarf-cloudsc-gpu-kernels**: GPU-enabled version of the CLOUDSC dwarf
that uses OpenACC and relies on the `!$acc kernels` directive to offload
the computational kernel.
- **dwarf-cloudsc-gpu-claw** (deprecated!): GPU-enabled and optimized version of
CLOUDSC that is based on an auto-generated version of CLOUDSC based on the CLAW
tool. The kernel in this demonstrator has been further optimized with gang-level
loop blocking to demonstrate potential performance gains. This variant is defunct
on current Nvidia GPUs and therefore deactivated by default, requiring explicit
`--with-claw` flag to build.
- **dwarf-cloudsc-gpu-scc**: GPU-enabled and optimized version of
CLOUDSC that utilises the native blocked IFS memory layout via a
"single-column coalesced" (SCC) loop layout. Here the outer NPROMA
Expand Down Expand Up @@ -191,7 +185,7 @@ device. This can be achieved using the `CUDA_VISIBLE_DEVICES` environment
variable:

```sh
mpirun -np 2 bash -c "CUDA_VISIBLE_DEVICES=\${OMPI_COMM_WORLD_RANK} bin/dwarf-cloudsc-gpu-claw 1 163840 8192"
mpirun -np 2 bash -c "CUDA_VISIBLE_DEVICES=\${OMPI_COMM_WORLD_RANK} bin/dwarf-cloudsc-gpu-scc-stack 1 163840 128"
```

### Choosing between HDF5 and Serialbox input file format
Expand Down Expand Up @@ -349,8 +343,7 @@ source-to-source translation tool that allows us to create bespoke
transformations for the IFS to target and experiment with emerging HPC
architectures and programming models. We use the CLOUDSC dwarf as a demonstrator
for targeted transformation capabilities of physics and grid point computations
kernels, including conversion to C and GPU, directly or via downstream tools
like CLAW.
kernels, including conversion to C and GPU.

The following build flags enable the demonstrator build targets on the
ECMWF Atos HPC facility's GPU partition:
Expand All @@ -368,21 +361,10 @@ The following Loki modes are included in the dwarf, each with a bespoke demonstr
- **cloudsc-loki-sca**: Pure single-column mode that strips all horizontal
vector loops from the kernel and introduces an outer "column-loop"
at the driver level.
- **cloudsc-loki-claw-cpu** (deprecated): Same as SCA, but also adds the
necessary CLAW annotations. The resulting cloudsc.claw.F90 file is then
processed by CLAW to re-insert vector loops for optimal CPU execution.
- **cloudsc-loki-claw-gpu** (deprecated): Creates the same CLAW-ready kernel
file, but triggers the GPU-specific optimizations in the CLAW compiler to insert
OpenACC-offload instructions in the driver and an OpenACC parallel loop inside
the kernel for each block. This needs to be run with large block sizes (eg.
NPROMA=1024-8192).
- **cloudsc-loki-c**: A prototype C transpilation pipeline that converts
the kernel to C and calls it via iso_c_bindings interfaces from the
driver.

To enable the deprecated and, on GPU, defunct CLAW variants, the build-flag
`--with-claw` needs to be specified explicitly.

## Python-driven CLOUDSC variants
The following partly or fully Python-based CLOUDSC are available:
- **cloudsc-python**: GT4PY based Python-only implementation. Refer to `src/cloudsc_python`
Expand All @@ -402,10 +384,6 @@ Loki currently supports three frontends to parse the Fortran source code:
- [FParser](https://github.com/stfc/fparser) (`loki-frontend=fp`):
The preferred default; developed by STFC for PsyClone.
- [OMNI](https://github.com/omni-compiler/omni-compiler) frontend (`loki-frontend=omni`):
Generates the same AST as used by CLAW.
- [OFP](https://github.com/OpenFortranProject/open-fortran-parser),
a Python wrapper around the ROSE frontend (`loki-frontend=ofp`):
Supported, but bugged in some places and slow; use with care.

For completeness, all three frontends are tested in our CI, which
means we require the `.xmod` module description files for utility
Expand Down
16 changes: 0 additions & 16 deletions bundle.yml
Original file line number Diff line number Diff line change
Expand Up @@ -156,18 +156,10 @@ options :
ENABLE_CLOUDSC_LOKI=ON
LOKI_ENABLE_NO_INSTALL=OFF
- with-claw :
help : Enable deprecated (and defunct) CLAW-generated variants
cmake : >
ENABLE_CLOUDSC_GPU_CLAW=ON
ENABLE_CLOUDSC_LOKI_CLAW=ON
LOKI_ENABLE_CLAW=ON
- without-loki-install :
help : Skip installation of Loki (Requires Loki to be on the PATH)
cmake : >
LOKI_ENABLE_NO_INSTALL=ON
LOKI_ENABLE_CLAW=OFF
- loki-frontend :
help : Frontend parser to use for Loki transformations
Expand Down Expand Up @@ -213,18 +205,10 @@ options :
help : Build the C version of CLOUDSC [ON|OFF]
cmake : ENABLE_CLOUDSC_C={{value}}

- cloudsc-gpu-claw :
help : Build the deprecated CLAW-based GPU version CLOUDSC [ON|OFF]
cmake : ENABLE_CLOUDSC_GPU_CLAW={{value}}

- cloudsc-loki :
help : Build the optimized Loki-based GPU version CLOUDSC [ON|OFF]
cmake : ENABLE_CLOUDSC_LOKI={{value}}

- cloudsc-loki-claw :
help : Build the deprecated Loki+CLAW-based GPU version CLOUDSC [ON|OFF]
cmake : ENABLE_CLOUDSC_LOKI_CLAW={{value}}

- cloudsc-python-f2py :
help : Enable dedicated pure Python variant of CLOUDSC [ON|OFF]
cmake : ENABLE_CLOUDSC_PYTHON_F2PY={{value}}
Expand Down
28 changes: 0 additions & 28 deletions src/cloudsc_gpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,6 @@ ecbuild_add_option( FEATURE CLOUDSC_GPU_KERNELS
CONDITION Serialbox_FOUND OR HDF5_FOUND
)

# Define the CLAW-based GPU dwarf variant as an ECBuild feature
ecbuild_add_option( FEATURE CLOUDSC_GPU_CLAW
DESCRIPTION "Build optimized GPU version of CLOUDSC derived from CLAW using OpenACC" DEFAULT OFF
CONDITION Serialbox_FOUND OR HDF5_FOUND
)

ecbuild_add_option( FEATURE CLOUDSC_GPU_SCC
DESCRIPTION "Build optimized GPU version of CLOUDSC using SCC layout and OpenACC" DEFAULT OFF
Expand Down Expand Up @@ -74,29 +69,6 @@ ecbuild_add_option( FEATURE CLOUDSC_GPU_SCC_FIELD
)


if( HAVE_CLOUDSC_GPU_CLAW )
ecbuild_add_executable(
TARGET dwarf-cloudsc-gpu-claw
SOURCES
dwarf_cloudsc_gpu.F90
cloudsc_driver_gpu_claw_mod.F90
cloudsc.claw.gpu.F90
LIBS
cloudsc-common-lib
DEFINITIONS ${CLOUDSC_DEFINITIONS} CLOUDSC_GPU_CLAW
)

ecbuild_add_test(
TARGET dwarf-cloudsc-gpu-claw-serial
COMMAND bin/dwarf-cloudsc-gpu-claw
ARGS 1 1280 128
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/../../..
OMP 1
ENABLED OFF # CLAW variant is currently broken
)
endif()


if( HAVE_CLOUDSC_GPU_SCC )
ecbuild_add_executable(
TARGET dwarf-cloudsc-gpu-scc
Expand Down
Loading

0 comments on commit 767d409

Please sign in to comment.