-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Experiment with dreamerv3 on polyburn
- Loading branch information
Showing
101 changed files
with
14,892 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
*.py[cod] | ||
__pycache__/ | ||
dist |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
.pytest_cache | ||
dist | ||
__pycache__/ | ||
*.py[cod] | ||
*.egg-info | ||
MUJOCO_LOG.TXT | ||
; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Copyright (c) 2023 Danijar Hafner | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
include dreamerv3/requirements.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Mastering Diverse Domains through World Models | ||
|
||
A reimplementation of [DreamerV3][paper], a scalable and general reinforcement | ||
learning algorithm that masters a wide range of applications with fixed | ||
hyperparameters. | ||
|
||
![DreamerV3 Tasks](https://user-images.githubusercontent.com/2111293/217647148-cbc522e2-61ad-4553-8e14-1ecdc8d9438b.gif) | ||
|
||
If you find this code useful, please reference in your paper: | ||
|
||
``` | ||
@article{hafner2023dreamerv3, | ||
title={Mastering Diverse Domains through World Models}, | ||
author={Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy}, | ||
journal={arXiv preprint arXiv:2301.04104}, | ||
year={2023} | ||
} | ||
``` | ||
|
||
To learn more: | ||
|
||
- [Research paper][paper] | ||
- [Project website][website] | ||
- [Twitter summary][tweet] | ||
|
||
## DreamerV3 | ||
|
||
DreamerV3 learns a world model from experiences and uses it to train an actor | ||
critic policy from imagined trajectories. The world model encodes sensory | ||
inputs into categorical representations and predicts future representations and | ||
rewards given actions. | ||
|
||
![DreamerV3 Method Diagram](https://user-images.githubusercontent.com/2111293/217355673-4abc0ce5-1a4b-4366-a08d-64754289d659.png) | ||
|
||
DreamerV3 masters a wide range of domains with a fixed set of hyperparameters, | ||
outperforming specialized methods. Removing the need for tuning reduces the | ||
amount of expert knowledge and computational resources needed to apply | ||
reinforcement learning. | ||
|
||
![DreamerV3 Benchmark Scores](https://github.com/danijar/dreamerv3/assets/2111293/0fe8f1cf-6970-41ea-9efc-e2e2477e7861) | ||
|
||
Due to its robustness, DreamerV3 shows favorable scaling properties. Notably, | ||
using larger models consistently increases not only its final performance but | ||
also its data-efficiency. Increasing the number of gradient steps further | ||
increases data efficiency. | ||
|
||
![DreamerV3 Scaling Behavior](https://user-images.githubusercontent.com/2111293/217356063-0cf06b17-89f0-4d5f-85a9-b583438c98dd.png) | ||
|
||
# Instructions | ||
|
||
The code has been tested on Linux and Mac and requires Python 3.11+. | ||
|
||
## Docker | ||
|
||
You can either use the provided `Dockerfile` that contains instructions or | ||
follow the manual instructions below. | ||
|
||
## Manual | ||
|
||
Install [JAX][jax] and then the other dependencies: | ||
|
||
```sh | ||
pip install -U -r embodied/requirements.txt | ||
pip install -U -r dreamerv3/requirements.txt \ | ||
-f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html | ||
``` | ||
|
||
Simple training script: | ||
|
||
```sh | ||
python example.py | ||
``` | ||
|
||
Flexible training script: | ||
|
||
```sh | ||
python dreamerv3/main.py \ | ||
--logdir ~/logdir/{timestamp} \ | ||
--configs crafter \ | ||
--run.train_ratio 32 | ||
``` | ||
|
||
To reproduce results, train on the desired task using the corresponding config, | ||
such as `--configs atari --task atari_pong`. | ||
|
||
# Tips | ||
|
||
- All config options are listed in `configs.yaml` and you can override them | ||
as flags from the command line. | ||
- The `debug` config block reduces the network size, batch size, duration | ||
between logs, and so on for fast debugging (but does not learn a good model). | ||
- By default, the code tries to run on GPU. You can switch to CPU or TPU using | ||
the `--jax.platform cpu` flag. | ||
- You can use multiple config blocks that will override defaults in the | ||
order they are specified, for example `--configs crafter size50m`. | ||
- By default, metrics are printed to the terminal, appended to a JSON lines | ||
file, and written as TensorBoard summaries. Other outputs like WandB can be | ||
enabled in the training script. | ||
- If you get a `Too many leaves for PyTreeDef` error, it means you're | ||
reloading a checkpoint that is not compatible with the current config. This | ||
often happens when reusing an old logdir by accident. | ||
- If you are getting CUDA errors, scroll up because the cause is often just an | ||
error that happened earlier, such as out of memory or incompatible JAX and | ||
CUDA versions. Try `--batch_size 1` to rule out an out of memory error. | ||
- Many environments are included, some of which require installing additional | ||
packages. See the `Dockerfile` for reference. | ||
- When running on custom environments, make sure to specify the observation | ||
keys the agent should be using via the `enc.spaces` and `dec.spaces` regex | ||
patterns. | ||
- To log metrics from environments without showing them to the agent or storing | ||
them in the replay buffer, return them as observation keys with `log_` prefix | ||
and enable logging via the `run.log_keys_...` options. | ||
- To continue stopped training runs, simply run the same command line again and | ||
make sure that the `--logdir` points to the same directory. | ||
|
||
# Disclaimer | ||
|
||
This repository contains a reimplementation of DreamerV3 based on the open | ||
source DreamerV2 code base. It is unrelated to Google or DeepMind. The | ||
implementation has been tested to reproduce the official results on a range of | ||
environments. | ||
|
||
[jax]: https://github.com/google/jax#pip-installation-gpu-cuda | ||
[paper]: https://arxiv.org/pdf/2301.04104v1.pdf | ||
[website]: https://danijar.com/dreamerv3 | ||
[tweet]: https://twitter.com/danijarh/status/1613161946223677441 | ||
[example]: https://github.com/danijar/dreamerv3/blob/main/example.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# Instructions | ||
# | ||
# 1) Test setup: | ||
# | ||
# docker run -it --rm --gpus all --privileged <base image> \ | ||
# sh -c 'ldconfig; nvidia-smi' | ||
# | ||
# 2) Start training: | ||
# | ||
# docker build -f dreamerv3/Dockerfile -t img . && \ | ||
# docker run -it --rm --gpus all -v ~/logdir/docker:/logdir img \ | ||
# sh -c 'ldconfig; sh embodied/scripts/xvfb_run.sh python dreamerv3/main.py \ | ||
# --logdir "/logdir/{timestamp}" --configs atari --task atari_pong' | ||
# | ||
# 3) See results: | ||
# | ||
# tensorboard --logdir ~/logdir/docker | ||
# | ||
|
||
# System | ||
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04 | ||
ENV DEBIAN_FRONTEND=noninteractive | ||
ENV TZ=America/San_Francisco | ||
ENV PYTHONUNBUFFERED 1 | ||
ENV PIP_NO_CACHE_DIR 1 | ||
ENV PIP_ROOT_USER_ACTION=ignore | ||
RUN apt-get update && apt-get install -y \ | ||
ffmpeg git vim curl software-properties-common \ | ||
libglew-dev x11-xserver-utils xvfb \ | ||
&& apt-get clean | ||
|
||
# Workdir | ||
RUN mkdir /app | ||
WORKDIR /app | ||
|
||
# Python | ||
RUN add-apt-repository ppa:deadsnakes/ppa | ||
RUN apt-get update && apt-get install -y python3.11-dev python3.11-venv && apt-get clean | ||
RUN python3.11 -m venv ./venv --upgrade-deps | ||
ENV PATH="/app/venv/bin:$PATH" | ||
RUN pip install --upgrade pip setuptools | ||
|
||
# Envs | ||
COPY embodied/scripts/install-minecraft.sh . | ||
RUN sh install-minecraft.sh | ||
COPY embodied/scripts/install-dmlab.sh . | ||
RUN sh install-dmlab.sh | ||
RUN pip install ale_py autorom[accept-rom-license] | ||
RUN pip install procgen_mirror | ||
RUN pip install crafter | ||
RUN pip install dm_control | ||
RUN pip install memory_maze | ||
ENV MUJOCO_GL egl | ||
ENV NUMBA_CACHE_DIR /tmp | ||
|
||
# Agent | ||
COPY dreamerv3/requirements.txt agent-requirements.txt | ||
RUN pip install -r agent-requirements.txt \ | ||
-f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html | ||
ENV XLA_PYTHON_CLIENT_MEM_FRACTION 0.8 | ||
|
||
# Embodied | ||
COPY embodied/requirements.txt embodied-requirements.txt | ||
RUN pip install -r embodied-requirements.txt | ||
|
||
# Source | ||
COPY . . | ||
|
||
# Cloud | ||
ENV GCS_RESOLVE_REFRESH_SECS=60 | ||
ENV GCS_REQUEST_CONNECTION_TIMEOUT_SECS=300 | ||
ENV GCS_METADATA_REQUEST_TIMEOUT_SECS=300 | ||
ENV GCS_READ_REQUEST_TIMEOUT_SECS=300 | ||
ENV GCS_WRITE_REQUEST_TIMEOUT_SECS=600 | ||
RUN chown 1000:root . && chmod 775 . |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
from .agent import Agent | ||
from .main import wrap_env |
Oops, something went wrong.