Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration not syncing for Wandb Offline Mode #20571

Open
brendo-k opened this issue Feb 3, 2025 · 0 comments
Open

Configuration not syncing for Wandb Offline Mode #20571

brendo-k opened this issue Feb 3, 2025 · 0 comments
Labels
bug Something isn't working needs triage Waiting to be triaged by maintainers ver: 2.5.x

Comments

@brendo-k
Copy link

brendo-k commented Feb 3, 2025

Bug description

When I train using wandb offline, pytorch-lightning doesn't upload the configuration of my job. It loads the metrics and the jobs summary. A workaround is to call wandb.init(config=model.hparams) before the pytorch-lightning wandbLogger initialization.

What version are you seeing the problem on?

v2.5

How to reproduce the bug

import pytorch_lightning as pl
from pytorch_lightning.loggers import WandbLogger
import torch
from torch import nn
from torch.utils.data import DataLoader, random_split, TensorDataset
import wandb

# Dummy dataset
data = torch.randn(1000, 10)
targets = torch.randint(0, 2, (1000,))
dataset = TensorDataset(data, targets)

# Split dataset
train_size = int(0.8 * len(dataset))
val_size = len(dataset) - train_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

# DataLoader
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32)

# Define a simple model
class SimpleModel(pl.LightningModule):
    def __init__(self, hidden_size):
        super().__init__()
        self.save_hyperparameters()
        self.layer_1 = nn.Linear(10, hidden_size,)
        self.layer_2 = nn.Linear(hidden_size, hidden_size)
        self.layer_3 = nn.Linear(64, 1)
        self.loss = nn.BCEWithLogitsLoss()

    def forward(self, x):
        x = torch.relu(self.layer_1(x))
        x = torch.relu(self.layer_2(x))
        x = self.layer_3(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x).squeeze()
        loss = self.loss(y_hat, y.float())
        self.log('train_loss', loss)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x).squeeze()
        loss = self.loss(y_hat, y.float())
        self.log('val_loss', loss)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
        return optimizer

# Initialize model
model = SimpleModel(64)

# Initialize trainer
wandblogger = WandbLogger(project='minimal-example', offline=True, save_dir='.')
trainer = pl.Trainer(max_epochs=10, logger=wandblogger)

# Train the model
trainer.fit(model, train_loader, val_loader)

Error messages and logs

# No errors

Environment

Current environment
  • CUDA:
    - GPU: None
    - available: False
    - version: None
  • Lightning:
    - lightning-utilities: 0.12.0
    - pytorch-lightning: 2.5.0.post0
    - torch: 2.5.1
    - torchmetrics: 1.6.1
  • Packages:
    - annotated-types: 0.7.0
    - appdirs: 1.4.4
    - autocommand: 2.2.2
    - backports.tarfile: 1.2.0
    - brotli: 1.1.0
    - certifi: 2024.12.14
    - cffi: 1.17.1
    - charset-normalizer: 3.4.1
    - click: 8.1.8
    - colorama: 0.4.6
    - docker-pycreds: 0.4.0
    - eval-type-backport: 0.2.2
    - filelock: 3.17.0
    - fsspec: 2025.2.0
    - gitdb: 4.0.12
    - gitpython: 3.1.44
    - h2: 4.2.0
    - hpack: 4.1.0
    - hyperframe: 6.1.0
    - idna: 3.10
    - importlib-metadata: 8.0.0
    - inflect: 7.3.1
    - jaraco.collections: 5.1.0
    - jaraco.context: 5.3.0
    - jaraco.functools: 4.0.1
    - jaraco.text: 3.12.1
    - jinja2: 3.1.5
    - lightning-utilities: 0.12.0
    - markupsafe: 3.0.2
    - more-itertools: 10.3.0
    - mpmath: 1.3.0
    - networkx: 3.4.2
    - numpy: 2.2.2
    - packaging: 24.2
    - pip: 25.0
    - platformdirs: 4.3.6
    - protobuf: 5.28.3
    - psutil: 6.1.1
    - pybind11: 2.13.6
    - pybind11-global: 2.13.6
    - pycparser: 2.22
    - pydantic: 2.10.6
    - pydantic-core: 2.27.2
    - pysocks: 1.7.1
    - pytorch-lightning: 2.5.0.post0
    - pyyaml: 6.0.2
    - requests: 2.32.3
    - sentry-sdk: 2.20.0
    - setproctitle: 1.3.4
    - setuptools: 75.8.0
    - six: 1.17.0
    - smmap: 5.0.0
    - sympy: 1.13.3
    - tomli: 2.0.1
    - torch: 2.5.1
    - torchmetrics: 1.6.1
    - tqdm: 4.67.1
    - typeguard: 4.3.0
    - typing-extensions: 4.12.2
    - urllib3: 2.3.0
    - wandb: 0.19.5
    - wheel: 0.45.1
    - win-inet-pton: 1.1.0
    - zipp: 3.19.2
    - zstandard: 0.23.0
  • System:
    - OS: Windows
    - architecture:
    - 64bit
    - WindowsPE
    - processor: Intel64 Family 6 Model 154 Stepping 4, GenuineIntel
    - python: 3.11.0
    - release: 10
    - version: 10.0.26100

More info

No response

@brendo-k brendo-k added bug Something isn't working needs triage Waiting to be triaged by maintainers labels Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Waiting to be triaged by maintainers ver: 2.5.x
Projects
None yet
Development

No branches or pull requests

1 participant