Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt][CUDA] MLPerf training script (WIP). #7807

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mfbalin
Copy link
Collaborator

@mfbalin mfbalin commented Sep 22, 2024

Description

@TristonC This will be the MLPerf training script that we plan to make a submission with. While we are not able to make a formal submission this round (8 weeks prior notice is required, unless NVIDIA makes a submission), we can make an official submission in the next round.

We still need to change the code a bit to get in inline with the MLPerf submission requirements, I will do them in the upcoming PRs.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 22, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 22, 2024

Commit ID: ef75b6b

Build ID: 1

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@mfbalin mfbalin marked this pull request as draft September 22, 2024 22:08
@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 22, 2024

Commit ID: 4a1f45a

Build ID: 2

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Sep 22, 2024

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 22, 2024

Commit ID: 4a1f45a

Build ID: 3

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

@mfbalin mfbalin closed this Sep 24, 2024
@mfbalin mfbalin reopened this Sep 24, 2024
@mfbalin
Copy link
Collaborator Author

mfbalin commented Sep 24, 2024

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 24, 2024

Commit ID: 0358e8622d17282fe32bb2b71399c3298b8dd293

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 24, 2024

Commit ID: 75c391bfdc5c61c764c214e454719d881890565b

Build ID: 5

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 10, 2024

Commit ID: 507455c

Build ID: 6

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 11, 2024

Commit ID: 9b6aff812b7c664c78406abab1fa8b0e4732b87c

Build ID: 7

Status: ❌ CI test failed in Stage [CPU Build].

Report path: link

Full logs path: link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants