Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STF] Factorize large event lists in CUDA graphs #3756

Merged
merged 16 commits into from
Feb 11, 2025

Conversation

caugonnet
Copy link
Contributor

Description

To reduce the cost related to having many events in a CUDA graph, we here try to automatically factorize event lists in the graph backend too.

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Feb 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 2h 05m: Pass: 100%/20 | Total: 3h 42m | Avg: 11m 08s | Max: 16m 32s | Hits: 72%/10080
  • 🟩 cudax: Pass: 100%/20 | Total: 3h 42m | Avg: 11m 08s | Max: 16m 32s | Hits: 72%/10080

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  2h 59m | Avg: 11m 13s | Max: 16m 32s | Hits:  73%/7868  
      🟩 arm64              Pass: 100%/4   | Total: 43m 12s | Avg: 10m 48s | Max: 12m 33s | Hits:  66%/2212  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/261   
      🟩 12.5               Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  5m 47s | Hits:  94%/706   
      🟩 12.8               Pass: 100%/17  | Total:  3h 21m | Avg: 11m 51s | Max: 16m 32s | Hits:  70%/9113  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  5m 47s | Hits:  94%/706   
      🟩 nvcc12.8           Pass: 100%/17  | Total:  3h 21m | Avg: 11m 51s | Max: 16m 32s | Hits:  70%/9113  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  3h 42m | Avg: 11m 08s | Max: 16m 32s | Hits:  72%/10080 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 10m 33s | Avg: 10m 33s | Max: 10m 33s | Hits:  74%/555   
      🟩 Clang15            Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s | Hits:  74%/553   
      🟩 Clang16            Pass: 100%/1   | Total: 11m 50s | Avg: 11m 50s | Max: 11m 50s | Hits:  74%/553   
      🟩 Clang17            Pass: 100%/1   | Total: 11m 58s | Avg: 11m 58s | Max: 11m 58s | Hits:  74%/553   
      🟩 Clang18            Pass: 100%/4   | Total: 42m 03s | Avg: 10m 30s | Max: 11m 45s | Hits:  80%/2212  
      🟩 GCC10              Pass: 100%/1   | Total: 13m 21s | Avg: 13m 21s | Max: 13m 21s | Hits:  59%/555   
      🟩 GCC11              Pass: 100%/1   | Total: 14m 43s | Avg: 14m 43s | Max: 14m 43s | Hits:  59%/553   
      🟩 GCC12              Pass: 100%/2   | Total: 28m 30s | Avg: 14m 15s | Max: 16m 32s | Hits:  79%/1106  
      🟩 GCC13              Pass: 100%/4   | Total: 45m 50s | Avg: 11m 27s | Max: 12m 33s | Hits:  59%/2212  
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/261   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 05s | Avg: 11m 05s | Max: 11m 05s | Hits:  61%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  5m 47s | Hits:  94%/706   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  1h 28m | Avg: 11m 00s | Max: 11m 58s | Hits:  77%/4426  
      🟩 GCC                Pass: 100%/8   | Total:  1h 42m | Avg: 12m 48s | Max: 16m 32s | Hits:  64%/4426  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 55s | Avg: 10m 27s | Max: 11m 05s | Hits:  61%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  5m 47s | Hits:  94%/706   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  3h 42m | Avg: 11m 08s | Max: 16m 32s | Hits:  72%/10080 
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  3h 19m | Avg: 11m 03s | Max: 16m 32s | Hits:  68%/8974  
      🟩 Test               Pass: 100%/2   | Total: 23m 43s | Avg: 11m 51s | Max: 11m 58s | Hits:  99%/1106  
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 10m 42s | Avg: 10m 42s | Max: 10m 42s | Hits:  59%/553   
      🟩 90a                Pass: 100%/1   | Total: 10m 52s | Avg: 10m 52s | Max: 10m 52s | Hits:  59%/553   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 37m 21s | Avg:  9m 20s | Max: 11m 43s | Hits:  69%/2012  
      🟩 20                 Pass: 100%/16  | Total:  3h 05m | Avg: 11m 35s | Max: 16m 32s | Hits:  72%/8068  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 20)

# Runner
12 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1

@caugonnet caugonnet self-assigned this Feb 10, 2025
@caugonnet caugonnet added the stf Sequential Task Flow programming model label Feb 10, 2025
@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 33m 10s: Pass: 100%/20 | Total: 3h 56m | Avg: 11m 48s | Max: 14m 47s | Hits: 68%/10080
  • 🟩 cudax: Pass: 100%/20 | Total: 3h 56m | Avg: 11m 48s | Max: 14m 47s | Hits: 68%/10080

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 08m | Avg: 11m 46s | Max: 14m 47s | Hits:  70%/7868  
      🟩 arm64              Pass: 100%/4   | Total: 47m 41s | Avg: 11m 55s | Max: 13m 31s | Hits:  62%/2212  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total:  9m 57s | Avg:  9m 57s | Max:  9m 57s | Hits:  61%/261   
      🟩 12.5               Pass: 100%/2   | Total: 12m 39s | Avg:  6m 19s | Max:  6m 22s | Hits:  88%/706   
      🟩 12.8               Pass: 100%/17  | Total:  3h 33m | Avg: 12m 33s | Max: 14m 47s | Hits:  67%/9113  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 57s | Avg:  9m 57s | Max:  9m 57s | Hits:  61%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 39s | Avg:  6m 19s | Max:  6m 22s | Hits:  88%/706   
      🟩 nvcc12.8           Pass: 100%/17  | Total:  3h 33m | Avg: 12m 33s | Max: 14m 47s | Hits:  67%/9113  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  3h 56m | Avg: 11m 48s | Max: 14m 47s | Hits:  68%/10080 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 12m 34s | Avg: 12m 34s | Max: 12m 34s | Hits:  63%/555   
      🟩 Clang15            Pass: 100%/1   | Total: 13m 51s | Avg: 13m 51s | Max: 13m 51s | Hits:  62%/553   
      🟩 Clang16            Pass: 100%/1   | Total: 13m 16s | Avg: 13m 16s | Max: 13m 16s | Hits:  62%/553   
      🟩 Clang17            Pass: 100%/1   | Total: 14m 13s | Avg: 14m 13s | Max: 14m 13s | Hits:  62%/553   
      🟩 Clang18            Pass: 100%/4   | Total: 47m 28s | Avg: 11m 52s | Max: 13m 19s | Hits:  72%/2212  
      🟩 GCC10              Pass: 100%/1   | Total: 14m 47s | Avg: 14m 47s | Max: 14m 47s | Hits:  62%/555   
      🟩 GCC11              Pass: 100%/1   | Total: 14m 07s | Avg: 14m 07s | Max: 14m 07s | Hits:  62%/553   
      🟩 GCC12              Pass: 100%/2   | Total: 28m 40s | Avg: 14m 20s | Max: 14m 41s | Hits:  80%/1106  
      🟩 GCC13              Pass: 100%/4   | Total: 44m 22s | Avg: 11m 05s | Max: 13m 31s | Hits:  62%/2212  
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 57s | Avg:  9m 57s | Max:  9m 57s | Hits:  61%/261   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 16s | Avg: 10m 16s | Max: 10m 16s | Hits:  61%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 39s | Avg:  6m 19s | Max:  6m 22s | Hits:  88%/706   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  1h 41m | Avg: 12m 40s | Max: 14m 13s | Hits:  67%/4426  
      🟩 GCC                Pass: 100%/8   | Total:  1h 41m | Avg: 12m 44s | Max: 14m 47s | Hits:  67%/4426  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 13s | Avg: 10m 06s | Max: 10m 16s | Hits:  61%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 39s | Avg:  6m 19s | Max:  6m 22s | Hits:  88%/706   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  3h 56m | Avg: 11m 48s | Max: 14m 47s | Hits:  68%/10080 
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  3h 30m | Avg: 11m 42s | Max: 14m 47s | Hits:  64%/8974  
      🟩 Test               Pass: 100%/2   | Total: 25m 29s | Avg: 12m 44s | Max: 13m 59s | Hits:  99%/1106  
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  9m 09s | Avg:  9m 09s | Max:  9m 09s | Hits:  62%/553   
      🟩 90a                Pass: 100%/1   | Total: 10m 11s | Avg: 10m 11s | Max: 10m 11s | Hits:  62%/553   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 38m 06s | Avg:  9m 31s | Max: 11m 31s | Hits:  67%/2012  
      🟩 20                 Pass: 100%/16  | Total:  3h 18m | Avg: 12m 22s | Max: 14m 47s | Hits:  68%/8068  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 20)

# Runner
12 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet caugonnet marked this pull request as ready for review February 11, 2025 13:40
@caugonnet caugonnet requested a review from a team as a code owner February 11, 2025 13:40
@caugonnet caugonnet requested a review from pciolkosz February 11, 2025 13:40
Copy link
Contributor

🟩 CI finished in 44m 20s: Pass: 100%/20 | Total: 4h 04m | Avg: 12m 12s | Max: 17m 08s | Hits: 49%/10080
  • 🟩 cudax: Pass: 100%/20 | Total: 4h 04m | Avg: 12m 12s | Max: 17m 08s | Hits: 49%/10080

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 16m | Avg: 12m 18s | Max: 17m 08s | Hits:  52%/7868  
      🟩 arm64              Pass: 100%/4   | Total: 47m 04s | Avg: 11m 46s | Max: 12m 54s | Hits:  41%/2212  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 16s | Avg: 10m 16s | Max: 10m 16s | Hits:  52%/261   
      🟩 12.5               Pass: 100%/2   | Total: 11m 58s | Avg:  5m 59s | Max:  6m 00s | Hits:  89%/706   
      🟩 12.8               Pass: 100%/17  | Total:  3h 41m | Avg: 13m 02s | Max: 17m 08s | Hits:  46%/9113  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 16s | Avg: 10m 16s | Max: 10m 16s | Hits:  52%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 58s | Avg:  5m 59s | Max:  6m 00s | Hits:  89%/706   
      🟩 nvcc12.8           Pass: 100%/17  | Total:  3h 41m | Avg: 13m 02s | Max: 17m 08s | Hits:  46%/9113  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  4h 04m | Avg: 12m 12s | Max: 17m 08s | Hits:  49%/10080 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits:  39%/555   
      🟩 Clang15            Pass: 100%/1   | Total: 14m 31s | Avg: 14m 31s | Max: 14m 31s | Hits:  36%/553   
      🟩 Clang16            Pass: 100%/1   | Total: 14m 54s | Avg: 14m 54s | Max: 14m 54s | Hits:  36%/553   
      🟩 Clang17            Pass: 100%/1   | Total: 14m 15s | Avg: 14m 15s | Max: 14m 15s | Hits:  36%/553   
      🟩 Clang18            Pass: 100%/4   | Total: 50m 08s | Avg: 12m 32s | Max: 15m 51s | Hits:  55%/2212  
      🟩 GCC10              Pass: 100%/1   | Total: 15m 27s | Avg: 15m 27s | Max: 15m 27s | Hits:  36%/555   
      🟩 GCC11              Pass: 100%/1   | Total: 14m 18s | Avg: 14m 18s | Max: 14m 18s | Hits:  36%/553   
      🟩 GCC12              Pass: 100%/2   | Total: 29m 01s | Avg: 14m 30s | Max: 17m 08s | Hits:  67%/1106  
      🟩 GCC13              Pass: 100%/4   | Total: 45m 16s | Avg: 11m 19s | Max: 12m 54s | Hits:  41%/2212  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 16s | Avg: 10m 16s | Max: 10m 16s | Hits:  52%/261   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 11m 26s | Avg: 11m 26s | Max: 11m 26s | Hits:  52%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 58s | Avg:  5m 59s | Max:  6m 00s | Hits:  89%/706   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  1h 46m | Avg: 13m 17s | Max: 15m 51s | Hits:  46%/4426  
      🟩 GCC                Pass: 100%/8   | Total:  1h 44m | Avg: 13m 00s | Max: 17m 08s | Hits:  46%/4426  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 42s | Avg: 10m 51s | Max: 11m 26s | Hits:  52%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 58s | Avg:  5m 59s | Max:  6m 00s | Hits:  89%/706   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  4h 04m | Avg: 12m 12s | Max: 17m 08s | Hits:  49%/10080 
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  3h 40m | Avg: 12m 14s | Max: 17m 08s | Hits:  43%/8974  
      🟩 Test               Pass: 100%/2   | Total: 23m 33s | Avg: 11m 46s | Max: 11m 53s | Hits:  99%/1106  
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 10m 08s | Avg: 10m 08s | Max: 10m 08s | Hits:  42%/553   
      🟩 90a                Pass: 100%/1   | Total: 10m 41s | Avg: 10m 41s | Max: 10m 41s | Hits:  42%/553   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 38m 32s | Avg:  9m 38s | Max: 11m 33s | Hits:  50%/2012  
      🟩 20                 Pass: 100%/16  | Total:  3h 25m | Avg: 12m 50s | Max: 17m 08s | Hits:  49%/8068  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 20)

# Runner
12 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1

@caugonnet caugonnet enabled auto-merge (squash) February 11, 2025 20:49
@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 59m 06s: Pass: 100%/20 | Total: 4h 09m | Avg: 12m 27s | Max: 18m 52s | Hits: 65%/10080
  • 🟩 cudax: Pass: 100%/20 | Total: 4h 09m | Avg: 12m 27s | Max: 18m 52s | Hits: 65%/10080

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 19m | Avg: 12m 28s | Max: 18m 52s | Hits:  67%/7868  
      🟩 arm64              Pass: 100%/4   | Total: 49m 19s | Avg: 12m 19s | Max: 13m 26s | Hits:  59%/2212  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total:  9m 39s | Avg:  9m 39s | Max:  9m 39s | Hits:  61%/261   
      🟩 12.5               Pass: 100%/2   | Total: 14m 07s | Avg:  7m 03s | Max:  7m 05s | Hits:  82%/706   
      🟩 12.8               Pass: 100%/17  | Total:  3h 45m | Avg: 13m 15s | Max: 18m 52s | Hits:  64%/9113  
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 39s | Avg:  9m 39s | Max:  9m 39s | Hits:  61%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 14m 07s | Avg:  7m 03s | Max:  7m 05s | Hits:  82%/706   
      🟩 nvcc12.8           Pass: 100%/17  | Total:  3h 45m | Avg: 13m 15s | Max: 18m 52s | Hits:  64%/9113  
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  4h 09m | Avg: 12m 27s | Max: 18m 52s | Hits:  65%/10080 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 13m 01s | Avg: 13m 01s | Max: 13m 01s | Hits:  59%/555   
      🟩 Clang15            Pass: 100%/1   | Total: 13m 16s | Avg: 13m 16s | Max: 13m 16s | Hits:  59%/553   
      🟩 Clang16            Pass: 100%/1   | Total: 15m 02s | Avg: 15m 02s | Max: 15m 02s | Hits:  59%/553   
      🟩 Clang17            Pass: 100%/1   | Total: 14m 44s | Avg: 14m 44s | Max: 14m 44s | Hits:  59%/553   
      🟩 Clang18            Pass: 100%/4   | Total: 57m 48s | Avg: 14m 27s | Max: 18m 52s | Hits:  69%/2212  
      🟩 GCC10              Pass: 100%/1   | Total: 14m 35s | Avg: 14m 35s | Max: 14m 35s | Hits:  59%/555   
      🟩 GCC11              Pass: 100%/1   | Total: 13m 27s | Avg: 13m 27s | Max: 13m 27s | Hits:  58%/553   
      🟩 GCC12              Pass: 100%/2   | Total: 28m 10s | Avg: 14m 05s | Max: 15m 18s | Hits:  79%/1106  
      🟩 GCC13              Pass: 100%/4   | Total: 46m 02s | Avg: 11m 30s | Max: 13m 26s | Hits:  58%/2212  
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 39s | Avg:  9m 39s | Max:  9m 39s | Hits:  61%/261   
      🟩 MSVC14.42          Pass: 100%/1   | Total:  9m 10s | Avg:  9m 10s | Max:  9m 10s | Hits:  61%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 14m 07s | Avg:  7m 03s | Max:  7m 05s | Hits:  82%/706   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  1h 53m | Avg: 14m 13s | Max: 18m 52s | Hits:  64%/4426  
      🟩 GCC                Pass: 100%/8   | Total:  1h 42m | Avg: 12m 46s | Max: 15m 18s | Hits:  64%/4426  
      🟩 MSVC               Pass: 100%/2   | Total: 18m 49s | Avg:  9m 24s | Max:  9m 39s | Hits:  61%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 14m 07s | Avg:  7m 03s | Max:  7m 05s | Hits:  82%/706   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  4h 09m | Avg: 12m 27s | Max: 18m 52s | Hits:  65%/10080 
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  3h 37m | Avg: 12m 04s | Max: 15m 18s | Hits:  61%/8974  
      🟩 Test               Pass: 100%/2   | Total: 31m 44s | Avg: 15m 52s | Max: 18m 52s | Hits:  99%/1106  
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  9m 58s | Avg:  9m 58s | Max:  9m 58s | Hits:  58%/553   
      🟩 90a                Pass: 100%/1   | Total: 10m 44s | Avg: 10m 44s | Max: 10m 44s | Hits:  58%/553   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 40m 42s | Avg: 10m 10s | Max: 11m 54s | Hits:  63%/2012  
      🟩 20                 Pass: 100%/16  | Total:  3h 28m | Avg: 13m 01s | Max: 18m 52s | Hits:  65%/8068  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 20)

# Runner
12 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1

@caugonnet caugonnet merged commit 737a604 into NVIDIA:main Feb 11, 2025
32 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stf Sequential Task Flow programming model
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants