-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use libcu++ limits/trait in tests/benchmarks #3822
Conversation
b39a3ff
to
5c8f54e
Compare
5c8f54e
to
96ed90b
Compare
🟨 CI finished in 1h 27m: Pass: 97%/93 | Total: 1d 20h | Avg: 28m 27s | Max: 1h 11m | Hits: 80%/132111
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
# | Runner |
---|---|
66 | linux-amd64-cpu16 |
9 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
96ed90b
to
4d90b5d
Compare
c2h/include/c2h/custom_type.h
Outdated
template <template <typename> class... Policies> | ||
class numeric_limits<c2h::custom_type_t<Policies...>> | ||
class __numeric_limits_impl<c2h::custom_type_t<Policies...>, __numeric_limits_type::__other> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldnt this also work if we just specialized numeric_limits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't and I need your guidance here. The problem is that CUB sometimes queries numeric_limits<const T>
and that does not find a specialization if I only specialize:
class numeric_limits<c2h::custom_type_t<Policies...>> { ... };
Specializing __numeric_limits_impl
works, because numeric_limits
strips CV qualifiers and passes the type to __numeric_limits_impl
. What's the best practice? Should be specialize numeric_limits
multiple times for T
, const T
, and maybe also for referneces of those? Or should we try harder inside CUB to call like numeric_limits<decay_t<T>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a bug in our implementation then: https://eel.is/c++draft/numeric.limits#general-5
The value of each member of a specialization of numeric_limits on a cv-qualified type cv T shall be equal to the value of the corresponding member of the specialization on the unqualified type T.
// replace with | ||
// std::numeric_limits<T>::max() when | ||
// C++11 support is more prevalent | ||
// TODO(bgruber): replace with ::cuda::std::numeric_limits<T>::max() (breaking change) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were breaking things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to separate non-breaking from breaking changes, so we can more easily roll if needed, and have a smaller diff to inspect with more attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through it, no change requested.
That said I found a few places where we are unconsistent about fully qualifying cuda::std
Feel free to ignore
ab531ba
to
bab6e89
Compare
🟨 CI finished in 1h 25m: Pass: 97%/93 | Total: 1d 22h | Avg: 30m 11s | Max: 1h 13m | Hits: 86%/131751
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
# | Runner |
---|---|
66 | linux-amd64-cpu16 |
9 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
I specialized |
ok, this is a fun one. The test is broken on |
I reported the bug here: #3835. And will disable the test now in this PR. |
Co-authored-by: Michael Schellenberger Costa <[email protected]>
bab6e89
to
fb4cf88
Compare
// FIXME(bgruber): uchar3 fails the test, see #3835 | ||
using vec_types = c2h::type_list<ulonglong4, /*uchar3,*/ short2>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am disabling the failing test here. We should update #3835 once this PR is merged.
🟩 CI finished in 1h 27m: Pass: 100%/93 | Total: 1d 23h | Avg: 30m 41s | Max: 1h 14m | Hits: 86%/134193
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 93)
# | Runner |
---|---|
66 | linux-amd64-cpu16 |
9 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
This pulls out the non-breaking part of #3384, which:
This PR does NOT replace any use of CUB traits in the CUB/Thrust headers, because that would break any user opting in with a custom type by specializing CUB traits.