-
Notifications
You must be signed in to change notification settings - Fork 534
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add cublas FP8 tensorwise GEMM in fbgemm quantize bench
cla signed
fb-exported
#3693
opened Feb 14, 2025 by
jiawenliu64
Loading…
avoid extra copy in PackedGemmMatrixB constructor
cla signed
fb-exported
#3691
opened Feb 13, 2025 by
helloguo
Loading…
[fbgemm_gpu] Increase timeout for ARM nova jobs
cla signed
#3690
opened Feb 13, 2025 by
q10
Loading…
adding an option to skip zeroing output tensor for f8f8bf16_rowwise_grouped_dynamic
cla signed
fb-exported
#3685
opened Feb 13, 2025 by
mxz297
Loading…
Add D_folded support for jagged_to_padded_dense_backward meta function
cla signed
fb-exported
#3670
opened Feb 8, 2025 by
brad-mengchi
Loading…
Adding Missing includes and explicitly declaring Tensor in aten namespace.
cla signed
fb-exported
#3638
opened Jan 30, 2025 by
pradeepfn
Loading…
Partial revert of D66986498 (Optimized backward pass for ROCm devices, pt 1), 2nd attempt
ciflow/rocm
cla signed
fb-exported
module: rocm
#3637
opened Jan 29, 2025 by
q10
Loading…
avoid using warning tensor in cpu tbe op
cla signed
fb-exported
#3631
opened Jan 29, 2025 by
842974287
Loading…
Update bf16i4 gemm with new cutlass version
cla signed
fb-exported
#3630
opened Jan 29, 2025 by
jwfromm
Loading…
finish #1808 cherry-pick, adjust interface
cla signed
fb-exported
#3627
opened Jan 28, 2025 by
coconutruben
Loading…
Re-land D67407935 (Optimized backward pass for ROCm devices, pt 2)
ciflow/rocm
cla signed
fb-exported
module: rocm
#3619
opened Jan 27, 2025 by
q10
Loading…
Performance Optimization: Optimized TileShape Configuration for f8
cla signed
#3617
opened Jan 27, 2025 by
MatrixAssembler
Loading…
Replace runners prefix amz2023. (#2895)
cla signed
fb-exported
module: rocm
#3612
opened Jan 24, 2025 by
q10
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.