Skip to content

Activity

[AMD] Update smem size for cdna4

sjw36created sjw/cdna4_smem_size • 6e2fa40 • 
1 hour ago

* moved Membar to allocate-shared-memory

Force push
sjw36force pushed to sjw/pipeline-fa • 2f4520c…8b258c5 • 
2 hours ago

[AMD] Use warp shuffle for fp8 MFMA to dot operand layout conversion (t…

vgokhalecreated improve_fa_decode_3.0.0_wip • 7fdc865 • 
5 hours ago

Force warpsPerCTA={1, numWarps} when BLOCK_M=mDim

vgokhalecreated improve_fa_decode_3.0.0_backup • 5a34085 • 
5 hours ago

Deleted branch

antiagainstdeleted robust_sinkSecondLoad • 
9 hours ago

Skip scalar and 1D tensor load for sinkSecondLoad

Force push
zhanglx13force pushed to robust_sinkSecondLoad • a8a0467…64d9cfe • 
11 hours ago

Skip scalar and 1D tensor load for sinkSecondLoad

zhanglx13created robust_sinkSecondLoad • a8a0467 • 
11 hours ago

Merge pull request #721 from ROCm/dtanner/dev-refine-ops

Pull request merge
guacamoleopushed 6 commits to refine-ops-pass • 6a4329b…5d52dfe • 
13 hours ago

[AMD] enhanced dep-graph printing

ravil-mobilepushed 1 commit to ravil/dev-refine-ops-pass • 80f417c…535c117 • 
15 hours ago

Add workaround for pytorch device selection issue (#711)

Chi-Chu319created jukorhon/persistent-MLA-compiler-fix • eb7e015 • 
15 hours ago

[AMD] re-worked scheduling loop in the machine model

ravil-mobilepushed 1 commit to ravil/dev-refine-ops-pass • 045d5c5…80f417c • 
15 hours ago

[AMD] Addressed some comments from PR #724 (fork)

ravil-mobilepushed 1 commit to ravil/dev-refine-ops-pass • b053a5b…045d5c5 • 
19 hours ago

fix

juuso-oskaripushed 1 commit to jukorhon/persistent-MLA_improved_fa_decoding_3.0.0 • 1a19a5f…3e81afd • 
20 hours ago

fix

juuso-oskaripushed 2 commits to jukorhon/persistent-MLA_improved_fa_decoding_3.0.0 • dd41460…1a19a5f • 
20 hours ago

persistent approach brings benefit with larger batch sizes with this …

Deleted branch

Deleted branch

Deleted branch

juuso-oskarideleted jukorhon/fix-bias-test • 
21 hours ago

Deleted branch

Deleted branch

juuso-oskarideleted jukorhon/assembly-atomic-add • 
21 hours ago

Deleted branch

juuso-oskarideleted jukorhon/rope-fuse-MLA • 
21 hours ago

Deleted branch

Deleted branch

Deleted branch

Deleted branch

Deleted branch

Deleted branch

Deleted branch

antiagainstdeleted sjw/global-local-prefetch • 
yesterday

Addressing review improvements.

guacamoleopushed 1 commit to dtanner/dev-refine-ops • 19b9017…3b50cce • 
yesterday

Merge remote-tracking branch 'origin/main' into sjw/global-local-pref…

Force push
antiagainstforce pushed to sjw/global-local-prefetch • 9411b65…744357d • 
yesterday