Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton Fusion Emitter for Int4 Test Failed #2822

Open
amd-jianli12 opened this issue Jan 21, 2025 · 6 comments
Open

Triton Fusion Emitter for Int4 Test Failed #2822

amd-jianli12 opened this issue Jan 21, 2025 · 6 comments

Comments

@amd-jianli12
Copy link
Collaborator

amd-jianli12 commented Jan 21, 2025

[  FAILED  ] TritonTest.DotWithI4WeightsOnLhsWithBitcastTo3dTensor
[  FAILED  ] TritonTest.DotWithI4WeightsOnLhsWithNonStandardLayoutAndMultplyInEpilogue
[  FAILED  ] TritonTest.LHSWithMinorDimEqualTo1
[  FAILED  ] TritonTest.RHSWithMinorDimEqualTo1
[  FAILED  ] TritonTest.LHSNonMinorContractingDim
[  FAILED  ] TritonTest.LHSNonMinorContractingDimWithBatchDim0
[  FAILED  ] TritonTest.LHSMinorContractingDim
[  FAILED  ] TritonTest.ConvertPlusNegate
[  FAILED  ] TritonTest.LHSMinorContractingDimWithBatchDim0
[  FAILED  ] TritonTest.RHSTestWithNotMinorContractingDim
[  FAILED  ] TritonTest.RHSTestWithMinorContractingDim
[  FAILED  ] TritonTest.RHSTestWithMinorContractingDimWithBatchDim
[  FAILED  ] TritonTest.RHSTestWithNotMinorContractingDimWithBatchDim0
@amd-jianli12 amd-jianli12 changed the title Triton Fusion Emitter for Int4 failed Triton Emitter for Int4 failed Jan 21, 2025
@amd-jianli12
Copy link
Collaborator Author

#2821

@amd-jianli12
Copy link
Collaborator Author

[2025-01-21T08:08:50.815Z] [ RUN      ] TritonTest.DotWithI4WeightsOnLhsWithBitcastTo3dTensor
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:22.660053: W external/local_xla/xla/service/compiler.h:209] Ignoring the buffer assignment proto provided.
[2025-01-21T08:08:50.815Z] WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
[2025-01-21T08:08:50.815Z] W0000 00:00:1737446902.988020   94721 fusion_emitter_legacy_matmul.cc:1946] Using fallback triton GEMM config for op dot.1
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:23.127395: W external/local_xla/xla/backends/gpu/codegen/triton/fusion.cc:158] Using fallback triton GEMM config for op dot.1
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:23.131201: I external/local_xla/xla/service/llvm_ir/llvm_command_line_options.cc:50] XLA (re)initializing LLVM with options fingerprint: 17422529330458003092
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.662952: W external/local_xla/xla/service/compiler.h:209] Ignoring the buffer assignment proto provided.
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.669897: I external/local_xla/xla/tests/literal_test_util.cc:56] expected: bf16[4,16,32] [TRUNCATED, Literal with more than 1000 values]
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.669911: I external/local_xla/xla/tests/literal_test_util.cc:58] actual:   bf16[4,16,32] [TRUNCATED, Literal with more than 1000 values]
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.669916: I external/local_xla/xla/tests/literal_test_util.cc:60] Dumping literals to temp files...
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.670435: E external/local_xla/xla/tests/literal_test_util.cc:46] wrote Literal to expected file: /root/.cache/bazel/_bazel_root/fbac33eb30dbfb6b11b15a7ff5ac830d/execroot/org_tensorflow/bazel-out/k8-opt/testlogs/external/local_xla/xla/backends/gpu/codegen/triton/fusion_emitter_int4_device_test_gpu_amd_any/test.outputs/tempfile-1737446904669921-expected.{pb,txt}
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.670858: E external/local_xla/xla/tests/literal_test_util.cc:46] wrote Literal to actual file: /root/.cache/bazel/_bazel_root/fbac33eb30dbfb6b11b15a7ff5ac830d/execroot/org_tensorflow/bazel-out/k8-opt/testlogs/external/local_xla/xla/backends/gpu/codegen/triton/fusion_emitter_int4_device_test_gpu_amd_any/test.outputs/tempfile-1737446904670441-actual.{pb,txt}
[2025-01-21T08:08:50.815Z] 2025-01-21 08:08:24.671023: E external/local_xla/xla/tests/literal_test_util.cc:46] wrote Literal to mismatches file: /root/.cache/bazel/_bazel_root/fbac33eb30dbfb6b11b15a7ff5ac830d/execroot/org_tensorflow/bazel-out/k8-opt/testlogs/external/local_xla/xla/backends/gpu/codegen/triton/fusion_emitter_int4_device_test_gpu_amd_any/test.outputs/tempfile-1737446904670863-mismatches.{pb,txt}
[2025-01-21T08:08:50.815Z] external/local_xla/xla/backends/gpu/codegen/triton/fusion_emitter_int4_device_test.cc:114: Failure
[2025-01-21T08:08:50.815Z] Value of: RunAndCompareNoHloPasses( kHloText, ErrorSpec{ 1e-5, 1e-5})
[2025-01-21T08:08:50.815Z]   Actual: false (
[2025-01-21T08:08:50.815Z] Mismatch count 2045 (99.8535%) in shape bf16[4,16,32] (2048 elements), abs bound 1e-05, rel bound 1e-05
[2025-01-21T08:08:50.815Z] Top relative error mismatches:
[2025-01-21T08:08:50.815Z]   actual       -9.188, expected    0.0005493, index {0,8,21}, rel error 1.67e+04, abs error     9.19
[2025-01-21T08:08:50.815Z]   actual       -16.38, expected     0.001839, index {3,9,4}, rel error 8.91e+03, abs error     16.4
[2025-01-21T08:08:50.815Z]   actual       -18.25, expected     0.002411, index {3,7,28}, rel error 7.57e+03, abs error     18.2
[2025-01-21T08:08:50.815Z]   actual        -18.5, expected     0.004028, index {3,7,3}, rel error 4.59e+03, abs error     18.5
[2025-01-21T08:08:50.815Z]   actual        -18.5, expected    -0.007935, index {3,3,3}, rel error 2.33e+03, abs error     18.5
[2025-01-21T08:08:50.815Z] Absolute magnitude breakdown of actual values:
[2025-01-21T08:08:50.815Z]   0      <= x < 0.0001 :       0 (  0.0000%)
[2025-01-21T08:08:50.815Z]   0.0001 <= x < 0.001  :       3 (  0.1465%), mismatches 3
[2025-01-21T08:08:50.815Z]   0.001  <= x < 0.01   :       1 (  0.0488%), mismatches 1
[2025-01-21T08:08:50.815Z]   0.01   <= x < 0.1    :      14 (  0.6836%), mismatches 14
[2025-01-21T08:08:50.815Z]   0.1    <= x < 1      :     168 (  8.2031%), mismatches 167
[2025-01-21T08:08:50.815Z]   1      <= x < inf    :    1862 ( 90.9180%), mismatches 1860
[2025-01-21T08:08:50.815Z] Elements exceeding abs error bound 1e-05: 2045 (99.8535%)
[2025-01-21T08:08:50.815Z] Relative error breakdown of elements exceeding abs error bound:
[2025-01-21T08:08:50.815Z]   <  0.0001 :       0 (0.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.0001 :    2045 (100.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.001  :    2045 (100.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.01   :    2044 (99.9511%)
[2025-01-21T08:08:50.815Z]   >= 0.1    :    1974 (96.5281%)
[2025-01-21T08:08:50.815Z]   >= 1      :    1411 (68.9976%)
[2025-01-21T08:08:50.815Z] Elements exceeding rel error bound 1e-05: 2045 (99.8535%)
[2025-01-21T08:08:50.815Z] Absolute error breakdown of elements exceeding rel error bound:
[2025-01-21T08:08:50.815Z]   <  0.0001 :       0 (0.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.0001 :    2045 (100.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.001  :    2045 (100.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.01   :    2045 (100.0000%)
[2025-01-21T08:08:50.815Z]   >= 0.1    :    2027 (99.1198%)
[2025-01-21T08:08:50.815Z]   >= 1      :    1834 (89.6822%)
[2025-01-21T08:08:50.815Z] 
[2025-01-21T08:08:50.815Z] 
[2025-01-21T08:08:50.815Z] Expected literal:
[2025-01-21T08:08:50.815Z] [TRUNCATED, Literal with more than 1000 values]
[2025-01-21T08:08:50.815Z] 
[2025-01-21T08:08:50.815Z] Actual literal:
[2025-01-21T08:08:50.815Z] [TRUNCATED, Literal with more than 1000 values])
[2025-01-21T08:08:50.815Z] Expected: true
[2025-01-21T08:08:50.815Z] [  FAILED  ] TritonTest.DotWithI4WeightsOnLhsWithBitcastTo3dTensor (2414 ms)

@amd-jianli12 amd-jianli12 changed the title Triton Emitter for Int4 failed Triton Fusion Emitter for Int4 failed Jan 21, 2025
@amd-jianli12 amd-jianli12 changed the title Triton Fusion Emitter for Int4 failed Triton Fusion Emitter for Int4 Test Failed Jan 21, 2025
@amd-jianli12
Copy link
Collaborator Author

@local_xla//xla/backends/gpu/codegen/triton:fusion_emitter_int4_device_test_gpu_amd_any

@amd-jianli12
Copy link
Collaborator Author

PR-2821.txt

@amd-jianli12
Copy link
Collaborator Author

Fix is under review at
openxla/xla#21845

For tensorflow-upstream
#2821

@amd-jianli12
Copy link
Collaborator Author

Further fix for TritonTest.NonstandardLayoutWithManyNonContractingDims*
#2826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant