You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The inference results on the GPU are significantly different from those on the NPU. We used the same code and set temperature=0 to ensure reproducibility. Additionally, the speed on NPU is significant lower than A800, even 4090. I want to know if this is normal?
Hi, vllm-ascend is still in progress. There are still some PRs need merge into vllm and vllm-ascend. If you hit the error in multi-card env, it's a known issue. See #16.
If you hit another error, please fill up with more content.
If you face the performance problem, we're working on it. Please wait more. Thanks.
The inference results on the GPU are significantly different from those on the NPU. We used the same code and set temperature=0 to ensure reproducibility. Additionally, the speed on NPU is significant lower than A800, even 4090. I want to know if this is normal?
vllm
: 0.7.2vllm-ascend
: latestGPU
: A800, 4090NPU
: 910b3A800:![Image](https://private-user-images.githubusercontent.com/66808901/411815138-a9f92434-3d0b-47a7-99cd-f45449b691de.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5NDE3MjEsIm5iZiI6MTczOTk0MTQyMSwicGF0aCI6Ii82NjgwODkwMS80MTE4MTUxMzgtYTlmOTI0MzQtM2QwYi00N2E3LTk5Y2QtZjQ1NDQ5YjY5MWRlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE5VDA1MDM0MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNlODkzNjNhNmU3ZTkzYTlhNzMxY2FiOGNhNmI5MDQyMTNlNTAyMDBkNjQyYjcwNmZiYzliZGUxY2EzMzhmZmMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.-wMI79uTS8y_0uZPCMVqps1VgP0ALrW3dpHiCmrUp1Q)
910b3:![Image](https://private-user-images.githubusercontent.com/66808901/411815200-ca7e8aec-86ac-4fdb-b45e-789b3d070d61.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5NDE3MjEsIm5iZiI6MTczOTk0MTQyMSwicGF0aCI6Ii82NjgwODkwMS80MTE4MTUyMDAtY2E3ZThhZWMtODZhYy00ZmRiLWI0NWUtNzg5YjNkMDcwZDYxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE5VDA1MDM0MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWM0YjNhZDZjMTVlYWVhODk0NTA0ZTM5YzIwNjM4MGM1Yzg5ZTA4YmNhMzQwODFiYmE4ZTZmYTVjNjE0OGJkMWMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.qBnlo9ZkAeN5A41Kqs5CBHuEyxqkEcC0QQ7Nxuf_7Ew)
One of inference result on 910b3 occured repeat, it never happend on other deivces.
![Image](https://private-user-images.githubusercontent.com/66808901/411815275-63d1aaa3-e6c3-40bd-980b-d7919f3f0370.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5NDE3MjEsIm5iZiI6MTczOTk0MTQyMSwicGF0aCI6Ii82NjgwODkwMS80MTE4MTUyNzUtNjNkMWFhYTMtZTZjMy00MGJkLTk4MGItZDc5MTlmM2YwMzcwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE5VDA1MDM0MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdhMzY1NDExNjgwNmI4NzU1NzQxNmZlMTU2YjZjNzM2MDczZDNlYjJhZTY1NGQwOWI3YTc4YjRhYTJmZDQ0YmUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.CC2OJo13OowePUiSlOHcWvu_xUho5HQjDHxpcCFhYFQ)
The text was updated successfully, but these errors were encountered: