-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CSIT-1948] NICs do not consistently distribute tunnels over RXQs depending on model or plugin #4030
Comments
Still present around rls2410, here [9] is a simpler trending link. |
Still present on rls2406. |
The only other combination I found (in rls2310 iterative results) is this that |
The pattern seems to be opposite for SRv6 tests. [6] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s6-s2-t3-k2-k9-k9-k14-k1-k1-k1-k1 |
Currently I believe it is better to control RSS behavior from VPP side using existing APIs (instead of relying on preparation before VPP starts). We already have suites that do that (see lines 121-126 in [5]), we just need to make sure those steps work (or at least do not fail) on all tested nic+driver combinations. |
> two-band structure for 4c AVF tests caused by which testbed got reserved That seems to be an unrelated issue. The telemetry from worse run [3] shows vpp_wk_0 (on either DUT) did not get 256 packets when reading from TG side (did get from DUT side), but better run [4] did (and also other workers did read from TX side in larger chunks). Still, both issues may be related to some NIC configuration, so may get fixed together. [3] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s1-s20-t3-k2-k13-k9-k14-k1-k1-k1-k1 |
Description
For most ipsec tests on 3n-icx, AVF puts all tunnels into a single RXQ, but dpdk plugin distributes them fairly.
Telemetry showing 40tnl test on AVF [0], compared to Cx6dx with mlx5 driver (dpdk plugin) [1].
Current trending [2] shows this issue.
(And also two-band structure for 4c AVF tests caused by which testbed got reserved, which is probably not related to this RSS issue).
[0] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s1-s20-t1-k2-k13-k9-k14-k1-k1-k1-k1
[1] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s1-s8-t1-k2-k13-k9-k14-k1-k1-k1-k1
[2] https://csit.fd.io/trending/#eNrtlMGKwjAQhp-me5GBpjbWiwe177HE6aiBpo1JVluffmN3YSqLsOjCHvSSQP7J_H8-hvjQOnr3VC8SuUqKVZIVuopLMl1O4na0FqYNaOxApOmOMiuwm1UdmLqTgK63oQUhxXwDAoHCXttcW08Yq9PQ1P4E8WSjPIFuAijymZzt0IBx7uKSrS8u1Ue4smTF7ntWbgbheuVI8YWvfKwG8iOj36flDlunDHl9Jm4zvJ4rMPIciXjtHno7Ur8hFOVQ8Qj8_AX_J_z8j-HTXKR4AHXc_vPkc5Dnmfxb8PMX_PsnX5ZvTevM8PfL8hMROrZ7
Assignee
Unassigned
Reporter
Vratko Polak
Comments
[9] https://csit.fd.io/trending/#eNrVUstuwjAQ_Jr0Uq0UB9xw6aGQ_0COsxCreSy7BhS-vo5byeFQCW7txfZ6dnbGI4sfGfeC3Xumt1m5zYrSNWHJVh-vYbsQwWqoQYhB5fkRC1K4Ubk9gbkcwPJEfgSl1aYGZQF962jtSNCu_dDJFUJZG0FwgweDUui3o-2hZ541it2s0Zz9nWBCqJ0S8ruNRDCMJjG-3SXUoyyUHvSa6Ac2PYq7YZoRH546bMhyAdp7aT_RAv2JoKxixzPBN9R8_oXkZx__PHpdvQwj9_Hvh7O04xU8O9NJvArkWMzduvoC9nnofw
The only other combination I found (in rls2310 iterative results) is this that
3n-tsh Intel-X520 dpdk plugin SRv6 is bad: [8].
[8] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-iterative-2310-3n-tsh/22/log.html.gz#s1-s1-s1-s6-s2-t3-k2-k9-k22-k14-k1-k1-k1-k1
The pattern seems to be opposite for SRv6 tests.
Here, mlx5 suites show bad worker distribution [6], but avf suites are good [7].
Not sure yet if it is possible to hardcode a default RSS hash function in VPP that would work for all encap/decap tests at once.
[6] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s6-s2-t3-k2-k9-k9-k14-k1-k1-k1-k1
[7] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s6-s8-t3-k2-k9-k9-k14-k1-k1-k1-k1
[5] https://gerrit.fd.io/r/c/csit/+/36119/8/tests/vpp/perf/crypto/10ge2p1x710-ethip4ipsec1000tnlsw-fixtnlip-ip4base-policy-flow-rss-aes256gcm-ndrpdr.robot
That seems to be an unrelated issue. The telemetry from worse run [3] shows vpp_wk_0 (on either DUT) did not get 256 packets when reading from TG side (did get from DUT side), but better run [4] did (and also other workers did read from TX side in larger chunks).
Still, both issues may be related to some NIC configuration, so may get fixed together.
[3] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/433/log.html.gz#s1-s1-s1-s1-s20-t3-k2-k13-k9-k14-k1-k1-k1-k1
[4] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/432/log.html.gz#s1-s1-s1-s1-s20-t3-k2-k13-k9-k14-k1-k1-k1-k1
Original issue: https://jira.fd.io/browse/CSIT-1948
The text was updated successfully, but these errors were encountered: