Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-17207 daos: upgrade to SPDK 24.09 #16013

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

DAOS-17207 daos: upgrade to SPDK 24.09 #16013

wants to merge 6 commits into from

Conversation

wangdi1
Copy link
Contributor

@wangdi1 wangdi1 commented Mar 4, 2025

Upgrade to SPDK24.09

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

Upgrade to SPDK24.09

Signed-off-by: Di Wang <[email protected]>
Signed-off-by: Jeff Olivier <[email protected]>
@wangdi1 wangdi1 requested review from a team as code owners March 4, 2025 04:03
Copy link

github-actions bot commented Mar 4, 2025

Ticket title is 'Update SPDK to 24'
Status is 'In Progress'
Errors are Unknown component
https://daosio.atlassian.net/browse/DAOS-17207

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/1/execution/node/332/log

fix style

Signed-off-by: Di Wang <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/1/execution/node/307/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/2/execution/node/333/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/2/execution/node/334/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/2/execution/node/328/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/2/execution/node/337/log

wangdi1 added 2 commits March 4, 2025 06:37
fix style and add those binary in app

Signed-off-by: Di Wang <[email protected]>
fix style issue.

Signed-off-by: Di Wang <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/4/execution/node/337/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/4/execution/node/284/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/4/execution/node/281/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/4/execution/node/306/log

Copy link
Contributor

@tanabarr tanabarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some missing specfile and Debian packaging changes that will be needed for this update.

['cp', 'build/examples/nvme_manage', '$SPDK_PREFIX/bin/spdk_nvme_manage'],
['cp', 'build/examples/identify', '$SPDK_PREFIX/bin/spdk_nvme_identify'],
['cp', 'build/examples/perf', '$SPDK_PREFIX/bin/spdk_nvme_perf']],
['cp', 'build/examples/nvme_manage', '$SPDK_PREFIX/bin/spdk_nvme_manage']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove perf and identify apps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perf and identify has been moved under app, since I remove --disable-apps, they should be included.

@@ -26,7 +26,6 @@ protobufc=https://github.com/protobuf-c/protobuf-c.git
ucx=https://github.com/openucx/ucx.git

[patch_versions]
spdk=https://github.com/spdk/spdk/commit/b0aba3fcd5aceceea530a702922153bc75664978.diff,https://github.com/spdk/spdk/commit/445a4c808badbad3942696ecf16fa60e8129a747.diff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure both of these commits are in v24.09?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I checked.

@@ -1617,13 +1702,14 @@ bio_xsctxt_alloc(struct bio_xs_context **pctxt, int tgt_id, bool self_polling)

/* Initialize all registered subsystems: bdev, vmd, copy. */
common_prep_arg(&cp_arg);
spdk_subsystem_init_from_json_config(nvme_glb.bd_nvme_conf,
SPDK_DEFAULT_RPC_ADDR,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the same default RPC address be used with these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tanabarr
Copy link
Contributor

tanabarr commented Mar 4, 2025

the normal procedure is to update the daos/spdk packaging branch first IIRC

subsystem_init_done(int rc, void *arg)
{
subsystem_init_arg_fini(arg, rc);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this callback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To free json_data basically.

};

static void
subsystem_init_arg_fini(void *arg, int rc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, why do we need this callback? Can't we just modify existing subsys_init_cb()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, indeed, we can use that cb. Thanks.

init_arg.json_data = json_data;
init_arg.json_data_size = json_data_size;
spdk_subsystem_load_config(json_data, (ssize_t)json_data_size, load_config_cb, &init_arg,
true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I follow this, can't we call spdk_subsystem_init() directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the RPC state has to be set to RUN_TIME, then you can run those aio_create cmd in the config jason file. So the first load_config() is to running those methods bdev_aio_create(), then set the RPC state as RUN_TIME explicitly, then do second load_config(). In the previous version(22.01), you can do that in a single API, but they remove that in the current version, which make the process a bit strange indeed.

"subsystems": [
{
"subsystem": "bdev",
"config": [
{
"params": {
"bdev_io_pool_size": 65536,
"bdev_io_cache_size": 256
},
"method": "bdev_set_options"
},
{
"params": {
"retry_count": 4,
"timeout_us": 0,
"nvme_adminq_poll_period_us": 100000,
"action_on_timeout": "none",
"nvme_ioq_poll_period_us": 0
},
"method": "bdev_nvme_set_options"
},
{
"params": {
"enable": false,
"period_us": 0
},
"method": "bdev_nvme_set_hotplug"
},
{
"params": {
"block_size": 4096,
"name": "AIO_1",
"filename": "/tmp/aio_file"
},
"method": "bdev_aio_create"
}
]
}
]
}

Use subsys_init_cb directly.

Signed-off-by: Di Wang <[email protected]>
@wangdi1 wangdi1 requested a review from tanabarr March 6, 2025 18:07
@wangdi1 wangdi1 requested a review from NiuYawei March 6, 2025 18:07
@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/5/execution/node/345/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/5/execution/node/310/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/5/execution/node/323/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/5/execution/node/307/log

Update the function and fix more jason file.

Signed-off-by: Di Wang <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/6/execution/node/337/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/6/execution/node/342/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/6/execution/node/345/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-16013/6/execution/node/338/log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants