Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving Data Between Fields in Dataverse 6.4 #11251

Open
Gerafp opened this issue Feb 12, 2025 · 4 comments
Open

Moving Data Between Fields in Dataverse 6.4 #11251

Gerafp opened this issue Feb 12, 2025 · 4 comments

Comments

@Gerafp
Copy link
Contributor

Gerafp commented Feb 12, 2025

Hello everyone,

We recently updated our Dataverse from version 5.10 to 6.4. The update added new fields to the "New Dataset" form that we want to take advantage of. We are looking to move the values from the "Controlled Vocabulary URL" field to the new field "Term URI" field.

Image

Image

We have no issues with recently created datasets, as our main interest is to move the values in older datasets (before the update).

Do you have any experience or recommendations for performing this task? We are looking for a way that does not involve creating new versions of the datasets.

Thank you in advance for your help!

Regards

@pdurbin
Copy link
Member

pdurbin commented Feb 12, 2025

@Gerafp are you aware of "Make Metadata Updates Without Changing Dataset Version"? Please see https://guides.dataverse.org/en/6.4/admin/dataverses-datasets.html#make-metadata-updates-without-changing-dataset-version

@qqmyers
Copy link
Member

qqmyers commented Feb 12, 2025

Alternately, I think you could use a variant of the sql update query in the release notes. The one there was moving entries from the keywordValue field to the new keywordTermURI field. To move the keywordVocabularyURI instead, you'd just change that part, e.g. to:

UPDATE datasetfield df
SET datasetfieldtype_id = (SELECT id FROM datasetfieldtype WHERE name = 'keywordTermURI')
FROM datasetfieldvalue dfv
WHERE dfv.datasetfield_id = df.id
AND df.datasetfieldtype_id = (SELECT id FROM datasetfieldtype WHERE name = 'keywordVocabularyURI')
AND dfv.value ILIKE 'http%';

Note that will move everything (that starts with 'http') for all datasets, so if you have ones where that isn't appropriate, using the Update Current Version mechanism might be easier, or you could adapt the query to be more specific, e.g. with the dfv.value ILIKE 'http://aims.fao.org/aos/agrovac%', etc.

@Gerafp
Copy link
Contributor Author

Gerafp commented Feb 12, 2025

Oh, thanks.

It seems that the second option is simpler due to the amount of data. I will also do the step from http to https in the process.

I have another question: does this change affect the templates?

@qqmyers
Copy link
Member

qqmyers commented Feb 12, 2025

Good question! I think the query as is would change templates, assuming the ILIKE query matches for them. I think you could distinguish if you want by checking to see if the datasetfield.datasetversion_id or template_id are null.

And just to repeat our general advice - when working in the db, make a backup and try things on a test machine first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants