-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path11_sharing.qmd
431 lines (316 loc) · 17.8 KB
/
11_sharing.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
---
title: "Sharing and Archiving Qualitative Data"
editor:
markdown:
wrap: 72
---
## Why Sharing Qualitative Data?
Sharing qualitative data benefits both the scholarly community and
researchers in several ways:
1. Fostering Public Trust: Transparency enhances public confidence in
research outcomes, vital for securing funding and support for future
projects. It allows for verification of claims, reinforcing trust in
the research.
2. Dynamic Research Environment: While qualitative research invites
diverse interpretations, sharing data fosters improved research
quality through collaborative critique and examination.
3. Enabling New Research: Access to shared data inspires innovative
analyses, maximizing the scientific value of existing studies.
4. More Effective Use of Resources: Data sharing reduces costs related
to new data collection, promoting efficient resource utilization,
and minimizing the burden on frequently targeted communities.
5. Skill Development for Trainees: It offers students valuable
opportunities to learn coding and analysis techniques, enhancing
their educational experience.
6. Receiving Credit: Sharing data ensures proper attribution, allowing
researchers to gain recognition for their work.
7. Opportunities for Collaboration: Open data fosters partnerships
among researchers, leading to new insights and advancements.
## Sharing with Caring
When sharing data, researchers should make their best effort to provide
complete and good quality documentation to support reuse.
Before we dive into what researchers should share and where. Let's
explore something together.
::: {.callout-note collapse="true" icon="false"}
# 💭 **Discussion:** Comparing Data Deposits
Please open the links to the two data deposits below:
Taherzadeh, O., 2016, "Interview Transcripts", *Interview Transcripts*,
<https://doi.org/10.7910/DVN/4C9KFK/XRREIY>, Harvard Dataverse, V1
Klein, M., 2022. *Interview transcripts of addiction therapists and
recovering drug service users.* Bath: University of Bath Research Data
Archive. Available from: <https://doi.org/10.15125/BATH-01096>{target='_blank'}.
Can you spot any differences? Supposing those were both topics related
to your research, how likely would you be to reuse one dataset versus
another? Why?
::: {collapse="true"}
**Context and Documentation**
- Taherzadeh (2016): This deposit lacks detailed contextual
information about the study, such as the sample, interview
questions, study goals, or informed consent details. It is a
standalone collection of transcripts.
- Klein (2022): This deposit provides clearer context, the objectives
of the research and questions asked, and links to the associated
dissertation.
**Reuse Value**
Taherzadeh (2016) Dataset: Low
- The absence of context and supporting documentation makes it
challenging to assess the dataset’s validity, reliability, and
relevance to other research. Without knowing the background or how
the data was collected, it's difficult to justify its use in further
studies.
Klein (2022): Higher
- The dataset seems to come with comprehensive documentation,
including context about the participants and the study goals. This
information facilitates a better understanding of how to apply the
data effectively in new research, making it much more reusable.
:::
:::
## Considerations on What to Share
Remember when we discussed the importance of outlining data-sharing
plans in Data Management Plans (DMPs)? At this stage, Sarah could
greatly benefit from having a clear strategy for archiving and storing
her data. As we discussed earlier, understanding the available options
and having at least a rough plan for what will be shared, along with
strategies to facilitate the process, is very important. We provided
Sarah with recommendations on what to document, and we hope this
guidance will empower her to share her research deliverables confidently
while adhering to key principles of open practices.
Also, it is important to recap the importance of balancing the value of
open sharing against the risks of harm associated with the
identification of participants, communities, and research sites. The
good news is that there are more options in between data being closed
and open!
Depending on your project needs and what was agreed in the informed
consent, we recommend you to consider evaluating access control options,
they will help you determine which data repository will be most suitable
for storing and preserving your project data.
### Access Control Questions
Access controls fall into three main categories:
- *Who* can access your data? Access may be limited to qualified
researchers, often requiring proof of interest through research
proposal, or it may require pre-approval from an Institutional
Review Board (IRB) for general requests.
- *How* can others access your data? Secure internet connections,
along with agreements regarding data storage and destruction, might
be required for downloading data. Researchers may sometimes need to
access data in person on a secure, offline computer. Hybrid
solutions, like ICPSR’s “virtual enclave,” allow remote viewing
without data leaving the server.
- *When* can others access your data? Embargoes can temporarily
restrict access to protect human participants, often allowing
researchers to publish findings before broader access. These
embargoes can also facilitate long-term data availability, with set
dates for lifting restrictions, as seen in historical archives.
### Sharing Levels
- Openly available: data (typically de-identified) shared with no
restrictions.
Example: Cunningham, Una; De Brún, Aoife; Mayumi, Willgerodt et al.
(2021). Appendices interview formats \[Dataset\]. Dryad.
<https://doi.org/10.5061/dryad.q83bk3jg8>{target='_blank'}
- Subject to Embargo: a temporary restriction on sharing or publishing
data. It means that the data can’t be made public for a set period,
usually to protect sensitive information allow for further analysis,
or wait for a specific event, such as a formal publication before
releasing it.
Example: Ibitoye, Mobolaji; OlaOlorun, Funmilola; Casterline, John B..
2025. "Demand for Modern Contraception in Sub-Saharan Africa: New
Methods, New Evidence". Qualitative Data Repository.
<https://doi.org/10.5064/F600CMLO>{target='_blank'}. QDR Main Collection. V1
- Closed Access/Metadata Record Only (sensitive data/no consent): a
summary and description of a dataset without containing the actual
data itself that provides essential information about the dataset's
provenance, structure, and context.
Depending on the research case, access can be provided through a Data
Use Agreement (DUA) and involve a data enclave for safe access. These
requirements will also depend on IRB and consent form agreements.
- Data Use Agreement (DUA) required: a contract that outlines the
terms and conditions for a recipient to use data from a data owner.
It's specific to a project or study and can include limitations on
use, data safeguarding obligations, and privacy rights. Some
supplementary files (i.e., codebooks, data collection instrument,
selected processed data to reproduce specific figures or support
some findings).
Example: Steeves, Vicky; Peltzman, Shira; Kim, Julia; Griesinger, Peggy;
Blumenthal, Karl-Rainer. 2020. "Data for: "What’s Wrong with Digital
Stewardship: Evaluating the Organization of Digital Preservation
Programs from Practitioners’ Perspectives". Qualitative Data Repository.
<https://doi.org/10.5064/F6DJRPLK>{target='_blank'}.
::: {.callout-note collapse="true" icon="false"}
# 💭 **Discussion:** What is the value of sharing a metadata record only?
A metadata-only record for research data that isn't openly available
enables readers to evaluate whether they want to request access quickly.
While a well-crafted Data Availability Statement in journal papers
serves a similar purpose, a metadata-only record in a suitable
repository offers the benefit of being discoverable through data-focused
searches, along with the ability to provide more detailed descriptions
through rich, linked, and interoperable metadata.
:::
### A Note About DAS
Data Availability Statements (DAS) are crucial for the credibility of
manuscripts and other published research. They provide interested
readers—and sometimes automated algorithms—access to the underlying data
supporting your claims, allowing them to verify those assertions or use
the data for further research. We suggest following some best practices
for crafting statements that are both effective and clear while also
complying with funders' and journal policies' requirements.
<iframe width="50%" height="800" src="https://rcd.ucsb.edu/sites/default/files/2024-02/DLS-202402-DataAvailability_navy.pdf">
</iframe>
Source: UCSB Library Data Literacy Series
([perma.cc/3ZHR-6JAG](https://perma.cc/3ZHR-6JAG){target='_blank'})
### Applying Access Controls
Implementing access controls involves a trade-off: while stricter
controls reduce misuse risk, they can hinder beneficial access. Though
powerful, they should not unnecessarily complicate access to low-risk
data. As the principal steward of your data, you ultimately decide on
access controls. However, it’s advisable to involve repository staff in
this process, as they can highlight potential challenges, ensuring that
your data remains accessible and ethically shared in the long run.
- Sharing de-identified transcripts openly while placing recordings
under more stringent access controls.
- Do keep a list of de-identification rules for yourself and your team
should you collaborate. This list serves as necessary documentation
when you share your data. See, for example, the protocol Thad
Dunning and Edward Camp used to de-identify data deposited with the
Qualitative Data Repository. This document is separate from the key
that links de-identified entries to the individuals or entities
interviewed, which should not be included when sharing your data.
- Do check the document properties of files, which may contain
identifiers such as original file names identifying interview
respondents.
- Finally, do try to strike a balance between keeping your
participants’ information confidential and unnecessarily reducing
the analytic value of the data by removing too much information. If
you are having difficulties striking that balance, you could ask
another subject-matter expert for assistance; some repository
personnel or data librarians can also provide abstract rules that
you can follow.
### What data?
ICPSR's Guide for Sharing Qualitative Data outlines examples of
qualitative data sources that may be archived for secondary analysis:
• Interview methods, including those captured through notes, audio, and
video
- In-depth and/or unstructured interviews
- Semi-structured interviews
- Focus group interviews
• Diary studies that are unstructured or use semi-structured writing
prompts
• Observational studies that generate field notes and other text and
information
- Naturalistic observation of real-world environments (e.g.,
classrooms, workplaces, healthcare facilities, courtrooms, public
spaces)
- Participant observation, where the researcher becomes an active part
of the setting to collect information (e.g., online gaming,
community policing, nightclub culture)
- Structured observation is where the research has predefined
objectives and a systemic approach to collecting information. This
would include case studies.
• Text from available sources
- Meeting minutes
- Official records Medical records
- News sources and social media
- Excerpts of copyrighted materials (e.g., literature, film, music)
• Survey methods or questionnaires with substantial open-ended comments
### Open formats
Why should we prioritize open file formats in our research? Imagine
sharing your groundbreaking findings and ensuring that anyone, anywhere,
can access and build upon your work without running into compatibility
issues. Open formats, offer exactly that—freedom from proprietary
software constraints. By choosing open formats, you enhance
collaboration and transparency and make your research more sustainable
for others and your future self.
There is a diversity of open formats available across different types of
media that can be of great use to qualitative data researchers,
including audio, video, image, and text. Refer to the handout below for
some examples:
<iframe width="50%" height="800" src="https://rcd.ucsb.edu/sites/default/files/2023-03/dls-n07-2021-openformats-navy.pdf">
</iframe>
Source: UCSB Library Data Literacy Series
([perma.cc/W4FL-JDFT](https://perma.cc/W4FL-JDFT){target='_blank'})
### Where Should You Share Your Project Data?
The decision of where to archive data is crucial for ensuring its
accessibility, integrity, and long-term preservation. Selecting a
stable, certified repository not only safeguards the data against loss
or corruption but also enhances its credibility and usability within the
research community. Unlike sharing via email, personal communication, or
unsecured websites—methods that can lead to data loss, miscommunication,
and lack of traceability—certified repositories provide a structured and
secure environment for data management.
Such repositories adhere to rigorous standards for data storage and
access, ensuring that shared data remains discoverable, citable, and
protected over time. By thoughtfully choosing the right repository,
researchers can maximize the impact of their work, facilitate
reproducibility, and contribute to the advancement of knowledge across
various fields.
Beyond support to access controls when required, choosing a repository
to archive QHS data, should take into account several factors laid out
in the handout below:
<iframe width="50%" height="800" src="https://rcd.ucsb.edu/sites/default/files/2023-03/dls-n05-2021-dr-navy.pdf">
</iframe>
Source: UCSB Library Data Literacy Series
([perma.cc/WLF7-WTUC](https://perma.cc/WLF7-WTUC){target='_blank'}).
### Preparing Your Data for Submission
There are a few required and recommended files that are important to be
added to your project package submission.
Required:
- Processed de-identified data (e.g., transcripts);
- Coded Data (supporting excerpts);
- README File: an overview of your project, including data sources,
their relationships and a brief description of the methods. Here is
a [customizable README
template](https://zenodo.org/records/10828379){target='_blank'};
- Data Collection Instruments: A sample of instruments used for data
collection, such as surveys or interview guides;
- Codebook: the coding framework used, including definitions of codes
and categories;
Recommended:
- Informed consent statement(s), if applicable;
- IRB protocol, if applicable;
- Study protocol or procedures manual, if applicable.
<iframe width="50%" height="800" src="https://rcd.ucsb.edu/sites/default/files/2024-07/DLS-202407-QualDataSharing.pdf">
</iframe>
Source: UCSB Library Data Literacy Series
([perma.cc/E7BA-BBYE](https://perma.cc/E7BA-BBYE){target='_blank'}).
### Licensing Your Data
Research data itself is generally not copyrightable because it consists
of facts, figures, and raw information that cannot be considered
original creative expression. Copyright protects the unique expression
of ideas, such as written texts, artwork, and music, rather than the
underlying data or factual content.
Most data repositories adhere to open licenses such as CC0 (Creative
Commons Zero) or CC BY (Creative Commons Attribution) to encourage broad
accessibility and reuse of data. These licenses promote the free sharing
of knowledge, allowing researchers and practitioners to utilize, modify,
and redistribute data without significant restrictions, ultimately
fostering collaboration and innovation within the scientific community.
However, researchers may choose to assign different licenses to other
creative deliverables and supplementary materials associated with their
projects, such as reports, presentations, or multimedia content. For
example, Sarah might opt for a CC BY-NC (Attribution-NonCommercial)
license for a infographic she created to represent the ethical
approaches in social media influencing market, to restrict its use for
commercial purpose. This flexibility allows Sarah and the research
community at large to balance openness with the need to protect specific
aspects of their intellectual property while still contributing to the
collective body of knowledge.
The handout below provides more insights about licenses, including the
Creative Commons family:
<iframe width="50%" height="800" src="https://rcd.ucsb.edu/sites/default/files/2023-03/dls-n10-2021-licensing-navy_0.pdf">
</iframe>
Source: UCSB Library Data Literacy Series
([perma.cc/ET6F-N84X](https://perma.cc/ET6F-N84X){target='_blank'}).
------------------------------------------------------------------------
**Recommended/Cited Sources:**
Campbell R, Javorka M, Engleton J, Fishwick K, Gregory K,
Goodman-Williams R. Open-Science Guidance for Qualitative Research: An
Empirically Validated Approach for De-Identifying Sensitive Narrative
Data. *Advances in Methods and Practices in Psychological Science*.
2023;6(4).
doi:[10.1177/25152459231205832](https://doi.org/10.1177/25152459231205832){target='_blank'}
Myers CA, Long SE, Polasek FO. Protecting participant privacy while
maintaining content and context: Challenges in qualitative data
De-identification and sharing. ProcAssoc Inf Sci Technol. 2020;57:e415.
<https://doi.org/10.1002/pra2.415>{target='_blank'}
DuBois, J. M., Strait, M., & Walsh, H. (2018). Is it time to share
qualitative research data?*Qualitative Psychology, 5*(3), 380–393.
<https://doi.org/10.1037/qup0000076>{target='_blank'}