A data donation study design consists of two elements:
Participant recruitment: How are you going to find and invite participants for your study?
Participant study flow: What do you want the participants to do and in what order?
You can consider consulting a methodologist, as they have extensive expertise on participant recruitment and designing a study flow.
Data donation can be implemented in both qualitative and quantitative studies. And a researcher can invite participants themselves or can collaborate with a participant recruitment or panel company to recruit participants. We advise you to consult literature on sampling designs and related challenges to find a fitting sampling strategy. See for example Frankel (1983), Hibberts et al. (2012), Leeuw et al. (2008), Lohr (2021), and Weisberg (2009).
In data donation studies, it is important to know that there can be misalignment between the persons that you invite for participation and the accounts of the platform that you are interested in. For example, it can be the case that a person does not have an account on that platform, or conversely has multiple accounts. Figure 3 illustrates how this misalignment can look like for a single platform to answer RQ2. To account for the misalignment in this example, you can allow participants to donate multiple Instagram DDPs if they have them. By not facilitating such an option, the collected data can be considered incomplete.
An important consideration is the use of incentives. Monetary incentives are considered best for keeping acceptable response rates in survey studies in general (Toepoel, 2012). Similar results are found for several versions of data donation studies (Kmetty et al., 2023; Silber et al., 2022), or for comparable data collection through smart phones (Haas et al., 2020). However, the desired amount depends on the study context and is still subject to research (Struminskaya, 2022).
Once the population and sample of interest are decided on, it is important to gain insight in the general characteristics of this population. Think of their age distribution, their proficiency with technology, but also aspects such as their opinion on research. Based on this, you might tailor characteristics of the participant flow to better facilitate this group. For example, you might need to make your instructions on how to perform a data access request available in multiple languages.
The exact elements of a participant flow and their order can differ per study. However, a common participant flow consists of the following elements:
- An invitation and information letter, explaining the purpose of the study and the data donation procedure for the participant.
- A consent form. Only a participant who signs the consent form can continue their participation.
- Potentially, additional data collection methods can be administered, such as a questionnaire.
- Instructions should be provided on how to perform a data access request and how to download the DDP once it’s ready at the platform of interest.
- Since participants sometimes need to wait a couple of days before the DDP is ready to be downloaded, and since the duration differs substantially per platform, we advise in some cases to send reminder messages in between.
- Once the participant has downloaded and stored the DDP on their own device, the local processing step can take place.
- Next, the participant can inspect the extracted data. Here, you can potentially give options to edit or delete (parts of) this data.
- Finally, the participant provides active consent to donate the extracted data (or declines).
- Potentially, a short questionnaire can be used to investigate how participants evaluate their data donation experience.
We have five recommendations for constructing a participant flow that will help to maximize the quality of the donated data:
Make sure the entire participant flow is easy to follow for participants. Prevent excessive burdens. Participants might drop out of the study if the requested steps are perceived as too demanding or difficult. One approach to accomplish a participant flow not to be perceived as too demanding, is by tailoring the elements of the participant flow towards the population of interest. For example, if the target population scores high on illiteracy, instructions and explanations for every step of the participant flow should have easy and accessible language.
Guarantee the privacy of the participant and communicate this clearly during the entire participant flow. In the invitation and consent form, rights of the participant and privacy assurance should be present. In addition, it can help to stress this explicitly at multiple moments during the participant flow. Participants not perceiving their privacy as guaranteed are more likely to not donate (Struminskaya, 2022; Struminskaya et al., 2020) or participate in data collection in general (Keusch et al., 2019; Sakshaug & Struminskaya, 2023). For instance, some participants in the WhatsApp pilot (example RQ3) and the GSLH study (example RQ1) indicated that they felt their privacy was not guaranteed during the data processing. For some participants, this formed a reason to not participate in the study. Therefore, investing in the communication of the process and the privacy during the study can be valuable.
Create clear and precise instructions on how to request and download the DDP. In creating the instructions for these steps, it should be noted that exact actions to obtain a DDP can differ over devices or over operating systems. Potential variations in these steps should be looked for, and should be taken into account in the instructions the participant is provided with. To illustrate, Figure 4 below, shows how steps in obtaining WhatsApp chat DDPs differ for Android and Apple devices.
Account for the time it takes for a platform to provide a DDP, and be conscious of expiration times of a DDP. Platforms are obliged to comply with the data access request within 30 days (European Union, 2016). However, platforms differ strongly in how fast they provide the DDPs to their users, ranging from minutes to multiple days. For example, WhatsApp account information takes three days to be ready, while Google DDPs can take between an hour and three days. Communication about the waiting time should be clear and correct. Well-timed reminders can help keep participants involved in the study.
Additionally, expiration times of DDPs should be taken into account. Platforms only make DDPs available after request for a limited amount of time. These expiration times again differ over platforms. For example, Facebook and TikTok DDPs are only available for four days when ready, while WhatsApp account information can be downloaded for 30 days once ready. If these expiration times are not considered properly in the participant flow, participants might fail to obtain their DDPs and have to request them again, which might cause these participants to drop out of the study.
- Consider the device the participant uses during the participant flow. Depending on the platform, DDPs can only be requested and or downloaded on specific devices (e.g. WhatsApp chat files can only be obtained on the smartphone, while the Netflix DDP can only be obtained when using a computer). It should be clearly communicated to the participant what device(s) they are expected to use during the study. If participants are allowed to use multiple devices, instructions per device should be available.
Combining data donation with other approaches for data collection
Since data donation is a user-centric approach, you can very well combine it with other methods for data collection, such questionnaires, diaries, etc. Including other methods for data collection in your study can have multiple benefits:
- Other methods can help you to obtain data on the constructs that cannot be measured through data donation. For example, measuring opinions or psychological characteristics through data donation can be difficult, but you can measure these through validated questionnaires.
- By administering a questionnaire prior to the data donation procedure, you can obtain more contextual information on who is willing to participate in such studies, and who is not, or who is able to successfully complete the participant flow and who is not.
- By measuring the same construct using multiple methods for data collection, you have the opportunity to combine these multiple measurements in a measurement model. This can provide you estimates of the quality of your data as well as improvements of your measurements.