Research Knowledge Base
De-Identified Publicly Available Datasets Guidance
Version Date: August 27, 2015
Research projects involving only the analysis of secondary data that either never contained, or have been stripped of identifiers, and are publicly available do not require prior IRB approval because their use does not constitute research involving human subjects under the federal Common Rule, 45 CFR Part 46 and UW-Madison policies.
However, research projects that merge more than one dataset in such a way that individuals may be identified DO require prior IRB approval.
Research projects involving analysis of secondary data will NOT require prior IRB approval in the following situations:
- The data set(s) is (are) published and publicly available without restriction (e.g., data are published by a reputable source in a publicly-available journal, textbook or web-site) and neither the UW researcher nor any collaborating researcher on the project(s) has access to links that would connect the data to the individuals from whom they were derived.
- The data set(s) are publicly available to researchers and others, but the data holder requires a “responsible use statement” or similar attestation to ensure appropriate use and protection of the data. Such an agreement or attestation may be automated. In this case, neither the UW researcher nor any collaborating researcher on the project can have access to any links that would connect the data to the individuals from whom they were derived, nor may any researcher on the project attempt to re-identify any person from whom the data were derived.
- The researcher will obtain a data set available from a Federal or State agency and will enter into an agreement with the data provider that includes language that a) the data provided to the researcher does not contain any identifiers, including those specified under the HIPAA Privacy Rule; b) if the data are coded, the data provider will not release a link to the code to the researcher; and c) the researcher receiving the data set must agree to not attempt to re-identify any person from whom the data were derived.
For examples of the types of secondary datasets that do not require prior IRB approval see the List of Approved De-identified Publicly Available Datasets. Any questions regarding whether a data set meets these requirements should be referred to the IRB Office that would be expected to review the use of the data if the project qualifies as research involving human subjects.
Keywords: Datasets database existing publicly available research secondary data not human subjects research