Last Updated: 03/29/16

Collecting and Managing Clinical Data

Appropriate annotation of biospecimens is crucial to the overall usefulness of the biospecimen resource for scientific research [63].

Biospecimen resources store collected biospecimens using multiple methodologies and procedures. Researchers rely on banked biospecimens for a wide variety of purposes, using different platforms and technologies. The data recorded by investigators and biospecimen resources depend on the types of biospecimens collected and the studies’ objectives.

B.5.1. Regulatory Compliance

Data collection activities should conform to U.S. Food and Drug Administration (FDA) requirements [64] (21 CFR Part 11), if and where applicable, so that the data may be cited and/or used in Investigational New Drug and Investigational Device Exemption applications.

B.5.2. Collecting Clinical Data

B.5.2.1. Privacy regulations

As appropriate for the purpose and nature of the biospecimen resource, relevant clinical data associated with a biospecimen should be collected in accordance with applicable privacy statutes and regulations, and human subject protection regulations governing the acquisition of biospecimens and associated clinical data (see Sections C.2, Informed Consent, and C.3, Privacy and Confidentiality Protections, for additional information and references). Clinical data associated with the biospecimens should be used and disclosed only for research and development in compliance, as applicable, with HIPAA and HITECH, with U.S. Department of Health and Human Services (HHS) and FDA human subject protection regulations, and with applicable State and local laws

B.5.2.2 Collection requirement updates.

Biospecimen resources should track researchers’ requests for biospecimens with specific clinical data to guide the refinement of clinical data collection, as appropriate, based on the intended purpose of the resource, and if the biospecimen resource is the point of access for biospecimens and associated clinical data. Biospecimen resources should routinely summarize this information and provide it to an entity that maintains and/or collects the clinical data in order to improve the collection of clinical data.

B.5.3. Longitudinal Clinical Data

B.5.3.1 Data types

If the study design and objectives require, biospecimen resources should collect and store longitudinal data following applicable informed consent and authorization requirements. Based on these requirements, information linked to biospecimens may include demographic data, lifestyle factors, environmental and occupational exposures, cancer history, structured pathology data, additional diagnostic studies, information on initial staging procedure, treatment data, and any other data relevant to tracking a research participant’s clinical outcome (see examples in the Minimal Clinical Data Set, Appendix 1 for a recommended set of Common Data Elements (CDEs) that may be included). Different biospecimen resources may require more or less detailed annotation based on the primary intended use of the biospecimens. The dataset for clinical annotation should be based on the needs of the biospecimen resource users, as well as overall feasibility, particularly for biospecimens collected from clinical trials.

B.5.3.2. Database and data access

Databases developed for longitudinal studies should use coded data associated with a biospecimen but should maintain a secure link to identify the research participant to allow additional longitudinal data to be obtained, if permitted by law and by the research participant’s consent/authorization. Policies and protocols should be in place to facilitate access to uniform longitudinal data (e.g., treatment and outcome information, as appropriate) while protecting research participant’s privacy and confidentiality.


To collect high-quality longitudinal information, biospecimen resources should ensure that dedicated and trained personnel curate longitudinal clinical data with validation of the collection process and QA/QC.