Protecting the Sensitive Information in My Data
Sensitive information is a broad term for data that would harm individual research subjects or the University if the data was lost or obtained by unauthorized individuals. This includes any data that is not already publicly available. For example, personally identifiable information (PII) from research on human subjects, protected health information (PHI) in the clinical setting, data subject to data use agreements (DUAs) as well as information subject to many other state and federal policies are considered sensitive.
Research data is often subject to multiple contract requirements that flow down from funding agencies and other entities that issue contracts related to research data.
For example, organizations considered covered entities or business associates by the Health Insurance Portability and Accountability Act (HIPAA) must protect PII (also known as Protected Health Information or PHI) according to specific requirements of the HIPAA privacy rule. Other health information about individuals may be covered under state regulations such as Illinois’s Biometric Information Privacy Act (BIPA). Some PII may be subject to both HIPAA and BIPA.
Additionally, data produced as a part of a federal contract (Federal Contract Information or FCI) is covered by a federal rule that defines requirements and procedures for working with FCI (52.204-21 Basic Safeguarding of Covered Contractor Information Systems). Other federal information is also covered under Executive Order 13556 "Controlled Unclassified Information" and 32 CFR Part 2002 "Controlled Unclassified Information" which specifies additional controls.
Storing Data Securely
Research data storage and analysis platforms have varying levels of security and are not all suitable for all data types. The Northwestern University data classification policy outlines four levels for University-owned data based on the impact that loss or disclosure would have on the University. This policy also applies to research data. The Feinberg School of Medicine (FSM) data storage policy makes similar distinctions but uses different language to describe the levels.
Level | Impact Level | Examples |
---|---|---|
1 |
None or low on the University and affiliates
|
Already publicly available data
|
2 |
Publicly noticeable impact on the University and affiliates
|
Unpublished research data
|
3 |
Serious or severe impact on the University – risk of civil or criminal penalties
|
Protected Health Information (PHI), Controlled Unclassified information (CUI), data subject to some Data Use Agreements (DUAs)
|
4 |
Severe or catastrophic impact on the University or national security - inherent risk of significant fines or penalties, regulatory action, or civil or criminal violations
|
Government classified data, export-controlled data
|
Before choosing where to store or analyze your data, knowing which level your data falls in is important. The choosing appropriate data storage page discusses what storage services support these levels.
Some Level 3 data have specific requirements that are not addressed in this guide. If you are working with legally or contractually restricted data, including personally identifiable information from data from human subjects, please email researchdata@northwestern.edu, and we will lead you to specific resources.
Data Security Requirements
Regulations and data use agreements (DUAs) may specify hardware and software components (technical controls) or rules and procedures that researchers should follow (administrative controls) to keep the data safe. The following sections discuss common technical and administrative controls.
Technical Controls
Data use agreements and regulations will spell out specific technical controls to protect your data. Here are some common examples:
Administrative Controls
Technical controls alone cannot protect your sensitive data from loss or exposure. The people working with the data must also be diligent about following good security practices. Some common administrative controls are:
- Limiting access to only people who need the data. For example, granting access to the one research group member analyzing the data rather than the whole research group.
- Only allowing access from certain computers. For example, only accessing the data from encrypted or University-managed computers. While this may not be able to be technically enforced, having a written policy is important.
- Regularly monitoring audit logs for unusual activity. For example, looking at the logs to make sure only the expected user names have accessed the data.
Deidentification
Many data use agreements and regulations allow you to use the data in less-restrictive conditions if personally identifiable information is removed (deidentified). For example, the US Department of Health and Human Services offers guidance on deidentifying HIPAA-regulated data. Deidentification can be useful if you are dealing with large datasets that need significant computational resources to analyze, such as medical imaging data. Deidentification would allow you to analyze this data on Quest, on which medical information cannot be stored. For more information on deidentifying data, see External Research Data Management Resources.
Email researchdata@northwestern.edu if you are unsure about whether deidentifying your data is appropriate for your project.
Data Security Plans
All of this information can be recorded in a data security plan that can be evaluated by Northwestern’s Information Security Office or Feinberg School of Medicine IT for Feinberg researchers. If you need help writing a data security plan for your research group, please contact researchdata@northwestern.edu to talk to a data management specialist or see Feinberg School of Medicine’s Data Security Plan Policy for more information on writing and having your plan reviewed.
See Feinberg IT’s information security polices for more information.