Protecting the Sensitive Information in My Data
Sensitive information is a broad term for data that would harm individual research subjects or the University if the data was lost or obtained by unauthorized individuals. This includes any data that is not already publicly available. For example, personally identifiable information (PII) from research on human subjects, protected health information (PHI) in the clinical setting, data subject to data use agreements (DUAs) as well as information subject to many other state and federal policies are considered sensitive.
Research data is often subject to multiple contract requirements that flow down from funding agencies and other entities that issue contracts related to research data.
For example, organizations considered covered entities or business associates by the Health Insurance Portability and Accountability Act (HIPAA) must protect PII (also known as Protected Health Information or PHI) according to specific requirements of the HIPAA privacy rule. Other health information about individuals may be covered under state regulations such as Illinois’s Biometric Information Privacy Act (BIPA). Some PII may be subject to both HIPAA and BIPA.
Additionally, data produced as a part of a federal contract (Federal Contract Information or FCI) is covered by a federal rule that defines requirements and procedures for working with FCI (52.204-21 Basic Safeguarding of Covered Contractor Information Systems). Other federal information is also covered under Executive Order 13556 "Controlled Unclassified Information" and 32 CFR Part 2002 "Controlled Unclassified Information" which specifies additional controls.
Storing Data Securely
Research data storage and analysis platforms have varying levels of security and are not all suitable for all data types. The Northwestern University data classification policy outlines four levels for University-owned data based on the impact that loss or disclosure would have on the University. This policy also applies to research data. The Feinberg School of Medicine (FSM) data storage policy makes similar distinctions but uses different language to describe the levels.
None or low on the University and affiliates
Already publicly available data
Publicly noticeable impact on the University and affiliates
Unpublished research data
Serious or severe impact on the University – risk of civil or criminal penalties
Protected Health Information (PHI), Controlled Unclassified information (CUI), data subject to some Data Use Agreements (DUAs)
Severe or catastrophic impact on the University or national security - inherent risk of significant fines or penalties, regulatory action, or civil or criminal violations
Government classified data, export-controlled data
Before choosing where to store or analyze your data, knowing which level your data falls in is important. The choosing appropriate data storage page discusses what storage services support these levels.
Some Level 3 data have specific requirements that are not addressed in this guide. If you are working with legally or contractually restricted data, including personally identifiable information from data from human subjects, please email firstname.lastname@example.org, and we will lead you to specific resources.
Data Security Requirements
Regulations and data use agreements (DUAs) may specify hardware and software components (technical controls) or rules and procedures that researchers should follow (administrative controls) to keep the data safe. The following sections discuss common technical and administrative controls.
Data use agreements and regulations will spell out specific technical controls to protect your data. Here are some common examples:
Controlled access means keeping the data in a place where only specific people can access it with a user name and password along with other access controls that are also described below. This can be as simple as storing your data on a computer that only you have the password for. However, many data providers and regulators recommend keeping your data on University-run systems like RDSS (research data storage service) because they are actively maintained, monitored, and have protections that are not feasible to implement on desktop and laptop computers.
Encryption is the process of converting your files into a “code” that is not readable unless it is decoded with the correct “key.” Data can be encrypted in transit or at rest.
- Encryption in transit is encoding the data while it moves through a computer network. Free Wi-Fi networks that are not password protected are generally not encrypted. Virtual Private Network (VPN) clients like GlobalProtect VPN encrypt your data in transit.
- Encryption at rest is encoding either the storage drive or the entire computer that the data are stored on.
Multi-factor Authentication (MFA) involves combining something you know (usually your user name and/or password) with something you have (like a smartphone, token, or key). Appropriate use and implementation of MFA prevents unauthorized use of your account by ensuring that direct physical approval is required before authenticating to a service provider.
Firewalls limit access to or from connected devices, ensuring that only trusted or known systems or networks can access certain devices or storage. Firewalls can also prevent certain types of cyberattacks by detecting and mitigating distributed denial of service (DDoS) attacks or limiting access to external sites that are known to host malware.
If you are working from off campus, GlobalProtect VPN uses a virtual private network that has permission through Northwestern’s firewalls to access some of Northwestern’s data storage platforms that are not accessible from the internet at-large, thus reducing random attacks on a system from the internet.
Audit logs are computer-generated records of who accessed the data, when, and what they did with it. Collecting these logs allows for early detection of a data breach if they are monitored regularly (see administrative controls). Access to audit logs as well as administrative access to all Northwestern information is governed by Northwestern’s Appropriate Use of Electronic Resources Policy.
If your data use agreements or regulation requires specific technical controls, email email@example.com to discuss what platforms comply with these requirements with a data management specialist.
Technical controls alone cannot protect your sensitive data from loss or exposure. The people working with the data must also be diligent about following good security practices. Some common administrative controls are:
- Limiting access to only people who need the data. For example, granting access to the one research group member analyzing the data rather than the whole research group.
- Only allowing access from certain computers. For example, only accessing the data from encrypted or University-managed computers. While this may not be able to be technically enforced, having a written policy is important.
- Regularly monitoring audit logs for unusual activity. For example, looking at the logs to make sure only the expected user names have accessed the data.
Many data use agreements and regulations allow you to use the data in less-restrictive conditions if personally identifiable information is removed (deidentified). For example, the US Department of Health and Human Services offers guidance on deidentifying HIPAA-regulated data. Deidentification can be useful if you are dealing with large datasets that need significant computational resources to analyze, such as medical imaging data. Deidentification would allow you to analyze this data on Quest, on which medical information cannot be stored. For more information on deidentifying data, see External Research Data Management Resources.
Email firstname.lastname@example.org if you are unsure about whether deidentifying your data is appropriate for your project.
Data Security Plans
All of this information can be recorded in a data security plan that can be evaluated by Northwestern’s Information Security Office or Feinberg School of Medicine IT for Feinberg researchers. If you need help writing a data security plan for your research group, please contact email@example.com to talk to a data management specialist or see Feinberg School of Medicine’s Data Security Plan Policy for more information on writing and having your plan reviewed.
See Feinberg IT’s information security polices for more information.