Skip to main content
IT Service Status
IT Service Status

Choosing Appropriate Data Storage

The Feinberg School of Medicine (FSM) data storage policy uses different language to describe research data categories and makes specific recommendations about where Feinberg researchers should store their data. Contact fsmhelp@northwestern.edu with questions about this policy and data storage in general.

Many factors go into choosing appropriate data storage for your data. The best place to store your research data depends on what you will do with it.

Consider the following questions when choosing a location to store your research data.

What Policies and Regulations Apply to My Data?

Not all data storage services meet the minimum requirements for all research data. All Northwestern research data is subject to Northwestern University’s Research Data policy and Data Classification policy, which describes categories of data (levels 1 to 4) referenced in the table below. Other policies and regulations may apply.

Northwestern University Data Classification Policy Categories

Northwestern University data classification policy categories
Service Good For Level 1 Data  Level 2 Data*  Level 3 Data*  Level 4 Data 
Storing working data that only you need to access Yes Maybe Maybe No
SharePoint Storing working data shared with a team Yes Maybe Maybe No
RDSS: non-audited zone (resfiles) Storing working data shared with a team, especially data with large individual file sizes Yes Yes No No
RDSS: audited zone (resfilesaudit) Storing working data shared with a team, especially data with large individual file sizes Yes Yes Maybe No
FSMResFiles (for Feinberg School of Medicine) Storing working data Yes Yes Maybe No
Quest Storage Storing data being actively analyzed on Quest Yes Maybe Maybe No
Public Cloud Storage Storing working data and archiving research data Yes Maybe** Maybe** No
Computers managed by IT staff at Northwestern Storing working data that only you need access to and is backed up on another server Yes Maybe Maybe No
Your personal computer or accounts for services not licensed by Northwestern University Storing Northwestern research data on personal computers or accounts is not recommended Yes Maybe Maybe No

* Refers only to the technical controls required to store this data type. Level 3 data and above requires other policies and procedures to be fully compliant with contractual or legal requirements they are subject to.

** Cloud services can usually be configured to satisfy most data storage requirements

Note: Using personal accounts on unsupported storage services like Dropbox is not allowed for storing Northwestern research data. Google Drive, including those provided by Northwestern (e.g., u.northwestern.edu) is also not permitted for Level 2 and above data. Please contact researchdata@northwestern.edu with questions and for advice about alternatives.

Storage Services

Each service will have its own policies about what it can be used for. For example, Quest users must comply with the Quest Storage and Data policy. Please ensure you can follow the policies laid out by the platform.

Schools and Colleges

Northwestern schools and colleges may have rules about where research data should be stored. For example, Feinberg School of Medicine researchers must comply with the Feinberg School of Medicine Data Storage policy. Be sure to follow the regulations set by your school or college.

Data Use Agreements and Other Contracts

If you signed a data use agreement (DUA) with another organization or institution, verify the storage platform you choose complies with the data provider’s requirements for data storage and security. All DUAs must go through the Office for Research for approval and signing.

Email researchdata@northwestern.edu or fsmhelp@northwestern.edu if you are in the Feinberg School of Medicine for help assessing storage options in the context of your DUA.

State and Federal Laws and Regulations

Certain types of data are governed by state and federal laws. For more information, read Protecting the Sensitive Information in My Data .

If you need help determining what policies and regulations apply to your data, email researchdata@northwestern.edu.

How Much Storage Do You Need? How Much Will It Cost?

Another factor to consider when evaluating storage options is the amount and type of data files that will be generated or used as part of your project. Tips for considering capacity when choosing data storage, include:

  • Estimate on the high end when planning. If you have a mix of very large and very small files, doing the estimate for very large files should be sufficient.
  • Use multiple storage services to their strengths. For example, SharePoint is excellent for collaborating with your research group, but it cannot accommodate files larger than 250 GB. Store collaborative documents like manuscripts and notes in SharePoint, and find another platform to store your files larger than 250 GB.
  • The location of your data will change through the course of your project. For example, your data could be produced in a core facility and stored on RDSS, then moved to Quest for analysis. Results can be integrated into a manuscript in SharePoint.
  • Archival storage is often less expensive. Think about putting infrequently accessed data in archival storage, such as Amazon S3 Glacier Deep Archive.

Any associated costs to storing your data can be included in your grant budgets as data management costs.

Cost and Capacity of Northwestern Storage Services

Cost and capacity of Northwestern storage services
Service Cost Capacity
OneDrive No additional charge* 5 TB total max per user

250 GB individual file size limit
SharePoint No additional charge* 25 TB max per library

250 GB individual file size limit
RDSS $100/TB/Year Minimum of 1 TB purchase
FSMResFiles (for Feinberg School of Medicine) No additional charge Determined by FSM IT**
Quest Storage (for data related to active computing, processing, and analysis on Quest HPC only) 1 to 2 TB at no additional charge

Buy in: $195/TB for five years
Home - 80 GB

Projects - 1 to 2 TB for general

Scratch - 5 TB and 5,000,000 files
Public Cloud Storage Monthly payment reflecting previous month's use Pay for storage used per month

* Microsoft is currently reviewing its policy on charging for storage.  We are tracking any changes here that may impact the community.

** Feinberg School of Medicine Only. Contact fsmhelp@northwestern.edu with questions about your quota.

Is the Data Backed Up?

Backing up your data prevents data loss due to hardware failure, disasters, file corruption, and human error. While many data storage services do this automatically, understanding what you are protected against when using a storage service is important.

Different types of “backup” strategies protect your data from different risks.

Syncing

Syncing involves continuously moving changes from one system to another so that both have the same file versions. For example, OneDrive and SharePoint allow you to specify which files and folders to synchronize changes between your computer and the cloud and which protects your data from hardware failure on your computer or theft. However, if a file is corrupted on your computer, the error will replicate in the cloud.

Versioning

Versioning involves keeping a record of changes made to your files and being able to go back to older versions. There are different types of versioning. For example, OneDrive and SharePoint create a new version whenever changes are made. In contrast, RDSS takes daily snapshots of what the entire file system looks like once a day. These snapshots are kept for 28 days. Versioning alone will not help if the storage service itself goes down.

Data Protection Features on Northwestern University Storage Service

Replication involves creating a completely separate copy of your data on another server, typically in a distinct geographic location. Replication is critical for disaster recovery or bringing back an entire system from scratch if it goes down due to natural disaster or cyberattack. RDSS combines versioning with replication by copying its snapshots between the Evanston campus and the Chicago campus.

Replication of data on another server
Service Replication Versioning
OneDrive and SharePoint Managed by Microsoft Version created every time a file is saved
RDSS and FSMResFiles (for Feinberg School of Medicine) Copies in geographically distinct locations (Chicago and Evanston) Daily snapshots kept for 28 days
Quest Storage Home directories are copied to an off-campus tape archive Daily snapshots kept for 28 days 
Public Cloud Storage Configurable–storage cost for each copy Configurable–pay for each version stored; usually includes file integrity checks

Who Needs Access to Your Data?

Different storage services have different rules about what is possible with regard to sharing your data with collaborators or the public.

Each system has different default permissions for new files and folders.

  • Locations that only you have access to by default are great for storing files that no one else should see. Keep in mind that files that are associated with an individual user account will disappear if they leave Northwestern.
  • Locations that grant access to a specific group of people by default are great for collaboration and are not tied to a single user account.

These services may have a method to share outside of this default access group. The following table also outlines who has access by default on each service and how to share outside the default access group.

Default access and sharing by Northwestern University storage service
Service Default Access Group How to Share Outside the Default Access Group*
OneDrive You Anyone with a Microsoft account

No anonymous link sharing

Recommended only for limited sharing; permissions are complex
SharePoint Site members and owners Anyone with a Microsoft account; anonymous access

Note that some security rules may restrict enabling anonymous access
RDSS  Authorized users by NetIDs, including affiliate NetIDs for external collaborators None; access is controlled at the share level
FSMResFiles (for Feinberg School of Medicine)** Managed by FSM IT** Managed by FSM IT**
Quest Storage


Home and scratch: you

Projects: Quest users who are part of the allocation
Home: Cannot be shared with other Quest users

Scratch: Other Quest users

Projects: Other Quest users
Public Cloud Storage
Manually configured Manually configured

* Sharing here is defined as granting access to people outside the users who have access by default.

** Feinberg School of Medicine Only

Is My Data Accessible to Compute Sources

Much like real estate, data storage is all about location. Can I access my data from where I need to analyze it? Common compute sources include Quest, your computer, or virtual machines run by Northwestern or public cloud providers.

Northwestern-run storage services can be directly mounted or synchronized to compute sources. Others require you to transfer your data to the compute source for use. To facilitate transfer among storage services, Northwestern University subscribes to Globus, a tool that facilitates large data transfers. Review the following table to see which storage services are accessible to Globus. Also, please see our documentation on Globus.

Access and data transfer methods for storage services
Storage Service Access Method Data Transfer Options
OneDrive

SharePoint
Web interface

Sync specified files and folder to your computer or VM
Globus data transfer tool

RDSS 

FSMResFiles (for Feinberg School of Medicine)

Mount to your computer or VM as a network drive
Globus data transfer tool*
Quest Storage


Log in to Quest via ssh

Quest Analytics Nodes
Globus data transfer tool (preferred)

FTP clients

Command line tools (sftp, scp, rsync)
Public Cloud Storage
Mount storage as a drive

Command line interface

Web interfaces:


Command line interface

Globus data transfer tool

*Data on RDSS can currently only be transferred to Quest. The resfilesaudit zone is not accessible to Globus.

Making Your Decisions

Choosing where to store your data so that it is safe, compliant, and easy to work is harder than it seems. If you need help deciding where to store your data, email researchdata@northwestern.edu, and our data management consultants will set a time to talk about your workflow and discuss options.