Data Retention and Protection

The following policies pertain to specific systems managed by MSI. The specific elements of this policy and MSI's data policies in general are consistent with the University policy on data management and are therefore applicable to the transfer and storage of data on MSI resources.    

Data Protection

Primary Storage (Tier 1)

MSI takes a number of precautions to prevent the loss of data stored on MSI's Primary Storage systems. These precautions protect against the majority of data storage system failures; however, they may not protect your data from a catastrophic failure of the file system or from damages sustained to the MSI data center in Walter Library. While catastrophic events are unlikely, users are nonetheless encouraged to take precautions to back up and archive data that are difficult or impossible to regenerate.  

Primary Storage Snapshots & Exclusions (Tier-1 Snapshots)

Nightly copies called “snapshots” are made of user home directories (including subdirectories), which allows MSI and end users to recover lost, modified, or damaged files for up to one month from the given calendar day. 

Exclusions

  • MSI's “scratch” filesystem does not have snapshotting or any other form of backup.
  • By extension, links to local or global scratch are not included in any snapshot.

Primary Storage Tape Backups & Exclusions (Tier-1 Tape Backups)

Periodic “tape” backups, for use in disaster recovery, are made of user home directories (including subdirectories) and are stored at a secondary U of M Twin Cities campus data center. These are for MSI staff and administrators to recover data in the rare event that MSI’s Primary Storage (Tier 1) data suffers an event that renders snapshots non-viable.

Tape backups by MSI are not scoped for individual users. Data that is recovered from tape may not be from a point in time that is desirable, or may not contain an ideal or fully complete copy of data. Therefore, users should take precautions to back up, archive, or in some other way maintain secondary copies of data that are difficult or impossible to regenerate.  

Exclusions

Note that due to limitations of tape backup platforms, some common non-data directories are globally excluded from the Tier-1 tape backups. This inclusive list is as follows:

  • ~/.cache
  • ~/.conda
  • ~/.local
  • ~/.mozilla

No Protections: Secondary Storage (Tier 2)

Currently, MSI’s Second Tier Storage system (Ceph) is not protected by snapshots or other backups. Users should therefore take precautions to make backups of any difficult-to-recover data that is stored in Tier 2 (Ceph), as MSI cannot recover this data if it is lost or deleted. It is the responsibility of the PI to ensure that their students and collaborators have transferred ownership of all relevant data to the PI or one of the group administrators before access to MSI is terminated.

No Protections: BlackPearl (Tier 3)

The BlackPearl system is a self-service tape archival platform for making backup and archive copies of important data (for example from Tier 1 or Tier 2 storage). The system duplicates all data onto two separate tapes, for redundancy in the event that one tape is non-viable. No additional backups or copies of data stored on Tier 3 are created beyond this duplication. 

Data Retention

Data on Primary Storage are retained in a Principal Investigator (PI) group directory for each annual allocation period. The PI is considered the owner of all data within the accounts of their group members. If a PI does not renew their affiliation with MSI before the end of an allocation period (December 31), MSI will lock the PI account, which will render their data inaccessible.  

Groups with Data Use Agreements

Groups that store data that are governed by a 3rd party Data Use Agreement or Data Use Certification (such as dbGaP, or data governed by NDA, etc.) are responsible for adhering to the terms of their agreement, and ensuring that they are choosing the appropriate MSI storage resource for their data. Please contact help@msi.umn.edu for assistance with determining how your agreement impacts the availability of any or all of the data protection services listed here.

Service Break Down

Availability and retention time for data restoration

 

Snapshots

Tape Backups

Tier 1 Retention

4 Calendar Weeks

60 Days*

Tier 2 

None

None

Tier 3

None

None

 

* Charge applies per request