Preservation & sharing

UNIL, together with the University of Zurich, is actively participating in the SWISSUBase project led by the Swiss Competence Centre for Social Sciences (FORS), which aims to provide a general, open and sustainable data repository to comply with the FAIR Data Principles, i. e. data that is easy to find, accessible, interoperable and reusable.

This institutional repository will have a strong disciplinary approach, but also a generalist one. It will allow for long-term data sharing and archiving and is expected to be accessible to the linguistic community in the spring of 2021 before opening up to other disciplines.

Backup, storage and security : what are the differences ?

Secure storage of your search data and regular backups are essential during your research project.

  • Backup consists of creating additional copies of your current data. It is essential to avoid the risk of data loss due to accidental erasure, hard disk failure, theft or damage to equipment. Files stored on your desktop are not automatically backed up. For more information, see UNIL's Crashplan backup system for your workstation.
  • Data storage refers to where and how you store your data. It is about :
    • select the appropriate file formats (for example, choose between options such as plain text, rich text or open and non-proprietary formats) ;
    • select the appropriate medium for the physical storage of data (e. g. hard disks, CD-ROMs, network storage and servers, etc.).
  • Security is about protecting your data. This means from :
    • ensure that data is not lost and corrupted ;
    • control access to your data as appropriate. This can be done in a variety of ways, including physical security (e.g., storing data in a locked room), file password protection and encryption.

How to archive your data ?

Data archiving is not to be considered as storage or backup. Archiving takes place after the end of a research project and aims a preservation ad aeternam. It must be accompanied by management rules that allow for the possible reuse of data over time, as well as their proper understanding and contextualisation (metadata). This is why it is important to ensure that open and non-proprietary file standards are used (see UK Data Service recommendations), as well as classification and naming rules (see organize your data).

The Data Management Plan is a tool that allows you not only to manage your data during the project, but also to ensure its proper management over time (after the end of the project).

In principle, publication-related data must be deposited for archiving and sharing on a non-commercial repository that complies with FAIR principles, subject to other requirements formulated by the research funding agency. Data not related to a publication can be store on the Ci Long term storage infrastructure (contact the Calculation and Research Support Division). According to Directive 4.5, the cost of archiving and of long term storage are covered by UNIL.

The sorting and destruction of research data is the responsibility of the researchers. In the event that UNIL has an interest in keeping research data whose destruction is desired by a researcher, UNIRIS shall determine, in agreement with the researcher, whether it is appropriate to archive or destroy, in whole or in part, the research data concerned.

What data to keep ?

Beyond the question on WHAT data to keep, a reflection on WHY to keep as well as on the PLAYERS to involve must be taken into account in order to know what should be kept or not.

For UNIRIS, as in the Jisc study, the WHY to keep research data is based on two aspects :

  1.     data are a support for the integrity of the research and its reproducibility ;
  2.     the data have the potential for reusability.

Questions about WHAT data to retain then focus on the following criteria :

  • those related to the "research" mission, i.e. :
    • funder's requirement
    • statutory requirement
    • editorial requirement
    • requirement of its home institution
    • data support publication and research results
    • the data has a unique character
    • the data has a character linked to the notion of intangible cultural heritage
    • data originality
    • proven accessibility and usability
  • those related to the nature of the data, i.e. :
    • raw data
    • treated data
    • data that support a publication and research results
    • data that synthesizes research
  • those related to the types of data, i.e. :
    • observational data
    • experimental data
    • secondary data
    • negative data
  • those related to the materials that complete the data, i.e. :
    • physical samples
    • metadata and documentation
    • software used

Finally, the committee in a position to decide on conservation/archiving should be composed of the following ACTOR·TRESSE·S :

  • the researcher who created/collected the data
  • the funder
  • an ethicist
  • an archivist
  • the manager (curator) of the data
  • other researchers using the data
  • the researcher's home institute.

The different research areas and the institutions in which researchers are located should also be consulted.

How to share your data ?

In a similar way to scientific publications, data sharing can be carried out via a general repository or a disciplinary repository.

It is strongly recommended to share your data in a FAIR and non-commercial repository. In order to facilitate the transition to FAIR data, the SNSF has defined a set of minimum criteria that data repositories must meet in order to comply with the FAIR principles. A checklist has been produced by the SNSF. The aim is to answer positively to the following questions :

  • Are unique and durable identifiers (e. g. DOI) assigned globally to the datasets (or ideally to the files in the dataset) ?
  • Does the database allow loading of intrinsic metadata (e. g. author's name, data set content, associated publications, etc.) and those defined by the person submitting the data (e. g. definition of variables, etc.) ?
  • Is the user license (CC0 recommanded for data and CC BY for publications) under which the data will be accessible clearly mentioned or can the user download/select a license ?
  • Are citations and metadata always publicly accessible   (even in the case of restricted datasets) ?
  • Does the database provide a submission form requiring that the intrinsic metadata follow a specific format (to ensure their automatic use or interoperability) ?
  • Does the database have a long-term preservation plan for the archived data ?

The website lists most of the databases and their characteristics.

Which data repositories at UNIL ?

In the long term and pending an institutional repository equivalent to SERVAL for data (see SWISSUbase project infra), UNIL could recommend the use of the general repository ZENODO developed by CERN and funded by the European Union. Each faculty of the University should be able to animate its faculty community there, thus offering researchers in its faculty the possibility of depositing and sharing its data. Contact your research consultant for more information.

FORSBase for social and political sciences

For data in social and political sciences, UNIL recommends the use of FORSBase, developed by FORS, the Swiss Centre of Competence in Social Sciences.

The centre produces data from national and international surveys. It provides tools for information infrastructure and a consultation service for researchers.

SWISSUBase for all research areas (from 2021)

UNIL is currently working with FORS and UNIZH to develop a repository of thematic and generalist data to manage the research data produced at UNIL. See the project SWISSUbase.

Qualitative data in the humanities and social sciences (ex-PlaTec)

PlaTec no longer exists as of July 31, 2021.

The research support services it offered are now provided since August 1, 2021 by the Division Calcul et Support à la Recherche (DCSR), which the PlaTec team has integrated.

If you would like support for a research project (databases and their conceptualization, research data management, digital project management, preferably from the project design stage), please fill in this contact/evaluation form. We will contact you as soon as possible.

Follow us:    

Did you know ?

29% of UNIL researchers believe that their data should be kept ad eternam.

UNIL's Research Data Survey, 2015

Archiving formats

  • Web Archive: WARC
  • Containers: TAR, GZIP, ZIP
  • Databases: XML, CSV
  • Tabular data: CSV
  • Films: MOV, MPEG, AVI, MXF
  • Geospatial: SHP, DBF, GeoTIFF, NetCDF
  • Images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
  • Sounds: WAVE, AIFF, MP3, MXF
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Text: XML, PDF/A, HTML, ASCII, UTF-8

source : Bibliothèque de Stanford

Data life cycle

To better understand the challenges of data storage, archiving and sharing, see the concept of the data life cycle.