What can be submitted to BioStudies?

We welcome submissions of all biological data that do not fit in the other, specialised EBI resources, as well as data packages that link together datasets in other resources (e.g., multi-omics). Usually a dataset is associated with a preprint or a publication.

Please use our data deposition tools, or contact us to discuss establishing a data pipeline in case of potential large submission volumes, e.g. as generated by an ongoing project.

Please note - All gene expression and other functional genomics data should be submitted to the ArrayExpress collection using Annotare.

How do I submit?

Submissions are handled via our online submission tool. There are two main steps - uploading the data files, and providing metadata descriptions such as contact details, links to related datasets in other resources etc. Several data upload methods are supported - via the submission tool, as well as using FTP and Aspera protocols. Metadata can be provided by filling in a web form. Data in BioStudies is organised into collections, and different collections will have different forms. A tab-delimited format described here is an alternative submission method. If you feel that the tool does not fit your requirements please contact us at biostudies@ebi.ac.uk.

Is there a cost to deposit data in BioStudies?

No, both deposition of data to BioStudies and data access are free of charge.

What is the best way to submit large volumes of information?

We recommend Aspera transfer for medium to large (10 GiB+) volumes of data.

How and when do I receive a BioStudies accession number?

Accession numbers are assigned during the submission process. The system needs to validate and transfer data files before issuing an accession number. For small datasets this can take just minutes, while larger data volumes (hundreds of GiB) may take hours. Note that usually the bottleneck will be the initial transfer of data files into your BioStudies “home” area, prior to filling in the metadata form.

Do you assign Digital Object Identifiers (DOIs)?

Yes, they are automatically assigned after submission.

Can I keep my dataset private (e.g. until publication)?

Yes. When you submit your data, you can choose a release date. Until that date, your data will not be publicly visible. You can choose to share your data with specific people (e.g. editors of the associated manuscript) by clicking the Share button in the data access page and forwarding the URL that is presented.

Can I add a publication or perform other edits to my dataset at a later point?

Yes, you can edit the publication field of your dataset’s. While a dataset remains private, you can perform arbitrary edits, e.g., add more data, change the release date, change metadata. After a dataset has become public, some operations are restricted, e.g., making it private again, or deleting data files.

Under what license(s) is a BioStudies dataset available?

New datasets in BioStudies are released into the public domain under the terms of a Creative Commons Zero (CC0) waiver.

How are ORCIDs used in BioStudies?

Data depositors can include their ORCIDs in contact details; these will be searchable in the data access interface. If you are an author of a dataset, you can also claim it through the browse interface.

Which file types can I submit?
BioStudies accepts all file types. However, there are some restrictions in the naming of the files. It is safe to use:
  • Any alphanumeric character (a-z | A-Z | 0-9)
  • Any of the following special characters
    • Exclamation point ( ! )
    • Hyphen ( - )
    • Underscore ( _ )
    • Period ( . )
    • Asterisk ( * )
    • Single quote ( ' )
    • Open parenthesis ( ( )
    • Close parenthesis ( ) )
These follow the Amazon S3 object key naming guidelines. For submissions via the PageTab files, and in file lists:
  • when referring to a directory the file path must not end with a slash (it should be e.g. “/mysubmission/mysubdirectory”)
  • please avoid trailing spaces (space character at the end of a file name)