Overview of Data & Metadata Submission to the DCC (via FTP Pipeline)
- Things to Know Before Submission
- Step 0: Getting an FTP Account on the Genboree FTP Server
- Small RNA-seq Data Submission Pipeline
- Files Needed for Data Submission
- Step 1: Preparing Your Data Archive
- Step 2: Preparing Your Metadata Archive
- Step 3: Preparing Your Manifest File
- Step 4: Uploading Your Submission to the FTP Server for Processing
- Step 5: Processing Your Files
- qPCR Data Submission
- Files Needed for Data Submission
- Step 1: Preparing Your Data Archive
- Step 2: Preparing Your Metadata Archive
- Step 3: Preparing Your Manifest File
- Step 4: Uploading Your Submission to the FTP Server for Processing
- Step 5: Processing Your Files
- Miscellaneous Tips and Tricks
- Creating an Archive
- Learning How to Use the Terminal
This Wiki page includes instructions on how to submit your data (with accompanying metadata) to the Data Coordination Center (DCC)
using the Genboree FTP Data Submission Pipeline.
If you're submitting small RNA-seq data, you should follow the steps in the "Small RNA-seq Data Submission Pipeline" section.
If you're submitting qPCR data, you should follow the steps in the "qPCR Data Submission" section.
Support for submitting long RNA-seq data is coming soon!
Things to Know Before Submission¶
This tutorial will walk you through the entire process of creating an FTP account, formatting and submitting your data and metadata properly,
and then seeing your dataset on the Atlas.
Step 0: Getting an FTP Account on the Genboree FTP Server¶
Small RNA-seq Data Submission Pipeline¶
All submitted samples will be processed through the exceRpt Small RNA-seq Pipeline for exRNA Profiling
and exceRpt Small RNA-seq Post-processing tools.
Files Needed for Data Submission¶
Your submission will consist of three different files:- a data archive: The data archive will contain all of your different data files (FASTQ / SRA) as well as an optional spike-in file (FASTA) for those inputs.
- a metadata archive: The metadata archive will contain various metadata documents relating to your data submission.
- a manifest file: The manifest file will link together your data and metadata files, and it will also provide other valuable information for verifying that your submission is complete.
IMPORTANT NOTE
All three files must have the same file name, other than the data archive file name ending in _data and the metadata archive file name ending in _metadata.
This will be explained again in Step 4 below, but your files will look something like this:
- samples_data.zip
- samples_metadata.zip
- samples.manifest.json
Here, I've chosen the name "samples" for my submission. This is just an example - you should give a more descriptive name in your actual submission ("gastricCancerOct2015_data.zip", for example).
Step 1: Preparing Your Data Archive¶
Step 2: Preparing Your Metadata Archive¶
Step 3: Preparing Your Manifest File¶
Step 4: Uploading Your Submission to the FTP Server for Processing¶
Upload Submission to the DCC using FTP Server
Step 5: Processing Your Files¶
qPCR Data Submission¶
Files Needed for Data Submission¶
Your submission will consist of two or three different files:- a data archive: The data archive is OPTIONAL. It will contain all of your different data files (RDML format or any other custom format provided by the qPCR instrument).
- a metadata archive: The metadata archive will contain various metadata documents relating to your data submission.
- a manifest file: The manifest file will provide valuable information about your submission.
IMPORTANT NOTE
Both files must have the same file name, other than the data archive file name ending in _qPCR_data and the metadata archive file name ending in _qPCR_metadata.
This will be explained again in Step 3 below, but your files will look something like this:
- samples_qPCR_data.zip
- samples_qPCR_metadata.zip
- samples_qPCR.manifest.json
Here, I've chosen the name "samples" for my submission. This is just an example - you should give a more descriptive name in your actual submission ("gastricCancerOct2015_qPCR_data.zip", for example).
Step 1: Preparing Your Data Archive¶
Prepare Your qPCR Data Archive
Step 2: Preparing Your Metadata Archive¶
Prepare Your qPCR Metadata Archive
Step 3: Preparing Your Manifest File¶
Prepare Your qPCR Manifest File
Step 4: Uploading Your Submission to the FTP Server for Processing¶
Upload qPCR Submission to the DCC using FTP Server
Step 5: Processing Your Files¶
Miscellaneous Tips and Tricks¶
Below, you'll find some useful tips and tricks for creating your submission for the FTP Pipeline.
Creating an Archive¶
Learning How to Use the Terminal¶
If you need help navigating the terminal (and want to learn some basic Linux/OSX commands), the following link will be useful: