Introduction

The NanoFlow Repository is a public EVP data repository developed with transparent publishing and future compatibility with a more inclusive atlas. Initially, the NanoFlow Repository will fully cater to EV characterization platforms where proposed field standards, best practices, and data processing tools have been developed, including single EV flow cytometry, multiplex flow cytometry, and microfluidic resistive pulse sensing technologies. While uncalibrated data uploads are supported, the utility of data calibration has been demonstrated and is recommended to allow cross-platform data integration.

The NanoFlow Repository has been built in tandem with field standards and local data processing software tools in mind but also for the eventual integration with the exRNA Atlas into an atlas of EVPs and their cargo. For example, single EV flow cytometry data calibrated with FCMPASS creates encrypted .exRNA files recognized by the repository to ensure data integrity. The NanoFlow Repository organizes uploaded data as one or more datasets owned by a user-selected or user-created team. The core of a dataset consists of one or more FCS files generated from the same flow cytometer and a MIFlowCyt-EV report. Additionally, generated files from the following calibration tools FCMPASS, MPAPASS, and RPSPASS are fully supported. Once all necessary data has been uploaded and validated, the team can publish their datasets to allow public access and download.

Those who want to contribute and publish their datasets can do so by logging in, if already a genboree user, or can sign up here. Data submission and access is made possible by a Linked Data Hub (LDH) extension named NanoAPI. The figure below illustrates the data submission steps for a user and how the submitted data is modeled.

A
B

[A] Data submission steps. Scientists can easily share and publish their nanoparticle datasets in a few simple steps. The data submitter, the user, creates a team. Acting on behalf of the team, the user made a resource that can have multiple datasets uploaded to it. A dataset consists of one or many FCS files generated from the same flow cytometer. Uploaded files are processed for metadata extraction and saved to the NanoFlow database.
[B] Graph entity data model. Each rounded rectangle represents a type of entity, e.g., Team, DataSet, Sample; each entity is assigned both or one of the following roles: subject and linked data. Arrows represent edges that link entities to one another. The model is based on LDH's subject and linked data entity roles to model data sources. Subject entities are related to linked data entities. For example, an entity of type Team is a subject entity that has User linked data entities (members). Whereas an entity of type DataSet is both a subject and linked data entity such that it is a linked data entity to a Team entity (the owners of the dataset) but a subject entity to a SampleSource entity.