Jump to content

Study-Specific File

Study-Specific File

Phenotypic File

Every phenotypic file should have ind_id as the first column. Additional columns must be defined in a corresponding Study-Specific Phenotypic Data Dictionary file (_phen_dd), and must follow whatever format and value constraints are defined therein.

Column Name Type Format / Value Checks Additional Comments
ind_id string

Format:

<study_id>-<site_id>-<fam_id>-<subject_id>

Checks:

Field is a required field and cannot be left empty.

Definition: A unique id for each individual in the specified format

Sample value:  22-101-00132-00001

If your study does not include families, please submit <study_id>-<site_id>-<subject_id>-<subject_id>.

Comments: IMPORTANT-- This id is used to verify subject \ cell_id pairs at RUCDR. Therefore, the id you submit here should match exactly the subject-code submitted to RUCDR. If IDs submitted to RUCDR do not match this format, NRGR will re-format and record the ID updates prior to release.

Timepoint File

For timepoint submissions (submissions with multiple samples at clinically meaningful timepoints per subject) all phenotypic files should have the first 4 columns as

Column Name Type Format / Value Checks Additional Comments
ind_id string

Format:

<study_id>-<site_id>-<fam_id>-<subject_id>

Checks:

Field is a required field and cannot be left empty.

Values in the (ind_id, timepoint) fields should be unique.

Values in the (ind_id, cell_id, timepoint) fields should be unique.

Definition: A unique id for each individual in the specified format

Sample value:  22-101-00132-00001

If your study does not include families, please submit <study_id>-<site_id>-<subject_id>-<subject_id>.

IMPORTANT: This id is used to verify subject \ cell_id pairs at RUCDR. Therefore, the id you submit here should match exactly the subject-code submitted to RUCDR. If IDs submitted to RUCDR do not match this format, NRGR will re-format and record the ID updates prior to release.

cell_id string(14)

Values in the cell_id field should be unique.

Values in the (ind_id, cell_id, timepoint) fields should be unique.

Definition: RUCDR sample id, "RUID"

Sample value:  03C19340

Empty string is considered an invalid value.

Empty strings should be represented with the value NULL

timecode string

Checks:

Field is a required field and cannot be left empty.

Values in the (ind_id, timecode) fields should be unique.

Values in the (ind_id, cell_id, timecode) fields should be unique.

Definition:unique, short code used to represent a timepoint.

Sample value: T0

timepoint_date date

Cannot be left empty

Record dates in YYYY-MM-DD format (ISO_8601 Dates)

Definition: The date corresponding to the timepoint for an individual

Sample value: 2016-01-01

Please note: Excel will default date fields to MM-DD-YYYY format, which is why the format might change when opened in Excel. Please open the file in Notepad or any text editor that shows the correct format and validate before submitting.