Biosample

From Original GA4GH schema

Status: proposed

Provenance

Used by

Authors

Schema source: YAML file

Properties of the Biosample class

Property Type Format Description
age_at_collection The age of the individual at time of biosample collection, as Age object.
biocharacteristics array biocharacteristics represents a wrapper list of "Phenotype" objects with properly prefixed term ids, describing features of the biosample. Examples would be phenotypes, disease codes or other ontology classes specific to this biosample. In a complete data model (variants - (callsets) - biosamples - individuals), characteristics applying to the individual (e.g. sex, most phenotypes) should be annotated there.
created timestamp The creation time of this record, in ISO8601
data_use_conditions Data use conditions applying to data from this biosample, as ontology object (e.g. DUO).
description string A free text description of the biosample. This should not contain any structured data.
external_references array list of reference_class objects with properly (e.g. identifiers.org) prefixed external identifiers and a term describing the relationship
geo_provenance This geo_class attribute ideally describes the geographic location of where the sample was extracted. Frequently, this value may reflect either the place of the laboratory where the analysis was performed, or correspond to the corresponding author's institution.
id string The local-unique identifier of this biosample (referenced as "biosample_id"). This is unique in the context of the local (e.g. server, resource) instance.
individual_id string In a complete data model "individual_id" points to the "id" of the individual ("donor", "subjerct"...) this biosample was derived from. In a local context this could be the id attribute in a corresponding "individuals" collection.
info This is a wrapper for objects without further specification in the schema.
name string A short descriptive name for sample which should be sufficient to distinguish it from other samples in the project or collection. This is a label or symbolic identifier for the biosample.
project_id string The id attribute of the project that this biosample was collected in.
updated timestamp The time of the last edit of this record, in ISO8601

Description

A Biosample refers to a unit of biological material from which the substrate molecules (e.g. genomic DNA, RNA, proteins) for molecular analyses (e.g. sequencing, array hybridisation, mass-spectrometry) are extracted. Examples would be a tissue biopsy, a single cell from a culture for single cell genome sequencing or a protein fraction from a gradient centrifugation. Several instances (e.g. technical replicates) or types of experiments (e.g. genomic array as well as RNA-seq experiments) may refer to the same Biosample. FHIR mapping: Specimen (http://www.hl7.org/fhir/specimen.html).

Examples

{
   "age_at_collection" : {
      "age" : "P56Y",
      "age_class" : {
         "id" : "HP:0003621",
         "label" : "Juvenile onset"
      }
   },
   "biocharacteristics" : [
      {
         "description" : "Lobular Breast Carcinoma In Situ, study sample",
         "type" : {
            "id" : "ncit:C4018",
            "label" : "Lobular Breast Carcinoma In Situ"
         }
      }
   ],
   "created" : "2017-10-25T07:06:03Z",
   "data_use_conditions" : {
      "id" : "DUO:0000004",
      "label" : "no restriction"
   },
   "description" : "Burkitt lymphoma, cell line Namalwa",
   "external_references" : [
      {
         "description" : "Cellosaurus cell line identifier",
         "relation" : "provenance",
         "type" : {
            "id" : "cellosaurus:CVCL_0312",
            "label" : "HOS"
         }
      }
   ],
   "geo_provenance" : {
      "altitude" : 94,
      "city" : "Timisoara",
      "country" : "Romania",
      "label" : "Str Marasesti 5, 300077 Timisoara, Romania",
      "latitude" : 45.75,
      "longitude" : 21.23
   },
   "id" : "AM_BS__NCBISKYCGH-1993",
   "individual_id" : "ind-cnhl-1293347-004",
   "info" : {
      "death" : 1,
      "followup_time" : "P14M"
   },
   "name" : "Sample BRCA-00429, 2nd replicate",
   "project_id" : "ind-cnhl-1293347-004",
   "updated" : "2017-10-25T07:06:03Z"
}

Notes and examples on the Biosample properties

age_at_collection
'age_at_collection' : {
  'age' => 'P56Y',
  'age_class' => {
                   'id' => 'HP:0003621',
                   'label' => 'Juvenile onset'
                 }
}
biocharacteristics
'biocharacteristics' : [
  {
    'description' => 'Adenocarcinoma of the pancreas',
    'type' => {
                'id' => 'pgx:81403',
                'label' => 'Adenocarcinoma, NOS'
              }
  },
  {
    'description' => 'Pancreatic Adenocarcinoma',
    'type' => {
                'id' => 'ncit:C8294',
                'label' => 'Pancreatic Adenocarcinoma'
              }
  }
]
db.biosamples.find( { "biocharacteristics.type.id" : "ncit:C8294" } )

This call to the distinct funcion will return all bioterms ids for samples having some ncit id; to retrive only the ncit ids, this has to be followed by a regex filter (/^ncit/).

db.biosamples.distinct( "biocharacteristics.type.id", { "biocharacteristics.type.id" : { $regex : /ncit/ } } )
created
'created' : "2017-10-25T07:06:03Z"
data_use_conditions
'data_use_conditions' : {
  'id' => 'DUO:0000004',
  'label' => 'no restriction'
}
description
'description' : "Burkitt lymphoma, cell line Namalwa"
external_references
'external_references' : [
  {
    'description' => 'Cellosaurus cell line identifier',
    'relation' => 'provenance',
    'type' => {
                'id' => 'cellosaurus:CVCL_0312',
                'label' => 'HOS'
              }
  },
  {
    'description' => 'PubMed reference',
    'relation' => 'report',
    'type' => {
                'id' => 'pubmed:2823272',
                'label' => 'Rearrangement of the p53 gene in human osteogenic sarcomas.'
              }
  }
]
db.biosamples.find( { "external_references.type.id" : "pubmed:17440070" } )
geo_provenance
'geo_provenance' : {
  'altitude' => 94,
  'city' => 'Timisoara',
  'country' => 'Romania',
  'label' => 'Str Marasesti 5, 300077 Timisoara, Romania',
  'latitude' => '45.75',
  'longitude' => '21.23',
  'precision' => 'address'
}
id
'id' : "AM_BS__NCBISKYCGH-1993"
individual_id
'individual_id' : "ind-cnhl-1293347-004"
info
'info' : {
  'death' => 1,
  'followup_time' => 'P14M'
}
db.biosamples.find( {"info" : { $elemMatch: { "followup_time.value" : { $regex : /\P/ }, "death.value" : true } } } )
name
'name' : "Sample BRCA-00429, 2nd replicate"
project_id
'project_id' : "ind-cnhl-1293347-004"
updated
'updated' : "2022-11-11T09:45:13Z"
Edit on Github...