Curate AnnData
based on the CELLxGENE schemaΒΆ
This guide shows how to curate an AnnData object with the help of laminlabs/cellxgene
and the CELLxGENE schema v5.1.0.
Load your instance to register the curated AnnData:
# !pip install 'lamindb[bionty,jupyter]' cellxgene-lamin cellxgene-schema
!lamin init --storage ./test-cellxgene-curate --schema bionty
Show code cell output
π‘ connected lamindb: testuser1/test-cellxgene-curate
import lamindb as ln
import cellxgene_lamin as cxg
π‘ connected lamindb: testuser1/test-cellxgene-curate
β Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.
Letβs start with an AnnData object that weβd like to inspect and curate:
adata = cxg.datasets.anndata_human_immune_cells(populate_registries=True)
adata.write_h5ad("anndata_human_immune_cells.h5ad")
adata
Show code cell output
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor', 'tissue', 'cell_type', 'assay', 'sex_ontology_term_id', 'organism'
var: 'feature_is_filtered'
uns: 'default_embedding'
obsm: 'X_umap'
!cellxgene-schema validate anndata_human_immune_cells.h5ad
Show code cell output
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Add labels error: Column 'cell_type' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'assay' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'organism' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'tissue' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: 'title' in 'uns' is not present.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
ERROR: Dataframe 'obs' is missing column 'cell_type_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'assay_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'disease_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'organism_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'tissue_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'self_reported_ethnicity_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'development_stage_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'is_primary_data'.
ERROR: Dataframe 'obs' is missing column 'donor_id'.
ERROR: Dataframe 'obs' is missing column 'suspension_type'.
ERROR: Dataframe 'obs' is missing column 'tissue_type'.
Validation complete in 0:00:00.314460 with status is_valid=False
Validate and curate metadataΒΆ
Validate the AnnData object:
try:
curate = cxg.Curate(adata)
except Exception as e:
print(e)
Show code cell output
Columns {'self_reported_ethnicity', 'disease', 'development_stage', 'donor_id', 'tissue_type', 'suspension_type'} are not found in the data object!
Letβs fix the βdonor_idβ column name:
adata.obs.rename(columns={"donor": "donor_id"}, inplace=True)
For the missing columns, we can pass default values suggested from CELLxGENE:
cxg.CellxGeneFields.OBS_FIELD_DEFAULTS
Show code cell output
{'disease': 'normal',
'development_stage': 'unknown',
'self_reported_ethnicity': 'unknown',
'suspension_type': 'cell',
'donor_id': 'na',
'tissue_type': 'tissue',
'cell_type': 'native_cell',
'sex': 'unknown',
'organism': 'unknown'}
curate = cxg.Curate(adata, defaults=cxg.CellxGeneFields.OBS_FIELD_DEFAULTS, organism="human")
Show code cell output
π‘ added defaults to the AnnData object: {'disease': 'normal', 'development_stage': 'unknown', 'self_reported_ethnicity': 'unknown', 'suspension_type': 'cell', 'tissue_type': 'tissue'}
β
added 1 record with Feature.name for columns: 'sex_ontology_term_id'
β
added 10 records from laminlabs/cellxgene with Feature.name for columns: 'assay', 'cell_type', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue', 'organism', 'tissue_type', 'suspension_type'
β
added 230 records from laminlabs/cellxgene with Gene.ensembl_gene_id for var_index: 'ENSG00000112096', 'ENSG00000182230', 'ENSG00000203441', 'ENSG00000203812', 'ENSG00000204092', 'ENSG00000214970', 'ENSG00000215067', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000223458', 'ENSG00000223797', 'ENSG00000224167', 'ENSG00000224247', 'ENSG00000224739', 'ENSG00000224745', 'ENSG00000225205', 'ENSG00000225932', 'ENSG00000226277', 'ENSG00000226362', 'ENSG00000226377', ...
curate.categoricals
Show code cell output
{'assay': FieldAttr(ExperimentalFactor.name),
'cell_type': FieldAttr(CellType.name),
'development_stage': FieldAttr(DevelopmentalStage.name),
'disease': FieldAttr(Disease.name),
'donor_id': FieldAttr(ULabel.name),
'self_reported_ethnicity': FieldAttr(Ethnicity.name),
'sex_ontology_term_id': FieldAttr(Phenotype.ontology_id),
'suspension_type': FieldAttr(ULabel.name),
'tissue': FieldAttr(Tissue.name),
'tissue_type': FieldAttr(ULabel.name),
'organism': FieldAttr(Organism.name)}
validated = curate.validate(organism="human")
validated
Show code cell output
π‘ validating metadata using registries of instance laminlabs/cellxgene
β
var_index is validated against Gene.ensembl_gene_id
π‘ mapping assay on ExperimentalFactor.name
β found 3 terms validated terms: ["10x 3' v3", "10x 5' v2", "10x 5' v1"]
β save terms via .add_validated_from('assay')
β
assay is validated against ExperimentalFactor.name
β
cell_type is validated against CellType.name
π‘ mapping development_stage on DevelopmentalStage.name
β found 1 terms validated terms: ['unknown']
β save terms via .add_validated_from('development_stage')
β
development_stage is validated against DevelopmentalStage.name
π‘ mapping disease on Disease.name
β found 1 terms validated terms: ['normal']
β save terms via .add_validated_from('disease')
β
disease is validated against Disease.name
π‘ mapping donor_id on ULabel.name
β 12 terms are not validated: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
β save terms via .add_new_from('donor_id')
π‘ mapping self_reported_ethnicity on Ethnicity.name
β found 1 terms validated terms: ['unknown']
β save terms via .add_validated_from('self_reported_ethnicity')
β
self_reported_ethnicity is validated against Ethnicity.name
π‘ mapping sex_ontology_term_id on Phenotype.ontology_id
β found 1 terms validated terms: ['PATO:0000384']
β save terms via .add_validated_from('sex_ontology_term_id')
β
sex_ontology_term_id is validated against Phenotype.ontology_id
π‘ mapping suspension_type on ULabel.name
β found 1 terms validated terms: ['cell']
β save terms via .add_validated_from('suspension_type')
β
suspension_type is validated against ULabel.name
π‘ mapping tissue on Tissue.name
β found 16 terms validated terms: ['blood', 'thoracic lymph node', 'spleen', 'mesenteric lymph node', 'lamina propria', 'liver', 'jejunal epithelium', 'omentum', 'bone marrow', 'ileum', 'caecum', 'thymus', 'skeletal muscle tissue', 'duodenum', 'sigmoid colon', 'transverse colon']
β save terms via .add_validated_from('tissue')
β 1 terms is not validated: 'lungg'
β save terms via .add_new_from('tissue')
π‘ mapping tissue_type on ULabel.name
β found 1 terms validated terms: ['tissue']
β save terms via .add_validated_from('tissue_type')
β
tissue_type is validated against ULabel.name
β
organism is validated against Organism.name
False
Register new metadata labelsΒΆ
Following the suggestions above to register genes and labels that arenβt present in the current instance:
(Note that our instance is rather empty. Once you filled up the registries, registering new labels wonβt be frequently needed)
curate.add_validated_from("all")
Show code cell output
π‘ saving labels for 'assay'
β
added 3 records from laminlabs/cellxgene with ExperimentalFactor.name for assay: '10x 5' v1', '10x 5' v2', '10x 3' v3'
π‘ saving labels for 'cell_type'
π‘ saving labels for 'development_stage'
β
added 1 record from laminlabs/cellxgene with DevelopmentalStage.name for development_stage: 'unknown'
π‘ saving labels for 'disease'
β
added 1 record from laminlabs/cellxgene with Disease.name for disease: 'normal'
π‘ saving labels for 'donor_id'
β 12 non-validated categories are not saved in ULabel.name: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']!
β to lookup categories, use lookup().donor_id
β to save, run .add_new_from('donor_id')
π‘ saving labels for 'self_reported_ethnicity'
β
added 1 record from laminlabs/cellxgene with Ethnicity.name for self_reported_ethnicity: 'unknown'
π‘ saving labels for 'sex_ontology_term_id'
β
added 1 record from laminlabs/cellxgene with Phenotype.ontology_id for sex_ontology_term_id: 'PATO:0000384'
π‘ saving labels for 'suspension_type'
β
added 1 record from laminlabs/cellxgene with ULabel.name for suspension_type: 'cell'
π‘ saving labels for 'tissue'
β 1 non-validated categories are not saved in Tissue.name: ['lungg']!
β to lookup categories, use lookup().tissue
β to save, run .add_new_from('tissue')
β
added 16 records from laminlabs/cellxgene with Tissue.name for tissue: 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', 'mesenteric lymph node', 'caecum', 'omentum', 'blood', 'ileum', 'thoracic lymph node'
π‘ saving labels for 'tissue_type'
β
added 1 record from laminlabs/cellxgene with ULabel.name for tissue_type: 'tissue'
π‘ saving labels for 'organism'
For donors, we register the new labels:
curate.add_new_from("donor_id")
Show code cell output
β
added 12 records with ULabel.name for donor_id: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
An error is shown for the tissue label βlunggβ, which is a typo, should be βlungβ. Letβs fix it:
tissues = curate.lookup().tissue
# using a lookup object to find the correct term
tissues.lung
Show code cell output
Tissue(uid='7Tt4iEKc', name='lung', ontology_id='UBERON:0002048', synonyms='pulmo', description='Respiration Organ That Develops As An Outpocketing Of The Esophagus.', created_by_id=1, source_id=47, updated_at='2024-01-08 15:22:49 UTC')
adata.obs["tissue"] = adata.obs["tissue"].cat.rename_categories(
{"lungg": tissues.lung.name}
)
curate.add_validated_from("tissue")
Show code cell output
β
added 1 record from laminlabs/cellxgene with Tissue.name for tissue: 'lung'
Letβs validate the object again:
validated = curate.validate()
validated
Show code cell output
π‘ validating metadata using registries of instance laminlabs/cellxgene
β
var_index is validated against Gene.ensembl_gene_id
β
assay is validated against ExperimentalFactor.name
β
cell_type is validated against CellType.name
β
development_stage is validated against DevelopmentalStage.name
β
disease is validated against Disease.name
β
donor_id is validated against ULabel.name
β
self_reported_ethnicity is validated against Ethnicity.name
β
sex_ontology_term_id is validated against Phenotype.ontology_id
β
suspension_type is validated against ULabel.name
β
tissue is validated against Tissue.name
β
tissue_type is validated against ULabel.name
β
organism is validated against Organism.name
True
adata.obs.head()
Show code cell output
donor_id | tissue | cell_type | assay | sex_ontology_term_id | organism | disease | development_stage | self_reported_ethnicity | suspension_type | tissue_type | |
---|---|---|---|---|---|---|---|---|---|---|---|
CZINY-0109_CTGGTCTAGTCTGTAC | D496-1 | blood | classical monocyte | 10x 3' v3 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
CZI-IA10244332+CZI-IA10244434_CCTTCGACATACTCTT | 621B-1 | thoracic lymph node | T follicular helper cell | 10x 5' v2 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7935491_CTGGTCTGTACATGTC | A29-1 | spleen | memory B cell | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7980367_GGGCATCCAGGTGGAT | A36-1 | lung | alveolar macrophage | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Pan_T7935494_ATCATGGTCTACCTGC | A29-1 | mesenteric lymph node | naive thymus-derived CD4-positive, alpha-beta ... | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Save artifactΒΆ
artifact = curate.save_artifact(description="test h5ad file")
Show code cell output
β no run & transform get linked, consider calling ln.track()
π‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/wMDQPhwdtZ9Bnt6oPMls.h5ad')
β
storing artifact 'wMDQPhwdtZ9Bnt6oPMls' at '/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-curate/.lamindb/wMDQPhwdtZ9Bnt6oPMls.h5ad'
π‘ you can auto-track these data as a run input by calling `ln.track()`
π‘ parsing feature names of X stored in slot 'var'
β
36503 terms (100.00%) are validated for ensembl_gene_id
β
linked: FeatureSet(uid='qcJKmNXydMBUjLmxFdsT', n=36503, dtype='float', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', created_by_id=1)
π‘ parsing feature names of slot 'obs'
β
11 terms (100.00%) are validated for name
β
linked: FeatureSet(uid='agF8FgjckEZRYrGIAZTw', n=11, registry='Feature', hash='uQw586KYRgQdtZZqvXny', created_by_id=1)
β
saved 2 feature sets for slots: 'var','obs'
artifact.describe()
Show code cell output
Artifact(uid='wMDQPhwdtZ9Bnt6oPMls', description='test h5ad file', suffix='.h5ad', type='dataset', accessor='AnnData', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True, updated_at='2024-07-29 14:16:01 UTC')
Provenance
.created_by = 'testuser1'
.storage = '/home/runner/work/cellxgene-lamin/cellxgene-lamin/docs/test-cellxgene-curate'
Labels
.organisms = 'human'
.tissues = 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
.cell_types = 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
.diseases = 'normal'
.phenotypes = 'male'
.experimental_factors = '10x 5' v1', '10x 5' v2', '10x 3' v3'
.developmental_stages = 'unknown'
.ethnicities = 'unknown'
.ulabels = 'cell', 'tissue', 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', ...
Features
'assay' = '10x 5' v1', '10x 5' v2', '10x 3' v3'
'cell_type' = 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
'development_stage' = 'unknown'
'disease' = 'normal'
'donor_id' = 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', ...
'organism' = 'human'
'self_reported_ethnicity' = 'unknown'
'sex_ontology_term_id' = 'male'
'suspension_type' = 'cell'
'tissue' = 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
'tissue_type' = 'tissue'
Feature sets
'var' = 'MIR1302-2HG', 'FAM138A', 'OR4F5', 'None', 'OR4F29', 'OR4F16', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115', 'FAM41C'
'obs' = 'assay', 'cell_type', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue', 'organism', 'tissue_type', 'suspension_type', 'sex_ontology_term_id'
The below is optional β it mimics the way cellxgene creates collections of AnnData
objects to link them to studies.
# register a new collection
collection = curate.save_collection(
[artifact], # registered artifact above, can also pass a list of artifacts
name=( # title of the publication
"Cross-tissue immune cell analysis reveals tissue-specific features in humans"
" (for test demo only)"
),
description="10.1126/science.abl5197", # DOI of the publication
reference="E-MTAB-11536", # accession number (e.g. GSE#, E-MTAB#, etc.)
reference_type="ArrayExpress", # source type (e.g. GEO, ArrayExpress, SRA, etc.)
)
Show code cell output
β no run & transform get linked, consider calling ln.track()
π‘ you can auto-track these data as a run input by calling `ln.track()`
Return an input h5ad file for cellxgene-schemaΒΆ
adata_cxg = curate.to_cellxgene(is_primary_data=True)
adata_cxg
Show code cell output
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor_id', 'sex_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'assay_ontology_term_id', 'organism_ontology_term_id', 'disease_ontology_term_id', 'development_stage_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data'
var: 'feature_is_filtered'
uns: 'default_embedding', 'title', 'cxg_lamin_schema_reference', 'cxg_lamin_schema_version'
obsm: 'X_umap'
adata_cxg.write_h5ad("anndata_human_immune_cells_cxg.h5ad")
!cellxgene-schema validate anndata_human_immune_cells_cxg.h5ad
Show code cell output
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
Validation complete in 0:00:03.144628 with status is_valid=False
Note
The Curate class is designed to validate all metadata for adherence to ontologies. It does not reimplement all rules of the cellxgene schema and we therefore recommend running the cellxgene-schema if full adherence beyond metadata is a necessity.