05:06:2024 20:55
sequence
1.2
Evan Christensen
definition
term replaced by
amino acid modification
Alliance of Genome Resources
Alliance of Genome Resources Gene Biotype Slim
biosapiens
database of genomic structural variation
RNA modification
SO feature annotation
variant annotation term
amino acid 1 letter code
amino acid 3 letter code
biosapiens protein feature ontology
dbsnp variant terms
DBVAR
ensembl variant terms
subset_property
synonym_type_property
consider
has_alternative_id
has_broad_synonym
database_cross_reference
has_exact_synonym
has_narrow_synonym
has_obo_format_version
has_obo_namespace
has_related_synonym
has_scope
has_synonym_type
in_subset
A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap.
sequence
adjacent_to
adjacent_to
A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap.
PMID:20226267
SO:ke
sequence
associated_with
This relationship is vague and up for discussion.
associated_with
B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A.
sequence
complete_evidence_for_feature
If A is a feature with multiple regions such as a multi exon transcript, the supporting EST evidence is complete if each of the regions is supported by an equivalent region in B. Also there must be no extra regions in B that are not represented in A. This relationship was requested by jeltje on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222.
complete_evidence_for_feature
B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A.
SO:ke
X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z.
kareneilbeck
2010-10-14T01:38:51Z
sequence
connects_on
Example: A splice_junction connects_on exon, exon, mature_transcript.
connects_on
X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z.
PMID:20226267
X contained_by Y iff X starts after start of Y and X ends before end of Y.
kareneilbeck
2010-10-14T01:26:16Z
sequence
contained_by
The inverse is contains. Example: intein contained_by immature_peptide_region.
contained_by
X contained_by Y iff X starts after start of Y and X ends before end of Y.
PMID:20226267
The inverse of contained_by.
kareneilbeck
2010-10-14T01:32:15Z
sequence
contains
Example: pre_miRNA contains miRNA_loop.
contains
The inverse of contained_by.
PMID:20226267
sequence
derives_from
derives_from
X is disconnected_from Y iff it is not the case that X overlaps Y.
kareneilbeck
2010-10-14T01:42:10Z
sequence
disconnected_from
disconnected_from
X is disconnected_from Y iff it is not the case that X overlaps Y.
PMID:20226267
kareneilbeck
2009-08-19T02:19:45Z
sequence
edited_from
edited_from
kareneilbeck
2009-08-19T02:19:11Z
sequence
edited_to
edited_to
B is evidence_for_feature A, if an instance of B supports the existence of A.
sequence
evidence_for_feature
This relationship was requested by nlw on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222.
evidence_for_feature
B is evidence_for_feature A, if an instance of B supports the existence of A.
SO:ke
X is exemplar of Y if X is the best evidence for Y.
sequence
exemplar_of
Tracker id: 2594157.
exemplar_of
X is exemplar of Y if X is the best evidence for Y.
SO:ke
Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary.
kareneilbeck
2010-10-14T01:45:45Z
sequence
finished_by
Example CDS finished_by stop_codon.
finished_by
Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary.
PMID:20226267
X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary.
kareneilbeck
2010-10-14T02:17:53Z
sequence
finishes
Example: stop_codon finishes CDS.
finishes
X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary.
PMID:20226267
X gained Y if X is a variant_of X' and Y part of X but not X'.
kareneilbeck
2011-06-28T12:51:10Z
sequence
gained
A relation with which to annotate the changes in a variant sequence with respect to a reference.
For example a variant transcript may gain a stop codon not present in the reference sequence.
gained
X gained Y if X is a variant_of X' and Y part of X but not X'.
SO:ke
sequence
genome_of
genome_of
kareneilbeck
2009-08-19T02:27:04Z
sequence
guided_by
guided_by
kareneilbeck
2009-08-19T02:27:24Z
sequence
guides
guides
X has_integral_part Y if and only if: X has_part Y and Y part_of X.
kareneilbeck
2009-08-19T12:01:46Z
sequence
has_integral_part
Example: mRNA has_integral_part CDS.
has_integral_part
X has_integral_part Y if and only if: X has_part Y and Y part_of X.
http://precedings.nature.com/documents/3495/version/1
sequence
has_origin
has_origin
Inverse of part_of.
sequence
has_part
Example: operon has_part gene.
has_part
Inverse of part_of.
http://precedings.nature.com/documents/3495/version/1
sequence
has_quality
The relationship between a feature and an attribute.
has_quality
sequence
homologous_to
homologous_to
X integral_part_of Y if and only if: X part_of Y and Y has_part X.
kareneilbeck
2009-08-19T12:03:28Z
sequence
integral_part_of
Example: exon integral_part_of transcript.
integral_part_of
X integral_part_of Y if and only if: X part_of Y and Y has_part X.
http://precedings.nature.com/documents/3495/version/1
R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical).
kareneilbeck
2010-10-14T02:19:48Z
sequence
is_consecutive_sequence_of
Example: region is consecutive_sequence of base.
is_consecutive_sequence_of
R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical).
PMID:20226267
X lost Y if X is a variant_of X' and Y part of X' but not X.
kareneilbeck
2011-06-28T12:53:16Z
sequence
lost
A relation with which to annotate the changes in a variant sequence with respect to a reference.
For example a variant transcript may have lost a stop codon present in the reference sequence.
lost
X lost Y if X is a variant_of X' and Y part of X' but not X.
SO:ke
A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y.
kareneilbeck
2010-10-14T01:34:48Z
sequence
maximally_overlaps
Example: non_coding_region_of_exon maximally_overlaps the intersections of exon and UTR.
maximally_overlaps
A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y.
PMID:20226267
sequence
member_of
A subtype of part_of. Inverse is collection_of. Winston, M, Chaffin, R, Herrmann: A taxonomy of part-whole relations. Cognitive Science 1987, 11:417-444.
member_of
A relationship between a pseudogenic feature and its functional ancestor.
sequence
non_functional_homolog_of
non_functional_homolog_of
A relationship between a pseudogenic feature and its functional ancestor.
SO:ke
sequence
orthologous_to
orthologous_to
X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y.
kareneilbeck
2010-10-14T01:33:15Z
sequence
overlaps
Example: coding_exon overlaps CDS.
overlaps
X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y.
PMID:20226267
sequence
paralogous_to
paralogous_to
X part_of Y if X is a subregion of Y.
sequence
part_of
Example: amino_acid part_of polypeptide.
part_of
X part_of Y if X is a subregion of Y.
http://precedings.nature.com/documents/3495/version/1
B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A.
sequence
partial_evidence_for_feature
partial_evidence_for_feature
B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A.
SO:ke
sequence
position_of
position_of
Inverse of processed_into.
kareneilbeck
2009-08-19T12:14:00Z
sequence
processed_from
Example: miRNA processed_from miRNA_primary_transcript.
processed_from
Inverse of processed_into.
http://precedings.nature.com/documents/3495/version/1
X is processed_into Y if a region X is modified to create Y.
kareneilbeck
2009-08-19T12:15:02Z
sequence
processed_into
Example: miRNA_primary_transcript processed into miRNA.
processed_into
X is processed_into Y if a region X is modified to create Y.
http://precedings.nature.com/documents/3495/version/1
kareneilbeck
2009-08-19T02:21:03Z
sequence
recombined_from
recombined_from
kareneilbeck
2009-08-19T02:20:07Z
sequence
recombined_to
recombined_to
sequence
sequence_of
sequence_of
sequence
similar_to
similar_to
X is strted_by Y if Y is part_of X and X and Y share a 5' boundary.
kareneilbeck
2010-10-14T01:43:55Z
sequence
started_by
Example: CDS started_by start_codon.
started_by
X is strted_by Y if Y is part_of X and X and Y share a 5' boundary.
PMID:20226267
X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary.
kareneilbeck
2010-10-14T01:47:53Z
sequence
starts
Example: start_codon starts CDS.
starts
X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary.
PMID:20226267
kareneilbeck
2009-08-19T02:22:14Z
sequence
trans_spliced_from
trans_spliced_from
kareneilbeck
2009-08-19T02:22:00Z
sequence
trans_spliced_to
trans_spliced_to
X is transcribed_from Y if X is synthesized from template Y.
kareneilbeck
2009-08-19T12:05:39Z
sequence
transcribed_from
Example: primary_transcript transcribed_from gene.
transcribed_from
X is transcribed_from Y if X is synthesized from template Y.
http://precedings.nature.com/documents/3495/version/1
Inverse of transcribed_from.
kareneilbeck
2009-08-19T12:08:24Z
sequence
transcribed_to
Example: gene transcribed_to primary_transcript.
transcribed_to
Inverse of transcribed_from.
http://precedings.nature.com/documents/3495/version/1
Inverse of translation _of.
kareneilbeck
2009-08-19T12:11:53Z
sequence
translates_to
Example: codon translates_to amino_acid.
translates_to
Inverse of translation _of.
http://precedings.nature.com/documents/3495/version/1
X is translation of Y if Y is translated by ribosome to create X.
kareneilbeck
2009-08-19T12:09:59Z
sequence
translation_of
Example: Polypeptide translation_of CDS.
translation_of
X is translation of Y if Y is translated by ribosome to create X.
http://precedings.nature.com/documents/3495/version/1
A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A.
sequence
variant_of
Added to SO during the immunology workshop, June 2007. This relationship was approved by Barry Smith.
variant_of
A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A.
SO:immuno_workshop
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
sequence
SO:0000000
Sequence_Ontology
true
A 5' UTR variant within an upstream open reading frame.
evan
2024-04-10T17:49:03Z
sequence
SO:00000000002382
Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #647.
5_prime_UTR_uORF_variant
A 5' UTR variant within an upstream open reading frame.
PMID:32461616
PMID:32926138
A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids.
sequence
sequence
SO:0000001
region
A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids.
SO:ke
A 5' UTR variant where a stop codon in an upstream open reading frame is introduced, moved or lost.
evan
2024-04-10T17:56:17Z
sequence
SO:00000010002382
Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #622.
5_prime_UTR_uORF_stop_codon_variant
A 5' UTR variant where a stop codon in an upstream open reading frame is introduced, moved or lost.
PMID:32461616
PMID:32926138
A folded sequence.
INSDC_feature:misc_structure
sequence secondary structure
sequence
SO:0000002
sequence_secondary_structure
A folded sequence.
SO:ke
A 5' UTR variant which disrupts the translation of an upstream open reading frame because the number of nucleotides inserted or deleted is not a multiple of three.
evan
2024-04-10T17:58:40Z
uFrameshift (UTRannotator)
sequence
SO:00000020002382
Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #621.
5_prime_UTR_uORF_frameshift_variant
A 5' UTR variant which disrupts the translation of an upstream open reading frame because the number of nucleotides inserted or deleted is not a multiple of three.
PMID:32461616
PMID:32926138
G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet.
http://en.wikipedia.org/wiki/G-quadruplex
G quartet
G tetrad
G-quadruplex
G-quartet
G-tetrad
G_quadruplex
guanine tetrad
sequence
SO:0000003
G_quartet
G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet.
http://www.ncbi.nlm.nih.gov/pubmed/7919797?dopt=Abstract
http://en.wikipedia.org/wiki/G-quadruplex
wiki
A 5' UTR variant where a premature stop codon is gained in an upstream open reading frame.
evan
2024-04-10T18:01:42Z
uSTOP_gained
sequence
SO:00000030002382
Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #624.
5_prime_UTR_uORF_stop_codon_gain_variant
A 5' UTR variant where a premature stop codon is gained in an upstream open reading frame.
PMID:32461616
PMID:32926138
uSTOP_gained
UTRannotator
A coding exon that is not the most 3-prime or the most 5-prime in a given transcript.
interior coding exon
sequence
SO:0000004
interior_coding_exon
A 5' UTR variant where the stop codon of an upstream open reading frame is lost.
evan
2024-04-10T18:05:50Z
uSTOP_lost
sequence
SO:00000040002382
Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #623.
5_prime_UTR_uORF_stop_codon_loss_variant
A 5' UTR variant where the stop codon of an upstream open reading frame is lost.
PMID:32461616
PMID:32926138
uSTOP_lost
UTRannotator
The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Satellite_DNA
INSDC_qualifier:satellite
satellite DNA
sequence
SO:0000005
satellite_DNA
The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Satellite_DNA
wiki
A region amplified by a PCR reaction.
http://en.wikipedia.org/wiki/RAPD
PCR product
sequence
amplicon
SO:0000006
This term is mapped to MGED. This term is now located in OBI, with the following ID OBI_0000406.
PCR_product
A region amplified by a PCR reaction.
SO:ke
http://en.wikipedia.org/wiki/RAPD
wiki
One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert.
mate pair
read-pair
sequence
SO:0000007
read_pair
One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert.
SO:ls
sequence
SO:0000008
gene_sensu_your_favorite_organism
true
sequence
SO:0000009
gene_class
true
A gene which, when transcribed, can be translated into a protein.
protein-coding
sequence
SO:0000010
protein_coding
A gene which can be transcribed, but will not be translated into a protein.
non protein-coding
sequence
SO:0000011
non_protein_coding
The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote.
scRNA primary transcript
scRNA transcript
small cytoplasmic RNA transcript
sequence
small cytoplasmic RNA
small_cytoplasmic_RNA
SO:0000012
scRNA_primary_transcript
The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote.
http://www.ebi.ac.uk/embl/WebFeat/align/scRNA_s.html
A small non coding RNA sequence, present in the cytoplasm.
INSDC_feature:ncRNA
INSDC_qualifier:scRNA
small cytoplasmic RNA
sequence
SO:0000013
scRNA
A small non coding RNA sequence, present in the cytoplasm.
SO:ke
A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element.
INR motif
initiator
initiator motif
sequence
DMp2
SO:0000014
Binds TAF1, TAF2.
INR_motif
A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element.
PMID:12651739
PMID:16858867
A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C).
DPE motif
downstream core promoter element
CRWMGCGWKCGCTTS
sequence
SO:0000015
Binds TAF6, TAF9.
DPE_motif
A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C).
PMID:12515390
PMID:12537576
PMID:12651739
PMID:16858867
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB.
B-recognition element
BRE motif
BREu motif
transcription factor B-recognition element
sequence
BREu
TFIIB recognition element
SO:0000016
Binds TFIIB.
BREu_motif
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB.
PMID:12651739
PMID:16858867
A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes.
PSE motif
proximal sequence element
sequence
SO:0000017
PSE_motif
A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes.
PMID:11390411
PMID:12621023
PMID:12651739
PMID:23166507
PMID:8339931
A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned.
http://en.wikipedia.org/wiki/Linkage_group
linkage group
sequence
SO:0000018
linkage_group
A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned.
ISBN:038752046
http://en.wikipedia.org/wiki/Linkage_group
wiki
true
A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge.
RNA internal loop
sequence
SO:0000020
RNA_internal_loop
A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge.
SO:ke
An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand.
asymmetric RNA internal loop
sequence
SO:0000021
asymmetric_RNA_internal_loop
An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand.
SO:ke
A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix.
A minor RNA motif
sequence
SO:0000022
A_minor_RNA_motif
A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix.
SO:ke
The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices.
http://en.wikipedia.org/wiki/K-turn
K turn RNA motif
K-turn
kink turn
kink-turn motif
sequence
SO:0000023
K_turn_RNA_motif
The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices.
SO:ke
http://en.wikipedia.org/wiki/K-turn
wiki
A loop in ribosomal RNA containing the sites of attack for ricin and sarcin.
sarcin like RNA motif
sarcin/ricin RNA domain
sarcin/ricin domain
sarcin/ricin loop
sequence
SO:0000024
sarcin_like_RNA_motif
A loop in ribosomal RNA containing the sites of attack for ricin and sarcin.
http://www.ncbi.nlm.nih.gov/pubmed/7897662
An internal RNA loop where the extent of the loop on both stands is the same size.
A-minor RNA motif
sequence
SO:0000025
symmetric_RNA_internal_loop
An internal RNA loop where the extent of the loop on both stands is the same size.
SO:ke
RNA junction loop
sequence
SO:0000026
RNA_junction_loop
RNA hook turn
hook-turn motif
sequence
hook turn
SO:0000027
RNA_hook_turn
Two bases paired opposite each other by hydrogen bonds creating a secondary structure.
http://en.wikipedia.org/wiki/Base_pair
base pair
sequence
SO:0000028
base_pair
http://en.wikipedia.org/wiki/Base_pair
wiki
The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation.
WC base pair
Watson Crick base pair
Watson-Crick pair
canonical base pair
sequence
Watson-Crick base pair
SO:0000029
WC_base_pair
The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation.
PMID:12177293
A type of non-canonical base-pairing.
sugar edge base pair
sequence
SO:0000030
sugar_edge_base_pair
A type of non-canonical base-pairing.
PMID:12177293
DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://en.wikipedia.org/wiki/Aptamer
sequence
SO:0000031
aptamer
DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://aptamer.icmb.utexas.edu
http://en.wikipedia.org/wiki/Aptamer
wiki
DNA molecules that have been selected from random pools based on their ability to bind other molecules.
DNA aptamer
sequence
SO:0000032
DNA_aptamer
DNA molecules that have been selected from random pools based on their ability to bind other molecules.
http:aptamer.icmb.utexas.edu
RNA molecules that have been selected from random pools based on their ability to bind other molecules.
RNA aptamer
sequence
SO:0000033
RNA_aptamer
RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://aptamer.icmb.utexas.edu
Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino.
morphant
morpholino
morpholino oligo
sequence
SO:0000034
morpholino_oligo
Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino.
http://www.gene-tools.com/
A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Riboswitch
INSDC_qualifier:riboswitch
riboswitch RNA
sequence
SO:0000035
riboswitch
A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites.
PMID:2820954
http://en.wikipedia.org/wiki/Riboswitch
wiki
A DNA region that is required for the binding of chromatin to the nuclear matrix.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Matrix_attachment_site
INSDC_qualifier:matrix_attachment_region
MAR
S/MAR
SMAR
matrix association region
matrix attachment region
matrix attachment site
nuclear matrix association region
nuclear matrix attachment site
scaffold attachment site
scaffold matrix attachment region
sequence
S/MAR element
SO:0000036
matrix_attachment_site
A DNA region that is required for the binding of chromatin to the nuclear matrix.
SO:ma
http://en.wikipedia.org/wiki/Matrix_attachment_site
wiki
A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Locus_control_region
INSDC_qualifier:locus_control_region
LCR
locus control region
sequence
locus control element
SO:0000037
Definition updated Nov 10 2020, Colin Logie from GREEKC helped us realize that LCRs can also be located 3' to a gene.
locus_control_region
A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene.
SO:ma
http://en.wikipedia.org/wiki/Locus_control_region
wiki
A collection of match parts.
sequence
SO:0000038
match_set
true
A collection of match parts.
SO:ke
A part of a match, for example an hsp from blast is a match_part.
match part
sequence
SO:0000039
match_part
A part of a match, for example an hsp from blast is a match_part.
SO:ke
A clone of a DNA region of a genome.
genomic clone
sequence
SO:0000040
genomic_clone
A clone of a DNA region of a genome.
SO:ma
An operation that can be applied to a sequence, that results in a change.
sequence operation
sequence
SO:0000041
sequence_operation
true
An operation that can be applied to a sequence, that results in a change.
SO:ke
An attribute of a pseudogene (SO:0000336).
pseudogene attribute
sequence
SO:0000042
pseudogene_attribute
true
An attribute of a pseudogene (SO:0000336).
SO:ma
A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail.
INSDC_feature:gene
INSDC_qualifier:processed
processed pseudogene
retropseudogene
sequence
R psi G
pseudogene by reverse transcription
SO:0000043
Please not the synonym R psi M uses the spelled out form of the greek letter.
processed_pseudogene
A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A pseudogene caused by unequal crossing over at recombination.
pseudogene by unequal crossing over
sequence
SO:0000044
pseudogene_by_unequal_crossing_over
A pseudogene caused by unequal crossing over at recombination.
SO:ke
To remove a subsection of sequence.
sequence
SO:0000045
delete
true
To remove a subsection of sequence.
SO:ke
To insert a subsection of sequence.
sequence
SO:0000046
insert
true
To insert a subsection of sequence.
SO:ke
To invert a subsection of sequence.
sequence
SO:0000047
invert
true
To invert a subsection of sequence.
SO:ke
To substitute a subsection of sequence for another.
sequence
SO:0000048
substitute
true
To substitute a subsection of sequence for another.
SO:ke
To translocate a subsection of sequence.
sequence
SO:0000049
translocate
true
To translocate a subsection of sequence.
SO:ke
A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene.
sequence
SO:0000050
gene_part
true
A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene.
SO:ke
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
http://en.wikipedia.org/wiki/Hybridization_probe
sequence
SO:0000051
probe
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
SO:ma
http://en.wikipedia.org/wiki/Hybridization_probe
wiki
sequence
assortment-derived_deficiency
SO:0000052
assortment_derived_deficiency
true
A sequence_variant_effect which changes the regulatory region of a gene.
SO:0001556
sequence variant affecting regulatory region
sequence
mutation affecting regulatory region
SO:0000053
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_regulatory_region
true
A sequence_variant_effect which changes the regulatory region of a gene.
SO:ke
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
http://en.wikipedia.org/wiki/Aneuploid
sequence
SO:0000054
aneuploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
SO:ke
http://en.wikipedia.org/wiki/Aneuploid
wiki
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present.
http://en.wikipedia.org/wiki/Hyperploid
sequence
SO:0000055
hyperploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present.
SO:ke
http://en.wikipedia.org/wiki/Hyperploid
wiki
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing.
http://en.wikipedia.org/wiki/Hypoploid
sequence
SO:0000056
hypoploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing.
SO:ke
http://en.wikipedia.org/wiki/Hypoploid
wiki
A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon.
http://en.wikipedia.org/wiki/Operator_(biology)#Operator
operator segment
sequence
SO:0000057
Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529.
operator
A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon.
SO:ma
http://en.wikipedia.org/wiki/Operator_(biology)#Operator
wiki
sequence
assortment-derived_aneuploid
SO:0000058
assortment_derived_aneuploid
true
A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease.
nuclease binding site
sequence
SO:0000059
nuclease_binding_site
A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease.
SO:cb
One arm of a compound chromosome.
compound chromosome arm
sequence
SO:0000060
FLAG - this term is should probably be a part of rather than an is_a.
compound_chromosome_arm
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme.
restriction endonuclease binding site
restriction enzyme binding site
sequence
SO:0000061
A region of a molecule that binds to a restriction enzyme.
restriction_enzyme_binding_site
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme.
SO:cb
An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
deficient intrachromosomal transposition
sequence
SO:0000062
deficient_intrachromosomal_transposition
An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
FB:reference_manual
An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
deficient interchromosomal transposition
sequence
SO:0000063
deficient_interchromosomal_transposition
An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
SO:ke
sequence
SO:0000064
This classes of attributes was added by MA to allow the broad description of genes based on qualities of the transcript(s). A product of SO meeting 2004.
gene_by_transcript_attribute
true
A chromosome structure variation whereby an arm exists as an individual chromosome element.
free chromosome arm
sequence
SO:0000065
free_chromosome_arm
A chromosome structure variation whereby an arm exists as an individual chromosome element.
SO:ke
sequence
SO:0000066
gene_by_polyadenylation_attribute
true
gene to gene feature
sequence
SO:0000067
gene_to_gene_feature
An attribute describing a gene that has a sequence that overlaps the sequence of another gene.
sequence
SO:0000068
overlapping
An attribute describing a gene that has a sequence that overlaps the sequence of another gene.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene.
inside intron
sequence
SO:0000069
inside_intron
An attribute to describe a gene when it is located within the intron of another gene.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand.
inside intron antiparallel
sequence
SO:0000070
inside_intron_antiparallel
An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene and on the same strand.
inside intron parallel
sequence
SO:0000071
inside_intron_parallel
An attribute to describe a gene when it is located within the intron of another gene and on the same strand.
SO:ke
sequence
SO:0000072
end_overlapping_gene
true
An attribute to describe a gene when the five prime region overlaps with another gene's 3' region.
five prime-three prime overlap
sequence
SO:0000073
five_prime_three_prime_overlap
An attribute to describe a gene when the five prime region overlaps with another gene's 3' region.
SO:ke
An attribute to describe a gene when the five prime region overlaps with another gene's five prime region.
five prime-five prime overlap
sequence
SO:0000074
five_prime_five_prime_overlap
An attribute to describe a gene when the five prime region overlaps with another gene's five prime region.
SO:ke
An attribute to describe a gene when the 3' region overlaps with another gene's 3' region.
three prime-three prime overlap
sequence
SO:0000075
three_prime_three_prime_overlap
An attribute to describe a gene when the 3' region overlaps with another gene's 3' region.
SO:ke
An attribute to describe a gene when the 3' region overlaps with another gene's 5' region.
5' 3' overlap
three prime five prime overlap
sequence
SO:0000076
three_prime_five_prime_overlap
An attribute to describe a gene when the 3' region overlaps with another gene's 5' region.
SO:ke
A region sequence that is complementary to a sequence of messenger RNA.
http://en.wikipedia.org/wiki/Antisense
sequence
SO:0000077
antisense
A region sequence that is complementary to a sequence of messenger RNA.
SO:ke
http://en.wikipedia.org/wiki/Antisense
wiki
A transcript that is polycistronic.
polycistronic transcript
sequence
SO:0000078
polycistronic_transcript
A transcript that is polycistronic.
SO:xp
A transcript that is dicistronic.
dicistronic transcript
sequence
SO:0000079
dicistronic_transcript
A transcript that is dicistronic.
SO:ke
A gene that is a member of an operon, which is a set of genes transcribed together as a unit.
operon member
sequence
SO:0000080
operon_member
gene array member
sequence
SO:0000081
gene_array_member
sequence
SO:0000082
processed_transcript_attribute
true
DNA belonging to the macronuclei of ciliates.
macronuclear sequence
sequence
SO:0000083
macronuclear_sequence
DNA belonging to the micronuclei of a cell.
micronuclear sequence
sequence
SO:0000084
micronuclear_sequence
sequence
SO:0000085
gene_by_genome_location
true
sequence
SO:0000086
gene_by_organelle_of_genome
true
A gene from nuclear sequence.
http://en.wikipedia.org/wiki/Nuclear_gene
nuclear gene
sequence
SO:0000087
nuclear_gene
A gene from nuclear sequence.
SO:xp
http://en.wikipedia.org/wiki/Nuclear_gene
wiki
A gene located in mitochondrial sequence.
http://en.wikipedia.org/wiki/Mitochondrial_gene
mitochondrial gene
mt gene
sequence
SO:0000088
mt_gene
A gene located in mitochondrial sequence.
SO:xp
http://en.wikipedia.org/wiki/Mitochondrial_gene
wiki
A gene located in kinetoplast sequence.
kinetoplast gene
sequence
SO:0000089
kinetoplast_gene
A gene located in kinetoplast sequence.
SO:xp
A gene from plastid sequence.
plastid gene
sequence
SO:0000090
plastid_gene
A gene from plastid sequence.
SO:xp
A gene from apicoplast sequence.
apicoplast gene
sequence
SO:0000091
apicoplast_gene
A gene from apicoplast sequence.
SO:xp
A gene from chloroplast sequence.
chloroplast gene
ct gene
sequence
SO:0000092
ct_gene
A gene from chloroplast sequence.
SO:xp
A gene from chromoplast_sequence.
chromoplast gene
sequence
SO:0000093
chromoplast_gene
A gene from chromoplast_sequence.
SO:xp
A gene from cyanelle sequence.
cyanelle gene
sequence
SO:0000094
cyanelle_gene
A gene from cyanelle sequence.
SO:xp
A plastid gene from leucoplast sequence.
leucoplast gene
sequence
SO:0000095
leucoplast_gene
A plastid gene from leucoplast sequence.
SO:xp
A gene from proplastid sequence.
proplastid gene
sequence
SO:0000096
proplastid_gene
A gene from proplastid sequence.
SO:ke
A gene from nucleomorph sequence.
nucleomorph gene
sequence
SO:0000097
nucleomorph_gene
A gene from nucleomorph sequence.
SO:xp
A gene from plasmid sequence.
plasmid gene
sequence
SO:0000098
plasmid_gene
A gene from plasmid sequence.
SO:xp
A gene from proviral sequence.
proviral gene
sequence
SO:0000099
proviral_gene
A gene from proviral sequence.
SO:xp
A proviral gene with origin endogenous retrovirus.
endogenous retroviral gene
sequence
SO:0000100
endogenous_retroviral_gene
A proviral gene with origin endogenous retrovirus.
SO:xp
A transposon or insertion sequence. An element that can insert in a variety of DNA sequences.
http://en.wikipedia.org/wiki/Transposable_element
transposable element
transposon
sequence
SO:0000101
transposable_element
A transposon or insertion sequence. An element that can insert in a variety of DNA sequences.
http://www.sci.sdsu.edu/~smaloy/Glossary/T.html
http://en.wikipedia.org/wiki/Transposable_element
wiki
A match to an EST or cDNA sequence.
expressed sequence match
sequence
SO:0000102
expressed_sequence_match
A match to an EST or cDNA sequence.
SO:ke
The end of the clone insert.
clone insert end
sequence
SO:0000103
clone_insert_end
The end of the clone insert.
SO:ke
A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation.
SO:0000358
http://en.wikipedia.org/wiki/Polypeptide
protein
sequence
SO:0000104
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The term 'protein' was merged with 'polypeptide'. Although 'protein' was a sequence_attribute and therefore meant to describe the quality rather than an actual feature, it was being used erroneously. It is replaced by 'peptidyl' as the polymer attribute.
polypeptide
A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation.
SO:ma
http://en.wikipedia.org/wiki/Polypeptide
wiki
A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere.
chromosome arm
sequence
SO:0000105
chromosome_arm
A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere.
http://www.medterms.com/script/main/art.asp?articlekey=5152
sequence
SO:0000106
non_capped_primary_transcript
true
A single stranded oligo used for polymerase chain reaction.
sequencing primer
sequence
SO:0000107
sequencing_primer
An mRNA with a frameshift.
frameshifted mRNA
mRNA with frameshift
sequence
SO:0000108
mRNA_with_frameshift
An mRNA with a frameshift.
SO:xp
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
sequence
mutation
SO:0000109
sequence_variant_obs
true
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
SO:ke
Any extent of continuous biological sequence.
INSDC_feature:misc_feature
INSDC_note:other
INSDC_note:sequence_feature
located_sequence_feature
sequence feature
sequence
located sequence feature
SO:0000110
sequence_feature
Any extent of continuous biological sequence.
LAMHDI:mb
SO:ke
A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast.
transposable element gene
sequence
SO:0000111
transposable_element_gene
A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast.
SO:ke
An oligo to which new deoxyribonucleotides can be added by DNA polymerase.
http://en.wikipedia.org/wiki/Primer_(molecular_biology)
DNA primer
primer oligonucleotide
primer polynucleotide
primer sequence
sequence
SO:0000112
primer
An oligo to which new deoxyribonucleotides can be added by DNA polymerase.
SO:ke
http://en.wikipedia.org/wiki/Primer_(molecular_biology)
wiki
A viral sequence which has integrated into a host genome.
proviral region
sequence
proviral sequence
SO:0000113
proviral_region
A viral sequence which has integrated into a host genome.
SO:ke
A methylated deoxy-cytosine.
methylated C
methylated cytosine
methylated cytosine base
methylated cytosine residue
methylated_C
sequence
SO:0000114
methylated_cytosine
A methylated deoxy-cytosine.
SO:ke
sequence
SO:0000115
transcript_feature
true
An attribute describing a sequence that is modified by editing.
sequence
SO:0000116
edited
An attribute describing a sequence that is modified by editing.
SO:ke
sequence
SO:0000117
transcript_with_readthrough_stop_codon
true
A transcript with a translational frameshift.
transcript with translational frameshift
sequence
SO:0000118
transcript_with_translational_frameshift
A transcript with a translational frameshift.
SO:xp
An attribute to describe a sequence that is regulated.
sequence
SO:0000119
regulated
An attribute to describe a sequence that is regulated.
SO:ke
A primary transcript that, at least in part, encodes one or more proteins.
protein coding primary transcript
sequence
pre mRNA
SO:0000120
May contain introns.
protein_coding_primary_transcript
A primary transcript that, at least in part, encodes one or more proteins.
SO:ke
A single stranded oligo used for polymerase chain reaction.
DNA forward primer
forward DNA primer
forward primer
forward primer oligo
forward primer oligonucleotide
forward primer polynucleotide
forward primer sequence
sequence
SO:0000121
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
forward_primer
A single stranded oligo used for polymerase chain reaction.
http://mged.sourceforge.net/ontologies/MGEDontology.php
A folded RNA sequence.
RNA sequence secondary structure
sequence
SO:0000122
RNA_sequence_secondary_structure
A folded RNA sequence.
SO:ke
An attribute describing a gene that is regulated at transcription.
transcriptionally regulated
sequence
SO:0000123
By:<protein_id>.
transcriptionally_regulated
An attribute describing a gene that is regulated at transcription.
SO:ma
Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate.
transcriptionally constitutive
sequence
SO:0000124
transcriptionally_constitutive
Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate.
SO:ke
An inducer molecule is required for transcription to occur.
transcriptionally induced
sequence
SO:0000125
transcriptionally_induced
An inducer molecule is required for transcription to occur.
SO:ke
A repressor molecule is required for transcription to stop.
transcriptionally repressed
sequence
SO:0000126
transcriptionally_repressed
A repressor molecule is required for transcription to stop.
SO:ke
A gene that is silenced.
silenced gene
sequence
SO:0000127
silenced_gene
A gene that is silenced.
SO:xp
A gene that is silenced by DNA modification.
gene silenced by DNA modification
sequence
SO:0000128
gene_silenced_by_DNA_modification
A gene that is silenced by DNA modification.
SO:xp
A gene that is silenced by DNA methylation.
gene silenced by DNA methylation
methylation-silenced gene
sequence
SO:0000129
gene_silenced_by_DNA_methylation
A gene that is silenced by DNA methylation.
SO:xp
An attribute describing a gene that is regulated after it has been translated.
post translationally regulated
post-translationally regulated
sequence
SO:0000130
post_translationally_regulated
An attribute describing a gene that is regulated after it has been translated.
SO:ke
An attribute describing a gene that is regulated as it is translated.
translationally regulated
sequence
SO:0000131
translationally_regulated
An attribute describing a gene that is regulated as it is translated.
SO:ke
A single stranded oligo used for polymerase chain reaction.
DNA reverse primer
reverse DNA primer
reverse primer
reverse primer oligo
reverse primer oligonucleotide
reverse primer sequence
sequence
SO:0000132
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
reverse_primer
A single stranded oligo used for polymerase chain reaction.
http://mged.sourceforge.net/ontologies/MGEDontology.php
This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones.
epigenetically modified
sequence
SO:0000133
epigenetically_modified
This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones.
SO:ke
Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin.
imprinted
http:http://en.wikipedia.org/wiki/Genomic_imprinting
genomically imprinted
sequence
SO:0000134
genomically_imprinted
Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin.
SO:ke
http:http://en.wikipedia.org/wiki/Genomic_imprinting
wiki
The maternal copy of the gene is modified, rendering it transcriptionally silent.
maternally imprinted
sequence
SO:0000135
maternally_imprinted
The maternal copy of the gene is modified, rendering it transcriptionally silent.
SO:ke
The paternal copy of the gene is modified, rendering it transcriptionally silent.
paternally imprinted
sequence
SO:0000136
paternally_imprinted
The paternal copy of the gene is modified, rendering it transcriptionally silent.
SO:ke
Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell.
allelically excluded
sequence
SO:0000137
Examples are x-inactivation and immunoglobulin formation.
allelically_excluded
Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell.
SO:ke
An epigenetically modified gene, rearranged at the DNA level.
gene rearranged at DNA level
sequence
SO:0000138
gene_rearranged_at_DNA_level
An epigenetically modified gene, rearranged at the DNA level.
SO:xp
Region in mRNA where ribosome assembles.
INSDC_feature:regulatory
INSDC_qualifier:ribosome_binding_site
ribosome entry site
sequence
SO:0000139
ribosome_entry_site
Region in mRNA where ribosome assembles.
SO:ke
A sequence segment located within the five prime end of an mRNA that causes premature termination of translation.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Attenuator
INSDC_qualifier:attenuator
attenuator sequence
sequence
SO:0000140
attenuator
A sequence segment located within the five prime end of an mRNA that causes premature termination of translation.
SO:as
http://en.wikipedia.org/wiki/Attenuator
wiki
The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Terminator_(genetics)
INSDC_qualifier:terminator
terminator sequence
sequence
SO:0000141
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
terminator
The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Terminator_(genetics)
wiki
A folded DNA sequence.
DNA sequence secondary structure
sequence
SO:0000142
DNA_sequence_secondary_structure
A folded DNA sequence.
SO:ke
A region of known length which may be used to manufacture a longer region.
assembly component
sequence
SO:0000143
assembly_component
A region of known length which may be used to manufacture a longer region.
SO:ke
sequence
SO:0000144
primary_transcript_attribute
true
A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough.
recoded codon
sequence
SO:0000145
recoded_codon
A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough.
SO:xp
An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end.
sequence
SO:0000146
capped
An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end.
SO:ke
A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.
http://en.wikipedia.org/wiki/Exon
INSDC_feature:exon
sequence
SO:0000147
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
exon
A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.
SO:ke
http://en.wikipedia.org/wiki/Exon
wiki
One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's.
sequence
scaffold
SO:0000148
supercontig
One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's.
SO:ls
A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases.
http://en.wikipedia.org/wiki/Contig
sequence
SO:0000149
contig
A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases.
SO:ls
http://en.wikipedia.org/wiki/Contig
wiki
A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine.
sequence
SO:0000150
read
A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine.
SO:rd
A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism.
http:http://en.wikipedia.org/wiki/Clone_(genetics)
sequence
SO:0000151
clone
A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism.
SO:ke
http:http://en.wikipedia.org/wiki/Clone_(genetics)
wiki
Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells.
yeast artificial chromosome
sequence
SO:0000152
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
YAC
Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells.
SO:ma
Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host.
bacterial artificial chromosome
sequence
SO:0000153
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
BAC
Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host.
SO:ma
The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells.
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
P1
P1 artificial chromosome
sequence
SO:0000154
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Drosophila melanogaster PACs carry an average insert size of 80 kb. The library represents a 6-fold coverage of the genome.
PAC
The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells.
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
wiki
A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism.
plasmid sequence
sequence
SO:0000155
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
plasmid
A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism.
SO:ma
A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites.
http://en.wikipedia.org/wiki/Cosmid
cosmid vector
sequence
SO:0000156
Paper: vans GA et al. High efficiency vectors for cosmid microcloning and genomic analysis. Gene 1989; 79(1):9-20. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
cosmid
A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites.
SO:ma
http://en.wikipedia.org/wiki/Cosmid
wiki
A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids.
http://en.wikipedia.org/wiki/Phagemid
sequence
phagemid vector
SO:0000157
phagemid
A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids.
SO:ma
http://en.wikipedia.org/wiki/Phagemid
wiki
A cloning vector that utilizes the E. coli F factor.
http://en.wikipedia.org/wiki/Fosmid
sequence
fosmid vector
SO:0000158
Birren BW et al. A human chromosome 22 fosmid resource: mapping and analysis of 96 clones. Genomics 1996.
fosmid
A cloning vector that utilizes the E. coli F factor.
SO:ma
http://en.wikipedia.org/wiki/Fosmid
wiki
The point at which one or more contiguous nucleotides were excised.
SO:1000033
http://en.wikipedia.org/wiki/Nucleotide_deletion
loinc:LA6692-3
deleted_sequence
nucleotide deletion
nucleotide_deletion
sequence
SO:0000159
deletion
The point at which one or more contiguous nucleotides were excised.
SO:ke
http://en.wikipedia.org/wiki/Nucleotide_deletion
wiki
loinc:LA6692-3
Deletion
A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
sequence
SO:0000160
lambda_clone
true
A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
ISBN:0-1767-2380-8
A modified base in which adenine has been methylated.
methylated A
methylated adenine
methylated adenine base
methylated adenine residue
methylated_A
sequence
SO:0000161
methylated_adenine
A modified base in which adenine has been methylated.
SO:ke
Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction.
http://en.wikipedia.org/wiki/Splice_site
splice site
sequence
SO:0000162
With spliceosomal introns, the splice sites bind the spliceosomal machinery.
splice_site
Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction.
SO:cjm
SO:ke
http://en.wikipedia.org/wiki/Splice_site
wiki
Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron.
5' splice site
donor splice site
five prime splice site
splice donor site
sequence
donor
SO:0000163
five_prime_cis_splice_site
Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron.
SO:cjm
SO:ke
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron.
acceptor splice site
splice acceptor site
three prime splice site
sequence
3' splice site
acceptor
SO:0000164
three_prime_cis_splice_site
Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron.
SO:cjm
SO:ke
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Enhancer_(genetics)
INSDC_qualifier:enhancer
sequence
SO:0000165
An enhancer may participate in an enhanceosome GO:0034206. A protein-DNA complex formed by the association of a distinct set of general and specific transcription factors with a region of enhancer DNA. The cooperative assembly of an enhanceosome confers specificity of transcriptional regulation. This comment is a place holder should we start to make cross products with GO.
enhancer
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Enhancer_(genetics)
wiki
An enhancer bound by a factor.
enhancer bound by factor
sequence
SO:0000166
enhancer_bound_by_factor
An enhancer bound by a factor.
SO:xp
A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Promoter
INSDC_qualifier:promoter
promoter sequence
sequence
SO:0000167
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The region on a DNA molecule involved in RNA polymerase binding to initiate transcription. Moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. Merged with RNA_polymerase_promoter (SO:0001203) Aug 2020. Moved up one level from is_a CRM (SO:0000727) to is_a transcriptional_cis_regulatory_region (SO:0001055) as part of the GREEKC work January 2021. Pascale Gaudet from Gene Ontology pointed out that CRM can be located upstream of the promoter and therefore cannot include the promoter.
promoter
A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription.
SO:regcreative
http://en.wikipedia.org/wiki/Promoter
wiki
A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA.
sequence
SO:0000168
restriction_enzyme_cut_site
true
A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA.
SO:ma
A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription.
RNA polymerase A promoter
RNApol I promoter
pol I promoter
polymerase I promoter
sequence
SO:0000169
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_I_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription.
SO:ke
A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription.
RNA polymerase B promoter
RNApol II promoter
polymerase II promoter
sequence
pol II promoter
SO:0000170
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_II_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription.
SO:ke
A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription.
RNA polymerase C promoter
RNApol III promoter
pol III promoter
polymerase III promoter
sequence
SO:0000171
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_III_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription.
SO:ke
Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/CAAT_box
CAAT box
CAAT signal
CAAT-box
INSDC_qualifier:CAAT_signal
sequence
SO:0000172
CAAT_signal
Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/CAAT_box
wiki
A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.
INSDC_feature:regulatory
GC rich promoter region
GC-rich region
INSDC_qualifier:GC_rich_promoter_region
sequence
SO:0000173
GC_rich_promoter_region
A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.
http://www.insdc.org/files/feature_table.html
A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T).
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/TATA_box
Goldstein-Hogness box
INSDC_qualifier:TATA_box
TATA box
sequence
SO:0000174
Binds TBP.
TATA_box
A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T).
PMID:16858867
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/TATA_box
wiki
A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Pribnow_box
-10 signal
INSDC_qualifier:minus_10_signal
Pribnow Schaller box
Pribnow box
Pribnow-Schaller box
minus 10 signal
sequence
SO:0000175
Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021.
minus_10_signal
A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Pribnow_box
wiki
A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70.
INSDC_feature:regulatory
-35 signal
INSDC_qualifier:minus_35_signal
minus 35 signal
sequence
SO:0000176
Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021.
minus_35_signal
A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70.
http://www.insdc.org/files/feature_table.html
A nucleotide match against a sequence from another organism.
cross genome match
sequence
SO:0000177
cross_genome_match
A nucleotide match against a sequence from another organism.
SO:ma
The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
http://en.wikipedia.org/wiki/Operon
INSDC_feature:operon
sequence
SO:0000178
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Definition updated with per Mejia-Almonte et.al Redefining fundamental concepts of transcription initiation in prokaryotes Aug 5 2020.
operon
The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
SO:ma
http://en.wikipedia.org/wiki/Operon
wiki
The start of the clone insert.
clone insert start
sequence
SO:0000179
clone_insert_start
The start of the clone insert.
SO:ke
A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase.
http://en.wikipedia.org/wiki/Retrotransposon
class I transposon
retrotransposon element
sequence
class I
SO:0000180
retrotransposon
A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase.
http://www.dddmag.com/Glossary.aspx#r
http://en.wikipedia.org/wiki/Retrotransposon
wiki
A match against a translated sequence.
translated nucleotide match
sequence
SO:0000181
translated_nucleotide_match
A match against a translated sequence.
SO:ke
A transposon where the mechanism of transposition is via a DNA intermediate.
DNA transposon
class II transposon
sequence
class II
SO:0000182
DNA_transposon
A transposon where the mechanism of transposition is via a DNA intermediate.
SO:ke
A region of the gene which is not transcribed.
non transcribed region
non-transcribed sequence
nontranscribed region
nontranscribed sequence
sequence
SO:0000183
non_transcribed_region
A region of the gene which is not transcribed.
SO:ke
A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs.
U2 intron
sequence
SO:0000184
May have either GT-AG or AT-AG 5' and 3' boundaries.
U2_intron
A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs.
PMID:9428511
A transcript that in its initial state requires modification to be functional.
http://en.wikipedia.org/wiki/Primary_transcript
INSDC_feature:precursor_RNA
INSDC_feature:prim_transcript
precursor RNA
primary transcript
sequence
SO:0000185
primary_transcript
A transcript that in its initial state requires modification to be functional.
SO:ma
http://en.wikipedia.org/wiki/Primary_transcript
wiki
A retrotransposon flanked by long terminal repeat sequences.
LTR retrotransposon
long terminal repeat retrotransposon
sequence
SO:0000186
LTR_retrotransposon
A retrotransposon flanked by long terminal repeat sequences.
SO:ke
A group of characterized repeat sequences.
sequence
SO:0000187
repeat_family
true
A group of characterized repeat sequences.
SO:ke
A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it.
http://en.wikipedia.org/wiki/Intron
INSDC_feature:intron
sequence
SO:0000188
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
intron
A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Intron
wiki
A retrotransposon without long terminal repeat sequences.
non LTR retrotransposon
sequence
SO:0000189
non_LTR_retrotransposon
A retrotransposon without long terminal repeat sequences.
SO:ke
An intron that is the most 5-prime in a given transcript.
5' intron
5' intron sequence
five prime intron
sequence
SO:0000190
five_prime_intron
An intron that is not the most 3-prime or the most 5-prime in a given transcript.
interior intron
sequence
SO:0000191
interior_intron
An intron that is the most 3-prime in a given transcript.
3' intron
three prime intron
sequence
3' intron sequence
SO:0000192
three_prime_intron
A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme.
http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism
RFLP
RFLP fragment
restriction fragment length polymorphism
sequence
SO:0000193
RFLP_fragment
A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme.
GOC:pj
http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism
wiki
A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats.
LINE
LINE element
Long interspersed element
Long interspersed nuclear element
sequence
SO:0000194
LINE_element
A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats.
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon).
coding exon
sequence
SO:0000195
coding_exon
An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon).
SO:ke
The sequence of the five_prime_coding_exon that codes for protein.
five prime exon coding region
sequence
SO:0000196
five_prime_coding_exon_coding_region
The sequence of the five_prime_coding_exon that codes for protein.
SO:cjm
The sequence of the three_prime_coding_exon that codes for protein.
three prime exon coding region
sequence
SO:0000197
three_prime_coding_exon_coding_region
The sequence of the three_prime_coding_exon that codes for protein.
SO:cjm
An exon that does not contain any codons.
noncoding exon
sequence
SO:0000198
noncoding_exon
An exon that does not contain any codons.
SO:ke
A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions.
translocated sequence
sequence
transchr
SO:0000199
translocation
A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions.
NCBI:th
SO:ke
transchr
http://www.ncbi.nlm.nih.gov/dbvar/
The 5' most coding exon.
5' coding exon
five prime coding exon
sequence
SO:0000200
five_prime_coding_exon
The 5' most coding exon.
SO:ke
An exon that is bounded by 5' and 3' splice sites.
interior exon
sequence
SO:0000201
interior_exon
An exon that is bounded by 5' and 3' splice sites.
PMID:10373547
The coding exon that is most 3-prime on a given transcript.
three prime coding exon
sequence
3' coding exon
SO:0000202
three_prime_coding_exon
The coding exon that is most 3-prime on a given transcript.
SO:ma
Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated.
untranslated region
sequence
SO:0000203
UTR
Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated.
SO:ke
A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.
http://en.wikipedia.org/wiki/5'_UTR
5' UTR
INSDC_feature:5'UTR
five prime UTR
five_prime_untranslated_region
sequence
SO:0000204
five_prime_UTR
A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/5'_UTR
wiki
A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein.
http://en.wikipedia.org/wiki/Three_prime_untranslated_region
INSDC_feature:3'UTR
three prime UTR
three prime untranslated region
sequence
SO:0000205
three_prime_UTR
A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Three_prime_untranslated_region
wiki
A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element.
http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element
SINE element
Short interspersed element
Short interspersed nuclear element
sequence
SO:0000206
SINE_element
A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element.
SO:ke
http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element
wiki
SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ.
http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism
simple sequence length variation
sequence
SSLP
simple sequence length polymorphism
SO:0000207
simple_sequence_length_variation
SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ.
SO:ke
http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism
WIKI
A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long.
TIR element
terminal inverted repeat element
sequence
SO:0000208
terminal_inverted_repeat_element
A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long.
http://www.genetics.org/cgi/reprint/156/4/1983.pdf
A primary transcript encoding a ribosomal RNA.
rRNA primary transcript
ribosomal RNA primary transcript
sequence
SO:0000209
rRNA_primary_transcript
A primary transcript encoding a ribosomal RNA.
SO:ke
A primary transcript encoding a transfer RNA (SO:0000253).
tRNA primary transcript
sequence
SO:0000210
tRNA_primary_transcript
A primary transcript encoding a transfer RNA (SO:0000253).
SO:ke
A primary transcript encoding alanyl tRNA.
alanine tRNA primary transcript
sequence
SO:0000211
alanine_tRNA_primary_transcript
A primary transcript encoding alanyl tRNA.
SO:ke
A primary transcript encoding arginyl tRNA (SO:0000255).
arginine tRNA primary transcript
sequence
SO:0000212
arginine_tRNA_primary_transcript
A primary transcript encoding arginyl tRNA (SO:0000255).
SO:ke
A primary transcript encoding asparaginyl tRNA (SO:0000256).
asparagine tRNA primary transcript
sequence
SO:0000213
asparagine_tRNA_primary_transcript
A primary transcript encoding asparaginyl tRNA (SO:0000256).
SO:ke
A primary transcript encoding aspartyl tRNA (SO:0000257).
aspartic acid tRNA primary transcript
sequence
SO:0000214
aspartic_acid_tRNA_primary_transcript
A primary transcript encoding aspartyl tRNA (SO:0000257).
SO:ke
A primary transcript encoding cysteinyl tRNA (SO:0000258).
cysteine tRNA primary transcript
sequence
SO:0000215
cysteine_tRNA_primary_transcript
A primary transcript encoding cysteinyl tRNA (SO:0000258).
SO:ke
A primary transcript encoding glutaminyl tRNA (SO:0000260).
glutamic acid tRNA primary transcript
sequence
SO:0000216
glutamic_acid_tRNA_primary_transcript
A primary transcript encoding glutaminyl tRNA (SO:0000260).
SO:ke
A primary transcript encoding glutamyl tRNA (SO:0000260).
glutamine tRNA primary transcript
sequence
SO:0000217
glutamine_tRNA_primary_transcript
A primary transcript encoding glutamyl tRNA (SO:0000260).
SO:ke
A primary transcript encoding glycyl tRNA (SO:0000263).
glycine tRNA primary transcript
sequence
SO:0000218
glycine_tRNA_primary_transcript
A primary transcript encoding glycyl tRNA (SO:0000263).
SO:ke
A primary transcript encoding histidyl tRNA (SO:0000262).
histidine tRNA primary transcript
sequence
SO:0000219
histidine_tRNA_primary_transcript
A primary transcript encoding histidyl tRNA (SO:0000262).
SO:ke
A primary transcript encoding isoleucyl tRNA (SO:0000263).
isoleucine tRNA primary transcript
sequence
SO:0000220
isoleucine_tRNA_primary_transcript
A primary transcript encoding isoleucyl tRNA (SO:0000263).
SO:ke
A primary transcript encoding leucyl tRNA (SO:0000264).
leucine tRNA primary transcript
sequence
SO:0000221
leucine_tRNA_primary_transcript
A primary transcript encoding leucyl tRNA (SO:0000264).
SO:ke
A primary transcript encoding lysyl tRNA (SO:0000265).
lysine tRNA primary transcript
sequence
SO:0000222
lysine_tRNA_primary_transcript
A primary transcript encoding lysyl tRNA (SO:0000265).
SO:ke
A primary transcript encoding methionyl tRNA (SO:0000266).
methionine tRNA primary transcript
sequence
SO:0000223
methionine_tRNA_primary_transcript
A primary transcript encoding methionyl tRNA (SO:0000266).
SO:ke
A primary transcript encoding phenylalanyl tRNA (SO:0000267).
phenylalanine tRNA primary transcript
sequence
SO:0000224
phenylalanine_tRNA_primary_transcript
A primary transcript encoding phenylalanyl tRNA (SO:0000267).
SO:ke
A primary transcript encoding prolyl tRNA (SO:0000268).
proline tRNA primary transcript
sequence
SO:0000225
proline_tRNA_primary_transcript
A primary transcript encoding prolyl tRNA (SO:0000268).
SO:ke
A primary transcript encoding seryl tRNA (SO:000269).
serine tRNA primary transcript
sequence
SO:0000226
serine_tRNA_primary_transcript
A primary transcript encoding seryl tRNA (SO:000269).
SO:ke
A primary transcript encoding threonyl tRNA (SO:000270).
threonine tRNA primary transcript
sequence
SO:0000227
threonine_tRNA_primary_transcript
A primary transcript encoding threonyl tRNA (SO:000270).
SO:ke
A primary transcript encoding tryptophanyl tRNA (SO:000271).
tryptophan tRNA primary transcript
sequence
SO:0000228
tryptophan_tRNA_primary_transcript
A primary transcript encoding tryptophanyl tRNA (SO:000271).
SO:ke
A primary transcript encoding tyrosyl tRNA (SO:000272).
tyrosine tRNA primary transcript
sequence
SO:0000229
tyrosine_tRNA_primary_transcript
A primary transcript encoding tyrosyl tRNA (SO:000272).
SO:ke
A primary transcript encoding valyl tRNA (SO:000273).
valine tRNA primary transcript
sequence
SO:0000230
valine_tRNA_primary_transcript
A primary transcript encoding valyl tRNA (SO:000273).
SO:ke
A primary transcript encoding a small nuclear RNA (SO:0000274).
snRNA primary transcript
sequence
SO:0000231
snRNA_primary_transcript
A primary transcript encoding a small nuclear RNA (SO:0000274).
SO:ke
A primary transcript encoding one or more small nucleolar RNAs (SO:0000275).
snoRNA primary transcript
sequence
SO:0000232
This definition was broadened 26 Jan 2021 to reflect that a single transcript can encode one or more snoRNAs. Brought to our attention by FlyBase. GitHub Issue #520 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/520).
snoRNA_primary_transcript
A primary transcript encoding one or more small nucleolar RNAs (SO:0000275).
SO:ke
A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified.
http://en.wikipedia.org/wiki/Mature_transcript
mature transcript
sequence
SO:0000233
A processed transcript cannot contain introns.
mature_transcript
A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified.
SO:ke
http://en.wikipedia.org/wiki/Mature_transcript
wiki
Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns.
http://en.wikipedia.org/wiki/MRNA
http://www.gencodegenes.org/gencode_biotypes.html
INSDC_feature:mRNA
messenger RNA
protein_coding_transcript
sequence
SO:0000234
An mRNA does not contain introns as it is a processed_transcript. The equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
mRNA
Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns.
SO:ma
http://en.wikipedia.org/wiki/MRNA
wiki
http://www.gencodegenes.org/gencode_biotypes.html
GENCODE
A DNA site where a transcription factor binds.
TF binding site
transcription factor binding site
sequence
SO:0000235
Definition updated along with definitions in Mejia-Almonte et.al PMID:32665585. Added relationship part_of SO:0000727 CRM in place of previous CRM relationship has_part TF_binding_site August 2020 in response to requests from GREEKC initiative. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
TF_binding_site
A DNA site where a transcription factor binds.
SO:ke
The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER.
open reading frame
sequence
SO:0000236
The definition was modified by Rama. ORF is defined by the sequence, whereas the CDS is defined according to whether a polypeptide is made. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ORF
The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER.
SGD:rb
SO:ma
An attribute describing a transcript.
transcript attribute
sequence
SO:0000237
transcript_attribute
A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats.
foldback element
sequence
LVR element
long inverted repeat element
SO:0000238
foldback_element
A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats.
http://www.genetics.org/cgi/reprint/156/4/1983.pdf
The sequences extending on either side of a specific region.
flanking region
sequence
SO:0000239
flanking_region
The sequences extending on either side of a specific region.
SO:ke
A deviation in chromosome structure or number.
chromosome variation
sequence
SO:0000240
chromosome_variation
A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal.
internal UTR
sequence
SO:0000241
internal_UTR
A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal.
SO:cjm
The untranslated sequence separating the 'cistrons' of multicistronic mRNA.
untranslated region polycistronic mRNA
sequence
SO:0000242
untranslated_region_polycistronic_mRNA
The untranslated sequence separating the 'cistrons' of multicistronic mRNA.
SO:ke
Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation.
http://en.wikipedia.org/wiki/Internal_ribosome_entry_site
IRES
internal ribosomal entry sequence
internal ribosomal entry site
internal ribosome entry site
sequence
internal ribosome entry sequence
SO:0000243
internal_ribosome_entry_site
Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation.
SO:ke
http://en.wikipedia.org/wiki/Internal_ribosome_entry_site
wiki
sequence
4-cutter_restriction_site
four-cutter_restriction_sit
SO:0000244
four_cutter_restriction_site
true
sequence
SO:0000245
mRNA_by_polyadenylation_status
true
A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule.
sequence
SO:0000246
polyadenylated
A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule.
SO:ke
sequence
SO:0000247
mRNA_not_polyadenylated
true
A kind of kind of sequence alteration where the copies of a region present varies across a population.
sequence length alteration
sequence
SO:0000248
sequence_length_alteration
A kind of kind of sequence alteration where the copies of a region present varies across a population.
SO:ke
sequence
6-cutter_restriction_site
six-cutter_restriction_site
SO:0000249
six_cutter_restriction_site
true
A post_transcriptionally modified base.
modified RNA base feature
sequence
SO:0000250
modified_RNA_base_feature
A post_transcriptionally modified base.
SO:ke
sequence
8-cutter_restriction_site
eight-cutter_restriction_site
SO:0000251
eight_cutter_restriction_site
true
rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity.
INSDC_qualifier:unknown
http://en.wikipedia.org/wiki/RRNA
INSDC_feature:rRNA
ribosomal RNA
ribosomal ribonucleic acid
sequence
SO:0000252
Definition updated 10 June 2021 as part of restructuring rRNA terms and reforming definitions to have similar structures. Request from EBI. See GitHub Issue #493
rRNA
rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity.
ISBN:0198506732
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/RRNA
wiki
Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position.
INSDC_qualifier:unknown
http://en.wikipedia.org/wiki/TRNA
INSDC_feature:tRNA
sequence
transfer RNA
transfer ribonucleic acid
SO:0000253
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
tRNA
Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position.
ISBN:0198506732
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005
http://en.wikipedia.org/wiki/TRNA
wiki
A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region.
alanyl tRNA
alanyl-transfer RNA
alanyl-transfer ribonucleic acid
sequence
SO:0000254
alanyl_tRNA
A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region.
SO:ke
A primary transcript encoding a small ribosomal subunit RNA.
rRNA small subunit primary transcript
sequence
SO:0000255
rRNA_small_subunit_primary_transcript
A primary transcript encoding a small ribosomal subunit RNA.
SO:ke
A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region.
asparaginyl tRNA
asparaginyl-transfer RNA
asparaginyl-transfer ribonucleic acid
sequence
SO:0000256
asparaginyl_tRNA
A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region.
SO:ke
A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region.
aspartyl tRNA
aspartyl-transfer RNA
aspartyl-transfer ribonucleic acid
sequence
SO:0000257
aspartyl_tRNA
A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region.
SO:ke
A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region.
cysteinyl tRNA
cysteinyl-transfer RNA
cysteinyl-transfer ribonucleic acid
sequence
SO:0000258
cysteinyl_tRNA
A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region.
SO:ke
A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region.
glutaminyl tRNA
glutaminyl-transfer RNA
glutaminyl-transfer ribonucleic acid
sequence
SO:0000259
glutaminyl_tRNA
A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region.
SO:ke
A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region.
glutamyl tRNA
glutamyl-transfer ribonucleic acid
sequence
glutamyl-transfer RNA
SO:0000260
glutamyl_tRNA
A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region.
SO:ke
A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region.
glycyl tRNA
sequence
glycyl-transfer RNA
glycyl-transfer ribonucleic acid
SO:0000261
glycyl_tRNA
A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region.
SO:ke
A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region.
histidyl tRNA
histidyl-transfer RNA
histidyl-transfer ribonucleic acid
sequence
SO:0000262
histidyl_tRNA
A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region.
SO:ke
A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region.
isoleucyl tRNA
isoleucyl-transfer RNA
isoleucyl-transfer ribonucleic acid
sequence
SO:0000263
isoleucyl_tRNA
A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region.
SO:ke
A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region.
leucyl tRNA
leucyl-transfer RNA
leucyl-transfer ribonucleic acid
sequence
SO:0000264
leucyl_tRNA
A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region.
SO:ke
A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region.
lysyl tRNA
lysyl-transfer RNA
lysyl-transfer ribonucleic acid
sequence
SO:0000265
lysyl_tRNA
A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region.
SO:ke
A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region.
methionyl tRNA
methionyl-transfer RNA
methionyl-transfer ribonucleic acid
sequence
SO:0000266
methionyl_tRNA
A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region.
SO:ke
A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region.
phenylalanyl tRNA
phenylalanyl-transfer RNA
phenylalanyl-transfer ribonucleic acid
sequence
SO:0000267
phenylalanyl_tRNA
A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region.
SO:ke
A tRNA sequence that has a proline anticodon, and a 3' proline binding region.
prolyl tRNA
prolyl-transfer RNA
prolyl-transfer ribonucleic acid
sequence
SO:0000268
prolyl_tRNA
A tRNA sequence that has a proline anticodon, and a 3' proline binding region.
SO:ke
A tRNA sequence that has a serine anticodon, and a 3' serine binding region.
seryl tRNA
seryl-transfer RNA
sequence
seryl-transfer ribonucleic acid
SO:0000269
seryl_tRNA
A tRNA sequence that has a serine anticodon, and a 3' serine binding region.
SO:ke
A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region.
threonyl tRNA
threonyl-transfer ribonucleic acid
sequence
threonyl-transfer RNA
SO:0000270
threonyl_tRNA
A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region.
SO:ke
A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region.
tryptophanyl tRNA
tryptophanyl-transfer RNA
tryptophanyl-transfer ribonucleic acid
sequence
SO:0000271
tryptophanyl_tRNA
A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region.
SO:ke
A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region.
tyrosyl tRNA
tyrosyl-transfer ribonucleic acid
sequence
tyrosyl-transfer RNA
SO:0000272
tyrosyl_tRNA
A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region.
SO:ke
A tRNA sequence that has a valine anticodon, and a 3' valine binding region.
valyl tRNA
valyl-transfer ribonucleic acid
sequence
valyl-transfer RNA
SO:0000273
valyl_tRNA
A tRNA sequence that has a valine anticodon, and a 3' valine binding region.
SO:ke
A small nuclear RNA molecule involved in pre-mRNA splicing and processing.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/SnRNA
INSDC_qualifier:snRNA
small nuclear RNA
sequence
SO:0000274
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
snRNA
A small nuclear RNA molecule involved in pre-mRNA splicing and processing.
PMID:11733745
WB:ems
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/SnRNA
wiki
Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing.
INSDC_feature:ncRNA
INSDC_qualifier:snoRNA
small nucleolar RNA
sequence
SO:0000275
Updated the definition of snoRNA (SO:0000275) from "A snoRNA (small nucleolar RNA) is any one of a class of small RNAs that are associated with the eukaryotic nucleus as components of small nucleolar ribonucleoproteins. They participate in the processing or modifications of many RNAs, mostly ribosomal RNAs (rRNAs) though snoRNAs are also known to target other classes of RNA, including spliceosomal RNAs, tRNAs, and mRNAs via a stretch of sequence that is complementary to a sequence in the targeted RNA." to "Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing." to acknowledge that some snoRNAs functionally localize to other compartments (cytoplasm or even secreted). See GitHub Issue #578.
snoRNA
Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing.
GOC:kgc
PMID:31828325
Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes). Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors.
SO:0000649
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/MiRNA
http://en.wikipedia.org/wiki/StRNA
INSDC_qualifier:miRNA
micro RNA
microRNA
small temporal RNA
stRNA
sequence
SO:0000276
miRNA
Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes). Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors.
PMID:11081512
PMID:12592000
http://en.wikipedia.org/wiki/MiRNA
wiki
http://en.wikipedia.org/wiki/StRNA
wiki
An attribute describing a sequence that is bound by another molecule.
bound by factor
sequence
SO:0000277
Formerly called transcript_by_bound_factor.
bound_by_factor
An attribute describing a sequence that is bound by another molecule.
SO:ke
A transcript that is bound by a nucleic acid.
transcript bound by nucleic acid
sequence
SO:0000278
Formerly called transcript_by_bound_nucleic_acid.
transcript_bound_by_nucleic_acid
A transcript that is bound by a nucleic acid.
SO:xp
A transcript that is bound by a protein.
transcript bound by protein
sequence
SO:0000279
Formerly called transcript_by_bound_protein.
transcript_bound_by_protein
A transcript that is bound by a protein.
SO:xp
A gene that is engineered.
engineered gene
sequence
SO:0000280
engineered_gene
A gene that is engineered.
SO:xp
A gene that is engineered and foreign.
engineered foreign gene
sequence
SO:0000281
engineered_foreign_gene
A gene that is engineered and foreign.
SO:xp
An mRNA with a minus 1 frameshift.
mRNA with minus 1 frameshift
sequence
SO:0000282
mRNA_with_minus_1_frameshift
An mRNA with a minus 1 frameshift.
SO:xp
A transposable_element that is engineered and foreign.
engineered foreign transposable element gene
sequence
SO:0000283
engineered_foreign_transposable_element_gene
A transposable_element that is engineered and foreign.
SO:xp
The recognition site is bipartite and interrupted.
sequence
SO:0000284
type_I_enzyme_restriction_site
true
The recognition site is bipartite and interrupted.
http://www.promega.com
A gene that is foreign.
foreign gene
sequence
SO:0000285
foreign_gene
A gene that is foreign.
SO:xp
A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Long_terminal_repeat
INSDC_qualifier:long_terminal_repeat
LTR
long terminal repeat
sequence
direct terminal repeat
SO:0000286
long_terminal_repeat
A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Long_terminal_repeat
wiki
A gene that is a fusion.
http://en.wikipedia.org/wiki/Fusion_gene
fusion gene
sequence
SO:0000287
fusion_gene
A gene that is a fusion.
SO:xp
http://en.wikipedia.org/wiki/Fusion_gene
wiki
A fusion gene that is engineered.
engineered fusion gene
sequence
SO:0000288
engineered_fusion_gene
A fusion gene that is engineered.
SO:xp
A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Microsatellite
INSDC_qualifier:microsatellite
STR
microsatellite locus
microsatellite marker
short tandem repeat
sequence
SO:0000289
microsatellite
A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem.
NCBI:th
http://www.informatics.jax.org/silver/glossary.shtml
http://en.wikipedia.org/wiki/Microsatellite
wiki
STR
http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9651/
A region of a repeating dinucleotide sequence (two bases).
dinucleotide repeat microsatellite
dinucleotide repeat microsatellite feature
dinucleotide repeat microsatellite locus
dinucleotide repeat microsatellite marker
sequence
SO:0000290
dinucleotide_repeat_microsatellite_feature
A region of a repeating trinucleotide sequence (three bases).
rinucleotide repeat microsatellite
trinucleotide repeat microsatellite feature
trinucleotide repeat microsatellite locus
sequence
dinucleotide repeat microsatellite marker
SO:0000291
trinucleotide_repeat_microsatellite_feature
sequence
SO:0000292
repetitive_element
true
A repetitive element that is engineered and foreign.
engineered foreign repetitive element
sequence
SO:0000293
engineered_foreign_repetitive_element
A repetitive element that is engineered and foreign.
SO:xp
The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Inverted_repeat
INSDC_qualifier:inverted
inverted repeat
inverted repeat sequence
sequence
SO:0000294
inverted_repeat
The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC.
SO:ke
http://en.wikipedia.org/wiki/Inverted_repeat
wiki
A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs.
U12 intron
U12-dependent intron
sequence
SO:0000295
May have either GT-AC or AT-AC 5' and 3' boundaries.
U12_intron
A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs.
PMID:9428511
A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites.
http://en.wikipedia.org/wiki/Origin_of_replication
INSDC_feature:rep_origin
ori
origin of replication
sequence
SO:0000296
origin_of_replication
A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites.
NCBI:cf
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Origin_of_replication
wiki
Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.
http://en.wikipedia.org/wiki/D_loop
D-loop
INSDC_feature:D-loop
sequence
displacement loop
SO:0000297
Moved from is_a: SO:0000296 origin_of_replication to is_a: SO:0001411 biological_region after Terrence Murphy (INSDC) pointed out that the D loop can also refer to a loop in DNA repair, which is not an origin of replication. See GitHub Issue #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417)
D_loop
Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/D_loop
wiki
A feature where there has been exchange of genetic material in the event of mitosis or meiosis
INSDC_feature:misc_recomb
INSDC_qualifier:other
recombination feature
sequence
SO:0000298
recombination_feature
A location where recombination or occurs during mitosis or meiosis.
specific recombination site
sequence
SO:0000299
specific_recombination_site
A location where a gene is rearranged due to recombination during mitosis or meiosis.
recombination feature of rearranged gene
sequence
SO:0000300
recombination_feature_of_rearranged_gene
A feature where recombination has occurred for the purpose of generating a diversity in the immune system.
vertebrate immune system gene recombination feature
sequence
SO:0000301
vertebrate_immune_system_gene_recombination_feature
Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence.
J gene recombination feature
J-RS
sequence
SO:0000302
J_gene_recombination_feature
Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Part of the primary transcript that is clipped off during processing.
sequence
SO:0000303
clip
Part of the primary transcript that is clipped off during processing.
SO:ke
The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site.
sequence
SO:0000304
type_II_enzyme_restriction_site
true
The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site.
http://www.promega.com
A modified nucleotide, i.e. a nucleotide other than A, T, C. G.
INSDC_feature:modified_base
modified base site
sequence
SO:0000305
Modified base:<modified_base>.
modified_DNA_base
A modified nucleotide, i.e. a nucleotide other than A, T, C. G.
http://www.insdc.org/files/feature_table.html
A nucleotide modified by methylation.
methylated base feature
sequence
SO:0000306
methylated_DNA_base_feature
A nucleotide modified by methylation.
SO:ke
Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes.
http://en.wikipedia.org/wiki/CpG_island
CG island
CpG island
sequence
SO:0000307
CpG_island
Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes.
SO:rd
http://en.wikipedia.org/wiki/CpG_island
wiki
sequence
SO:0000308
sequence_feature_locating_method
true
sequence
SO:0000309
computed_feature
true
sequence
SO:0000310
predicted_ab_initio_computation
true
.
sequence
SO:0000311
similar to:<sequence_id>
computed_feature_by_similarity
true
.
SO:ma
Attribute to describe a feature that has been experimentally verified.
experimentally determined
sequence
SO:0000312
experimentally_determined
Attribute to describe a feature that has been experimentally verified.
SO:ke
A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences.
SO:0000019
http://en.wikipedia.org/wiki/Stem_loop
INSDC_feature:stem_loop
RNA_hairpin_loop
stem loop
stem-loop
sequence
SO:0000313
stem_loop
A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Stem_loop
wiki
A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Direct_repeat
INSDC_qualifier:direct
direct repeat
sequence
SO:0000314
direct_repeat
A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA.
SO:ke
http://en.wikipedia.org/wiki/Direct_repeat
wiki
The first base where RNA polymerase begins to synthesize the RNA transcript.
INSDC_feature:misc_feature
INSDC_note:transcription_start_site
transcription start site
transcription_start_site
sequence
SO:0000315
Added relationship is_a SO:0002309 core_promoter_element with the creation of core_promoter_element as part of GREEKC initiative August 2020 - Dave Sant.
TSS
The first base where RNA polymerase begins to synthesize the RNA transcript.
SO:ke
A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon.
INSDC_feature:CDS
coding sequence
coding_sequence
sequence
SO:0000316
CDS
A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon.
SO:ma
Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host.
cDNA clone
sequence
SO:0000317
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
cDNA_clone
Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host.
http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html
First codon to be translated by a ribosome.
http://en.wikipedia.org/wiki/Start_codon
initiation codon
start codon
sequence
SO:0000318
start_codon
First codon to be translated by a ribosome.
SO:ke
http://en.wikipedia.org/wiki/Start_codon
wiki
In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis.
http://en.wikipedia.org/wiki/Stop_codon
stop codon
sequence
SO:0000319
stop_codon
In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis.
SO:ke
http://en.wikipedia.org/wiki/Stop_codon
wiki
Sequences within the intron that modulate splice site selection for some introns.
intronic splice enhancer
sequence
SO:0000320
intronic_splice_enhancer
Sequences within the intron that modulate splice site selection for some introns.
SO:ke
An mRNA with a plus 1 frameshift.
mRNA with plus 1 frameshift
sequence
SO:0000321
mRNA_with_plus_1_frameshift
An mRNA with a plus 1 frameshift.
SO:ke
A region of nucleotide sequence targeted by a nuclease enzyme that is found cleaved more than would be expected by chance.
nuclease hypersensitive site
sequence
SO:0000322
Relationship to accessible_DNA_region added 11 Feb 2021. GREEKC pointed out that this is an assay based term, but we need a biological term for the accessible DNA. See GitHub Issue #531.
nuclease_hypersensitive_site
The first base to be translated into protein.
coding start
translation initiation site
sequence
translation start
SO:0000323
coding_start
The first base to be translated into protein.
SO:ke
A nucleotide sequence that may be used to identify a larger sequence.
sequence
SO:0000324
tag
A nucleotide sequence that may be used to identify a larger sequence.
SO:ke
A primary transcript encoding a large ribosomal subunit RNA.
35S rRNA primary transcript
rRNA large subunit primary transcript
sequence
SO:0000325
rRNA_large_subunit_primary_transcript
A primary transcript encoding a large ribosomal subunit RNA.
SO:ke
A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts.
SAGE tag
sequence
SO:0000326
SAGE_tag
A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract
The last base to be translated into protein. It does not include the stop codon.
coding end
translation termination site
translation_end
sequence
SO:0000327
coding_end
The last base to be translated into protein. It does not include the stop codon.
SO:ke
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
microarray oligo
microarray oligonucleotide
sequence
SO:0000328
microarray_oligo
An mRNA with a plus 2 frameshift.
mRNA with plus 2 frameshift
sequence
SO:0000329
mRNA_with_plus_2_frameshift
An mRNA with a plus 2 frameshift.
SO:xp
Region of sequence similarity by descent from a common ancestor.
INSDC_feature:misc_feature
http://en.wikipedia.org/wiki/Conserved_region
INSDC_note:conserved_region
conserved region
sequence
SO:0000330
conserved_region
Region of sequence similarity by descent from a common ancestor.
SO:ke
http://en.wikipedia.org/wiki/Conserved_region
wiki
Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known.
INSDC_feature:STS
sequence tag site
sequence
SO:0000331
STS
Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known.
http://www.biospace.com
Coding region of sequence similarity by descent from a common ancestor.
coding conserved region
sequence
SO:0000332
coding_conserved_region
Coding region of sequence similarity by descent from a common ancestor.
SO:ke
The boundary between two exons in a processed transcript.
exon junction
sequence
SO:0000333
exon_junction
The boundary between two exons in a processed transcript.
SO:ke
Non-coding region of sequence similarity by descent from a common ancestor.
conserved non-coding element
conserved non-coding sequence
nc conserved region
noncoding conserved region
sequence
SO:0000334
nc_conserved_region
Non-coding region of sequence similarity by descent from a common ancestor.
SO:ke
A mRNA with a minus 2 frameshift.
mRNA with minus 2 frameshift
sequence
SO:0000335
mRNA_with_minus_2_frameshift
A mRNA with a minus 2 frameshift.
SO:ke
A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog).
INSDC_feature:gene
http://en.wikipedia.org/wiki/Pseudogene
INSDC_qualifier:pseudo
INSDC_qualifier:unknown
sequence
SO:0000336
pseudogene
A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog).
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
http://en.wikipedia.org/wiki/Pseudogene
wiki
A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference.
RNAi reagent
sequence
SO:0000337
RNAi_reagent
A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference.
SO:rd
A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins.
miniature inverted repeat transposable element
sequence
SO:0000338
MITE
A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins.
http://www.pnas.org/cgi/content/full/97/18/10083
A region in a genome which promotes recombination.
http://en.wikipedia.org/wiki/Recombination_hotspot
recombination hotspot
sequence
SO:0000339
recombination_hotspot
A region in a genome which promotes recombination.
SO:rd
http://en.wikipedia.org/wiki/Recombination_hotspot
wiki
Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication.
http://en.wikipedia.org/wiki/Chromosome
sequence
SO:0000340
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
chromosome
Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication.
SO:ma
http://en.wikipedia.org/wiki/Chromosome
wiki
A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark.
http://en.wikipedia.org/wiki/Cytological_band
chromosome band
cytoband
cytological band
sequence
SO:0000341
chromosome_band
A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark.
SO:ma
http://en.wikipedia.org/wiki/Cytological_band
wiki
A region specifically recognised by a recombinase where recombination can occur during mitosis or meiosis.
site specific recombination target region
sequence
SO:0000342
site_specific_recombination_target_region
A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4.
sequence
SO:0000343
match
A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4.
SO:ke
Region of a transcript that regulates splicing.
splice enhancer
sequence
SO:0000344
splice_enhancer
Region of a transcript that regulates splicing.
SO:ke
A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long.
expressed sequence tag
sequence
SO:0000345
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
EST
A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long.
SO:ke
Cre-Recombination target sequence.
loxP site
sequence
Cre-recombination target region
SO:0000346
loxP_site
A match against a nucleotide sequence.
nucleotide match
sequence
SO:0000347
nucleotide_match
A match against a nucleotide sequence.
SO:ke
An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone.
http://en.wikipedia.org/wiki/Nucleic_acid
nucleic acid
sequence
SO:0000348
nucleic_acid
An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone.
CHEBI:33696
RSC:cb
http://en.wikipedia.org/wiki/Nucleic_acid
wiki
A match against a protein sequence.
protein match
sequence
SO:0000349
protein_match
A match against a protein sequence.
SO:ke
An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid.
FLP recombination target region
FRT site
sequence
SO:0000350
FRT_site
An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid.
SO:ma
An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence.
synthetic sequence
sequence
SO:0000351
synthetic_sequence
An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence.
SO:ma
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone.
sequence
SO:0000352
DNA
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone.
RSC:cb
A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences.
http://en.wikipedia.org/wiki/Sequence_assembly
sequence assembly
sequence
SO:0000353
sequence_assembly
A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences.
SO:ma
http://en.wikipedia.org/wiki/Sequence_assembly
wiki
A region of intronic nucleotide sequence targeted by a nuclease enzyme.
group 1 intron homing endonuclease target region
sequence
SO:0000354
group_1_intron_homing_endonuclease_target_region
A region of intronic nucleotide sequence targeted by a nuclease enzyme.
SO:ke
A region of the genome which is co-inherited as the result of the lack of historic recombination within it.
haplotype block
sequence
SO:0000355
haplotype_block
A region of the genome which is co-inherited as the result of the lack of historic recombination within it.
SO:ma
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone.
sequence
SO:0000356
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
RNA
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone.
RSC:cb
An attribute describing a region that is bounded either side by a particular kind of region.
sequence
SO:0000357
flanked
An attribute describing a region that is bounded either side by a particular kind of region.
SO:ke
true
An attribute describing sequence that is flanked by Lox-P sites.
http://en.wikipedia.org/wiki/Floxed
sequence
SO:0000359
floxed
An attribute describing sequence that is flanked by Lox-P sites.
SO:ke
http://en.wikipedia.org/wiki/Floxed
wiki
A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS.
http://en.wikipedia.org/wiki/Codon
sequence
SO:0000360
codon
A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS.
SO:ke
http://en.wikipedia.org/wiki/Codon
wiki
An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT.
FRT flanked
sequence
SO:0000361
FRT_flanked
An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT.
SO:ke
A cDNA clone constructed from more than one mRNA. Usually an experimental artifact.
invalidated by chimeric cDNA
sequence
SO:0000362
invalidated_by_chimeric_cDNA
A cDNA clone constructed from more than one mRNA. Usually an experimental artifact.
SO:ma
A transgene that is floxed.
floxed gene
sequence
SO:0000363
floxed_gene
A transgene that is floxed.
SO:xp
The region of sequence surrounding a transposable element.
transposable element flanking region
sequence
SO:0000364
transposable_element_flanking_region
The region of sequence surrounding a transposable element.
SO:ke
A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site.
http://en.wikipedia.org/wiki/Integron
sequence
SO:0000365
integron
A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site.
SO:as
http://en.wikipedia.org/wiki/Integron
wiki
The junction where an insertion occurred.
insertion site
sequence
SO:0000366
insertion_site
The junction where an insertion occurred.
SO:ke
A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place.
attI site
sequence
SO:0000367
attI_site
A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place.
SO:as
The junction in a genome where a transposable_element has inserted.
transposable element insertion site
sequence
SO:0000368
transposable_element_insertion_site
The junction in a genome where a transposable_element has inserted.
SO:ke
sequence
SO:0000369
integrase_coding_region
true
A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others).
small regulatory ncRNA
sequence
SO:0000370
small_regulatory_ncRNA
A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others).
PMID:28541282
PomBase:al
SO:ma
A transposon that encodes function required for conjugation.
conjugative transposon
sequence
SO:0000371
conjugative_transposon
A transposon that encodes function required for conjugation.
http://www.sci.sdsu.edu/~smaloy/Glossary/C.html
An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein.
enzymatic RNA
sequence
SO:0000372
This was moved to be a child of transcript (SO:0000673) because some enzymatic RNA regions are part of primary transcripts and some are part of processed transcripts. Moved under ncRNA on 18 Nov 2021. See GitHub Issue #533.
enzymatic_RNA
An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein.
RSC:cb
A recombinationally rearranged gene by inversion.
recombinationally inverted gene
sequence
SO:0000373
recombinationally_inverted_gene
A recombinationally rearranged gene by inversion.
SO:xp
An RNA with catalytic activity.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Ribozyme
INSDC_qualifier:ribozyme
sequence
SO:0000374
ribozyme
An RNA with catalytic activity.
SO:ma
http://en.wikipedia.org/wiki/Ribozyme
wiki
Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes.
http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA
cytosolic 5.8S LSU rRNA
cytosolic 5.8S rRNA
cytosolic 5.8S ribosomal RNA
cytosolic rRNA 5 8S
sequence
SO:0000375
Dave Sant removed '5_8S rRNA is also found in archaea.' from definition due to lack of references mentioning this on 1 Feb 2021. See GitHub Issue #505. Renamed from rRNA_5_8S to cytosolic_5_8S_rRNA on 10 June 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_5_8S_rRNA
Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes.
https://rfam.xfam.org/family/RF00002
http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA
wiki
A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase.
http://en.wikipedia.org/wiki/6S_RNA
6S RNA
RNA 6S
sequence
SO:0000376
RNA_6S
A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00013
http://en.wikipedia.org/wiki/6S_RNA
wiki
An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family.
CsrB RsmB RNA
CsrB-RsmB RNA
sequence
SO:0000377
CsrB_RsmB_RNA
An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00018
DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation.
http://en.wikipedia.org/wiki/DsrA_RNA
DsrA RNA
sequence
SO:0000378
DsrA_RNA
DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00014
http://en.wikipedia.org/wiki/DsrA_RNA
wiki
A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli.
http://en.wikipedia.org/wiki/GcvB_RNA
GcvB RNA
sequence
SO:0000379
GcvB_RNA
A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00022
http://en.wikipedia.org/wiki/GcvB_RNA
wiki
A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Hammerhead_ribozyme
INSDC_qualifier:hammerhead_ribozyme
hammerhead ribozyme
sequence
SO:0000380
hammerhead_ribozyme
A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs.
PMID:2436805
http://en.wikipedia.org/wiki/Hammerhead_ribozyme
wiki
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon.
group IIA intron
sequence
SO:0000381
group_IIA_intron
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon.
PMID:20463000
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon.
group IIB intron
sequence
SO:0000382
group_IIB_intron
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon.
PMID:20463000
A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message.
http://en.wikipedia.org/wiki/MicF_RNA
MicF RNA
sequence
SO:0000383
MicF_RNA
A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00033
http://en.wikipedia.org/wiki/MicF_RNA
wiki
A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages.
http://en.wikipedia.org/wiki/OxyS_RNA
OxyS RNA
sequence
SO:0000384
OxyS_RNA
A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00035
http://en.wikipedia.org/wiki/OxyS_RNA
wiki
The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs.
INSDC_feature:ncRNA
INSDC_qualifier:RNase_MRP_RNA
RNase MRP RNA
sequence
SO:0000385
Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533.
RNase_MRP_RNA
The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030
The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs.
INSDC_feature:ncRNA
INSDC_qualifier:RNase_P_RNA
RNase P RNA
sequence
SO:0000386
Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533.
RNase_P_RNA
The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010
Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential.
http://en.wikipedia.org/wiki/RprA_RNA
RprA RNA
sequence
SO:0000387
RprA_RNA
Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00034
http://en.wikipedia.org/wiki/RprA_RNA
wiki
The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins.
RRE RNA
sequence
SO:0000388
RRE_RNA
The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00036
A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels.
http://en.wikipedia.org/wiki/Spot_42_RNA
spot-42 RNA
sequence
SO:0000389
spot_42_RNA
A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00021
http://en.wikipedia.org/wiki/Spot_42_RNA
wiki
The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Telomerase_RNA
INSDC_qualifier:telomerase_RNA
telomerase RNA
sequence
SO:0000390
telomerase_RNA
The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025
http://en.wikipedia.org/wiki/Telomerase_RNA
wiki
U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV.
http://en.wikipedia.org/wiki/U1_snRNA
U1 small nuclear RNA
U1 snRNA
small nuclear RNA U1
snRNA U1
sequence
SO:0000391
U1_snRNA
U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003
http://en.wikipedia.org/wiki/U1_snRNA
wiki
U1 small nuclear RNA
RSC:cb
small nuclear RNA U1
RSC:cb
snRNA U1
RSC:cb
U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing.
http://en.wikipedia.org/wiki/U2_snRNA
U2 small nuclear RNA
U2 snRNA
small nuclear RNA U2
snRNA U2
sequence
SO:0000392
U2_snRNA
U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004
http://en.wikipedia.org/wiki/U2_snRNA
wiki
U2 small nuclear RNA
RSC:CB
small nuclear RNA U2
RSC:CB
snRNA U2
RSC:CB
U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6.
http://en.wikipedia.org/wiki/U4_snRNA
U4 small nuclear RNA
U4 snRNA
small nuclear RNA U4
snRNA U4
sequence
SO:0000393
U4_snRNA
U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015
http://en.wikipedia.org/wiki/U4_snRNA
wiki
U4 small nuclear RNA
RSC:cb
small nuclear RNA U4
RSC:cb
snRNA U4
RSC:cb
An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397).
U4atac small nuclear RNA
U4atac snRNA
small nuclear RNA U4atac
snRNA U4atac
sequence
SO:0000394
U4atac_snRNA
An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397).
PMID:12409455
U4atac small nuclear RNA
RSC:cb
small nuclear RNA U4atac
RSC:cb
snRNA U4atac
RSC:cb
U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation.
http://en.wikipedia.org/wiki/U5_snRNA
U5 small nuclear RNA
U5 snRNA
small nuclear RNA U5
snRNA U5
sequence
SO:0000395
U5_snRNA
U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020
http://en.wikipedia.org/wiki/U5_snRNA
wiki
U5 small nuclear RNA
RSC:cb
small nuclear RNA U5
RSC:cb
snRNA U5
RSC:cb
U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA.
http://en.wikipedia.org/wiki/U6_snRNA
U6 small nuclear RNA
U6 snRNA
small nuclear RNA U6
snRNA U6
sequence
SO:0000396
U6_snRNA
U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015
http://en.wikipedia.org/wiki/U6_snRNA
wiki
U6 small nuclear RNA
RSC:cb
small nuclear RNA U6
RSC:cb
snRNA U6
RSC:cb
U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394).
U6atac small nuclear RNA
U6atac snRNA
snRNA U6atac
sequence
SO:0000397
U6atac_snRNA
U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394).
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=12409455&dopt=Abstract
U6atac small nuclear RNA
RSC:cb
U6atac snRNA
RSC:cb
snRNA U6atac
RSC:cb
U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence.
http://en.wikipedia.org/wiki/U11_snRNA
U11 small nuclear RNA
U11 snRNA
small nuclear RNA U11
snRNA U11
sequence
SO:0000398
U11_snRNA
U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence.
PMID:9622129
http://en.wikipedia.org/wiki/U11_snRNA
wiki
U11 small nuclear RNA
RSC:cb
small nuclear RNA U11
RSC:cb
snRNA U11
RSC:cb
The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns.
http://en.wikipedia.org/wiki/U12_snRNA
U12 small nuclear RNA
U12 snRNA
small nuclear RNA U12
snRNA U12
sequence
SO:0000399
U12_snRNA
The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007
http://en.wikipedia.org/wiki/U12_snRNA
wiki
U12 small nuclear RNA
RSC:cb
small nuclear RNA U12
RSC:cb
snRNA U12
RSC:cb
An attribute describes a quality of sequence.
sequence attribute
sequence
SO:0000400
sequence_attribute
An attribute describes a quality of sequence.
SO:ke
An attribute describing a gene.
gene attribute
sequence
SO:0000401
gene_attribute
sequence
SO:0000402
enhancer_attribute
true
U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates.
SO:0005839
U14 small nucleolar RNA
U14 snoRNA
small nucleolar RNA U14
snoRNA U14
sequence
SO:0000403
An evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA.
U14_snoRNA
U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates.
PMID:2551119
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016
A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Vault_RNA
INSDC_qualifier:vault_RNA
vault RNA
sequence
SO:0000404
vault_RNA
A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006
http://en.wikipedia.org/wiki/Vault_RNA
wiki
Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Y_RNA
INSDC_qualifier:Y_RNA
Y RNA
sequence
SO:0000405
Y_RNA
Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019
http://en.wikipedia.org/wiki/Y_RNA
wiki
An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed.
http://en.wikipedia.org/wiki/Twintron
sequence
SO:0000406
twintron
An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed.
PMID:1899376
PMID:7823908
http://en.wikipedia.org/wiki/Twintron
wiki
Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes.
http://en.wikipedia.org/wiki/18S_ribosomal_RNA
cytosolic 18S rRNA
cytosolic 18S ribosomal RNA
cytosolic rRNA 18S
sequence
SO:0000407
Renamed to cytosolic_18S_rRNA from rRNA_18S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493.
cytosolic_18S_rRNA
Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes.
SO:ke
http://en.wikipedia.org/wiki/18S_ribosomal_RNA
wiki
The interbase position where something (eg an aberration) occurred.
sequence
SO:0000408
site
true
The interbase position where something (eg an aberration) occurred.
SO:ke
A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids.
BS:00033
http://en.wikipedia.org/wiki/Binding_site
INSDC_feature:misc_binding
binding site
binding_or_interaction_site
sequence
site
SO:0000409
See GO:0005488 : binding.
binding_site
A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids.
EBIBS:GAR
SO:ke
http://en.wikipedia.org/wiki/Binding_site
wiki
A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules.
INSDC_feature:protein_bind
protein binding site
sequence
SO:0000410
See GO:0042277 : peptide binding.
protein_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules.
SO:ke
A region that rescues.
rescue fragment
rescue region
sequence
rescue segment
SO:0000411
rescue_region
A region that rescues.
SO:xp
A region of polynucleotide sequence produced by digestion with a restriction endonuclease.
http://en.wikipedia.org/wiki/Restriction_fragment
restriction fragment
sequence
SO:0000412
restriction_fragment
A region of polynucleotide sequence produced by digestion with a restriction endonuclease.
SO:ke
http://en.wikipedia.org/wiki/Restriction_fragment
wiki
A region where the sequence differs from that of a specified sequence.
INSDC_feature:misc_difference
sequence difference
sequence
SO:0000413
sequence_difference
A region where the sequence differs from that of a specified sequence.
SO:ke
An attribute to describe a feature that is invalidated due to genomic contamination.
invalidated by genomic contamination
sequence
SO:0000414
invalidated_by_genomic_contamination
An attribute to describe a feature that is invalidated due to genomic contamination.
SO:ke
An attribute to describe a feature that is invalidated due to polyA priming.
invalidated by genomic polyA primed cDNA
sequence
SO:0000415
invalidated_by_genomic_polyA_primed_cDNA
An attribute to describe a feature that is invalidated due to polyA priming.
SO:ke
An attribute to describe a feature that is invalidated due to partial processing.
invalidated by partial processing
sequence
SO:0000416
invalidated_by_partial_processing
An attribute to describe a feature that is invalidated due to partial processing.
SO:ke
A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution.
BS:00012
BS:00134
SO:0001069
domain
structural domain
polypeptide domain
polypeptide_structural_domain
sequence
SO:0000417
Range. Old definition from before biosapiens: A region of a single polypeptide chain that folds into an independent unit and exhibits biological activity. A polypeptide chain may have multiple domains.
polypeptide_domain
A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution.
EBIBS:GAR
domain
uniprot:feature_type
structural domain
polypeptide_structural_domain
The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components.
BS:00159
http://en.wikipedia.org/wiki/Signal_peptide
INSDC_feature:sig_peptide
signal peptide
signal peptide coding sequence
sequence
signal
SO:0000418
Old def before biosapiens:The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence.
signal_peptide
The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Signal_peptide
wiki
signal
uniprot:feature_type
The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide.
BS:00149
INSDC_feature:mat_peptide
mature protein region
sequence
chain
mature peptide
SO:0000419
This term mature peptide, merged with the biosapiens term mature protein region and took that to be the new name. Old def: The coding sequence for the mature or final peptide or protein product following post-translational modification.
mature_protein_region
The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide.
EBIBS:GAR
SO:cb
http://www.insdc.org/files/feature_table.html
chain
uniprot:feature_type
An inverted repeat (SO:0000294) occurring at the 5-prime termini of a DNA transposon.
5' TIR
five prime terminal inverted repeat
sequence
SO:0000420
five_prime_terminal_inverted_repeat
An inverted repeat (SO:0000294) occurring at the 3-prime termini of a DNA transposon.
3' TIR
three prime terminal inverted repeat
sequence
SO:0000421
three_prime_terminal_inverted_repeat
The U5 segment of the long terminal repeats.
U5 LTR region
U5 long terminal repeat region
sequence
SO:0000422
U5_LTR_region
The R segment of the long terminal repeats.
R LTR region
R long terminal repeat region
sequence
SO:0000423
R_LTR_region
The U3 segment of the long terminal repeats.
U3 LTR region
U3 long terminal repeat region
sequence
SO:0000424
U3_LTR_region
The long terminal repeat found at the five-prime end of the sequence to be inserted into the host genome.
5' LTR
5' long terminal repeat
five prime LTR
sequence
SO:0000425
five_prime_LTR
The long terminal repeat found at the three-prime end of the sequence to be inserted into the host genome.
3' LTR
3' long terminal repeat
three prime LTR
sequence
SO:0000426
three_prime_LTR
The R segment of the three-prime long terminal repeat.
R 5' long term repeat region
R five prime LTR region
sequence
SO:0000427
R_five_prime_LTR_region
The U5 segment of the three-prime long terminal repeat.
U5 5' long terminal repeat region
U5 five prime LTR region
sequence
SO:0000428
U5_five_prime_LTR_region
The U3 segment of the three-prime long terminal repeat.
U3 5' long term repeat region
U3 five prime LTR region
sequence
SO:0000429
U3_five_prime_LTR_region
The R segment of the three-prime long terminal repeat.
R 3' long terminal repeat region
R three prime LTR region
sequence
SO:0000430
R_three_prime_LTR_region
The U3 segment of the three-prime long terminal repeat.
U3 3' long terminal repeat region
U3 three prime LTR region
sequence
SO:0000431
U3_three_prime_LTR_region
The U5 segment of the three-prime long terminal repeat.
U5 3' long terminal repeat region
U5 three prime LTR region
sequence
SO:0000432
U5_three_prime_LTR_region
A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon.
INSDC_feature:repeat_region
INSDC_qualifier:non_ltr_retrotransposon_polymeric_tract
non LTR retrotransposon polymeric tract
sequence
SO:0000433
non_LTR_retrotransposon_polymeric_tract
A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon.
SO:ke
A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion.
target site duplication
sequence
SO:0000434
target_site_duplication
A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion.
http://www.koko.gov.my/CocoaBioTech/Glossaryt.html
A polypurine tract within an LTR_retrotransposon.
RR tract
sequence
LTR retrotransposon poly purine tract
SO:0000435
RR_tract
A polypurine tract within an LTR_retrotransposon.
SO:ke
A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host.
autonomously replicating sequence
sequence
SO:0000436
ARS
A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host.
SO:ma
sequence
SO:0000437
assortment_derived_duplication
true
sequence
SO:0000438
gene_not_polyadenylated
true
A ring chromosome is a chromosome whose arms have fused together to form a ring in an inverted fashion, often with the loss of the ends of the chromosome.
inverted ring chromosome
sequence
SO:0000439
inverted_ring_chromosome
A replicon that has been modified to act as a vector for foreign sequence.
http://en.wikipedia.org/wiki/Vector_(molecular_biology)
vector
vector replicon
sequence
SO:0000440
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
vector_replicon
A replicon that has been modified to act as a vector for foreign sequence.
SO:ma
http://en.wikipedia.org/wiki/Vector_(molecular_biology)
wiki
A single stranded oligonucleotide.
single strand oligo
single strand oligonucleotide
single stranded oligonucleotide
ss oligo
ss oligonucleotide
sequence
SO:0000441
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ss_oligo
A single stranded oligonucleotide.
SO:ke
A double stranded oligonucleotide.
double stranded oligonucleotide
ds oligo
ds-oligonucleotide
sequence
SO:0000442
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ds_oligo
A double stranded oligonucleotide.
SO:ke
An attribute to describe the kind of biological sequence.
polymer attribute
sequence
SO:0000443
polymer_attribute
An attribute to describe the kind of biological sequence.
SO:ke
Non-coding exon in the 3' UTR.
three prime noncoding exon
sequence
SO:0000444
three_prime_noncoding_exon
Non-coding exon in the 3' UTR.
SO:ke
Non-coding exon in the 5' UTR.
5' nc exon
5' non coding exon
five prime noncoding exon
sequence
SO:0000445
five_prime_noncoding_exon
Non-coding exon in the 5' UTR.
SO:ke
Intron located in the untranslated region.
UTR intron
sequence
SO:0000446
UTR_intron
Intron located in the untranslated region.
SO:ke
An intron located in the 5' UTR.
five prime UTR intron
sequence
SO:0000447
five_prime_UTR_intron
An intron located in the 5' UTR.
SO:ke
An intron located in the 3' UTR.
three prime UTR intron
sequence
SO:0000448
three_prime_UTR_intron
An intron located in the 3' UTR.
SO:ke
A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components.
random sequence
sequence
SO:0000449
random_sequence
A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components.
SO:ma
A light region between two darkly staining bands in a polytene chromosome.
sequence
chromosome interband
SO:0000450
interband
A light region between two darkly staining bands in a polytene chromosome.
SO:ma
A gene that encodes a polyadenylated mRNA.
gene with polyadenylated mRNA
sequence
SO:0000451
gene_with_polyadenylated_mRNA
A gene that encodes a polyadenylated mRNA.
SO:xp
sequence
SO:0000452
transgene_attribute
true
A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type.
chromosomal transposition
transposition
sequence
SO:0000453
chromosomal_transposition
A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type.
FB:reference_manual
SO:ke
A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements.
INSDC_feature:ncRNA
INSDC_qualifier:rasiRNA
repeat associated small interfering RNA
sequence
SO:0000454
Changed parent term from ncRNA (SO:0000655) to piRNA (SO:0001035). See GitHub Issue #573.
rasiRNA
A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements.
PMID:18032451
http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284
A gene that encodes an mRNA with a frameshift.
gene with mRNA with frameshift
sequence
SO:0000455
gene_with_mRNA_with_frameshift
A gene that encodes an mRNA with a frameshift.
SO:xp
A gene that is recombinationally rearranged.
recombinationally rearranged gene
sequence
SO:0000456
recombinationally_rearranged_gene
A gene that is recombinationally rearranged.
SO:ke
A chromosome duplication involving an insertion from another chromosome.
interchromosomal duplication
sequence
SO:0000457
interchromosomal_duplication
A chromosome duplication involving an insertion from another chromosome.
SO:ke
Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment.
D gene
D-GENE
INSDC_feature:D_segment
sequence
SO:0000458
D_gene_segment
Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A gene with a transcript that is trans-spliced.
gene with trans spliced transcript
sequence
SO:0000459
gene_with_trans_spliced_transcript
A gene with a transcript that is trans-spliced.
SO:xp
Germline genomic DNA with the sequence for a V, D, C, or J portion of an immunoglobulin/T-cell receptor.
vertebrate immunoglobulin T cell receptor segment
vertebrate_immunoglobulin/T-cell receptor gene
sequence
SO:0000460
I am using the term segment instead of gene here to avoid confusion with the region 'gene'.
vertebrate_immunoglobulin_T_cell_receptor_segment
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion.
inversion derived bipartite deficiency
sequence
SO:0000461
inversion_derived_bipartite_deficiency
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion.
FB:km
A non-functional descendant of a functional entity.
pseudogenic region
sequence
SO:0000462
pseudogenic_region
A non-functional descendant of a functional entity.
SO:cjm
A gene that encodes more than one transcript.
encodes alternately spliced transcripts
sequence
SO:0000463
encodes_alternately_spliced_transcripts
A gene that encodes more than one transcript.
SO:ke
A non-functional descendant of an exon.
decayed exon
sequence
SO:0000464
Does not have to be part of a pseudogene.
decayed_exon
A non-functional descendant of an exon.
SO:ke
A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion.
inversion derived deficiency plus duplication
sequence
SO:0000465
inversion_derived_deficiency_plus_duplication
A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion.
FB:km
Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR.
INSDC_feature:V_segment
V gene
V gene segment
V-GENE
variable_gene
sequence
SO:0000466
V_gene_segment
Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein.
post translationally regulated by protein stability
post-translationally regulated by protein stability
sequence
SO:0000467
post_translationally_regulated_by_protein_stability
An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein.
SO:ke
One of the pieces of sequence that make up a golden path.
golden path fragment
sequence
SO:0000468
golden_path_fragment
One of the pieces of sequence that make up a golden path.
SO:rd
An attribute describing a gene sequence where the resulting protein is modified to regulate it.
post translationally regulated by protein modification
post-translationally regulated by protein modification
sequence
SO:0000469
post_translationally_regulated_by_protein_modification
An attribute describing a gene sequence where the resulting protein is modified to regulate it.
SO:ke
Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment.
INSDC_feature:J_segment
J gene
J-GENE
sequence
SO:0000470
J_gene_segment
Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
The gene product is involved in its own transcriptional regulation.
sequence
SO:0000471
autoregulated
The gene product is involved in its own transcriptional regulation.
SO:ke
A set of regions which overlap with minimal polymorphism to form a linear sequence.
tiling path
sequence
SO:0000472
tiling_path
A set of regions which overlap with minimal polymorphism to form a linear sequence.
SO:cjm
The gene product is involved in its own transcriptional regulation where it decreases transcription.
negatively autoregulated
sequence
SO:0000473
negatively_autoregulated
The gene product is involved in its own transcriptional regulation where it decreases transcription.
SO:ke
A piece of sequence that makes up a tiling_path (SO:0000472).
tiling path fragment
sequence
SO:0000474
tiling_path_fragment
A piece of sequence that makes up a tiling_path (SO:0000472).
SO:ke
The gene product is involved in its own transcriptional regulation, where it increases transcription.
positively autoregulated
sequence
SO:0000475
positively_autoregulated
The gene product is involved in its own transcriptional regulation, where it increases transcription.
SO:ke
A DNA sequencer read which is part of a contig.
contig read
sequence
SO:0000476
contig_read
A DNA sequencer read which is part of a contig.
SO:ke
A gene that is polycistronic.
sequence
SO:0000477
polycistronic_gene
true
A gene that is polycistronic.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
C gene
C_GENE
INSDC_feature:C_region
constant gene
sequence
SO:0000478
C_gene_segment
Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A transcript that is trans-spliced.
INSDC_feature:tRNA
INSDC_qualifier:trans_splicing
trans spliced transcript
trans-spliced transcript
sequence
SO:0000479
trans_spliced_transcript
A transcript that is trans-spliced.
SO:xp
A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly.
tiling path clone
sequence
SO:0000480
tiling_path_clone
A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly.
SO:ke
An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon.
TIR
terminal inverted repeat
sequence
SO:0000481
terminal_inverted_repeat
An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration.
vertebrate immunoglobulin T cell receptor gene cluster
vertebrate_immunoglobulin/T-cell receptor gene cluster
sequence
SO:0000482
vertebrate_immunoglobulin_T_cell_receptor_gene_cluster
A primary transcript that is never translated into a protein.
nc primary transcript
noncoding primary transcript
sequence
SO:0000483
nc_primary_transcript
A primary transcript that is never translated into a protein.
SO:ke
The sequence of the 3' exon that is not coding.
three prime coding exon noncoding region
three_prime_exon_noncoding_region
sequence
SO:0000484
three_prime_coding_exon_noncoding_region
The sequence of the 3' exon that is not coding.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene.
(DJ)-J-CLUSTER
DJ J cluster
sequence
SO:0000485
DJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
The sequence of the 5' exon preceding the start codon.
five prime coding exon noncoding region
five_prime_exon_noncoding_region
sequence
SO:0000486
five_prime_coding_exon_noncoding_region
The sequence of the 5' exon preceding the start codon.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene.
(VDJ)-J-C-CLUSTER
VDJ J C cluster
sequence
SO:0000487
VDJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene.
(VDJ)-J-CLUSTER
VDJ J cluster
sequence
SO:0000488
VDJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene.
VJ C cluster
sequence
(VJ)-C-CLUSTER
SO:0000489
VJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene.
(VJ)-J-C-CLUSTER
VJ J C cluster
sequence
SO:0000490
VJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene.
(VJ)-J-CLUSTER
VJ J cluster
sequence
SO:0000491
VJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Recombination signal including D-heptamer, D-spacer and D-nonamer in 5' of D-region of a D-gene or D-sequence.
D gene recombination feature
sequence
SO:0000492
D_gene_recombination_feature
7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
3'D-HEPTAMER
three prime D heptamer
sequence
SO:0000493
three_prime_D_heptamer
7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
3'D-NOMAMER
three prime D nonamer
sequence
SO:0000494
three_prime_D_nonamer
A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS.
3'D-SPACER
three prime D spacer
sequence
SO:0000495
three_prime_D_spacer
A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
5'D-HEPTAMER
five prime D heptamer
sequence
SO:0000496
five_prime_D_heptamer
7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
9 nucleotide recombination site (e.g. GGTTTTTGT), part of a five_prime_D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
5'D-NONAMER
five prime D nonamer
sequence
SO:0000497
five_prime_D_nonamer
9 nucleotide recombination site (e.g. GGTTTTTGT), part of a five_prime_D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
12 or 23 nucleotide spacer between the 5' D-heptamer (SO:0000496) and 5' D-nonamer (SO:0000497) of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
5'-SPACER
five prime D spacer
five prime D-spacer
sequence
SO:0000498
five_prime_D_spacer
12 or 23 nucleotide spacer between the 5' D-heptamer (SO:0000496) and 5' D-nonamer (SO:0000497) of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database.
virtual sequence
sequence
SO:0000499
virtual_sequence
A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database.
SO:ke
A type of non-canonical base-pairing. This is less energetically favourable than watson crick base pairing. Hoogsteen GC base pairs only have two hydrogen bonds.
http://en.wikipedia.org/wiki/Hoogsteen_base_pair
Hoogsteen base pair
sequence
SO:0000500
Hoogsteen_base_pair
A type of non-canonical base-pairing. This is less energetically favourable than watson crick base pairing. Hoogsteen GC base pairs only have two hydrogen bonds.
PMID:12177293
http://en.wikipedia.org/wiki/Hoogsteen_base_pair
wiki
A type of non-canonical base-pairing.
reverse Hoogsteen base pair
sequence
SO:0000501
reverse_Hoogsteen_base_pair
A type of non-canonical base-pairing.
SO:ke
A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes.
sequence
SO:0000502
This concept cam about as a direct result of the SO meeting August 2004.nThe exact nature of the relationship between transcribed_region and gene is still up for discussion. We are going with 'associated_with' for the time being.
transcribed_region
true
A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes.
SO:ke
sequence
SO:0000503
alternately_spliced_gene_encodeing_one_transcript
true
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene and one C-gene.
D DJ C cluster
D-(DJ)-C-CLUSTER
sequence
SO:0000504
D_DJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene and one DJ-gene.
D DJ cluster
D-(DJ)-CLUSTER
sequence
SO:0000505
D_DJ_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene and one DJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, one J-gene and one C-gene.
D DJ J C cluster
D-(DJ)-J-C-CLUSTER
sequence
SO:0000506
D_DJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A non functional descendant of an exon, part of a pseudogene.
pseudogenic exon
sequence
SO:0000507
This is the analog of the exon of a functional gene. The term was requested by Rama - SGD to allow the annotation of the parts of a pseudogene. Non-functional is defined as either its transcription or translation (or both) are prevented due to one or more mutations.
pseudogenic_exon
A non functional descendant of an exon, part of a pseudogene.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, and one J-gene.
D DJ J cluster
D-(DJ)-J-CLUSTER
sequence
SO:0000508
D_DJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene, one J-gene and one C-gene.
D J C cluster
D-J-C-CLUSTER
sequence
SO:0000509
D_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including L-part1, V-intron and V-D-exon, with the 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
VD gene
V_D_GENE
sequence
SO:0000510
VD_gene_segment
Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including L-part1, V-intron and V-D-exon, with the 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one J-gene and one C-gene.
J C cluster
J-C-CLUSTER
sequence
SO:0000511
J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at one end and presumed to have a deficiency or duplication at the other end of the inversion.
inversion derived deficiency plus aneuploid
sequence
SO:0000512
inversion_derived_deficiency_plus_aneuploid
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at one end and presumed to have a deficiency or duplication at the other end of the inversion.
FB:km
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one J-gene.
J cluster
J-CLUSTER
sequence
SO:0000513
J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
9 nucleotide recombination site (e.g. GGTTTTTGT), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
J nonamer
J-NONAMER
sequence
SO:0000514
J_nonamer
9 nucleotide recombination site (e.g. GGTTTTTGT), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
7 nucleotide recombination site (e.g. CACAGTG), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
J heptamer
J-HEPTAMER
sequence
SO:0000515
J_heptamer
7 nucleotide recombination site (e.g. CACAGTG), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A non functional descendant of a transcript, part of a pseudogene.
INSDC_feature:misc_RNA
INSDC_qualifier:pseudo
pseudogenic transcript
sequence
SO:0000516
This is the analog of the transcript of a functional gene. The term was requested by Rama - SGD to allow the annotation of the parts of a pseudogene. Non-functional is defined as either its transcription or translation (or both) are prevented due to one or more mutations.
pseudogenic_transcript
A non functional descendant of a transcript, part of a pseudogene.
SO:ke
12 or 23 nucleotide spacer between the J-nonamer and the J-heptamer of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
J spacer
J-SPACER
sequence
SO:0000517
J_spacer
12 or 23 nucleotide spacer between the J-nonamer and the J-heptamer of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one DJ-gene.
V DJ cluster
V-(DJ)-CLUSTER
sequence
SO:0000518
V_DJ_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one DJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one J-gene.
V DJ J cluster
sequence
V-(DJ)-J-CLUSTER
SO:0000519
V_DJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one C-gene.
V VDJ C cluster
V-(VDJ)-C-CLUSTER
sequence
SO:0000520
V_VDJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VDJ-gene.
V VDJ cluster
V-(VDJ)-CLUSTER
sequence
SO:0000521
V_VDJ_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VDJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one J-gene.
V VDJ J cluster
sequence
V-(VDJ)-J-CLUSTER
SO:0000522
V_VDJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one C-gene.
V VJ C cluster
V-(VJ)-C-CLUSTER
sequence
SO:0000523
V_VJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VJ-gene.
V VJ cluster
V-(VJ)-CLUSTER
sequence
SO:0000524
V_VJ_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one J-gene.
V VJ J cluster
V-(VJ)-J-CLUSTER
sequence
SO:0000525
V_VJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one V-gene.
V cluster
V-CLUSTER
sequence
SO:0000526
V_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one V-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one C-gene.
V D DJ C cluster
V-D-(DJ)-C-CLUSTER
sequence
SO:0000527
V_D_DJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene.
V D DJ cluster
V-D-(DJ)-CLUSTER
sequence
SO:0000528
V_D_DJ_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene, one J-gene and one C-gene.
V D DJ J C cluster
V-D-(DJ)-J-C-CLUSTER
sequence
SO:0000529
V_D_DJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one J-gene.
V D DJ J cluster
V-D-(DJ)-J-CLUSTER
sequence
SO:0000530
V_D_DJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene and one C-gene.
V D J C cluster
V-D-J-C-CLUSTER
sequence
SO:0000531
V_D_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene.
V D J cluster
V-D-J-CLUSTER
sequence
SO:0000532
V_D_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
7 nucleotide recombination site (e.g. CACAGTG), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
V heptamer
V-HEPTAMER
sequence
SO:0000533
V_heptamer
7 nucleotide recombination site (e.g. CACAGTG), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene and one J-gene.
V J cluster
V-J-CLUSTER
sequence
SO:0000534
V_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one J-gene and one C-gene.
V J C cluster
V-J-C-CLUSTER
sequence
SO:0000535
V_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
9 nucleotide recombination site (e.g. ACAAAAACC), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
V nonamer
V-NONAMER
sequence
SO:0000536
V_nonamer
9 nucleotide recombination site (e.g. ACAAAAACC), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
12 or 23 nucleotide spacer between the V-heptamer and the V-nonamer of a V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
V spacer
V-SPACER
sequence
SO:0000537
V_spacer
12 or 23 nucleotide spacer between the V-heptamer and the V-nonamer of a V-gene recombination feature of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Recombination signal including V-heptamer, V-spacer and V-nonamer in 3' of V-region of a V-gene or V-sequence of an immunoglobulin/T-cell receptor gene.
V gene recombination feature
V-RS
sequence
SO:0000538
V_gene_recombination_feature
Recombination signal including V-heptamer, V-spacer and V-nonamer in 3' of V-region of a V-gene or V-sequence of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene and one C-gene.
(DJ)-C-CLUSTER
DJ C cluster
sequence
SO:0000539
DJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA in rearranged configuration including at least one D-J-GENE, one J-GENE and one C-GENE.
(DJ)-J-C-CLUSTER
DJ J C cluster
sequence
SO:0000540
DJ_J_C_cluster
Genomic DNA in rearranged configuration including at least one D-J-GENE, one J-GENE and one C-GENE.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one C-gene.
(VDJ)-C-CLUSTER
VDJ C cluster
sequence
SO:0000541
VDJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one C-gene.
V DJ C cluster
V-(DJ)-C-CLUSTER
sequence
SO:0000542
V_DJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
sequence
SO:0000543
alternately_spliced_gene_encoding_greater_than_one_transcript
true
A rolling circle transposon. Autonomous helitrons encode a 5'-to-3' DNA helicase and nuclease/ligase similar to those encoded by known rolling-circle replicons.
http://en.wikipedia.org/wiki/Helitron
sequence
ISCR
SO:0000544
helitron
A rolling circle transposon. Autonomous helitrons encode a 5'-to-3' DNA helicase and nuclease/ligase similar to those encoded by known rolling-circle replicons.
http://www.pnas.org/cgi/content/full/100/11/6569
http://en.wikipedia.org/wiki/Helitron
wiki
The pseudoknots involved in recoding are unique in that, as they play their role as a structure, they are immediately unfolded and their now linear sequence serves as a template for decoding.
recoding pseudoknot
sequence
SO:0000545
recoding_pseudoknot
The pseudoknots involved in recoding are unique in that, as they play their role as a structure, they are immediately unfolded and their now linear sequence serves as a template for decoding.
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=33937
An oligonucleotide sequence that was designed by an experimenter that may or may not correspond with any natural sequence.
designed sequence
sequence
SO:0000546
designed_sequence
A chromosome generated by recombination between two inversions; there is a duplication at each end of the inversion.
inversion derived bipartite duplication
sequence
SO:0000547
inversion_derived_bipartite_duplication
A chromosome generated by recombination between two inversions; there is a duplication at each end of the inversion.
FB:km
A gene that encodes a transcript that is edited.
gene with edited transcript
sequence
SO:0000548
gene_with_edited_transcript
A gene that encodes a transcript that is edited.
SO:xp
A chromosome generated by recombination between two inversions; has a duplication at one end and presumed to have a deficiency or duplication at the other end of the inversion.
inversion derived duplication plus aneuploid
sequence
SO:0000549
inversion_derived_duplication_plus_aneuploid
A chromosome generated by recombination between two inversions; has a duplication at one end and presumed to have a deficiency or duplication at the other end of the inversion.
FB:km
A chromosome structural variation whereby either a chromosome exists in addition to the normal chromosome complement or is lacking.
aneuploid chromosome
sequence
SO:0000550
Examples are Nullo-4, Haplo-4 and triplo-4 in Drosophila.
aneuploid_chromosome
A chromosome structural variation whereby either a chromosome exists in addition to the normal chromosome complement or is lacking.
SO:ke
The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA.
INSDC_feature:regulatory
INSDC_qualifier:polyA_signal_sequence
poly(A) signal
polyA signal sequence
polyadenylation termination signal
sequence
SO:0000551
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
polyA_signal_sequence
The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA.
http://www.insdc.org/files/feature_table.html
A region in the 5' UTR that pairs with the 16S rRNA during formation of the preinitiation complex.
http://en.wikipedia.org/wiki/Shine-Dalgarno_sequence
Shine Dalgarno sequence
Shine-Dalgarno sequence
five prime ribosome binding site
sequence
RBS
SO:0000552
Not found in Eukaryotic sequence.
Shine_Dalgarno_sequence
A region in the 5' UTR that pairs with the 16S rRNA during formation of the preinitiation complex.
SO:jh
http://en.wikipedia.org/wiki/Shine-Dalgarno_sequence
wiki
The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. The boundary between the UTR and the polyA sequence.
SO:0001430
INSDC_feature:polyA_site
polyA cleavage site
polyA junction
polyA site
polyA_junction
sequence
polyadenylation site
SO:0000553
polyA_site
The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. The boundary between the UTR and the polyA sequence.
http://www.insdc.org/files/feature_table.html
sequence
SO:0000554
assortment_derived_deficiency_plus_duplication
true
5' most region of a precursor transcript that is clipped off during processing.
five prime clip
sequence
5' clip
SO:0000555
five_prime_clip
5' most region of a precursor transcript that is clipped off during processing.
http://www.insdc.org/files/feature_table.html
Recombination signal of an immunoglobulin/T-cell receptor gene, including the 5' D-nonamer (SO:0000497), 5' D-spacer (SO:0000498), and 5' D-heptamer (SO:0000396) in 5' of the D-region of a D-gene, or in 5' of the D-region of DJ-gene.
5'RS
five prime D recombination signal sequence
five prime D-recombination signal sequence
sequence
SO:0000556
five_prime_D_recombination_signal_sequence
Recombination signal of an immunoglobulin/T-cell receptor gene, including the 5' D-nonamer (SO:0000497), 5' D-spacer (SO:0000498), and 5' D-heptamer (SO:0000396) in 5' of the D-region of a D-gene, or in 5' of the D-region of DJ-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
3'-most region of a precursor transcript that is clipped off during processing.
3'-clip
three prime clip
sequence
SO:0000557
three_prime_clip
3'-most region of a precursor transcript that is clipped off during processing.
http://www.insdc.org/files/feature_table.html
Genomic DNA of immunoglobulin/T-cell receptor gene including more than one C-gene.
C cluster
C-CLUSTER
sequence
SO:0000558
C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene including more than one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one D-gene.
D cluster
D-CLUSTER
sequence
SO:0000559
D_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one D-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene and one J-gene.
D J cluster
D-J-CLUSTER
sequence
SO:0000560
D_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Seven nucleotide recombination site (e.g. CACAGTG), part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene.
heptamer of recombination feature of vertebrate immune system gene
sequence
HEPTAMER
SO:0000561
heptamer_of_recombination_feature_of_vertebrate_immune_system_gene
Seven nucleotide recombination site (e.g. CACAGTG), part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Nine nucleotide recombination site, part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene.
nonamer of recombination feature of vertebrate immune system gene
sequence
SO:0000562
nonamer_of_recombination_feature_of_vertebrate_immune_system_gene
A 12 or 23 nucleotide spacer between two regions of an immunoglobulin/T-cell receptor gene that may be rearranged by recombinase.
vertebrate immune system gene recombination spacer
sequence
SO:0000563
vertebrate_immune_system_gene_recombination_spacer
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene, one J-gene and one C-gene.
V DJ J C cluster
V-(DJ)-J-C-CLUSTER
sequence
SO:0000564
V_DJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene, one J-gene and one C-gene.
V VDJ J C cluster
V-(VDJ)-J-C-CLUSTER
sequence
SO:0000565
V_VDJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene, one J-gene and one C-gene.
V VJ J C cluster
V-(VJ)-J-C-CLUSTER
sequence
SO:0000566
V_VJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A chromosome may be generated by recombination between two inversions; presumed to have a deficiency or duplication at each end of the inversion.
inversion derived aneuploid chromosome
sequence
SO:0000567
inversion_derived_aneuploid_chromosome
A chromosome may be generated by recombination between two inversions; presumed to have a deficiency or duplication at each end of the inversion.
FB:km
A promoter that can allow for transcription in both directions.
bidirectional promoter
sequence
SO:0000568
Definition updated in Aug 2020 by Dave Sant.
bidirectional_promoter
A promoter that can allow for transcription in both directions.
PMID:21601935
SO:ke
An attribute of a feature that occurred as the product of a reverse transcriptase mediated event.
SO:0100042
http://en.wikipedia.org/wiki/Retrotransposed
sequence
SO:0000569
GO:0003964 RNA-directed DNA polymerase activity.
retrotransposed
An attribute of a feature that occurred as the product of a reverse transcriptase mediated event.
SO:ke
http://en.wikipedia.org/wiki/Retrotransposed
wiki
Recombination signal of an immunoglobulin/T-cell receptor gene, including the 3' D-heptamer (SO:0000493), 3' D-spacer, and 3' D-nonamer (SO:0000494) in 3' of the D-region of a D-gene.
3'D-RS
three prime D recombination signal sequence
three_prime_D-recombination_signal_sequence
sequence
SO:0000570
three_prime_D_recombination_signal_sequence
Recombination signal of an immunoglobulin/T-cell receptor gene, including the 3' D-heptamer (SO:0000493), 3' D-spacer, and 3' D-nonamer (SO:0000494) in 3' of the D-region of a D-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A region that can be transcribed into a microRNA (miRNA).
miRNA encoding
sequence
SO:0000571
miRNA_encoding
Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including D-J-region with 5' UTR and 3' UTR, also designated as D-J-segment.
D-J-GENE
DJ gene
sequence
SO:0000572
DJ_gene_segment
Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including D-J-region with 5' UTR and 3' UTR, also designated as D-J-segment.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A region that can be transcribed into a ribosomal RNA (rRNA).
rRNA encoding
sequence
SO:0000573
rRNA_encoding
Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-D-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205).
V-D-J-GENE
VDJ gene
sequence
SO:0000574
VDJ_gene_segment
Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-D-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205).
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A region that can be transcribed into a small cytoplasmic RNA (scRNA).
scRNA encoding
sequence
SO:0000575
scRNA_encoding
Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205).
V-J-GENE
VJ gene
sequence
SO:0000576
VJ_gene_segment
Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205).
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A region of chromosome where the spindle fibers attach during mitosis and meiosis.
http://en.wikipedia.org/wiki/Centromere
INSDC_feature:centromere
sequence
SO:0000577
centromere
A region of chromosome where the spindle fibers attach during mitosis and meiosis.
SO:ke
http://en.wikipedia.org/wiki/Centromere
wiki
A region that can be transcribed into a small nucleolar RNA (snoRNA).
snoRNA encoding
sequence
SO:0000578
snoRNA_encoding
A locatable feature on a transcript that is edited.
edited transcript feature
sequence
SO:0000579
edited_transcript_feature
A locatable feature on a transcript that is edited.
SO:ma
A primary transcript encoding a methylation guide small nucleolar RNA.
methylation guide snoRNA primary transcript
sequence
SO:0000580
methylation_guide_snoRNA_primary_transcript
A primary transcript encoding a methylation guide small nucleolar RNA.
SO:ke
A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA.
http://en.wikipedia.org/wiki/5%27_cap
sequence
SO:0000581
cap
A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA.
http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html
http://en.wikipedia.org/wiki/5%27_cap
wiki
A primary transcript encoding an rRNA cleavage snoRNA.
rRNA cleavage snoRNA primary transcript
sequence
SO:0000582
rRNA_cleavage_snoRNA_primary_transcript
A primary transcript encoding an rRNA cleavage snoRNA.
SO:ke
The region of a transcript that will be edited.
pre edited region
pre-edited region
sequence
SO:0000583
pre_edited_region
The region of a transcript that will be edited.
http://dna.kdna.ucla.edu/rna/index.aspx
A tmRNA liberates a mRNA from a stalled ribosome. To accomplish this part of the tmRNA is used as a reading frame that ends in a translation stop signal. The broken mRNA is replaced in the ribosome by the tmRNA and translation of the tmRNA leads to addition of a proteolysis tag to the incomplete protein enabling recognition by a protease. Recently a number of permuted tmRNAs genes have been found encoded in two parts. TmRNAs have been identified in eubacteria and some chloroplasts but are absent from archeal and Eukaryote nuclear genomes.
http://en.wikipedia.org/wiki/TmRNA
INSDC_feature:tmRNA
sequence
10Sa RNA
ssrA
SO:0000584
tmRNA
A tmRNA liberates a mRNA from a stalled ribosome. To accomplish this part of the tmRNA is used as a reading frame that ends in a translation stop signal. The broken mRNA is replaced in the ribosome by the tmRNA and translation of the tmRNA leads to addition of a proteolysis tag to the incomplete protein enabling recognition by a protease. Recently a number of permuted tmRNAs genes have been found encoded in two parts. TmRNAs have been identified in eubacteria and some chloroplasts but are absent from archeal and Eukaryote nuclear genomes.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00023
http://en.wikipedia.org/wiki/TmRNA
wiki
snoRNA that is associated with guiding methylation of nucleotides. It contains two short conserved sequence motifs: C (RUGAUGA) near the 5-prime end and D (CUGA) near the 3-prime end.
C/D box snoRNA encoding
sequence
SO:0000585
C_D_box_snoRNA_encoding
A primary transcript encoding a tmRNA (SO:0000584).
tmRNA primary transcript
sequence
10Sa RNA primary transcript
ssrA RNA primary transcript
SO:0000586
tmRNA_primary_transcript
A primary transcript encoding a tmRNA (SO:0000584).
SO:ke
Group I catalytic introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions.
http://en.wikipedia.org/wiki/Group_I_intron
group I intron
sequence
SO:0000587
GO:0000372.
group_I_intron
Group I catalytic introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00028
http://en.wikipedia.org/wiki/Group_I_intron
wiki
A self spliced intron.
INSDC_feature:ncRNA
INSDC_qualifier:autocatalytically_spliced_intron
autocatalytically spliced intron
sequence
SO:0000588
autocatalytically_spliced_intron
A self spliced intron.
SO:ke
A primary transcript encoding a signal recognition particle RNA.
SRP RNA primary transcript
sequence
SO:0000589
SRP_RNA_primary_transcript
A primary transcript encoding a signal recognition particle RNA.
SO:ke
The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding.
INSDC_feature:ncRNA
INSDC_qualifier:SRP_RNA
SRP RNA
sequence
7S RNA
signal recognition particle RNA
SO:0000590
SRP_RNA
The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00017
A tertiary structure in RNA where nucleotides in a loop form base pairs with a region of RNA downstream of the loop.
http://en.wikipedia.org/wiki/Pseudoknot
sequence
SO:0000591
pseudoknot
A tertiary structure in RNA where nucleotides in a loop form base pairs with a region of RNA downstream of the loop.
RSC:cb
http://en.wikipedia.org/wiki/Pseudoknot
wiki
A pseudoknot which contains two stems and at least two loops.
H pseudoknot
H-pseudoknot
H-type pseudoknot
classical pseudoknot
hairpin-type pseudoknot
sequence
SO:0000592
H_pseudoknot
A pseudoknot which contains two stems and at least two loops.
http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10334330&dopt=Abstract
Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'.
C D box snoRNA
C/D box snoRNA
SNORD
box C/D snoRNA
sequence
SO:0000593
Added 'SNORD' as a synonym of C_D_box_snoRNA (SO:0000593) and 'SNORA' as a synonym of H_ACA_box_snoRNA (SO:0000594). See GitHub Issue #577.
C_D_box_snoRNA
Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'.
http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html
SNORD
PMID:31828325
Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains.
H ACA box snoRNA
H/ACA box snoRNA
SNORA
box H/ACA snoRNA
sequence
SO:0000594
Added 'SNORD' as a synonym of C_D_box_snoRNA (SO:0000593) and 'SNORA' as a synonym of H_ACA_box_snoRNA (SO:0000594). See GitHub Issue #577.
H_ACA_box_snoRNA
Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains.
http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html
SNORA
PMID:31828325
A primary transcript encoding a small nucleolar RNA of the box C/D family.
C/D box snoRNA primary transcript
sequence
SO:0000595
C_D_box_snoRNA_primary_transcript
A primary transcript encoding a small nucleolar RNA of the box C/D family.
SO:ke
A primary transcript encoding a small nucleolar RNA of the box H/ACA family.
H ACA box snoRNA primary transcript
sequence
SO:0000596
H_ACA_box_snoRNA_primary_transcript
A primary transcript encoding a small nucleolar RNA of the box H/ACA family.
SO:ke
The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa.
sequence
SO:0000597
transcript_edited_by_U_insertion/deletion
true
The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa.
http://www.rna.ucla.edu/index.html
sequence
transcript_edited_by_C-insertion_and_dinucleotide_insertion
SO:0000598
edited_by_C_insertion_and_dinucleotide_insertion
true
sequence
SO:0000599
edited_by_C_to_U_substitution
true
sequence
SO:0000600
edited_by_A_to_I_substitution
true
sequence
SO:0000601
edited_by_G_addition
true
A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Guide_RNA
INSDC_qualifier:guide_RNA
gRNA
guide RNA
sequence
SO:0000602
guide_RNA
A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA.
http://www.rna.ucla.edu/index.html
http://en.wikipedia.org/wiki/Guide_RNA
wiki
Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron to intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny.
http://en.wikipedia.org/wiki/Group_II_intron
group II intron
sequence
SO:0000603
GO:0000373.
group_II_intron
Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron to intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny.
http://www.sanger.ac.uk/Software/Rfam/browse/index.shtml
http://en.wikipedia.org/wiki/Group_II_intron
wiki
Edited mRNA sequence mediated by a single guide RNA (SO:0000602).
editing block
sequence
SO:0000604
editing_block
Edited mRNA sequence mediated by a single guide RNA (SO:0000602).
http://dna.kdna.ucla.edu/rna/index.aspx
A region containing or overlapping no genes that is bounded on either side by a gene, or bounded by a gene and the end of the chromosome.
http://en.wikipedia.org/wiki/Intergenic_region
intergenic region
sequence
SO:0000605
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
intergenic_region
A region containing or overlapping no genes that is bounded on either side by a gene, or bounded by a gene and the end of the chromosome.
SO:cjm
http://en.wikipedia.org/wiki/Intergenic_region
wiki
Edited mRNA sequence mediated by two or more overlapping guide RNAs (SO:0000602).
editing domain
sequence
SO:0000606
editing_domain
Edited mRNA sequence mediated by two or more overlapping guide RNAs (SO:0000602).
http://dna.kdna.ucla.edu/rna/index.aspx
The region of an edited transcript that will not be edited.
unedited region
sequence
SO:0000607
unedited_region
The region of an edited transcript that will not be edited.
http://dna.kdna.ucla.edu/rna/index.aspx
snoRNA that is associated with guiding polyuridylation. It contains two short conserved sequence motifs: H box (ANANNA) and ACA (ACA).
H ACA box snoRNA encoding
sequence
SO:0000608
H_ACA_box_snoRNA_encoding
The string of non-encoded U's at the 3' end of a guide RNA (SO:0000602).
oligo U tail
sequence
SO:0000609
oligo_U_tail
The string of non-encoded U's at the 3' end of a guide RNA (SO:0000602).
http://www.rna.ucla.edu/
Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs.
polyA sequence
sequence
SO:0000610
polyA_sequence
Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs.
SO:ke
A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat.
branch point
branch site
branch_point
sequence
SO:0000611
branch_site
A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat.
SO:ke
The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing.
http://en.wikipedia.org/wiki/Polypyrimidine_tract
polypyrimidine tract
sequence
SO:0000612
polypyrimidine_tract
The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing.
http://nar.oupjournals.org/cgi/content/full/25/4/888
http://en.wikipedia.org/wiki/Polypyrimidine_tract
wiki
A DNA sequence to which bacterial RNA polymerase binds, to begin transcription.
bacterial RNApol promoter
sequence
SO:0000613
former parent RNA_polymerase_promoter SO:0001203 was merged with promoter SO:0000167 in Aug 2020 as part of GREEKC.
bacterial_RNApol_promoter
A DNA sequence to which bacterial RNA polymerase binds, to begin transcription.
SO:ke
A terminator signal for bacterial transcription.
bacterial terminator
sequence
SO:0000614
Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529.
bacterial_terminator
A terminator signal for bacterial transcription.
SO:ke
A terminator signal for RNA polymerase III transcription.
terminator of type 2 RNApol III promoter
sequence
SO:0000615
terminator_of_type_2_RNApol_III_promoter
A terminator signal for RNA polymerase III transcription.
SO:ke
The base where transcription ends.
transcription end site
sequence
SO:0000616
transcription_end_site
The base where transcription ends.
SO:ke
This type of promoter recruits RNA pol III. This promoter is intragenic and includes an A box, an intermediate element, and a C box. This is well conserved in the 5s rRNA promoters across species.
RNApol III promoter type 1
sequence
SO:0000617
RNApol_III_promoter_type_1
This type of promoter recruits RNA pol III. This promoter is intragenic and includes an A box, an intermediate element, and a C box. This is well conserved in the 5s rRNA promoters across species.
PMID:12381659
This type of promoter recruits RNA pol III to transcribe genes mainly for t-RNA. This promoter is intragenic and includes an A box and a B box.
RNApol III promoter type 2
sequence
tRNA promoter
SO:0000618
RNApol_III_promoter_type_2
This type of promoter recruits RNA pol III to transcribe genes mainly for t-RNA. This promoter is intragenic and includes an A box and a B box.
PMID:12381659
A variably distant linear promoter region recognized by TFIIIC, with consensus sequence TGGCnnAGTGG.
http://en.wikipedia.org/wiki/A-box
A-box
sequence
SO:0000619
Binds TFIIIC.
A_box
A variably distant linear promoter region recognized by TFIIIC, with consensus sequence TGGCnnAGTGG.
SO:ke
http://en.wikipedia.org/wiki/A-box
wiki
A variably distant linear promoter region recognized by TFIIIC, with consensus sequence AGGTTCCAnnCC.
B-box
sequence
SO:0000620
Binds TFIIIC.
B_box
A variably distant linear promoter region recognized by TFIIIC, with consensus sequence AGGTTCCAnnCC.
SO:ke
This type of promoter recruits RNA pol III to transcribe predominantly noncoding RNAs. This promoter contains a proximal sequence element (PSE) and a TATA box upstream of the gene that it regulates. Transcription can also be activated by a distal sequence element (DSE), which is located further upstream.
RNApol III promoter type 3
sequence
SO:0000621
RNApol_III_promoter_type_3
This type of promoter recruits RNA pol III to transcribe predominantly noncoding RNAs. This promoter contains a proximal sequence element (PSE) and a TATA box upstream of the gene that it regulates. Transcription can also be activated by a distal sequence element (DSE), which is located further upstream.
PMID:12381659
An RNA polymerase III type 1 promoter with consensus sequence CAnnCCn.
C-box
sequence
SO:0000622
C_box
An RNA polymerase III type 1 promoter with consensus sequence CAnnCCn.
SO:ke
A region that can be transcribed into a small nuclear RNA (snRNA).
snRNA encoding
sequence
SO:0000623
snRNA_encoding
A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end.
http://en.wikipedia.org/wiki/Telomere
INSDC_feature:telomere
telomeric DNA
telomeric sequence
sequence
SO:0000624
telomere
A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end.
SO:ma
http://en.wikipedia.org/wiki/Telomere
wiki
A regulatory region which upon binding of transcription factors, suppress the transcription of the gene or genes they control.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Silencer_(DNA)
INSDC_qualifier:silencer
sequence
SO:0000625
silencer
A regulatory region which upon binding of transcription factors, suppress the transcription of the gene or genes they control.
SO:ke
http://en.wikipedia.org/wiki/Silencer_(DNA)
wiki
Regions of the chromosome that are important for regulating binding of chromosomes to the nuclear matrix.
chromosomal regulatory element
sequence
SO:0000626
chromosomal_regulatory_element
A regulatory region that 1) when located between a CRM and a gene's promoter prevents the CRM from modulating that genes expression and 2) acts as a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Insulator_(genetics)
INSDC_qualifier:insulator
insulator element
sequence
SO:0000627
moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020.
insulator
A regulatory region that 1) when located between a CRM and a gene's promoter prevents the CRM from modulating that genes expression and 2) acts as a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region.
NCBI:cf
PMID:12154228
SO:regcreative
http://en.wikipedia.org/wiki/Insulator_(genetics)
wiki
Regions of the chromosome that are important for structural elements.
chromosomal structural element
sequence
SO:0000628
chromosomal_structural_element
An open reading frame found within the 5' UTR that can be translated and stall the translation of the downstream open reading frame.
five prime open reading frame
sequence
SO:0000629
five_prime_open_reading_frame
An open reading frame found within the 5' UTR that can be translated and stall the translation of the downstream open reading frame.
PMID:12890013
A start codon upstream of the ORF.
upstream AUG codon
sequence
SO:0000630
upstream_AUG_codon
A start codon upstream of the ORF.
SO:ke
A primary transcript encoding for more than one gene product.
polycistronic primary transcript
sequence
SO:0000631
polycistronic_primary_transcript
A primary transcript encoding for more than one gene product.
SO:ke
A primary transcript encoding for one gene product.
monocistronic primary transcript
sequence
SO:0000632
monocistronic_primary_transcript
A primary transcript encoding for one gene product.
SO:ke
An mRNA with either a single protein product, or for which the regions encoding all its protein products overlap.
http://en.wikipedia.org/wiki/Monocistronic_mRNA
monocistronic mRNA
monocistronic processed transcript
sequence
SO:0000633
monocistronic_mRNA
An mRNA with either a single protein product, or for which the regions encoding all its protein products overlap.
SO:rd
http://en.wikipedia.org/wiki/Monocistronic_mRNA
wiki
An mRNA that encodes multiple proteins from at least two non-overlapping regions.
http://en.wikipedia.org/wiki/Polycistronic_mRNA
polycistronic mRNA
sequence
polycistronic processed transcript
SO:0000634
polycistronic_mRNA
An mRNA that encodes multiple proteins from at least two non-overlapping regions.
SO:rd
http://en.wikipedia.org/wiki/Polycistronic_mRNA
wiki
A primary transcript that donates the spliced leader to other mRNA.
mini exon donor RNA
mini-exon donor RNA
sequence
SO:0000635
mini_exon_donor_RNA
A primary transcript that donates the spliced leader to other mRNA.
SO:ke
Snall nuclear RNAs that are incorporated into the pre-mRNAs to replace the 5' end in some eukaryotes.
spliced leader RNA
sequence
mini-exon
SO:0000636
spliced_leader_RNA
Snall nuclear RNAs that are incorporated into the pre-mRNAs to replace the 5' end in some eukaryotes.
PMID:24130571
A plasmid that is engineered.
engineered plasmid
sequence
engineered plasmid gene
SO:0000637
engineered_plasmid
A plasmid that is engineered.
SO:xp
Part of an rRNA transcription unit that is transcribed but discarded during maturation, not giving rise to any part of rRNA.
transcribed spacer region
sequence
SO:0000638
transcribed_spacer_region
Part of an rRNA transcription unit that is transcribed but discarded during maturation, not giving rise to any part of rRNA.
http://oregonstate.edu/instruction/bb492/general/glossary.html
Non-coding regions of DNA sequence that separate genes coding for the 28S, 5.8S, and 18S ribosomal RNAs.
internal transcribed spacer region
sequence
SO:0000639
internal_transcribed_spacer_region
Non-coding regions of DNA sequence that separate genes coding for the 28S, 5.8S, and 18S ribosomal RNAs.
SO:ke
Non-coding regions of DNA that precede the sequence that codes for the ribosomal RNA.
external transcribed spacer region
sequence
SO:0000640
external_transcribed_spacer_region
Non-coding regions of DNA that precede the sequence that codes for the ribosomal RNA.
SO:ke
A region of a repeating tetranucleotide sequence (four bases).
tetranucleotide repeat microsatellite feature
sequence
SO:0000641
tetranucleotide_repeat_microsatellite_feature
A region that can be transcribed into a signal recognition particle RNA (SRP RNA).
SRP RNA encoding
sequence
SO:0000642
SRP_RNA_encoding
A repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Minisatellite
INSDC_qualifier:minisatellite
VNTR
sequence
SO:0000643
minisatellite
A repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp.
http://www.informatics.jax.org/silver/glossary.shtml
http://en.wikipedia.org/wiki/Minisatellite
wiki
VNTR
http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9655/
Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Antisense_RNA
INSDC_qualifier:antisense_RNA
antisense RNA
sequence
SO:0000644
antisense_RNA
Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA.
SO:ke
http://en.wikipedia.org/wiki/Antisense_RNA
wiki
The reverse complement of the primary transcript.
antisense primary transcript
sequence
SO:0000645
antisense_primary_transcript
The reverse complement of the primary transcript.
SO:ke
A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. siRNAs trigger the cleavage of their target molecules.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/SiRNA
INSDC_qualifier:siRNA
small interfering RNA
sequence
SO:0000646
siRNA
A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. siRNAs trigger the cleavage of their target molecules.
PMID:12592000
http://en.wikipedia.org/wiki/SiRNA
wiki
A primary transcript encoding a micro RNA.
SO:0000648
miRNA primary transcript
micro RNA primary transcript
small temporal RNA primary transcript
stRNA primary transcript
stRNA_primary_transcript
sequence
SO:0000647
miRNA_primary_transcript
A primary transcript encoding a micro RNA.
SO:ke
true
true
Cytosolic SSU rRNA is an RNA component of the small subunit of cytosolic ribosomes.
cytosolic SSU rRNA
cytosolic SSU ribosomal RNA
cytosolic small subunit rRNA
sequence
SO:0000650
Renamed to cytosolic_SSU_rRNA from small_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493.
cytosolic_SSU_rRNA
Cytosolic SSU rRNA is an RNA component of the small subunit of cytosolic ribosomes.
SO:ke
Cytosolic LSU rRNA is an RNA component of the large subunit of cytosolic ribosomes.
cytosolic LSU RNA
cytosolic LSU rRNA
cytosolic large subunit rRNA
sequence
SO:0000651
Renamed to cytosolic_LSU_rRNA from large_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493.
cytosolic_LSU_rRNA
Cytosolic LSU rRNA is an RNA component of the large subunit of cytosolic ribosomes.
SO:ke
Cytosolic 5S rRNA is an RNA component of the large subunit of cytosolic ribosomes in both prokaryotes and eukaryotes.
http://en.wikipedia.org/wiki/5S_ribosomal_RNA
cytosolic 5S LSU rRNA
cytosolic 5S rRNA
cytosolic 5S ribosomal RNA
cytosolic rRNA 5S
sequence
SO:0000652
Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_5S_rRNA
Cytosolic 5S rRNA is an RNA component of the large subunit of cytosolic ribosomes in both prokaryotes and eukaryotes.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00001
http://en.wikipedia.org/wiki/5S_ribosomal_RNA
wiki
Cytosolic 28S rRNA is an RNA component of the large subunit of cytosolic ribosomes in metazoan eukaryotes.
http://en.wikipedia.org/wiki/28S_ribosomal_RNA
cytosolic 28S LSU rRNA
cytosolic 28S rRNA
cytosolic 28S ribosomal RNA
cytosolic rRNA 28S
sequence
SO:0000653
Renamed from rRNA_28S to cytosolic_28S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_28S_rRNA
Cytosolic 28S rRNA is an RNA component of the large subunit of cytosolic ribosomes in metazoan eukaryotes.
SO:ke
http://en.wikipedia.org/wiki/28S_ribosomal_RNA
wiki
A mitochondrial gene located in a maxicircle.
maxi-circle gene
maxicircle gene
sequence
SO:0000654
maxicircle_gene
A mitochondrial gene located in a maxicircle.
SO:xp
An RNA transcript that does not encode for a protein rather the RNA molecule is the gene product.
INSDC_qualifier:other
http://en.wikipedia.org/wiki/NcRNA
http://www.gencodegenes.org/gencode_biotypes.html
known_ncrna
noncoding RNA
sequence
SO:0000655
A ncRNA is a processed_transcript, so it may not contain parts such as transcribed_spacer_regions that are removed in the act of processing. For the corresponding primary_transcripts, please see term SO:0000483 nc_primary_transcript.
ncRNA
An RNA transcript that does not encode for a protein rather the RNA molecule is the gene product.
SO:ke
http://en.wikipedia.org/wiki/NcRNA
wiki
http://www.gencodegenes.org/gencode_biotypes.html
GENCODE
A region that can be transcribed into a small temporal RNA (stRNA). Found in roundworm development.
stRNA encoding
sequence
SO:0000656
stRNA_encoding
A region of sequence containing one or more repeat units.
INSDC_feature:repeat_region
INSDC_qualifier:other
repeat region
sequence
SO:0000657
repeat_region
A region of sequence containing one or more repeat units.
SO:ke
A repeat that is located at dispersed sites in the genome.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Interspersed_repeat
INSDC_qualifier:dispersed
dispersed repeat
interspersed repeat
sequence
SO:0000658
dispersed_repeat
A repeat that is located at dispersed sites in the genome.
SO:ke
http://en.wikipedia.org/wiki/Interspersed_repeat
wiki
A region that can be transcribed into a transfer-messenger RNA (tmRNA).
tmRNA encoding
sequence
SO:0000659
tmRNA_encoding
sequence
SO:0000660
DNA_invertase_target_sequence
true
sequence
SO:0000661
intron_attribute
true
An intron which is spliced by the spliceosome.
spliceosomal intron
sequence
SO:0000662
GO:0000398.
spliceosomal_intron
An intron which is spliced by the spliceosome.
SO:ke
A region that can be transcribed into a transfer RNA (tRNA).
tRNA encoding
sequence
SO:0000663
tRNA_encoding
A region of a chromosome that has been introduced by backcrossing with a separate species.
introgressed chromosome region
sequence
SO:0000664
introgressed_chromosome_region
A region of a chromosome that has been introduced by backcrossing with a separate species.
PMID:11454782
A transcript that is monocistronic.
monocistronic transcript
sequence
SO:0000665
monocistronic_transcript
A transcript that is monocistronic.
SO:xp
An intron (mitochondrial, chloroplast, nuclear or prokaryotic) that encodes a double strand sequence specific endonuclease allowing for mobility.
mobile intron
sequence
SO:0000666
mobile_intron
An intron (mitochondrial, chloroplast, nuclear or prokaryotic) that encodes a double strand sequence specific endonuclease allowing for mobility.
SO:ke
The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence.
SO:1000034
loinc:LA6687-3
insertion
nucleotide insertion
nucleotide_insertion
sequence
SO:0000667
insertion
The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence.
SO:ke
loinc:LA6687-3
Insertion
insertion
http://www.ncbi.nlm.nih.gov/dbvar/
A match against an EST sequence.
EST match
sequence
SO:0000668
EST_match
A match against an EST sequence.
SO:ke
A feature where a segment of DNA has been rearranged from what it was in the parent cell.
sequence rearrangement feature
sequence
SO:0000669
sequence_rearrangement_feature
A sequence within the micronuclear DNA of ciliates at which chromosome breakage and telomere addition occurs during nuclear differentiation.
chromosome breakage sequence
sequence
SO:0000670
chromosome_breakage_sequence
A sequence within the micronuclear DNA of ciliates at which chromosome breakage and telomere addition occurs during nuclear differentiation.
SO:ma
A sequence eliminated from the genome of ciliates during nuclear differentiation.
internal eliminated sequence
sequence
SO:0000671
internal_eliminated_sequence
A sequence eliminated from the genome of ciliates during nuclear differentiation.
SO:ma
A sequence that is conserved, although rearranged relative to the micronucleus, in the macronucleus of a ciliate genome.
macronucleus destined segment
sequence
SO:0000672
macronucleus_destined_segment
A sequence that is conserved, although rearranged relative to the micronucleus, in the macronucleus of a ciliate genome.
SO:ma
An RNA synthesized on a DNA or RNA template by an RNA polymerase.
INSDC_feature:misc_RNA
http://en.wikipedia.org/wiki/RNA
sequence
SO:0000673
Added relationship overlaps SO:0002300 unit_of_gene_expression with Mejia-Almonte et.al PMID:32665585 Aug 5, 2020.
transcript
An RNA synthesized on a DNA or RNA template by an RNA polymerase.
SO:ma
http://en.wikipedia.org/wiki/RNA
wiki
A splice site where the donor and acceptor sites differ from the canonical form.
SO:0000678
SO:0000679
non canonical splice site
non-canonical splice site
sequence
SO:0000674
non_canonical_splice_site
true
A splice site where the donor and acceptor sites differ from the canonical form.
SO:ke
The major class of splice site with dinucleotides GT and AG for donor and acceptor sites, respectively.
SO:0000676
SO:0000677
canonical splice site
sequence
SO:0000675
canonical_splice_site
true
The major class of splice site with dinucleotides GT and AG for donor and acceptor sites, respectively.
SO:ke
The canonical 3' splice site has the sequence "AG".
canonical 3' splice site
canonical three prime splice site
sequence
SO:0000676
canonical_three_prime_splice_site
The canonical 3' splice site has the sequence "AG".
SO:ke
The canonical 5' splice site has the sequence "GT".
canonical 5' splice site
canonical five prime splice site
sequence
SO:0000677
canonical_five_prime_splice_site
The canonical 5' splice site has the sequence "GT".
SO:ke
A 3' splice site that does not have the sequence "AG".
non canonical three prime splice site
non-canonical three prime splice site
sequence
non canonical 3' splice site
SO:0000678
non_canonical_three_prime_splice_site
A 3' splice site that does not have the sequence "AG".
SO:ke
A 5' splice site which does not have the sequence "GT".
non canonical 5' splice site
non canonical five prime splice site
non-canonical five prime splice site
sequence
SO:0000679
non_canonical_five_prime_splice_site
A 5' splice site which does not have the sequence "GT".
SO:ke
A start codon that is not the usual AUG sequence.
non ATG start codon
non canonical start codon
non-canonical start codon
sequence
SO:0000680
non_canonical_start_codon
A start codon that is not the usual AUG sequence.
SO:ke
A transcript that has been processed "incorrectly", for example by the failure of splicing of one or more exons.
aberrant processed transcript
sequence
SO:0000681
aberrant_processed_transcript
A transcript that has been processed "incorrectly", for example by the failure of splicing of one or more exons.
SO:ke
sequence
SO:0000682
splicing_feature
true
Exonic splicing enhancers (ESEs) facilitate exon definition by assisting in the recruitment of splicing factors to the adjacent intron.
exonic splice enhancer
sequence
SO:0000683
exonic_splice_enhancer
Exonic splicing enhancers (ESEs) facilitate exon definition by assisting in the recruitment of splicing factors to the adjacent intron.
http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12403462&dopt=Abstract
A region of nucleotide sequence targeted by a nuclease enzyme.
nuclease sensitive site
sequence
SO:0000684
nuclease_sensitive_site
A region of nucleotide sequence targeted by a nuclease enzyme.
SO:ma
DNA region representing open chromatin structure that is hypersensitive to digestion by DNase I.
INSDC_feature:regulatory
DHS
DNaseI hypersensitive site
INSDC_qualifier:DNase_I_hypersensitive_site
sequence
SO:0000685
DNaseI_hypersensitive_site
A chromosomal translocation whereby the chromosomes carrying non-homologous centromeres may be recovered independently. These chromosomes are described as translocation elements. This occurs for some translocations, particularly but not exclusively, reciprocal translocations.
translocation element
sequence
SO:0000686
translocation_element
A chromosomal translocation whereby the chromosomes carrying non-homologous centromeres may be recovered independently. These chromosomes are described as translocation elements. This occurs for some translocations, particularly but not exclusively, reciprocal translocations.
SO:ma
The space between two bases in a sequence which marks the position where a deletion has occurred.
deletion junction
sequence
SO:0000687
deletion_junction
The space between two bases in a sequence which marks the position where a deletion has occurred.
SO:ke
A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence.
golden path
sequence
SO:0000688
golden_path
A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence.
SO:ls
A match against cDNA sequence.
cDNA match
sequence
SO:0000689
cDNA_match
A match against cDNA sequence.
SO:ke
A gene that encodes a polycistronic transcript.
gene with polycistronic transcript
sequence
SO:0000690
gene_with_polycistronic_transcript
A gene that encodes a polycistronic transcript.
SO:xp
The initiator methionine that has been cleaved from a mature polypeptide sequence.
BS:00067
cleaved initiator methionine
sequence
init_met
initiator methionine
SO:0000691
cleaved_initiator_methionine
The initiator methionine that has been cleaved from a mature polypeptide sequence.
EBIBS:GAR
init_met
uniprot:feature_type
A gene that encodes a dicistronic transcript.
gene with dicistronic transcript
sequence
SO:0000692
gene_with_dicistronic_transcript
A gene that encodes a dicistronic transcript.
SO:xp
A gene that encodes an mRNA that is recoded.
gene with recoded mRNA
sequence
SO:0000693
gene_with_recoded_mRNA
A gene that encodes an mRNA that is recoded.
SO:xp
SNPs are single base pair positions in genomic DNA at which different sequence alternatives exist in normal individuals in some population(s), wherein the least frequent variant has an abundance of 1% or greater.
single nucleotide polymorphism
sequence
SO:0000694
SNP
SNPs are single base pair positions in genomic DNA at which different sequence alternatives exist in normal individuals in some population(s), wherein the least frequent variant has an abundance of 1% or greater.
SO:cb
A sequence used in experiment.
sequence
SO:0000695
Requested by Lynn Crosby, jan 2006.
reagent
A sequence used in experiment.
SO:ke
A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded.
http://en.wikipedia.org/wiki/Oligonucleotide
oligonucleotide
sequence
SO:0000696
oligo
A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded.
SO:ma
http://en.wikipedia.org/wiki/Oligonucleotide
wiki
A gene that encodes a transcript with stop codon readthrough.
gene with stop codon read through
sequence
SO:0000697
gene_with_stop_codon_read_through
A gene that encodes a transcript with stop codon readthrough.
SO:xp
A gene encoding an mRNA that has the stop codon redefined as pyrrolysine.
gene with stop codon redefined as pyrrolysine
sequence
SO:0000698
gene_with_stop_codon_redefined_as_pyrrolysine
A gene encoding an mRNA that has the stop codon redefined as pyrrolysine.
SO:xp
A sequence_feature with an extent of zero.
boundary
breakpoint
sequence
SO:0000699
A junction is a boundary between regions. A boundary has an extent of zero.
junction
A sequence_feature with an extent of zero.
SO:ke
A comment about the sequence.
sequence
SO:0000700
remark
A comment about the sequence.
SO:ke
A region of sequence where the validity of the base calling is questionable.
possible base call error
sequence
SO:0000701
possible_base_call_error
A region of sequence where the validity of the base calling is questionable.
SO:ke
A region of sequence where there may have been an error in the assembly.
possible assembly error
sequence
SO:0000702
possible_assembly_error
A region of sequence where there may have been an error in the assembly.
SO:ke
A region of sequence implicated in an experimental result.
experimental result region
sequence
SO:0000703
experimental_result_region
A region of sequence implicated in an experimental result.
SO:ke
A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions.
http://en.wikipedia.org/wiki/Gene
INSDC_feature:gene
sequence
SO:0000704
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. A gene may be considered as a unit of inheritance.
gene
A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Gene
wiki
Two or more adjacent copies of a region (of length greater than 1).
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Tandem_repeat
http://www.sci.sdsu.edu/~smaloy/Glossary/T.html
INSDC_qualifier:tandem
tandem repeat
sequence
SO:0000705
tandem_repeat
Two or more adjacent copies of a region (of length greater than 1).
SO:ke
http://en.wikipedia.org/wiki/Tandem_repeat
wiki
The 3' splice site of the acceptor primary transcript.
trans splice acceptor site
sequence
3' trans splice site
SO:0000706
This region contains a polypyridine tract and AG dinucleotide in some organisms and is UUUCAG in C. elegans.
trans_splice_acceptor_site
The 3' splice site of the acceptor primary transcript.
SO:ke
The 5' five prime splice site region of the donor RNA.
trans splice donor site
trans-splice donor site
sequence
5 prime trans splice site
SO:0000707
SL RNA contains a donor site.
trans_splice_donor_site
The 5' five prime splice site region of the donor RNA.
SO:ke
A trans_splicing_acceptor_site which appends the 22nt SL1 RNA leader sequence to the 5' end of most mRNAs.
SL1 acceptor site
sequence
SO:0000708
SL1_acceptor_site
A trans_splicing_acceptor_site which appends the 22nt SL1 RNA leader sequence to the 5' end of most mRNAs.
SO:nlw
A trans_splicing_acceptor_site which appends the 22nt SL2 RNA leader sequence to the 5' end of mRNAs. SL2 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SL2 acceptor site
sequence
SO:0000709
SL2_acceptor_site
A trans_splicing_acceptor_site which appends the 22nt SL2 RNA leader sequence to the 5' end of mRNAs. SL2 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A gene encoding an mRNA that has the stop codon redefined as selenocysteine.
gene with stop codon redefined as selenocysteine
sequence
SO:0000710
gene_with_stop_codon_redefined_as_selenocysteine
A gene encoding an mRNA that has the stop codon redefined as selenocysteine.
SO:xp
A gene with mRNA recoded by translational bypass.
gene with mRNA recoded by translational bypass
sequence
SO:0000711
gene_with_mRNA_recoded_by_translational_bypass
A gene with mRNA recoded by translational bypass.
SO:xp
A gene encoding a transcript that has a translational frameshift.
gene with transcript with translational frameshift
sequence
SO:0000712
gene_with_transcript_with_translational_frameshift
A gene encoding a transcript that has a translational frameshift.
SO:xp
A motif that is active in the DNA form of the sequence.
http://en.wikipedia.org/wiki/DNA_motif
DNA motif
sequence
SO:0000713
DNA_motif
A motif that is active in the DNA form of the sequence.
SO:ke
http://en.wikipedia.org/wiki/DNA_motif
wiki
A region of nucleotide sequence corresponding to a known motif.
INSDC_feature:misc_feature
INSDC_note:nucleotide_motif
nucleotide motif
sequence
SO:0000714
nucleotide_motif
A region of nucleotide sequence corresponding to a known motif.
SO:ke
A motif that is active in RNA sequence.
RNA motif
sequence
SO:0000715
RNA_motif
A motif that is active in RNA sequence.
SO:ke
An mRNA that has the quality dicistronic.
dicistronic mRNA
sequence
dicistronic processed transcript
SO:0000716
dicistronic_mRNA
An mRNA that has the quality dicistronic.
SO:ke
A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon.
http://en.wikipedia.org/wiki/Reading_frame
reading frame
sequence
SO:0000717
This term was added after a request by SGD. August 2004. Modified after SO meeting in Cambridge to not include start or stop.
reading_frame
A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon.
SGD:rb
http://en.wikipedia.org/wiki/Reading_frame
wiki
A reading_frame that is interrupted by one or more stop codons; usually identified through inter-genomic sequence comparisons.
blocked reading frame
sequence
SO:0000718
Term requested by Rama from SGD.
blocked_reading_frame
A reading_frame that is interrupted by one or more stop codons; usually identified through inter-genomic sequence comparisons.
SGD:rb
An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers.
pseudochromosome
sequence
superscaffold
SO:0000719
ultracontig
An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers.
FB:WG
A transposable element that is foreign.
foreign transposable element
sequence
SO:0000720
requested by Michael on 19 Nov 2004.
foreign_transposable_element
A transposable element that is foreign.
SO:ke
A gene that encodes a dicistronic primary transcript.
gene with dicistronic primary transcript
sequence
SO:0000721
Requested by Michael, 19 nov 2004.
gene_with_dicistronic_primary_transcript
A gene that encodes a dicistronic primary transcript.
SO:xp
A gene that encodes a polycistronic mRNA.
gene with dicistronic mRNA
gene with dicistronic processed transcript
sequence
SO:0000722
Requested by MA nov 19 2004.
gene_with_dicistronic_mRNA
A gene that encodes a polycistronic mRNA.
SO:xp
Genomic sequence removed from the genome, as a normal event, by a process of recombination.
INSDC_feature:iDNA
intervening DNA
sequence
SO:0000723
iDNA
Genomic sequence removed from the genome, as a normal event, by a process of recombination.
SO:ma
A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization.
http://en.wikipedia.org/wiki/Origin_of_transfer
INSDC_feature:oriT
origin of transfer
sequence
SO:0000724
oriT
A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Origin_of_transfer
wiki
The transit_peptide is a short region at the N-terminus of the peptide that directs the protein to an organelle (chloroplast, mitochondrion, microbody or cyanelle).
BS:00055
INSDC_feature:transit_peptide
transit peptide
sequence
signal
transit
SO:0000725
Added to bring SO inline with the EMBL, DDBJ, GenBank feature table. Old definition before biosapiens: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein. This domain is involved in post translational import of the protein into the organelle.
transit_peptide
The transit_peptide is a short region at the N-terminus of the peptide that directs the protein to an organelle (chloroplast, mitochondrion, microbody or cyanelle).
http://www.insdc.org/files/feature_table.html
transit
uniprot:feature_type
The simplest repeated component of a repeat region. A single repeat.
http://www.insdc.org/files/feature_table.html
repeat unit
sequence
SO:0000726
Added to comply with the feature table. A single repeat.
repeat_unit
The simplest repeated component of a repeat region. A single repeat.
SO:ke
A regulatory region where transcription factor binding sites are clustered to regulate various aspects of transcription activities. (CRMs can be located a few kb to hundreds of kb upstream of the core promoter, in the coding sequence, within introns, or in the untranslated regions (UTR) sequences, and even on a different chromosome). A single gene can be regulated by multiple CRMs to give precise control of its spatial and temporal expression. CRMs function as nodes in large, intertwined regulatory network. CRM DNA accessibility is subject to regulation by dbTFs and transcription co-TFs.
CRM
TF module
cis regulatory module
transcription factor module
sequence
SO:0000727
Requested by Stephen Grossmann Dec 2004. Changed relationship from has_part SO:0000235 TF_binding site to TF_binding_site is part_of SO:0000727 CRM in response to requests from GREEKC initiative in Aug 2020. Removed 3' from definition because 5' UTRs are included as well, notified by Colin Logie of GREEKC. Nov 9 2020. DS Updated name from 'CRM' to 'cis_regulatory_module' on 08 Feb 2021. See GitHub Issue #526. DS Added final sentence to definition as part of GREEKC Feb 16, 2021. See GitHub Issue #534.
cis_regulatory_module
A regulatory region where transcription factor binding sites are clustered to regulate various aspects of transcription activities. (CRMs can be located a few kb to hundreds of kb upstream of the core promoter, in the coding sequence, within introns, or in the untranslated regions (UTR) sequences, and even on a different chromosome). A single gene can be regulated by multiple CRMs to give precise control of its spatial and temporal expression. CRMs function as nodes in large, intertwined regulatory network. CRM DNA accessibility is subject to regulation by dbTFs and transcription co-TFs.
PMID:19660565
SO:SG
A region of a peptide that is able to excise itself and rejoin the remaining portions with a peptide bond.
http://en.wikipedia.org/wiki/Intein
sequence
protein intron
SO:0000728
Intein-mediated protein splicing occurs after mRNA has been translated into a protein.
intein
A region of a peptide that is able to excise itself and rejoin the remaining portions with a peptide bond.
SO:ke
http://en.wikipedia.org/wiki/Intein
wiki
An attribute of protein-coding genes where the initial protein product contains an intein.
intein containing
sequence
SO:0000729
intein_containing
An attribute of protein-coding genes where the initial protein product contains an intein.
SO:ke
A gap in the sequence of known length. The unknown bases are filled in with N's.
INSDC_feature:gap
INSDC_feature:assembly_gap
sequence
SO:0000730
gap
A gap in the sequence of known length. The unknown bases are filled in with N's.
SO:ke
An attribute to describe a feature that is incomplete.
fragment
sequence
SO:0000731
Term added because of request by MO people.
fragmentary
An attribute to describe a feature that is incomplete.
SO:ke
An attribute describing an unverified region.
http://en.wikipedia.org/wiki/Predicted
sequence
SO:0000732
predicted
An attribute describing an unverified region.
SO:ke
http://en.wikipedia.org/wiki/Predicted
wiki
An attribute describing a located_sequence_feature.
feature attribute
sequence
SO:0000733
feature_attribute
An attribute describing a located_sequence_feature.
SO:ke
An exemplar is a representative cDNA sequence for each gene. The exemplar approach is a method that usually involves some initial clustering into gene groups and the subsequent selection of a representative from each gene group.
exemplar mRNA
sequence
SO:0000734
Added for the MO people.
exemplar_mRNA
An exemplar is a representative cDNA sequence for each gene. The exemplar approach is a method that usually involves some initial clustering into gene groups and the subsequent selection of a representative from each gene group.
http://mged.sourceforge.net/ontologies/MGEDontology.php
The location of a sequence.
sequence location
sequence
SO:0000735
sequence_location
A sequence of DNA that originates from a an organelle.
organelle sequence
sequence
SO:0000736
organelle_sequence
DNA belonging to the genome of a mitochondria.
mitochondrial sequence
sequence
SO:0000737
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
mitochondrial_sequence
DNA belonging to the nuclear genome of cell.
nuclear sequence
sequence
SO:0000738
Moved from is_a SO:0000736 (organelle_sequence) when brought to our attention by GitHub issue #489.
nuclear_sequence
DNA belonging to the genome of a plastid such as a chloroplast. The nucleomorph is the nuclei of the plastic.
nucleomorphic sequence
sequence
SO:0000739
nucleomorphic_sequence
DNA belonging to the genome of a plastid such as a chloroplast.
plastid sequence
sequence
SO:0000740
plastid_sequence
A kinetoplast is an interlocked network of thousands of minicircles and tens of maxicircles, located near the base of the flagellum of some protozoan species.
SO:0000826
http://en.wikipedia.org/wiki/Kinetoplast
kinetoplast_chromosome
sequence
SO:0000741
kinetoplast
A kinetoplast is an interlocked network of thousands of minicircles and tens of maxicircles, located near the base of the flagellum of some protozoan species.
PMID:8395055
http://en.wikipedia.org/wiki/Kinetoplast
wiki
A maxicircle is a replicon, part of a kinetoplast, that contains open reading frames and replicates via a rolling circle method.
SO:0000827
maxicircle_chromosome
sequence
SO:0000742
maxicircle
A maxicircle is a replicon, part of a kinetoplast, that contains open reading frames and replicates via a rolling circle method.
PMID:8395055
DNA belonging to the genome of an apicoplast, a non-photosynthetic plastid.
apicoplast sequence
sequence
SO:0000743
apicoplast_sequence
DNA belonging to the genome of a chromoplast, a colored plastid for synthesis and storage of pigments.
chromoplast sequence
sequence
SO:0000744
chromoplast_sequence
DNA belonging to the genome of a chloroplast, a green plastid for photosynthesis.
chloroplast sequence
sequence
SO:0000745
chloroplast_sequence
DNA belonging to the genome of a cyanelle, a photosynthetic plastid found in algae.
cyanelle sequence
sequence
SO:0000746
cyanelle_sequence
DNA belonging to the genome of a leucoplast, a colorless plastid generally containing starch or oil.
leucoplast sequence
sequence
SO:0000747
leucoplast_sequence
DNA belonging to the genome of a proplastid such as an immature chloroplast.
proplastid sequence
sequence
SO:0000748
proplastid_sequence
The location of DNA that has come from a plasmid sequence.
plasmid location
sequence
SO:0000749
plasmid_location
An origin_of_replication that is used for the amplification of a chromosomal nucleic acid sequence.
amplification origin
sequence
SO:0000750
amplification_origin
An origin_of_replication that is used for the amplification of a chromosomal nucleic acid sequence.
SO:ma
The location of DNA that has come from a viral origin.
proviral location
sequence
SO:0000751
proviral_location
A region that is involved in the regulation of transcription of a group of regulated genes.
SO:0001055
gene group regulatory region
sequence
SO:0000752
Merged into transcriptional_cis_regulatory_region (SO:0001055) on 11 Feb 2021 as part of GREEKC reducing redundancy as we prepare to submit several terms to Ensembl. See GitHub Issue #529.
gene_group_regulatory_region
true
The region of sequence that has been inserted and is being propagated by the clone.
clone insert
sequence
SO:0000753
clone_insert
The region of sequence that has been inserted and is being propagated by the clone.
SO:ke
The lambda bacteriophage is the vector for the linear lambda clone. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
lambda vector
sequence
SO:0000754
lambda_vector
The lambda bacteriophage is the vector for the linear lambda clone. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
ISBN:0-1767-2380-8
A plasmid that has been generated to act as a vector for foreign sequence.
http://en.wikipedia.org/wiki/Plasmid_vector#Vectors
plasmid vector
sequence
SO:0000755
plasmid_vector
http://en.wikipedia.org/wiki/Plasmid_vector#Vectors
wiki
DNA synthesized by reverse transcriptase using RNA as a template.
http://en.wikipedia.org/wiki/CDNA
complementary DNA
sequence
SO:0000756
cDNA
DNA synthesized by reverse transcriptase using RNA as a template.
SO:ma
http://en.wikipedia.org/wiki/CDNA
wiki
DNA synthesized from RNA by reverse transcriptase, single stranded.
single strand cDNA
single stranded cDNA
sequence
single-strand cDNA
SO:0000757
single_stranded_cDNA
DNA synthesized from RNA by reverse transcriptase that has been copied by PCR to make it double stranded.
double stranded cDNA
sequence
double strand cDNA
double-strand cDNA
SO:0000758
double_stranded_cDNA
sequence
SO:0000759
plasmid_clone
true
sequence
SO:0000760
YAC_clone
true
sequence
SO:0000761
phagemid_clone
true
sequence
P1_clone
SO:0000762
PAC_clone
true
sequence
SO:0000763
fosmid_clone
true
sequence
SO:0000764
BAC_clone
true
sequence
SO:0000765
cosmid_clone
true
A tRNA sequence that has a pyrrolysine anticodon, and a 3' pyrrolysine binding region.
pyrrolysyl tRNA
pyrrolysyl-transfer RNA
pyrrolysyl-transfer ribonucleic acid
sequence
SO:0000766
pyrrolysyl_tRNA
A tRNA sequence that has a pyrrolysine anticodon, and a 3' pyrrolysine binding region.
SO:ke
sequence
SO:0000767
clone_insert_start
true
A plasmid that may integrate with a chromosome.
sequence
SO:0000768
episome
A plasmid that may integrate with a chromosome.
SO:ma
The region of a two-piece tmRNA that bears the reading frame encoding the proteolysis tag. The tmRNA gene undergoes circular permutation in some groups of bacteria. Processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together.
tmRNA coding piece
sequence
SO:0000769
Added in response to comment from Kelly Williams from Indiana. Nov 2005.
tmRNA_coding_piece
The region of a two-piece tmRNA that bears the reading frame encoding the proteolysis tag. The tmRNA gene undergoes circular permutation in some groups of bacteria. Processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together.
Indiana:kw
doi:10.1093/nar/gkh795
issn:1362-4962
The acceptor region of a two-piece tmRNA that when mature is charged at its 3' end with alanine. The tmRNA gene undergoes circular permutation in some groups of bacteria; processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together.
tmRNA acceptor piece
sequence
SO:0000770
Added in response to Kelly Williams from Indiana. Date: Nov 2005.
tmRNA_acceptor_piece
The acceptor region of a two-piece tmRNA that when mature is charged at its 3' end with alanine. The tmRNA gene undergoes circular permutation in some groups of bacteria; processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together.
Indiana:kw
doi:10.1093/nar/gkh795
A quantitative trait locus (QTL) is a polymorphic locus which contains alleles that differentially affect the expression of a continuously distributed phenotypic trait. Usually it is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci.
quantitative trait locus
sequence
SO:0000771
Added in respose to request by Simon Twigger November 14th 2005.
QTL
A quantitative trait locus (QTL) is a polymorphic locus which contains alleles that differentially affect the expression of a continuously distributed phenotypic trait. Usually it is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci.
http://rgd.mcw.edu/tu/qtls/
A genomic island is an integrated mobile genetic element, characterized by size (over 10 Kb). It that has features that suggest a foreign origin. These can include nucleotide distribution (oligonucleotides signature, CG content etc.) that differs from the bulk of the chromosome and/or genes suggesting DNA mobility.
http://en.wikipedia.org/wiki/Genomic_island
genomic island
sequence
SO:0000772
Genomic islands are transmissible elements characterized by large size (>10kb).
genomic_island
A genomic island is an integrated mobile genetic element, characterized by size (over 10 Kb). It that has features that suggest a foreign origin. These can include nucleotide distribution (oligonucleotides signature, CG content etc.) that differs from the bulk of the chromosome and/or genes suggesting DNA mobility.
Phigo:at
SO:ke
http://en.wikipedia.org/wiki/Genomic_island
wiki
Mobile genetic elements that contribute to rapid changes in virulence potential. They are present on the genomes of pathogenic strains but absent from the genomes of non pathogenic members of the same or related species.
pathogenic island
sequence
SO:0000773
Nature Reviews Microbiology 2, 414-424 (2004); doi:10.1038 micro 884 GENOMIC ISLANDS IN PATHOGENIC AND ENVIRONMENTAL MICROORGANISMS Ulrich Dobrindt, Bianca Hochhut, Ute Hentschel & Jorg Hacker.
pathogenic_island
Mobile genetic elements that contribute to rapid changes in virulence potential. They are present on the genomes of pathogenic strains but absent from the genomes of non pathogenic members of the same or related species.
SO:ke
A transmissible element containing genes involved in metabolism, analogous to the pathogenicity islands of gram negative bacteria.
metabolic island
sequence
SO:0000774
Genes for phenolic compound degradation in Pseudomonas putida are found on metabolic islands.
metabolic_island
A transmissible element containing genes involved in metabolism, analogous to the pathogenicity islands of gram negative bacteria.
SO:ke
An adaptive island is a genomic island that provides an adaptive advantage to the host.
adaptive island
sequence
SO:0000775
The iron-uptake ability of many pathogens are conveyed by adaptive islands. Nature Reviews Microbiology 2, 414-424 (2004); doi:10.1038 micro 884 GENOMIC ISLANDS IN PATHOGENIC AND ENVIRONMENTAL MICROORGANISMS Ulrich Dobrindt, Bianca Hochhut, Ute Hentschel & Jorg Hacker.
adaptive_island
An adaptive island is a genomic island that provides an adaptive advantage to the host.
SO:ke
A transmissible element containing genes involved in symbiosis, analogous to the pathogenicity islands of gram negative bacteria.
symbiosis island
sequence
SO:0000776
Nitrogen fixation in Rhizobiaceae species is encoded by symbiosis islands. Evolution of rhizobia by acquisition of a 500-kb symbiosis island that integrates into a phe-tRNA gene. John T. Sullivan and Clive W. Ronso PNAS 1998 Apr 28 95 (9) 5145-5149.
symbiosis_island
A transmissible element containing genes involved in symbiosis, analogous to the pathogenicity islands of gram negative bacteria.
SO:ke
A non functional descendant of an rRNA.
INSDC_feature:rRNA
INSDC_qualifier:pseudo
pseudogenic rRNA
sequence
SO:0000777
Added Jan 2006 to allow the annotation of the pseudogenic rRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations.
pseudogenic_rRNA
A non functional descendant of an rRNA.
SO:ke
A non functional descendent of a tRNA.
INSDC_feature:tRNA
INSDC_qualifier:pseudo
pseudogenic tRNA
sequence
SO:0000778
Added Jan 2006 to allow the annotation of the pseudogenic tRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations.
pseudogenic_tRNA
A non functional descendent of a tRNA.
SO:ke
An episome that is engineered.
engineered episome
sequence
SO:0000779
Requested by Lynn Crosby Jan 2006.
engineered_episome
An episome that is engineered.
SO:xp
sequence
SO:0000780
Added by KE Jan 2006 to capture the kinds of attributes of TEs
transposable_element_attribute
true
Attribute describing sequence that has been integrated with foreign sequence.
sequence
SO:0000781
transgenic
Attribute describing sequence that has been integrated with foreign sequence.
SO:ke
An attribute describing a feature that occurs in nature.
sequence
SO:0000782
natural
An attribute describing a feature that occurs in nature.
SO:ke
An attribute to describe a region that was modified in vitro.
sequence
SO:0000783
engineered
An attribute to describe a region that was modified in vitro.
SO:ke
An attribute to describe a region from another species.
sequence
SO:0000784
foreign
An attribute to describe a region from another species.
SO:ke
The region of sequence that has been inserted and is being propagated by the clone.
cloned region
cloned segment
sequence
SO:0000785
Added in response to Lynn Crosby. A clone insert may be composed of many cloned regions.
cloned_region
reagent attribute
sequence
SO:0000786
Added jan 2006 by KE.
reagent_attribute
true
sequence
SO:0000787
clone_attribute
true
sequence
SO:0000788
cloned
true
An attribute to describe a feature that has been proven.
sequence
SO:0000789
validated
An attribute to describe a feature that has been proven.
SO:ke
An attribute describing a feature that is invalidated.
sequence
SO:0000790
invalidated
An attribute describing a feature that is invalidated.
SO:ke
sequence
SO:0000791
cloned_genomic
true
sequence
SO:0000792
cloned_cDNA
true
sequence
SO:0000793
engineered_DNA
true
A rescue region that is engineered.
engineered rescue fragment
engineered rescue region
engineered rescue segment
sequence
SO:0000794
engineered_rescue_region
A rescue region that is engineered.
SO:xp
A mini_gene that rescues.
rescue mini gene
rescue mini-gene
sequence
SO:0000795
rescue_mini_gene
A mini_gene that rescues.
SO:xp
TE that has been modified in vitro, including insertion of DNA derived from a source other than the originating TE.
transgenic transposable element
sequence
SO:0000796
Modified as requested by Lynn - FB. May 2007.
transgenic_transposable_element
TE that has been modified in vitro, including insertion of DNA derived from a source other than the originating TE.
FB:mc
TE that exists (or existed) in nature.
natural transposable element
sequence
SO:0000797
natural_transposable_element
TE that exists (or existed) in nature.
FB:mc
TE that has been modified by manipulations in vitro.
engineered transposable element
sequence
SO:0000798
engineered_transposable_element
TE that has been modified by manipulations in vitro.
FB:mc
A transposable_element that is engineered and foreign.
engineered foreign transposable element
sequence
SO:0000799
engineered_foreign_transposable_element
A transposable_element that is engineered and foreign.
FB:mc
A multi-chromosome duplication aberration generated by reassortment of other aberration components.
assortment derived duplication
sequence
SO:0000800
assortment_derived_duplication
A multi-chromosome duplication aberration generated by reassortment of other aberration components.
FB:gm
A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency and a duplication.
assortment derived deficiency plus duplication
sequence
SO:0000801
assortment_derived_deficiency_plus_duplication
A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency and a duplication.
FB:gm
A multi-chromosome deficiency aberration generated by reassortment of other aberration components.
assortment-derived deficiency
sequence
SO:0000802
assortment_derived_deficiency
A multi-chromosome deficiency aberration generated by reassortment of other aberration components.
FB:gm
A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency or a duplication.
assortment derived aneuploid
sequence
SO:0000803
assortment_derived_aneuploid
A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency or a duplication.
FB:gm
A region that is engineered.
construct
engineered region
engineered sequence
sequence
SO:0000804
engineered_region
A region that is engineered.
SO:xp
A region that is engineered and foreign.
engineered foreign region
sequence
SO:0000805
engineered_foreign_region
A region that is engineered and foreign.
SO:xp
When two regions of DNA are joined together that are not normally together.
sequence
SO:0000806
fusion
A tag that is engineered.
engineered tag
sequence
SO:0000807
engineered_tag
A tag that is engineered.
SO:xp
A cDNA clone that has been validated.
validated cDNA clone
sequence
SO:0000808
validated_cDNA_clone
A cDNA clone that has been validated.
SO:xp
A cDNA clone that is invalid.
invalidated cDNA clone
sequence
SO:0000809
invalidated_cDNA_clone
A cDNA clone that is invalid.
SO:xp
A cDNA clone invalidated because it is chimeric.
chimeric cDNA clone
sequence
SO:0000810
chimeric_cDNA_clone
A cDNA clone invalidated because it is chimeric.
SO:xp
A cDNA clone invalidated by genomic contamination.
genomically contaminated cDNA clone
sequence
SO:0000811
genomically_contaminated_cDNA_clone
A cDNA clone invalidated by genomic contamination.
SO:xp
A cDNA clone invalidated by polyA priming.
polyA primed cDNA clone
sequence
SO:0000812
polyA_primed_cDNA_clone
A cDNA clone invalidated by polyA priming.
SO:xp
A cDNA invalidated clone by partial processing.
partially processed cDNA clone
sequence
SO:0000813
partially_processed_cDNA_clone
A cDNA invalidated clone by partial processing.
SO:xp
An attribute describing a region's ability, when introduced to a mutant organism, to re-establish (rescue) a phenotype.
sequence
SO:0000814
rescue
An attribute describing a region's ability, when introduced to a mutant organism, to re-establish (rescue) a phenotype.
SO:ke
By definition, minigenes are short open-reading frames (ORF), usually encoding approximately 9 to 20 amino acids, which are expressed in vivo (as distinct from being synthesized as peptide or protein ex vivo and subsequently injected). The in vivo synthesis confers a distinct advantage: the expressed sequences can enter both antigen presentation pathways, MHC I (inducing CD8+ T- cells, which are usually cytotoxic T-lymphocytes (CTL)) and MHC II (inducing CD4+ T-cells, usually 'T-helpers' (Th)); and can encounter B-cells, inducing antibody responses. Three main vector approaches have been used to deliver minigenes: viral vectors, bacterial vectors and plasmid DNA.
mini gene
sequence
SO:0000815
mini_gene
By definition, minigenes are short open-reading frames (ORF), usually encoding approximately 9 to 20 amino acids, which are expressed in vivo (as distinct from being synthesized as peptide or protein ex vivo and subsequently injected). The in vivo synthesis confers a distinct advantage: the expressed sequences can enter both antigen presentation pathways, MHC I (inducing CD8+ T- cells, which are usually cytotoxic T-lymphocytes (CTL)) and MHC II (inducing CD4+ T-cells, usually 'T-helpers' (Th)); and can encounter B-cells, inducing antibody responses. Three main vector approaches have been used to deliver minigenes: viral vectors, bacterial vectors and plasmid DNA.
PMID:15992143
A gene that rescues.
rescue gene
sequence
SO:0000816
rescue_gene
A gene that rescues.
SO:xp
An attribute describing sequence with the genotype found in nature and/or standard laboratory stock.
http://en.wikipedia.org/wiki/Wild_type
loinc:LA9658-1
wild type
sequence
SO:0000817
wild_type
An attribute describing sequence with the genotype found in nature and/or standard laboratory stock.
SO:ke
http://en.wikipedia.org/wiki/Wild_type
wiki
loinc:LA9658-1
wild type
A gene that rescues.
wild type rescue gene
sequence
SO:0000818
wild_type_rescue_gene
A gene that rescues.
SO:xp
A chromosome originating in a mitochondria.
mitochondrial chromosome
sequence
SO:0000819
mitochondrial_chromosome
A chromosome originating in a mitochondria.
SO:xp
A chromosome originating in a chloroplast.
chloroplast chromosome
sequence
SO:0000820
chloroplast_chromosome
A chromosome originating in a chloroplast.
SO:xp
A chromosome originating in a chromoplast.
chromoplast chromosome
sequence
SO:0000821
chromoplast_chromosome
A chromosome originating in a chromoplast.
SO:xp
A chromosome originating in a cyanelle.
cyanelle chromosome
sequence
SO:0000822
cyanelle_chromosome
A chromosome originating in a cyanelle.
SO:xp
A chromosome with origin in a leucoplast.
leucoplast chromosome
sequence
SO:0000823
leucoplast_chromosome
A chromosome with origin in a leucoplast.
SO:xp
A chromosome originating in a macronucleus.
macronuclear chromosome
sequence
SO:0000824
macronuclear_chromosome
A chromosome originating in a macronucleus.
SO:xp
A chromosome originating in a micronucleus.
micronuclear chromosome
sequence
SO:0000825
micronuclear_chromosome
A chromosome originating in a micronucleus.
SO:xp
true
true
A chromosome originating in a nucleus.
nuclear chromosome
sequence
SO:0000828
nuclear_chromosome
A chromosome originating in a nucleus.
SO:xp
A chromosome originating in a nucleomorph.
nucleomorphic chromosome
sequence
SO:0000829
nucleomorphic_chromosome
A chromosome originating in a nucleomorph.
SO:xp
A region of a chromosome.
chromosomal region
chromosomal_region
chromosome part
sequence
SO:0000830
This is a manufactured term, that serves the purpose of allow the parts of a chromosome to have an is_a path to the root.
chromosome_part
A region of a chromosome.
SO:ke
A region of a gene.
gene member region
sequence
SO:0000831
A manufactured term used to allow the parts of a gene to have an is_a path to the root.
gene_member_region
A region of a gene.
SO:ke
A region of sequence which is part of a promoter.
sequence
SO:0000832
This is a manufactured term to allow the parts of promoter to have an is_a path back to the root.
promoter_region
true
A region of sequence which is part of a promoter.
SO:ke
A region of a transcript.
transcript region
sequence
SO:0000833
This term was added to provide a grouping term for the region parts of transcript, thus giving them an is_a path back to the root.
transcript_region
A region of a transcript.
SO:ke
A region of a mature transcript.
mature transcript region
sequence
SO:0000834
A manufactured term to collect together the parts of a mature transcript and give them an is_a path to the root.
mature_transcript_region
A region of a mature transcript.
SO:ke
A part of a primary transcript.
primary transcript region
sequence
SO:0000835
This term was added to provide a grouping term for the region parts of primary_transcript, thus giving them an is_a path back to the root.
primary_transcript_region
A part of a primary transcript.
SO:ke
A region of an mRNA.
mRNA region
sequence
SO:0000836
This term was added to provide a grouping term for the region parts of mRNA, thus giving them an is_a path back to the root.
mRNA_region
A region of an mRNA.
SO:cb
A region of UTR.
UTR region
sequence
SO:0000837
A region of UTR. This term is a grouping term to allow the parts of UTR to have an is_a path to the root.
UTR_region
A region of UTR.
SO:ke
A region of an rRNA primary transcript.
rRNA primary transcript region
sequence
SO:0000838
To allow transcribed_spacer_region to have a path to the root.
rRNA_primary_transcript_region
A region of an rRNA primary transcript.
SO:ke
Biological sequence region that can be assigned to a specific subsequence of a polypeptide.
BS:00124
BS:00331
region
site
sequence
positional
positional polypeptide feature
region or site annotation
SO:0000839
Added to allow the polypeptide regions to have is_a paths back to the root.
polypeptide_region
Biological sequence region that can be assigned to a specific subsequence of a polypeptide.
SO:GAR
SO:ke
region
uniprot:feature_type
site
uniprot:feature_type
A region of a repeated sequence.
repeat component
sequence
SO:0000840
A manufactured to group the parts of repeats, to give them an is_a path back to the root.
repeat_component
A region of a repeated sequence.
SO:ke
A region within an intron.
spliceosomal intron region
sequence
SO:0000841
A terms added to allow the parts of introns to have is_a paths to the root.
spliceosomal_intron_region
A region within an intron.
SO:ke
A region of a gene that has a specific function.
gene component region
sequence
SO:0000842
gene_component_region
A region which is part of a bacterial RNA polymerase promoter.
sequence
SO:0000843
This is a manufactured term to allow the parts of bacterial_RNApol_promoter to have an is_a path back to the root.
bacterial_RNApol_promoter_region
true
A region which is part of a bacterial RNA polymerase promoter.
SO:ke
A region of sequence which is a promoter for RNA polymerase II.
sequence
SO:0000844
This is a manufactured term to allow the parts of RNApol_II_promoter to have an is_a path back to the root.
RNApol_II_promoter_region
true
A region of sequence which is a promoter for RNA polymerase II.
SO:ke
A region of sequence which is a promoter for RNA polymerase III type 1.
sequence
SO:0000845
This is a manufactured term to allow the parts of RNApol_III_promoter_type_1 to have an is_a path back to the root.
RNApol_III_promoter_type_1_region
true
A region of sequence which is a promoter for RNA polymerase III type 1.
SO:ke
A region of sequence which is a promoter for RNA polymerase III type 2.
sequence
SO:0000846
This is a manufactured term to allow the parts of RNApol_III_promoter_type_2 to have an is_a path back to the root.
RNApol_III_promoter_type_2_region
true
A region of sequence which is a promoter for RNA polymerase III type 2.
SO:ke
A region of a tmRNA.
tmRNA region
sequence
SO:0000847
This term was added to provide a grouping term for the region parts of tmRNA, thus giving them an is_a path back to the root.
tmRNA_region
A region of a tmRNA.
SO:cb
The long terminal repeat found at the ends of the sequence to be inserted into the host genome.
LTR component
long term repeat component
sequence
SO:0000848
LTR_component
A component of the three-prime long terminal repeat.
3' long terminal repeat component
three prime LTR component
sequence
SO:0000849
three_prime_LTR_component
A component of the three-prime long terminal repeat.
PMID:8649407
A component of the five-prime long terminal repeat.
5' long term repeat component
five prime LTR component
sequence
SO:0000850
five_prime_LTR_component
A component of the five-prime long terminal repeat.
PMID:8649407
A region of a CDS.
CDS region
sequence
SO:0000851
CDS_region
A region of a CDS.
SO:cb
A region of an exon.
exon region
sequence
SO:0000852
exon_region
A region of an exon.
RSC:cb
A region that is homologous to another region.
http://en.wikipedia.org/wiki/Homology_(biology)
homolog
homologous region
homologue
sequence
SO:0000853
homologous_region
A region that is homologous to another region.
SO:ke
http://en.wikipedia.org/wiki/Homology_(biology)
wiki
A homologous_region that is paralogous to another region.
http://en.wikipedia.org/wiki/Paralog#Paralogy
paralog
paralogous region
paralogue
sequence
SO:0000854
A term to be used in conjunction with the paralogous_to relationship.
paralogous_region
A homologous_region that is paralogous to another region.
SO:ke
http://en.wikipedia.org/wiki/Paralog#Paralogy
wiki
A homologous_region that is orthologous to another region.
http://en.wikipedia.org/wiki/Ortholog#Orthology
ortholog
orthologous region
orthologue
sequence
SO:0000855
This term should be used in conjunction with the similarity relationships defined in SO.
orthologous_region
A homologous_region that is orthologous to another region.
SO:ke
http://en.wikipedia.org/wiki/Ortholog#Orthology
wiki
A region that is similar or identical across more than one species.
sequence
SO:0000856
conserved
Similarity due to common ancestry.
sequence
SO:0000857
homologous
Similarity due to common ancestry.
SO:ke
An attribute describing a kind of homology where divergence occurred after a speciation event.
sequence
SO:0000858
orthologous
An attribute describing a kind of homology where divergence occurred after a speciation event.
SO:ke
An attribute describing a kind of homology where divergence occurred after a duplication event.
sequence
SO:0000859
paralogous
An attribute describing a kind of homology where divergence occurred after a duplication event.
SO:ke
Attribute describing sequence regions occurring in same order on chromosome of different species.
http://en.wikipedia.org/wiki/Syntenic
sequence
SO:0000860
syntenic
Attribute describing sequence regions occurring in same order on chromosome of different species.
SO:ke
http://en.wikipedia.org/wiki/Syntenic
wiki
A primary transcript that is capped.
capped primary transcript
sequence
SO:0000861
capped_primary_transcript
A primary transcript that is capped.
SO:xp
An mRNA that is capped.
capped mRNA
sequence
SO:0000862
capped_mRNA
An mRNA that is capped.
SO:xp
An attribute describing an mRNA feature.
mRNA attribute
sequence
SO:0000863
mRNA_attribute
An attribute describing an mRNA feature.
SO:ke
An attribute describing a sequence is representative of a class of similar sequences.
sequence
SO:0000864
exemplar
An attribute describing a sequence is representative of a class of similar sequences.
SO:ke
An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is not divisible by 3.
http://en.wikipedia.org/wiki/Frameshift
sequence
SO:0000865
frameshift
An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is not divisible by 3.
SO:ke
http://en.wikipedia.org/wiki/Frameshift
wiki
A frameshift caused by deleting one base.
minus 1 frameshift
sequence
SO:0000866
minus_1_frameshift
A frameshift caused by deleting one base.
SO:ke
A frameshift caused by deleting two bases.
minus 2 frameshift
sequence
SO:0000867
minus_2_frameshift
A frameshift caused by deleting two bases.
SO:ke
A frameshift caused by inserting one base.
plus 1 frameshift
sequence
SO:0000868
plus_1_frameshift
A frameshift caused by inserting one base.
SO:ke
A frameshift caused by inserting two bases.
plus 2 framshift
sequence
SO:0000869
plus_2_framshift
A frameshift caused by inserting two bases.
SO:ke
An attribute describing transcript sequence that is created by splicing exons from diferent genes.
trans-spliced
sequence
SO:0000870
trans_spliced
An attribute describing transcript sequence that is created by splicing exons from diferent genes.
SO:ke
An mRNA that is polyadenylated.
polyadenylated mRNA
sequence
SO:0000871
polyadenylated_mRNA
An mRNA that is polyadenylated.
SO:xp
An mRNA that is trans-spliced.
trans-spliced mRNA
sequence
SO:0000872
trans_spliced_mRNA
An mRNA that is trans-spliced.
SO:xp
A transcript that is edited.
edited transcript
sequence
SO:0000873
edited_transcript
A transcript that is edited.
SO:ke
A transcript that has been edited by A to I substitution.
edited transcript by A to I substitution
sequence
SO:0000874
edited_transcript_by_A_to_I_substitution
A transcript that has been edited by A to I substitution.
SO:ke
An attribute describing a sequence that is bound by a protein.
bound by protein
sequence
SO:0000875
bound_by_protein
An attribute describing a sequence that is bound by a protein.
SO:ke
An attribute describing a sequence that is bound by a nucleic acid.
bound by nucleic acid
sequence
SO:0000876
bound_by_nucleic_acid
An attribute describing a sequence that is bound by a nucleic acid.
SO:ke
An attribute describing a situation where a gene may encode for more than 1 transcript.
alternatively spliced
sequence
SO:0000877
alternatively_spliced
An attribute describing a situation where a gene may encode for more than 1 transcript.
SO:ke
An attribute describing a sequence that contains the code for one gene product.
sequence
SO:0000878
monocistronic
An attribute describing a sequence that contains the code for one gene product.
SO:ke
An attribute describing a sequence that contains the code for two gene products.
sequence
SO:0000879
dicistronic
An attribute describing a sequence that contains the code for two gene products.
SO:ke
An attribute describing a sequence that contains the code for more than one gene product.
sequence
SO:0000880
polycistronic
An attribute describing a sequence that contains the code for more than one gene product.
SO:ke
An attribute describing an mRNA sequence that has been reprogrammed at translation, causing localized alterations.
sequence
SO:0000881
recoded
An attribute describing an mRNA sequence that has been reprogrammed at translation, causing localized alterations.
SO:ke
An attribute describing the alteration of codon meaning.
codon redefined
sequence
SO:0000882
codon_redefined
An attribute describing the alteration of codon meaning.
SO:ke
A stop codon redefined to be a new amino acid.
stop codon read through
sequence
stop codon readthrough
SO:0000883
stop_codon_read_through
A stop codon redefined to be a new amino acid.
SO:ke
A stop codon redefined to be the new amino acid, pyrrolysine.
stop codon redefined as pyrrolysine
sequence
SO:0000884
stop_codon_redefined_as_pyrrolysine
A stop codon redefined to be the new amino acid, pyrrolysine.
SO:ke
A stop codon redefined to be the new amino acid, selenocysteine.
stop codon redefined as selenocysteine
sequence
SO:0000885
stop_codon_redefined_as_selenocysteine
A stop codon redefined to be the new amino acid, selenocysteine.
SO:ke
Recoded mRNA where a block of nucleotides is not translated.
recoded by translational bypass
sequence
SO:0000886
recoded_by_translational_bypass
Recoded mRNA where a block of nucleotides is not translated.
SO:ke
Recoding by frameshifting a particular site.
translationally frameshifted
sequence
SO:0000887
translationally_frameshifted
Recoding by frameshifting a particular site.
SO:ke
A gene that is maternally_imprinted.
maternally imprinted gene
sequence
SO:0000888
maternally_imprinted_gene
A gene that is maternally_imprinted.
SO:xp
A gene that is paternally imprinted.
paternally imprinted gene
sequence
SO:0000889
paternally_imprinted_gene
A gene that is paternally imprinted.
SO:xp
A gene that is post translationally regulated.
post translationally regulated gene
sequence
SO:0000890
post_translationally_regulated_gene
A gene that is post translationally regulated.
SO:xp
A gene that is negatively autoreguated.
negatively autoregulated gene
sequence
SO:0000891
negatively_autoregulated_gene
A gene that is negatively autoreguated.
SO:xp
A gene that is positively autoregulated.
positively autoregulated gene
sequence
SO:0000892
positively_autoregulated_gene
A gene that is positively autoregulated.
SO:xp
An attribute describing an epigenetic process where a gene is inactivated at transcriptional or translational level.
http://en.wikipedia.org/wiki/Silenced
sequence
SO:0000893
silenced
An attribute describing an epigenetic process where a gene is inactivated at transcriptional or translational level.
SO:ke
http://en.wikipedia.org/wiki/Silenced
wiki
An attribute describing an epigenetic process where a gene is inactivated by DNA modifications, resulting in repression of transcription.
silenced by DNA modification
sequence
SO:0000894
silenced_by_DNA_modification
An attribute describing an epigenetic process where a gene is inactivated by DNA modifications, resulting in repression of transcription.
SO:ke
An attribute describing an epigenetic process where a gene is inactivated by DNA methylation, resulting in repression of transcription.
silenced by DNA methylation
sequence
SO:0000895
silenced_by_DNA_methylation
An attribute describing an epigenetic process where a gene is inactivated by DNA methylation, resulting in repression of transcription.
SO:ke
A gene that is translationally regulated.
translationally regulated gene
sequence
SO:0000896
translationally_regulated_gene
A gene that is translationally regulated.
SO:xp
A gene that is allelically_excluded.
allelically excluded gene
sequence
SO:0000897
allelically_excluded_gene
A gene that is allelically_excluded.
SO:xp
A gene that is epigenetically modified.
epigenetically modified gene
sequence
SO:0000898
epigenetically_modified_gene
A gene that is epigenetically modified.
SO:ke
An attribute describing a nuclear pseudogene of a mitochndrial gene.
nuclear mitochondrial
sequence
SO:0000899
nuclear_mitochondrial
true
An attribute describing a nuclear pseudogene of a mitochndrial gene.
SO:ke
An attribute describing a pseudogene where by an mRNA was retrotransposed. The mRNA sequence is transcribed back into the genome, lacking introns and promotors, but often including a polyA tail.
sequence
SO:0000900
processed
true
An attribute describing a pseudogene where by an mRNA was retrotransposed. The mRNA sequence is transcribed back into the genome, lacking introns and promotors, but often including a polyA tail.
SO:ke
An attribute describing a pseudogene that was created by tandem duplication and unequal crossing over during recombination.
unequally crossed over
sequence
SO:0000901
unequally_crossed_over
true
An attribute describing a pseudogene that was created by tandem duplication and unequal crossing over during recombination.
SO:ke
A transgene is a gene that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another.
http://en.wikipedia.org/wiki/Transgene
sequence
SO:0000902
transgene
A transgene is a gene that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another.
SO:xp
http://en.wikipedia.org/wiki/Transgene
wiki
Endogenous DNA sequence that are likely to have arisen from retroviruses.
endogenous retroviral sequence
sequence
SO:0000903
endogenous_retroviral_sequence
An attribute to describe the sequence of a feature, where the DNA is rearranged.
rearranged at DNA level
sequence
SO:0000904
rearranged_at_DNA_level
An attribute to describe the sequence of a feature, where the DNA is rearranged.
SO:ke
An attribute describing the status of a feature, based on the available evidence.
sequence
SO:0000905
This term is the hypernym of attributes and should not be annotated to.
status
An attribute describing the status of a feature, based on the available evidence.
SO:ke
Attribute to describe a feature that is independently known - not predicted.
independently known
sequence
SO:0000906
independently_known
Attribute to describe a feature that is independently known - not predicted.
SO:ke
An attribute to describe a feature that has been predicted using sequence similarity techniques.
supported by sequence similarity
sequence
SO:0000907
supported_by_sequence_similarity
An attribute to describe a feature that has been predicted using sequence similarity techniques.
SO:ke
An attribute to describe a feature that has been predicted using sequence similarity of a known domain.
supported by domain match
sequence
SO:0000908
supported_by_domain_match
An attribute to describe a feature that has been predicted using sequence similarity of a known domain.
SO:ke
An attribute to describe a feature that has been predicted using sequence similarity to EST or cDNA data.
supported by EST or cDNA
sequence
SO:0000909
supported_by_EST_or_cDNA
An attribute to describe a feature that has been predicted using sequence similarity to EST or cDNA data.
SO:ke
A gene whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence.
sequence
SO:0000910
orphan
An attribute describing a feature that is predicted by a computer program that did not rely on sequence similarity.
predicted by ab initio computation
sequence
SO:0000911
predicted_by_ab_initio_computation
An attribute describing a feature that is predicted by a computer program that did not rely on sequence similarity.
SO:ke
A motif of three consecutive residues and one H-bond in which: residue(i) is Aspartate or Asparagine (Asx), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).
BS:00203
asx turn
sequence
SO:0000912
asx_turn
A motif of three consecutive residues and one H-bond in which: residue(i) is Aspartate or Asparagine (Asx), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).
http://www.ebi.ac.uk/msd-srv/msdmotif/
A clone insert made from cDNA.
cloned cDNA insert
sequence
SO:0000913
cloned_cDNA_insert
A clone insert made from cDNA.
SO:xp
A clone insert made from genomic DNA.
cloned genomic insert
sequence
SO:0000914
cloned_genomic_insert
A clone insert made from genomic DNA.
SO:xp
A clone insert that is engineered.
engineered insert
sequence
SO:0000915
engineered_insert
A clone insert that is engineered.
SO:xp
edit operation
sequence
SO:0000916
edit_operation
true
An edit to insert a U.
insert U
sequence
SO:0000917
The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa.
insert_U
true
An edit to insert a U.
SO:ke
An edit to delete a uridine.
delete U
sequence
SO:0000918
The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa.
delete_U
true
An edit to delete a uridine.
SO:ke
An edit to substitute an I for an A.
substitute A to I
sequence
SO:0000919
substitute_A_to_I
true
An edit to substitute an I for an A.
SO:ke
An edit to insert a cytidine.
insert C
sequence
SO:0000920
insert_C
true
An edit to insert a cytidine.
SO:ke
An edit to insert a dinucleotide.
insert dinucleotide
sequence
SO:0000921
insert_dinucleotide
true
An edit to insert a dinucleotide.
SO:ke
An edit to substitute an U for a C.
substitute C to U
sequence
SO:0000922
substitute_C_to_U
true
An edit to substitute an U for a C.
SO:ke
An edit to insert a G.
insert G
sequence
SO:0000923
insert_G
true
An edit to insert a G.
SO:ke
An edit to insert a GC dinucleotide.
insert GC
sequence
SO:0000924
The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs.
insert_GC
true
An edit to insert a GC dinucleotide.
SO:ke
An edit to insert a GU dinucleotide.
insert GU
sequence
SO:0000925
The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs.
insert_GU
true
An edit to insert a GU dinucleotide.
SO:ke
An edit to insert a CU dinucleotide.
insert CU
sequence
SO:0000926
The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs.
insert_CU
true
An edit to insert a CU dinucleotide.
SO:ke
An edit to insert a AU dinucleotide.
insert AU
sequence
SO:0000927
The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs.
insert_AU
true
An edit to insert a AU dinucleotide.
SO:ke
An edit to insert a AA dinucleotide.
insert AA
sequence
SO:0000928
The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs.
insert_AA
true
An edit to insert a AA dinucleotide.
SO:ke
An mRNA that is edited.
edited mRNA
sequence
SO:0000929
edited_mRNA
An mRNA that is edited.
SO:xp
A region of guide RNA.
guide RNA region
sequence
SO:0000930
guide_RNA_region
A region of guide RNA.
SO:ma
A region of a guide_RNA that base-pairs to a target mRNA.
anchor region
sequence
SO:0000931
anchor_region
A region of a guide_RNA that base-pairs to a target mRNA.
SO:jk
A primary transcript that, at least in part, encodes one or more proteins that has not been edited.
pre-edited mRNA
sequence
SO:0000932
pre_edited_mRNA
An attribute to describe a feature between stages of processing.
sequence
SO:0000933
intermediate
An attribute to describe a feature between stages of processing.
SO:ke
A miRNA target site is a binding site where the molecule is a micro RNA.
miRNA target site
sequence
SO:0000934
miRNA_target_site
A miRNA target site is a binding site where the molecule is a micro RNA.
FB:cds
A CDS that is edited.
edited CDS
sequence
SO:0000935
edited_CDS
A CDS that is edited.
SO:xp
Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA.
vertebrate immunoglobulin T cell receptor rearranged segment
sequence
SO:0000936
vertebrate_immunoglobulin_T_cell_receptor_rearranged_segment
sequence
SO:0000937
vertebrate_immune_system_feature
true
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration.
vertebrate immunoglobulin T cell receptor rearranged gene cluster
sequence
SO:0000938
vertebrate_immunoglobulin_T_cell_receptor_rearranged_gene_cluster
Feature used for the recombination of genomic material for the purpose of generating diversity of the immune system.
vertebrate immune system gene recombination signal feature
sequence
SO:0000939
vertebrate_immune_system_gene_recombination_signal_feature
A gene that is recombinationally rearranged.
recombinationally rearranged
sequence
SO:0000940
recombinationally_rearranged
A recombinationally rearranged gene of the vertebrate immune system.
recombinationally rearranged vertebrate immune system gene
sequence
SO:0000941
recombinationally_rearranged_vertebrate_immune_system_gene
A recombinationally rearranged gene of the vertebrate immune system.
SO:xp
An integration/excision site of a phage chromosome at which a recombinase acts to insert the phage DNA at a cognate integration/excision site on a bacterial chromosome.
attP site
sequence
SO:0000942
attP_site
An integration/excision site of a phage chromosome at which a recombinase acts to insert the phage DNA at a cognate integration/excision site on a bacterial chromosome.
SO:as
An integration/excision site of a bacterial chromosome at which a recombinase acts to insert foreign DNA containing a cognate integration/excision site.
attB site
sequence
SO:0000943
attB_site
An integration/excision site of a bacterial chromosome at which a recombinase acts to insert foreign DNA containing a cognate integration/excision site.
SO:as
A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attB_site and the 3' portion of attP_site.
sequence
attBP'
attL site
SO:0000944
attL_site
A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attB_site and the 3' portion of attP_site.
SO:as
A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attP_site and the 3' portion of attB_site.
attR site
sequence
attPB'
SO:0000945
attR_site
A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attP_site and the 3' portion of attB_site.
SO:as
A region specifically recognised by a recombinase, which inserts or removes another region marked by a distinct cognate integration/excision site.
integration excision site
sequence
attachment site
SO:0000946
integration_excision_site
A region specifically recognised by a recombinase, which inserts or removes another region marked by a distinct cognate integration/excision site.
SO:as
A region specifically recognized by a recombinase, which separates a physically contiguous circle of DNA into two physically separate circles.
res site
resolution site
sequence
SO:0000947
resolution_site
A region specifically recognized by a recombinase, which separates a physically contiguous circle of DNA into two physically separate circles.
SO:as
A region specifically recognised by a recombinase, which inverts the region flanked by a pair of sites.
inversion site
sequence
SO:0000948
A target region for site-specific inversion of a DNA region and which carries binding sites for a site-specific recombinase and accessory proteins as well as the site for specific cleavage by the recombinase.
inversion_site
A region specifically recognised by a recombinase, which inverts the region flanked by a pair of sites.
SO:ma
A site at which replicated bacterial circular chromosomes are decatenated by site specific resolvase.
dif site
sequence
SO:0000949
dif_site
A site at which replicated bacterial circular chromosomes are decatenated by site specific resolvase.
SO:as
An attC site is a sequence required for the integration of a DNA of an integron.
attC site
sequence
SO:0000950
attC_site
An attC site is a sequence required for the integration of a DNA of an integron.
SO:as
A signal for RNA polymerase to terminate transcription.
eukaryotic terminator
sequence
SO:0000951
eukaryotic_terminator
An origin of vegetative replication in plasmids and phages.
origin of vegetative replication
sequence
SO:0000952
oriV
An origin of vegetative replication in plasmids and phages.
SO:as
An origin of bacterial chromosome replication.
origin of bacterial chromosome replication
sequence
SO:0000953
oriC
An origin of bacterial chromosome replication.
SO:as
Structural unit composed of a self-replicating, DNA molecule.
DNA chromosome
sequence
SO:0000954
DNA_chromosome
Structural unit composed of a self-replicating, DNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded DNA molecule.
double stranded DNA chromosome
sequence
SO:0000955
double_stranded_DNA_chromosome
Structural unit composed of a self-replicating, double-stranded DNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded DNA molecule.
single stranded DNA chromosome
sequence
SO:0000956
single_stranded_DNA_chromosome
Structural unit composed of a self-replicating, single-stranded DNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded, linear DNA molecule.
linear double stranded DNA chromosome
sequence
SO:0000957
linear_double_stranded_DNA_chromosome
Structural unit composed of a self-replicating, double-stranded, linear DNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded, circular DNA molecule.
circular double stranded DNA chromosome
sequence
SO:0000958
circular_double_stranded_DNA_chromosome
Structural unit composed of a self-replicating, double-stranded, circular DNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded, linear DNA molecule.
linear single stranded DNA chromosome
sequence
SO:0000959
linear_single_stranded_DNA_chromosome
Structural unit composed of a self-replicating, single-stranded, linear DNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded, circular DNA molecule.
circular single stranded DNA chromosome
sequence
SO:0000960
circular_single_stranded_DNA_chromosome
Structural unit composed of a self-replicating, single-stranded, circular DNA molecule.
SO:ma
Structural unit composed of a self-replicating, RNA molecule.
RNA chromosome
sequence
SO:0000961
RNA_chromosome
Structural unit composed of a self-replicating, RNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded RNA molecule.
single stranded RNA chromosome
sequence
SO:0000962
single_stranded_RNA_chromosome
Structural unit composed of a self-replicating, single-stranded RNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded, linear RNA molecule.
linear single stranded RNA chromosome
sequence
SO:0000963
linear_single_stranded_RNA_chromosome
Structural unit composed of a self-replicating, single-stranded, linear RNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded, linear RNA molecule.
linear double stranded RNA chromosome
sequence
SO:0000964
linear_double_stranded_RNA_chromosome
Structural unit composed of a self-replicating, double-stranded, linear RNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded RNA molecule.
double stranded RNA chromosome
sequence
SO:0000965
double_stranded_RNA_chromosome
Structural unit composed of a self-replicating, double-stranded RNA molecule.
SO:ma
Structural unit composed of a self-replicating, single-stranded, circular DNA molecule.
circular single stranded RNA chromosome
sequence
SO:0000966
circular_single_stranded_RNA_chromosome
Structural unit composed of a self-replicating, single-stranded, circular DNA molecule.
SO:ma
Structural unit composed of a self-replicating, double-stranded, circular RNA molecule.
circular double stranded RNA chromosome
sequence
SO:0000967
circular_double_stranded_RNA_chromosome
Structural unit composed of a self-replicating, double-stranded, circular RNA molecule.
SO:ma
sequence replication mode
sequence
SO:0000968
This has been obsoleted as it represents a process. replaced_by: GO:0034961.
sequence_replication_mode
true
http://en.wikipedia.org/wiki/Rolling_circle
rolling circle
sequence
SO:0000969
This has been obsoleted as it represents a process. replaced_by: GO:0070581.
rolling_circle
true
http://en.wikipedia.org/wiki/Rolling_circle
wiki
theta replication
sequence
SO:0000970
This has been obsoleted as it represents a process. replaced_by: GO:0070582
theta_replication
true
DNA replication mode
sequence
SO:0000971
This has been obsoleted as it represents a process. replaced_by: GO:0006260.
DNA_replication_mode
true
RNA replication mode
sequence
SO:0000972
This has been obsoleted as it represents a process. replaced_by: GO:0034961.
RNA_replication_mode
true
A terminal_inverted_repeat_element that is bacterial and only encodes the functions required for its transposition between these inverted repeats.
http://en.wikipedia.org/wiki/Insertion_sequence
insertion sequence
sequence
IS
SO:0000973
insertion_sequence
A terminal_inverted_repeat_element that is bacterial and only encodes the functions required for its transposition between these inverted repeats.
SO:as
http://en.wikipedia.org/wiki/Insertion_sequence
wiki
true
A gene found within a minicircle.
minicircle gene
sequence
SO:0000975
minicircle_gene
A feature_attribute describing a feature that is not manifest under normal conditions.
sequence
SO:0000976
cryptic
A feature_attribute describing a feature that is not manifest under normal conditions.
SO:ke
anchor binding site
sequence
SO:0000977
Part of an edited transcript only.
anchor_binding_site
A region of a guide_RNA that specifies the insertions and deletions of bases in the editing of a target mRNA.
information region
template region
sequence
SO:0000978
template_region
A region of a guide_RNA that specifies the insertions and deletions of bases in the editing of a target mRNA.
SO:jk
A non-protein_coding gene that encodes a guide_RNA.
gRNA encoding
sequence
SO:0000979
gRNA_encoding
A non-protein_coding gene that encodes a guide_RNA.
SO:ma
A minicircle is a replicon, part of a kinetoplast, that encodes for guide RNAs.
SO:0000974
http://en.wikipedia.org/wiki/Minicircle
minicircle_chromosome
sequence
SO:0000980
minicircle
A minicircle is a replicon, part of a kinetoplast, that encodes for guide RNAs.
PMID:8395055
http://en.wikipedia.org/wiki/Minicircle
wiki
A transcription terminator that is dependent upon Rho.
rho dependent bacterial terminator
sequence
SO:0000981
rho_dependent_bacterial_terminator
A transcription terminator that is not dependent upon Rho. Rather, the mRNA contains a sequence that allows it to base-pair with itself and make a stem-loop structure.
rho independent bacterial terminator
sequence
SO:0000982
rho_independent_bacterial_terminator
The attribute of how many strands are present in a nucleotide polymer.
strand attribute
sequence
SO:0000983
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
strand_attribute
When a nucleotide polymer has only one strand.
sequence
SO:0000984
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
single
When a nucleotide polymer has two strands that are reverse-complement to one another and pair together.
sequence
SO:0000985
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
double
The attribute of whether a nucleotide polymer is linear or circular.
topology attribute
sequence
SO:0000986
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
topology_attribute
A quality of a nucleotide polymer that has a 3'-terminal residue and a 5'-terminal residue.
sequence
two-ended
SO:0000987
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
linear
A quality of a nucleotide polymer that has a 3'-terminal residue and a 5'-terminal residue.
SO:cb
A quality of a nucleotide polymer that has no terminal nucleotide residues.
sequence
zero-ended
SO:0000988
Attributes added to describe the different kinds of replicon. SO workshop, September 2006.
circular
A quality of a nucleotide polymer that has no terminal nucleotide residues.
SO:cb
Small non-coding RNA (59-60 nt long) containing 5' and 3' ends that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm.
class II RNA
sequence
SO:0000989
class_II_RNA
Small non-coding RNA (59-60 nt long) containing 5' and 3' ends that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm.
PMID:15333696
Small non-coding RNA (55-65 nt long) containing highly conserved 5' and 3' ends (16 and 8 nt, respectively) that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm.
class I RNA
sequence
SO:0000990
Requested by Karen Pilcher - Dictybase. song-Term Tracker-1574577.
class_I_RNA
Small non-coding RNA (55-65 nt long) containing highly conserved 5' and 3' ends (16 and 8 nt, respectively) that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm.
PMID:15333696
DNA located in the genome and able to be transmitted to the offspring.
gDNA
genomic DNA
sequence
SO:0000991
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
genomic_DNA
DNA located in the genome and able to be transmitted to the offspring.
BCS:etrwz
A region of DNA that has been inserted into the bacterial genome using a bacterial artificial chromosome.
BAC cloned genomic insert
sequence
SO:0000992
Requested by Andy Schroder - Flybase Harvard, Nov 2006.
BAC_cloned_genomic_insert
A sequence produced from an aligment algorithm that uses multiple sequences as input.
sequence
SO:0000993
Term added Dec 06 to comply with mapping to MGED terms. It should be used to generate consensus regions. The specific cross product terms they require are consensus_region and consensus_mRNA.
consensus
A region that has a known consensus sequence.
consensus region
sequence
SO:0000994
DO not obsolete without considering MGED mapping.
consensus_region
An mRNA sequence produced from an aligment algorithm that uses multiple sequences as input.
consensus mRNA
sequence
SO:0000995
DO not obsolete without considering MGED mapping.
consensus_mRNA
A region of the genome that has been predicted to be a gene but has not been confirmed by laboratory experiments.
predicted gene
sequence
SO:0000996
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
predicted_gene
A portion of a gene that is not the complete gene.
gene fragment
sequence
SO:0000997
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
gene_fragment
A recursive splice site is a splice site which subdivides a large intron. Recursive splicing is a mechanism that splices large introns by sub dividing the intron at non exonic elements and alternate exons.
recursive splice site
sequence
SO:0000998
recursive_splice_site
A recursive splice site is a splice site which subdivides a large intron. Recursive splicing is a mechanism that splices large introns by sub dividing the intron at non exonic elements and alternate exons.
http://www.genetics.org/cgi/content/full/170/2/661
A region of sequence from the end of a BAC clone that may provide a highly specific marker.
BAC end
BAC end sequence
BES
sequence
SO:0000999
Requested by Keith Boroevich December, 2006.
BAC_end
A region of sequence from the end of a BAC clone that may provide a highly specific marker.
SO:ke
Cytosolic 16S rRNA is an RNA component of the small subunit of cytosolic ribosomes in prokaryotes.
http://en.wikipedia.org/wiki/16S_ribosomal_RNA
cytosolic 16S SSU RNA
cytosolic 16S ribosomal RNA
cytosolic rRNA 16S
sequence
cytosolic 16S rRNA
SO:0001000
Renamed to cytosolic_16S_rRNA from rRNA_16S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493.
cytosolic_16S_rRNA
Cytosolic 16S rRNA is an RNA component of the small subunit of cytosolic ribosomes in prokaryotes.
SO:ke
http://en.wikipedia.org/wiki/16S_ribosomal_RNA
wiki
Cytosolic 23S rRNA is an RNA component of the large subunit of cytosolic ribosomes in prokaryotes.
cytosolic 23S LSU rRNA
cytosolic 23S rRNA
cytosolic rRNA 23S
sequence
cytosolic 23S ribosomal RNA
SO:0001001
Renamed from rRNA_23S to cytosolic_23S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_23S_rRNA
Cytosolic 23S rRNA is an RNA component of the large subunit of cytosolic ribosomes in prokaryotes.
SO:ke
Cytosolic 25S rRNA is an RNA component of the large subunit of cytosolic ribosomes most eukaryotes.
cytosolic 25S LSU rRNA
cytosolic 25S rRNA
cytosolic 25S ribosomal RNA
cytosolic rRNA 25S
sequence
SO:0001002
Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_25S_rRNA
Cytosolic 25S rRNA is an RNA component of the large subunit of cytosolic ribosomes most eukaryotes.
PMID:15493135
PMID:2100998
RSC:cb
A recombination product between the 2 LTR of the same element.
solo LTR
sequence
SO:0001003
Requested by Hadi Quesneville January 2007.
solo_LTR
A recombination product between the 2 LTR of the same element.
SO:ke
When a sequence does not contain an equal distribution of all four possible nucleotide bases or does not contain all nucleotide bases.
low complexity
sequence
SO:0001004
low_complexity
A region where the DNA does not contain an equal distrubution of all four possible nucleotides or does not contain all four nucleotides.
low complexity region
sequence
SO:0001005
low_complexity_region
A phage genome after it has established in the host genome in a latent/immune state either as a plasmid or as an integrated "island".
http://en.wikipedia.org/wiki/Prophage
sequence
SO:0001006
prophage
A phage genome after it has established in the host genome in a latent/immune state either as a plasmid or as an integrated "island".
GOC:jl
http://en.wikipedia.org/wiki/Prophage
wiki
A remnant of an integrated prophage in the host genome or an "island" in the host genome that includes phage like-genes.
http://ecoliwiki.net/colipedia/index.php/Category:Cryptic_Prophage.w
cryptic prophage
sequence
SO:0001007
This is not cryptic in the same sense as a cryptic gene or cryptic splice site.
cryptic_prophage
A remnant of an integrated prophage in the host genome or an "island" in the host genome that includes phage like-genes.
GOC:jl
A base-paired stem with loop of 4 non-hydrogen bonded nucleotides.
http://en.wikipedia.org/wiki/Tetraloop
sequence
SO:0001008
tetraloop
A base-paired stem with loop of 4 non-hydrogen bonded nucleotides.
SO:ke
http://en.wikipedia.org/wiki/Tetraloop
wiki
A double-stranded DNA used to control macromolecular structure and function.
DNA constraint
DNA constraint sequence
sequence
SO:0001009
DNA_constraint_sequence
A double-stranded DNA used to control macromolecular structure and function.
http:/www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=pubmed&term=SILVERMAN+SK[au]&dispmax=50
A cytosine rich domain whereby strands associate both inter- and intramolecularly at moderately acidic pH.
i motif
short intercalated motif
sequence
SO:0001010
i_motif
A cytosine rich domain whereby strands associate both inter- and intramolecularly at moderately acidic pH.
PMID:9753739
Peptide nucleic acid, is a chemical not known to occur naturally but is artificially synthesized and used in some biological research and medical treatments. The PNA backbone is composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds.
http://en.wikipedia.org/wiki/Peptide_nucleic_acid
PNA oligo
peptide nucleic acid
sequence
SO:0001011
PNA_oligo
Peptide nucleic acid, is a chemical not known to occur naturally but is artificially synthesized and used in some biological research and medical treatments. The PNA backbone is composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds.
SO:ke
http://en.wikipedia.org/wiki/Peptide_nucleic_acid
wiki
A DNA sequence with catalytic activity.
DNA enzyme
catalytic DNA
sequence
deoxyribozyme
SO:0001012
Added by request from Colin Batchelor.
DNAzyme
A DNA sequence with catalytic activity.
SO:cb
A multiple nucleotide polymorphism with alleles of common length > 1, for example AAA/TTT.
sequence
multiple nucleotide polymorphism
SO:0001013
MNP
A multiple nucleotide polymorphism with alleles of common length > 1, for example AAA/TTT.
http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs2067431
An intronic region that has an attribute.
intron domain
sequence
SO:0001014
Requested by Colin Batchelor, Feb 2007.
intron_domain
A type of non-canonical base pairing, most commonly between G and U, which is important for the secondary structure of RNAs. It has similar thermodynamic stability to the Watson-Crick pairing. Wobble base pairs only have two hydrogen bonds. Other wobble base pair possibilities are I-A, I-U and I-C.
http://en.wikipedia.org/wiki/Wobble_base_pair
wobble base pair
wobble pair
sequence
SO:0001015
wobble_base_pair
A type of non-canonical base pairing, most commonly between G and U, which is important for the secondary structure of RNAs. It has similar thermodynamic stability to the Watson-Crick pairing. Wobble base pairs only have two hydrogen bonds. Other wobble base pair possibilities are I-A, I-U and I-C.
PMID:11256617
http://en.wikipedia.org/wiki/Wobble_base_pair
wiki
A purine-rich sequence in the group I introns which determines the locations of the splice sites in group I intron splicing and has catalytic activity.
IGS
internal guide sequence
sequence
SO:0001016
internal_guide_sequence
A purine-rich sequence in the group I introns which determines the locations of the splice sites in group I intron splicing and has catalytic activity.
SO:cb
A sequence variant that does not affect protein function. Silent mutations may occur in genic ( CDS, UTR, intron etc) and intergenic regions. Silent mutations may have affects on processes such as splicing and regulation.
http://en.wikipedia.org/wiki/Silent_mutation
loinc:LA6700-4
silent mutation
sequence
SO:0001017
Added in March 2007 in after meeting with PharmGKB. Although this term is in common usage, it is better to annotate with the most specific term possible, such as synonymous codon, intron variant etc.
silent_mutation
A sequence variant that does not affect protein function. Silent mutations may occur in genic ( CDS, UTR, intron etc) and intergenic regions. Silent mutations may have affects on processes such as splicing and regulation.
SO:ke
http://en.wikipedia.org/wiki/Silent_mutation
wiki
loinc:LA6700-4
Silent
A binding site that, in the molecule, interacts selectively and non-covalently with antibodies, B cells or T cells.
http://en.wikipedia.org/wiki/Epitope
sequence
SO:0001018
Requested by Trish Whetzel.
epitope
A binding site that, in the molecule, interacts selectively and non-covalently with antibodies, B cells or T cells.
SO:cb
http://en.wikipedia.org/wiki/Epitope
http://en.wikipedia.org/wiki/Epitope
wiki
A variation that increases or decreases the copy number of a given region.
http://en.wikipedia.org/wiki/Copy_number_variation
CNP
CNV
copy number polymorphism
copy number variation
sequence
SO:0001019
copy_number_variation
A variation that increases or decreases the copy number of a given region.
SO:ke
http://en.wikipedia.org/wiki/Copy_number_variation
wiki
SO:0001563
mutation affecting copy number
sequence variant affecting copy number
sequence
SO:0001020
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_copy_number
true
A chromosomal region that may sustain a double-strand break, resulting in a recombination event.
SO:0001242
INSDC_feature:misc_recomb
INSDC_qualifier:chromosome_breakpoint
aberration breakpoint
aberration_junction
chromosome breakpoint
sequence
SO:0001021
chromosome_breakpoint
The point within a chromosome where an inversion begins or ends.
inversion breakpoint
sequence
SO:0001022
inversion_breakpoint
The point within a chromosome where an inversion begins or ends.
SO:cb
An allele is one of a set of coexisting sequence variants of a gene.
http://en.wikipedia.org/wiki/Allele
allelomorph
sequence
SO:0001023
allele
An allele is one of a set of coexisting sequence variants of a gene.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Allele
wiki
A haplotype is one of a set of coexisting sequence variants of a haplotype block.
http://en.wikipedia.org/wiki/Haplotype
sequence
SO:0001024
haplotype
A haplotype is one of a set of coexisting sequence variants of a haplotype block.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Haplotype
wiki
A sequence variant that is segregating in one or more natural populations of a species.
polymorphic sequence variant
sequence
SO:0001025
polymorphic_sequence_variant
A sequence variant that is segregating in one or more natural populations of a species.
SO:immuno_workshop
A genome is the sum of genetic material within a cell or virion.
http://en.wikipedia.org/wiki/Genome
sequence
SO:0001026
genome
A genome is the sum of genetic material within a cell or virion.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Genome
wiki
A genotype is a variant genome, complete or incomplete.
http://en.wikipedia.org/wiki/Genotype
sequence
SO:0001027
genotype
A genotype is a variant genome, complete or incomplete.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Genotype
wiki
A diplotype is a pair of haplotypes from a given individual. It is a genotype where the phase is known.
sequence
SO:0001028
diplotype
A diplotype is a pair of haplotypes from a given individual. It is a genotype where the phase is known.
SO:immuno_workshop
The attribute of whether the sequence is the same direction as a feature (forward) or the opposite direction as a feature (reverse).
direction attribute
sequence
SO:0001029
direction_attribute
Forward is an attribute of the feature, where the feature is in the 5' to 3' direction.
sequence
SO:0001030
forward
Forward is an attribute of the feature, where the feature is in the 5' to 3' direction.
SO:ke
Reverse is an attribute of the feature, where the feature is in the 3' to 5' direction. Again could be applied to primer.
sequence
SO:0001031
reverse
Reverse is an attribute of the feature, where the feature is in the 3' to 5' direction. Again could be applied to primer.
SO:ke
DNA belonging to the genome of a mitochondria.
http://en.wikipedia.org/wiki/Mitochondrial_DNA
mitochondrial DNA
mtDNA
sequence
SO:0001032
This terms is used by MO.
mitochondrial_DNA
http://en.wikipedia.org/wiki/Mitochondrial_DNA
wiki
DNA belonging to the genome of a chloroplast, a photosynthetic plastid.
chloroplast DNA
sequence
SO:0001033
This term is used by MO.
chloroplast_DNA
A de-branched intron which mimics the structure of pre-miRNA and enters the miRNA processing pathway without Drosha mediated cleavage.
sequence
SO:0001034
Ruby et al. Nature 448:83 describe a new class of miRNAs that are derived from de-branched introns.
miRtron
A de-branched intron which mimics the structure of pre-miRNA and enters the miRNA processing pathway without Drosha mediated cleavage.
PMID:17589500
SO:ma
A small non coding RNA, part of a silencing system that prevents the spreading of selfish genetic elements.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/PiRNA
INSDC_qualifier:piRNA
piwi-associated RNA
sequence
SO:0001035
piRNA
A small non coding RNA, part of a silencing system that prevents the spreading of selfish genetic elements.
SO:ke
http://en.wikipedia.org/wiki/PiRNA
wiki
A tRNA sequence that has an arginine anticodon, and a 3' arginine binding region.
arginyl tRNA
sequence
SO:0001036
arginyl_tRNA
A tRNA sequence that has an arginine anticodon, and a 3' arginine binding region.
SO:ke
A nucleotide region with either intra-genome or intracellular mobility, of varying length, which often carry the information necessary for transfer and recombination with the host genome.
http://en.wikipedia.org/wiki/Mobile_genetic_element
INSDC_feature:mobile_element
MGE
mobile genetic element
sequence
SO:0001037
mobile_genetic_element
A nucleotide region with either intra-genome or intracellular mobility, of varying length, which often carry the information necessary for transfer and recombination with the host genome.
PMID:14681355
http://en.wikipedia.org/wiki/Mobile_genetic_element
wiki
An MGE that is not integrated into the host chromosome.
extrachromosomal mobile genetic element
sequence
SO:0001038
extrachromosomal_mobile_genetic_element
An MGE that is not integrated into the host chromosome.
SO:ke
An MGE that is integrated into the host chromosome.
integrated mobile genetic element
sequence
SO:0001039
integrated_mobile_genetic_element
An MGE that is integrated into the host chromosome.
SO:ke
A plasmid sequence that is integrated within the host chromosome.
integrated plasmid
sequence
SO:0001040
integrated_plasmid
A plasmid sequence that is integrated within the host chromosome.
SO:ke
The region of nucleotide sequence of a virus, a submicroscopic particle that replicates by infecting a host cell.
viral sequence
virus sequence
sequence
SO:0001041
The definitions of the children of this term were revised Decemeber 2007 after discussion on song-devel. The resulting definitions are slightly unweildy but hopefully more logically correct.
viral_sequence
The region of nucleotide sequence of a virus, a submicroscopic particle that replicates by infecting a host cell.
SO:ke
The nucleotide sequence of a virus that infects bacteria.
http://en.wikipedia.org/wiki/Bacteriophage
bacteriophage
phage
phage sequence
sequence
SO:0001042
phage_sequence
The nucleotide sequence of a virus that infects bacteria.
SO:ke
http://en.wikipedia.org/wiki/Bacteriophage
wiki
An attachment site located on a conjugative transposon and used for site-specific integration of a conjugative transposon.
attCtn site
sequence
SO:0001043
attCtn_site
An attachment site located on a conjugative transposon and used for site-specific integration of a conjugative transposon.
Phigo:at
A nuclear pseudogene of either coding or non-coding mitochondria derived sequence.
http://en.wikipedia.org/wiki/Numt
NUMT
nuclear mitochondrial pseudogene
nuclear mt pseudogene
sequence
SO:0001044
Definition change requested by Val, 3172757.
nuclear_mt_pseudogene
A nuclear pseudogene of either coding or non-coding mitochondria derived sequence.
SO:xp
http://en.wikipedia.org/wiki/Numt
wikipedia
A MGE region consisting of two fused plasmids resulting from a replicative transposition event.
cointegrated plasmid
cointegrated replicon
sequence
SO:0001045
cointegrated_plasmid
A MGE region consisting of two fused plasmids resulting from a replicative transposition event.
phigo:at
Component of the inversion site located at the left of a region susceptible to site-specific inversion.
IRLinv site
sequence
SO:0001046
IRLinv_site
Component of the inversion site located at the left of a region susceptible to site-specific inversion.
Phigo:at
Component of the inversion site located at the right of a region susceptible to site-specific inversion.
IRRinv site
sequence
SO:0001047
IRRinv_site
Component of the inversion site located at the right of a region susceptible to site-specific inversion.
Phigo:at
A region located within an inversion site.
inversion site part
sequence
SO:0001048
A term created to allow the parts of an inversion site have an is_a path back to the root.
inversion_site_part
A region located within an inversion site.
SO:ke
An island that contains genes for integration/excision and the gene and site for the initiation of intercellular transfer by conjugation. It can be complemented for transfer by a conjugative transposon.
defective conjugative transposon
sequence
SO:0001049
defective_conjugative_transposon
An island that contains genes for integration/excision and the gene and site for the initiation of intercellular transfer by conjugation. It can be complemented for transfer by a conjugative transposon.
Phigo:ariane
A portion of a repeat, interrupted by the insertion of another element.
repeat fragment
sequence
SO:0001050
Requested by Chris Smith, and others at Flybase to help annotate nested repeats.
repeat_fragment
A portion of a repeat, interrupted by the insertion of another element.
SO:ke
sequence
SO:0001051
nested_region
true
sequence
SO:0001052
nested_repeat
true
sequence
SO:0001053
nested_transposon
true
A portion of a transposon, interrupted by the insertion of another element.
transposon fragment
sequence
SO:0001054
transposon_fragment
A portion of a transposon, interrupted by the insertion of another element.
SO:ke
A regulatory_region that modulates the transcription of a gene or genes.
INSDC_feature:regulatory
INSDC_qualifier:transcriptional_cis_regulatory_region
transcription-control region
transcriptional cis regulatory region
sequence
SO:0001055
Previous parent term transcription_regulatory_region (SO:0001067) has been merged with this term on 11 Feb 2021 as part of the GREEKC consortium. See GitHub Issue #527.
transcriptional_cis_regulatory_region
A regulatory_region that modulates the transcription of a gene or genes.
PMID:9679020
SO:regcreative
A regulatory_region that modulates splicing.
splicing regulatory region
sequence
SO:0001056
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
splicing_regulatory_region
A regulatory_region that modulates splicing.
SO:ke
sequence
SO:0001057
enhanceosome
true
A transcriptional_cis_regulatory_region that restricts the activity of a CRM to a single promoter and which functions only when both itself and an insulator are located between the CRM and the promoter.
promoter targeting sequence
sequence
SO:0001058
Obsoleted Jan 21, 2021 by Dave Sant. GREEKC consortium individuals pointed out that this did not fit with the other child terms of transcriptional_cis_regulatory_region (SO:0001055), which are currently promoter, CRM and promoter flanking region. No comments about when this term was created exist, no references are listed. GREEKC members assume that this was previously under enhansosome (SO:0001057), which was probably created along with this term but has since been obsoleted. This term can be resurrected as non-obsolete if we can find a reference publication and/or change the name to a term that is commonly used in the field.
promoter_targeting_sequence
true
A transcriptional_cis_regulatory_region that restricts the activity of a CRM to a single promoter and which functions only when both itself and an insulator are located between the CRM and the promoter.
SO:regcreative
A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence.
SO:1000004
SO:1000007
INSDC_feature:misc_feature
INSDC_feature:variation
INSDC_note:sequence_alteration
sequence alteration
partially characterised change in DNA sequence
partially_characterised_change_in_DNA_sequence
uncharacterised_change_in_nucleotide_sequence
sequence
sequence variation
SO:0001059
Merged with partially characterized change in nucleotide sequence.
sequence_alteration
A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence.
SO:ke
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
Jannovar:sequence_variant
VAAST:sequence_variant
sequence variant
sequence
ANNOVAR:unknown
SO:0001060
sequence_variant
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
SO:ke
Jannovar:sequence_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:sequence_variant
ANNOVAR:unknown
http://www.openbioinformatics.org/annovar/annovar_download.html
The propeptide_cleavage_site is the arginine/lysine boundary on a propeptide where cleavage occurs.
BS:00063
propeptide cleavage site
sequence
SO:0001061
Discrete.
propeptide_cleavage_site
The propeptide_cleavage_site is the arginine/lysine boundary on a propeptide where cleavage occurs.
EBIBS:GAR
Part of a peptide chain which is cleaved off during the formation of the mature protein.
BS:00077
http://en.wikipedia.org/wiki/Propeptide
INSDC_feature:propeptide
sequence
propep
SO:0001062
Range.
propeptide
Part of a peptide chain which is cleaved off during the formation of the mature protein.
EBIBS:GAR
http://en.wikipedia.org/wiki/Propeptide
wiki
propep
uniprot:feature_type
An immature_peptide_region is the extent of the peptide after it has been translated and before any processing occurs.
BS:00129
immature peptide region
sequence
SO:0001063
Range.
immature_peptide_region
An immature_peptide_region is the extent of the peptide after it has been translated and before any processing occurs.
EBIBS:GAR
Active peptides are proteins which are biologically active, released from a precursor molecule.
BS:00076
peptide
http://en.wikipedia.org/wiki/Peptide
active peptide
sequence
SO:0001064
Hormones, neuropeptides, antimicrobial peptides, are active peptides. They are typically short (<40 amino acids) in length.
active_peptide
Active peptides are proteins which are biologically active, released from a precursor molecule.
EBIBS:GAR
UniProt:curation_manual
peptide
uniprot:feature_type
http://en.wikipedia.org/wiki/Peptide
wiki
true
Polypeptide region that is rich in a particular amino acid or homopolymeric and greater than three residues in length.
BS:00068
compositionally_biased_region
sequence
compbias
compositional bias
compositionally biased
compositionally biased region of peptide
SO:0001066
Range.
compositionally_biased_region_of_peptide
Polypeptide region that is rich in a particular amino acid or homopolymeric and greater than three residues in length.
EBIBS:GAR
UniProt:curation_manual
compbias
uniprot:feature_type
A sequence motif is a short (up to 20 amino acids) region of biological interest. Such motifs, although they are too short to constitute functional domains, share sequence similarities and are conserved in different proteins. They display a common function (protein-binding, subcellular location etc.).
BS:00032
motif
polypeptide motif
sequence
SO:0001067
Range.
polypeptide_motif
A sequence motif is a short (up to 20 amino acids) region of biological interest. Such motifs, although they are too short to constitute functional domains, share sequence similarities and are conserved in different proteins. They display a common function (protein-binding, subcellular location etc.).
EBIBS:GAR
UniProt:curation_manual
motif
uniprot:feature_type
A polypeptide_repeat is a single copy of an internal sequence repetition.
BS:00070
polypeptide repeat
sequence
repeat
SO:0001068
Range.
polypeptide_repeat
A polypeptide_repeat is a single copy of an internal sequence repetition.
EBIBS:GAR
repeat
uniprot:feature_type
true
Region of polypeptide with a given structural property.
BS:00337
polypeptide structural region
sequence
structural_region
SO:0001070
Range.
polypeptide_structural_region
Region of polypeptide with a given structural property.
EBIBS:GAR
SO:cb
Arrangement of the polypeptide with respect to the lipid bilayer.
BS:00128
membrane structure
sequence
SO:0001071
Range.
membrane_structure
Arrangement of the polypeptide with respect to the lipid bilayer.
EBIBS:GAR
Polypeptide region that is localized outside of a lipid bilayer.
BS:00154
extramembrane polypeptide region
sequence
extramembrane
extramembrane_region
topo_dom
SO:0001072
Range.
extramembrane_polypeptide_region
Polypeptide region that is localized outside of a lipid bilayer.
EBIBS:GAR
SO:cb
extramembrane
extramembrane_region
topo_dom
uniprot:feature_type
Polypeptide region that is localized inside the cytoplasm.
BS:00145
cytoplasm_location
cytoplasmic polypeptide region
sequence
inside
SO:0001073
cytoplasmic_polypeptide_region
Polypeptide region that is localized inside the cytoplasm.
EBIBS:GAR
SO:cb
cytoplasm_location
inside
Polypeptide region that is localized outside of a lipid bilayer and outside of the cytoplasm.
BS:00144
non cytoplasmic polypeptide region
non_cytoplasm_location
sequence
outside
SO:0001074
This could be inside an organelle within the cell.
non_cytoplasmic_polypeptide_region
Polypeptide region that is localized outside of a lipid bilayer and outside of the cytoplasm.
EBIBS:GAR
SO:cb
non_cytoplasm_location
outside
Polypeptide region present in the lipid bilayer.
BS:00156
intramembrane polypeptide region
sequence
intramembrane
SO:0001075
intramembrane_polypeptide_region
Polypeptide region present in the lipid bilayer.
EBIBS:GAR
intramembrane
Polypeptide region localized within the lipid bilayer where both ends traverse the same membrane.
BS:00155
membrane peptide loop
sequence
membrane_loop
SO:0001076
membrane_peptide_loop
Polypeptide region localized within the lipid bilayer where both ends traverse the same membrane.
EBIBS:GAR
SO:cb
membrane_loop
Polypeptide region traversing the lipid bilayer.
BS:00158
transmembrane polypeptide region
sequence
transmem
transmembrane
SO:0001077
transmembrane_polypeptide_region
Polypeptide region traversing the lipid bilayer.
EBIBS:GAR
UniProt:curator_manual
transmem
uniprot:feature_type
transmembrane
A region of peptide with secondary structure has hydrogen bonding along the peptide chain that causes a defined conformation of the chain.
BS:00003
http://en.wikipedia.org/wiki/Secondary_structure
polypeptide secondary structure
sequence
2nary structure
secondary structure
secondary structure region
secondary_structure
SO:0001078
Biosapien term was secondary_structure.
polypeptide_secondary_structure
A region of peptide with secondary structure has hydrogen bonding along the peptide chain that causes a defined conformation of the chain.
EBIBS:GAR
http://en.wikipedia.org/wiki/Secondary_structure
wiki
2nary structure
secondary structure
secondary structure region
secondary_structure
Motif is a three-dimensional structural element within the chain, which appears also in a variety of other molecules. Unlike a domain, a motif does not need to form a stable globular unit.
BS:0000338
http://en.wikipedia.org/wiki/Structural_motif
sequence
polypeptide structural motif
structural_motif
SO:0001079
polypeptide_structural_motif
Motif is a three-dimensional structural element within the chain, which appears also in a variety of other molecules. Unlike a domain, a motif does not need to form a stable globular unit.
EBIBS:GAR
http://en.wikipedia.org/wiki/Structural_motif
wiki
structural_motif
A coiled coil is a structural motif in proteins, in which alpha-helices are coiled together like the strands of a rope.
BS:00041
http://en.wikipedia.org/wiki/Coiled_coil
coiled coil
sequence
coiled
SO:0001080
Range.
coiled_coil
A coiled coil is a structural motif in proteins, in which alpha-helices are coiled together like the strands of a rope.
EBIBS:GAR
UniProt:curation_manual
http://en.wikipedia.org/wiki/Coiled_coil
wiki
coiled
uniprot:feature_type
A motif comprising two helices separated by a turn.
BS:00147
helix turn helix
helix-turn-helix
sequence
HTH
SO:0001081
helix_turn_helix
A motif comprising two helices separated by a turn.
EBIBS:GAR
HTH
Incompatibility in the sequence due to some experimental problem.
BS:00125
sequencing_information
sequence
SO:0001082
Range.
polypeptide_sequencing_information
Incompatibility in the sequence due to some experimental problem.
EBIBS:GAR
Indicates that two consecutive residues in a fragment sequence are not consecutive in the full-length protein and that there are a number of unsequenced residues between them.
BS:00182
non consecutive
non_cons
sequence
SO:0001083
non_adjacent_residues
Indicates that two consecutive residues in a fragment sequence are not consecutive in the full-length protein and that there are a number of unsequenced residues between them.
EBIBS:GAR
UniProt:curation_manual
non_cons
uniprot:feature_type
The residue at an extremity of the sequence is not the terminal residue.
BS:00072
non terminal
non_ter
sequence
SO:0001084
Discrete.
non_terminal_residue
The residue at an extremity of the sequence is not the terminal residue.
EBIBS:GAR
UniProt:curation_manual
non_ter
uniprot:feature_type
Different sources report differing sequences.
BS:00069
conflict
sequence
SO:0001085
Discrete.
sequence_conflict
Different sources report differing sequences.
EBIBS:GAR
UniProt:curation_manual
conflict
uniprot:feature_type
Describes the positions in a sequence where the authors are unsure about the sequence assignment.
BS:00181
INSDC_feature:unsure
unsure
sequence
SO:0001086
sequence_uncertainty
Describes the positions in a sequence where the authors are unsure about the sequence assignment.
EBIBS:GAR
UniProt:curation_manual
unsure
uniprot:feature_type
Posttranslationally formed amino acid bonds.
BS:00178
cross link
sequence
crosslink
SO:0001087
cross_link
true
Posttranslationally formed amino acid bonds.
EBIBS:GAR
UniProt:curation_manual
The covalent bond between sulfur atoms that binds two peptide chains or different parts of one peptide chain and is a structural determinant in many protein molecules.
BS:00028
disulphide
sequence
disulfid
disulfide
disulfide bond
disulphide bond
SO:0001088
2 discreet & joined.
disulfide_bond
true
The covalent bond between sulfur atoms that binds two peptide chains or different parts of one peptide chain and is a structural determinant in many protein molecules.
EBIBS:GAR
UniProt:curation_manual
A region where a transformation occurs in a protein after it has been synthesized. This which may regulate, stabilize, crosslink or introduce new chemical functionalities in the protein.
BS:00052
http://en.wikipedia.org/wiki/Post_translational_modification
mod_res
modified residue
post_translational_modification
sequence
SO:0001089
Discrete.
post_translationally_modified_region
A region where a transformation occurs in a protein after it has been synthesized. This which may regulate, stabilize, crosslink or introduce new chemical functionalities in the protein.
EBIBS:GAR
UniProt:curation_manual
http://en.wikipedia.org/wiki/Post_translational_modification
wiki
mod_res
uniprot:feature_type
Binding involving a covalent bond.
BS:00246
covalent binding site
sequence
SO:0001090
covalent_binding_site
true
Binding involving a covalent bond.
EBIBS:GAR
Binding site for any chemical group (co-enzyme, prosthetic group, etc.).
BS:00029
non covalent binding site
sequence
binding
binding site
SO:0001091
Discrete.
non_covalent_binding_site
true
Binding site for any chemical group (co-enzyme, prosthetic group, etc.).
EBIBS:GAR
binding
uniprot:curation
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with metal ions.
BS:00027
sequence
metal_binding
SO:0001092
Residue is part of a binding site for a metal ion.
polypeptide_metal_contact
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with metal ions.
EBIBS:GAR
SO:cb
UniProt:curation_manual
A binding site that, in the protein molecule, interacts selectively and non-covalently with polypeptide residues.
BS:00131
http://en.wikipedia.org/wiki/Protein_protein_interaction
protein protein contact
protein protein contact site
sequence
protein_protein_interaction
SO:0001093
protein_protein_contact
A binding site that, in the protein molecule, interacts selectively and non-covalently with polypeptide residues.
EBIBS:GAR
UniProt:Curation_manual
http://en.wikipedia.org/wiki/Protein_protein_interaction
wiki
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with calcium ions.
BS:00186
Ca_contact_site
ca_bind
polypeptide calcium ion contact site
sequence
ca bind
SO:0001094
Residue involved in contact with calcium.
polypeptide_calcium_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with calcium ions.
EBIBS:GAR
ca_bind
uniprot:feature_type
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with cobalt ions.
BS:00136
Co_contact_site
polypeptide cobalt ion contact site
sequence
SO:0001095
polypeptide_cobalt_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with cobalt ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with copper ions.
BS:00146
Cu_contact_site
polypeptide copper ion contact site
sequence
SO:0001096
polypeptide_copper_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with copper ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with iron ions.
BS:00137
Fe_contact_site
polypeptide iron ion contact site
sequence
SO:0001097
polypeptide_iron_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with iron ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with magnesium ions.
BS:00187
Mg_contact_site
polypeptide magnesium ion contact site
sequence
SO:0001098
polypeptide_magnesium_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with magnesium ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with manganese ions.
BS:00140
Mn_contact_site
polypeptide manganese ion contact site
sequence
SO:0001099
polypeptide_manganese_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with manganese ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with molybdenum ions.
BS:00141
Mo_contact_site
polypeptide molybdenum ion contact site
sequence
SO:0001100
polypeptide_molybdenum_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with molybdenum ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with nickel ions.
BS:00142
Ni_contact_site
polypeptide nickel ion contact site
sequence
SO:0001101
polypeptide_nickel_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with nickel ions.
EBIBS:GAR
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with tungsten ions.
BS:00143
W_contact_site
polypeptide tungsten ion contact site
sequence
SO:0001102
polypeptide_tungsten_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with tungsten ions.
EBIBS:GAR
SO:cb
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with zinc ions.
BS:00185
Zn_contact_site
polypeptide zinc ion contact site
sequence
SO:0001103
polypeptide_zinc_ion_contact_site
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with zinc ions.
EBIBS:GAR
SO:cb
Amino acid involved in the activity of an enzyme.
BS:00026
active site residue
catalytic residue
sequence
act_site
SO:0001104
Discrete.
catalytic_residue
Amino acid involved in the activity of an enzyme.
EBIBS:GAR
UniProt:curation_manual
act_site
uniprot:feature_type
Residues which interact with a ligand.
BS:00157
polypeptide ligand contact
sequence
protein-ligand interaction
SO:0001105
polypeptide_ligand_contact
Residues which interact with a ligand.
EBIBS:GAR
A motif of five consecutive residues and two H-bonds in which: Residue(i) is Aspartate or Asparagine (Asx), side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3), main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4).
BS:00202
asx motif
sequence
SO:0001106
asx_motif
A motif of five consecutive residues and two H-bonds in which: Residue(i) is Aspartate or Asparagine (Asx), side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3), main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4).
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of three residues within a beta-sheet in which the main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows: Residue(i): -140 degrees < phi(l) -20 degrees , -90 degrees < psi(l) < 40 degrees. Residue (i+1): -180 degrees < phi < -25 degrees or +120 degrees < phi < +180 degrees, +40 degrees < psi < +180 degrees or -180 degrees < psi < -120 degrees.
BS:00208
http://en.wikipedia.org/wiki/Beta_bulge
beta bulge
sequence
SO:0001107
beta_bulge
A motif of three residues within a beta-sheet in which the main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows: Residue(i): -140 degrees < phi(l) -20 degrees , -90 degrees < psi(l) < 40 degrees. Residue (i+1): -180 degrees < phi < -25 degrees or +120 degrees < phi < +180 degrees, +40 degrees < psi < +180 degrees or -180 degrees < psi < -120 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
http://en.wikipedia.org/wiki/Beta_bulge
wiki
A motif of three residues within a beta-sheet consisting of two H-bonds. Beta bulge loops often occur at the loop ends of beta-hairpins.
BS:00209
beta bulge loop
sequence
SO:0001108
beta_bulge_loop
A motif of three residues within a beta-sheet consisting of two H-bonds. Beta bulge loops often occur at the loop ends of beta-hairpins.
EBIBS:GAR
Http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+4), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+3), these loops have an RL nest at residues i+2 and i+3.
BS:00210
beta bulge loop five
sequence
SO:0001109
beta_bulge_loop_five
A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+4), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+3), these loops have an RL nest at residues i+2 and i+3.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+5), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+4), these loops have an RL nest at residues i+3 and i+4.
BS:00211
beta bulge loop six
sequence
SO:0001110
beta_bulge_loop_six
A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+5), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+4), these loops have an RL nest at residues i+3 and i+4.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A beta strand describes a single length of polypeptide chain that forms part of a beta sheet. A single continuous stretch of amino acids adopting an extended conformation of hydrogen bonds between the N-O and the C=O of another part of the peptide. This forms a secondary protein structure in which two or more extended polypeptide regions are hydrogen-bonded to one another in a planar array.
BS:00042
http://en.wikipedia.org/wiki/Beta_sheet
sequence
strand
SO:0001111
Range.
beta_strand
A beta strand describes a single length of polypeptide chain that forms part of a beta sheet. A single continuous stretch of amino acids adopting an extended conformation of hydrogen bonds between the N-O and the C=O of another part of the peptide. This forms a secondary protein structure in which two or more extended polypeptide regions are hydrogen-bonded to one another in a planar array.
EBIBS:GAR
UniProt:curation_manual
http://en.wikipedia.org/wiki/Beta_sheet
wiki
strand
uniprot:feature_type
A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (one running N-terminal to C-terminal and one running C-terminal to N-terminal). Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i) and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they form two mutual backbone hydrogen bonds to each other's flanking peptide groups; this is known as a close pair of hydrogen bonds. The peptide backbone dihedral angles (phi, psi) are about (-140 degrees, 135 degrees) in antiparallel sheets.
BS:0000341
antiparallel beta strand
sequence
SO:0001112
Range.
antiparallel_beta_strand
A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (one running N-terminal to C-terminal and one running C-terminal to N-terminal). Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i) and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they form two mutual backbone hydrogen bonds to each other's flanking peptide groups; this is known as a close pair of hydrogen bonds. The peptide backbone dihedral angles (phi, psi) are about (-140 degrees, 135 degrees) in antiparallel sheets.
EBIBS:GAR
UniProt:curation_manual
A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (both running N-terminal to C-terminal). This orientation is slightly less stable because it introduces nonplanarity in the inter-strand hydrogen bonding pattern. Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i)and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they do not hydrogen bond to each other; rather, one residue forms hydrogen bonds to the residues that flank the other (but not vice versa). For example, residue i may form hydrogen bonds to residues j - 1 and j + 1; this is known as a wide pair of hydrogen bonds. By contrast, residue j may hydrogen-bond to different residues altogether, or to none at all. The dihedral angles (phi, psi) are about (-120 degrees, 115 degrees) in parallel sheets.
BS:00151
parallel beta strand
sequence
SO:0001113
Range.
parallel_beta_strand
A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (both running N-terminal to C-terminal). This orientation is slightly less stable because it introduces nonplanarity in the inter-strand hydrogen bonding pattern. Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i)and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they do not hydrogen bond to each other; rather, one residue forms hydrogen bonds to the residues that flank the other (but not vice versa). For example, residue i may form hydrogen bonds to residues j - 1 and j + 1; this is known as a wide pair of hydrogen bonds. By contrast, residue j may hydrogen-bond to different residues altogether, or to none at all. The dihedral angles (phi, psi) are about (-120 degrees, 115 degrees) in parallel sheets.
EBIBS:GAR
UniProt:curation_manual
A helix is a secondary_structure conformation where the peptide backbone forms a coil.
BS:00152
sequence
helix
SO:0001114
Range.
peptide_helix
A helix is a secondary_structure conformation where the peptide backbone forms a coil.
EBIBS:GAR
helix
A left handed helix is a region of peptide where the coiled conformation turns in an anticlockwise, left handed screw.
BS:00222
left handed helix
sequence
helix-l
SO:0001115
left_handed_peptide_helix
A left handed helix is a region of peptide where the coiled conformation turns in an anticlockwise, left handed screw.
EBIBS:GAR
A right handed helix is a region of peptide where the coiled conformation turns in a clockwise, right handed screw.
BS:0000339
right handed helix
sequence
helix
SO:0001116
right_handed_peptide_helix
A right handed helix is a region of peptide where the coiled conformation turns in a clockwise, right handed screw.
EBIBS:GAR
helix
The helix has 3.6 residues per turn which corresponds to a translation of 1.5 angstroms (= 0.15 nm) along the helical axis. Every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier.
BS:00040
http://en.wikipedia.org/wiki/Alpha_helix
sequence
a-helix
helix
SO:0001117
Range.
alpha_helix
The helix has 3.6 residues per turn which corresponds to a translation of 1.5 angstroms (= 0.15 nm) along the helical axis. Every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier.
EBIBS:GAR
http://en.wikipedia.org/wiki/Alpha_helix
wiki
a-helix
helix
uniprot:feature_type
The pi helix has 4.1 residues per turn and a translation of 1.15 (=0.115 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid five residues earlier.
BS:00153
http://en.wikipedia.org/wiki/Pi_helix
pi helix
sequence
SO:0001118
Range.
pi_helix
The pi helix has 4.1 residues per turn and a translation of 1.15 (=0.115 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid five residues earlier.
EBIBS:GAR
http://en.wikipedia.org/wiki/Pi_helix
wiki
The 3-10 helix has 3 residues per turn with a translation of 2.0 angstroms (=0.2 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid three residues earlier.
BS:0000340
http://en.wikipedia.org/wiki/310_helix
3(10) helix
3-10 helix
310 helix
three ten helix
sequence
SO:0001119
Range.
three_ten_helix
The 3-10 helix has 3 residues per turn with a translation of 2.0 angstroms (=0.2 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid three residues earlier.
EBIBS:GAR
http://en.wikipedia.org/wiki/310_helix
wiki
A motif of two consecutive residues with dihedral angles. Nest should not have Proline as any residue. Nests frequently occur as parts of other motifs such as Schellman loops.
BS:00223
nest_motif
sequence
nest
polypeptide nest motif
SO:0001120
polypeptide_nest_motif
A motif of two consecutive residues with dihedral angles. Nest should not have Proline as any residue. Nests frequently occur as parts of other motifs such as Schellman loops.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
nest
A motif of two consecutive residues with dihedral angles: Residue(i): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
BS:00224
nest_left_right
nest_lr
polypeptide nest left right motif
sequence
SO:0001121
polypeptide_nest_left_right_motif
A motif of two consecutive residues with dihedral angles: Residue(i): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of two consecutive residues with dihedral angles: Residue(i): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
BS:00225
nest_right_left
nest_rl
polypeptide nest right left motif
sequence
SO:0001122
polypeptide_nest_right_left_motif
A motif of two consecutive residues with dihedral angles: Residue(i): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of six or seven consecutive residues that contains two H-bonds.
BS:00226
schellmann loop
sequence
paperclip
paperclip loop
SO:0001123
schellmann_loop
A motif of six or seven consecutive residues that contains two H-bonds.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
paperclip
Wild type: A motif of seven consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+6), the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+5).
BS:00228
schellmann loop seven
seven-residue schellmann loop
sequence
SO:0001124
schellmann_loop_seven
Wild type: A motif of seven consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+6), the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+5).
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Common Type: A motif of six consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+5) the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+4).
BS:00227
schellmann loop six
six-residue schellmann loop
sequence
SO:0001125
schellmann_loop_six
Common Type: A motif of six consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+5) the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+4).
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of five consecutive residues and two hydrogen bonds in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) , the main-chain CO group of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4).
BS:00229
serine/threonine motif
st motif
st_motif
sequence
SO:0001126
serine_threonine_motif
A motif of five consecutive residues and two hydrogen bonds in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) , the main-chain CO group of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4).
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of four or five consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain OH of residue(i) is H-bonded to the main-chain CO of residue(i3) or (i4), Phi angles of residues(i1), (i2) and (i3) are negative.
BS:00230
serine threonine staple motif
st_staple
sequence
SO:0001127
serine_threonine_staple_motif
A motif of four or five consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain OH of residue(i) is H-bonded to the main-chain CO of residue(i3) or (i4), Phi angles of residues(i1), (i2) and (i3) are negative.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A reversal in the direction of the backbone of a protein that is stabilized by hydrogen bond between backbone NH and CO groups, involving no more than 4 amino acid residues.
BS:00148
sequence
turn
SO:0001128
Range.
polypeptide_turn_motif
A reversal in the direction of the backbone of a protein that is stabilized by hydrogen bond between backbone NH and CO groups, involving no more than 4 amino acid residues.
EBIBS:GAR
uniprot:feature_type
turn
Left handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
BS:00206
asx turn left handed type one
sequence
asx_turn_il
SO:0001129
asx_turn_left_handed_type_one
Left handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Left handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
BS:00204
asx turn left handed type two
asx_turn_iil
sequence
SO:0001130
asx_turn_left_handed_type_two
Left handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Right handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
BS:00205
asx turn right handed type two
asx_turn_iir
sequence
SO:0001131
asx_turn_right_handed_type_two
Right handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Right handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
BS:00207
asx turn type right handed type one
asx_turn_ir
sequence
SO:0001132
asx_turn_right_handed_type_one
Right handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles of the second and third residues, which are the basis for sub-categorization.
BS:00212
beta turn
sequence
SO:0001133
beta_turn
A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles of the second and third residues, which are the basis for sub-categorization.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Left handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles:- Residue(i+1): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. Residue(i+2): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees.
BS:00215
beta turn left handed type one
beta_turn_il
type I' beta turn
type I' turn
sequence
SO:0001134
beta_turn_left_handed_type_one
Left handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles:- Residue(i+1): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. Residue(i+2): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Left handed type II: A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees > phi > -20 degrees, +80 degrees > psi > +180 degrees. Residue(i+2): +20 degrees > phi > +140 degrees, -40 degrees > psi > +90 degrees.
BS:00213
beta turn left handed type two
beta_turn_iil
type II' beta turn
type II' turn
sequence
SO:0001135
beta_turn_left_handed_type_two
Left handed type II: A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees > phi > -20 degrees, +80 degrees > psi > +180 degrees. Residue(i+2): +20 degrees > phi > +140 degrees, -40 degrees > psi > +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Right handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+2): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
BS:00216
beta turn right handed type one
beta_turn_ir
type I beta turn
type I turn
sequence
SO:0001136
beta_turn_right_handed_type_one
Right handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+2): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Right handed type II:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, +80 degrees < psi < +180 degrees. Residue(i+2): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
BS:00214
beta turn right handed type two
beta_turn_iir
type II beta turn
type II turn
sequence
SO:0001137
beta_turn_right_handed_type_two
Right handed type II:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, +80 degrees < psi < +180 degrees. Residue(i+2): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Gamma turns, defined for 3 residues i,( i+1),( i+2) if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees.
BS:00219
gamma turn
sequence
SO:0001138
gamma_turn
Gamma turns, defined for 3 residues i,( i+1),( i+2) if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=75.0 - psi(i+1)=-64.0.
BS:00220
classic gamma turn
gamma turn classic
sequence
SO:0001139
gamma_turn_classic
Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=75.0 - psi(i+1)=-64.0.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=-79.0 - psi(i+1)=69.0.
BS:00221
gamma turn inverse
sequence
SO:0001140
gamma_turn_inverse
Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=-79.0 - psi(i+1)=69.0.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of three consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).
BS:00231
serine/threonine turn
st_turn
sequence
SO:0001141
serine_threonine_turn
A motif of three consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees.
BS:00234
st turn left handed type one
st_turn_il
sequence
SO:0001142
st_turn_left_handed_type_one
The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees.
BS:00232
st turn left handed type two
st_turn_iil
sequence
SO:0001143
st_turn_left_handed_type_two
The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees.
BS:00235
st turn right handed type one
st_turn_ir
sequence
SO:0001144
st_turn_right_handed_type_one
The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees.
BS:00233
st turn right handed type two
st_turn_iir
sequence
SO:0001145
st_turn_right_handed_type_two
The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A site of sequence variation (alteration). Alternative sequence due to naturally occurring events such as polymorphisms and alternative splicing or experimental methods such as site directed mutagenesis.
BS:00336
sequence_variations
sequence
SO:0001146
For example, was a substitution natural or mutated as part of an experiment? This term is added to merge the biosapiens term sequence_variations.
polypeptide_variation_site
A site of sequence variation (alteration). Alternative sequence due to naturally occurring events such as polymorphisms and alternative splicing or experimental methods such as site directed mutagenesis.
EBIBS:GAR
SO:ke
Describes the natural sequence variants due to polymorphisms, disease-associated mutations, RNA editing and variations between strains, isolates or cultivars.
BS:00071
natural_variant
sequence variation
variant
sequence
SO:0001147
Discrete.
natural_variant_site
Describes the natural sequence variants due to polymorphisms, disease-associated mutations, RNA editing and variations between strains, isolates or cultivars.
EBIBS:GAR
UniProt:curation_manual
variant
uniprot:feature_type
Site which has been experimentally altered.
BS:00036
mutagen
mutagenesis
mutated_site
sequence
SO:0001148
Discrete.
mutated_variant_site
Site which has been experimentally altered.
EBIBS:GAR
UniProt:curation_manual
mutagen
uniprot:feature_type
Description of sequence variants produced by alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting.
BS:00073
SO:0001065
alternative_sequence
var_seq
isoform
sequence variation
varsplic
sequence
SO:0001149
Discrete.
alternate_sequence_site
Description of sequence variants produced by alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting.
EBIBS:GAR
UniProt:curation_manual
var_seq
uniprot:feature_type
A motif of four consecutive peptide resides of type VIa or type VIb and where the i+2 residue is cis-proline.
beta turn type six
cis-proline loop
type VI beta turn
type VI turn
sequence
SO:0001150
beta_turn_type_six
A motif of four consecutive peptide resides of type VIa or type VIb and where the i+2 residue is cis-proline.
SO:cb
A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -90 degrees, psi ~ 0 degrees.
beta turn type six a
type VIa beta turn
type VIa turn
sequence
SO:0001151
beta_turn_type_six_a
A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -90 degrees, psi ~ 0 degrees.
PMID:2371257
SO:cb
A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -60 degrees, psi-2 = 120 degrees, phi-3 = -90 degrees, psi-3 = 0 degrees.
beta turn type six a one
type VIa1 beta turn
type VIa1 turn
sequence
SO:0001152
beta_turn_type_six_a_one
A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -60 degrees, psi-2 = 120 degrees, phi-3 = -90 degrees, psi-3 = 0 degrees.
PMID:27428516
A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -120 degrees, psi-2 = 120 degrees, phi-3 = -60 degrees, psi-3 = 0 degrees.
beta turn type six a two
type VIa2 beta turn
type VIa2 turn
sequence
SO:0001153
beta_turn_type_six_a_two
A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -120 degrees, psi-2 = 120 degrees, phi-3 = -60 degrees, psi-3 = 0 degrees.
PMID:27428516
A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -120 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -60 degrees, psi ~ 0 degrees.
beta turn type six b
type VIb beta turn
type VIb turn
sequence
SO:0001154
beta_turn_type_six_b
A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -120 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -60 degrees, psi ~ 0 degrees.
PMID:2371257
SO:cb
A motif of four consecutive peptide residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ -30 degrees. Residue(i+2): phi ~ -120 degrees, psi ~ 120 degrees.
beta turn type eight
type VIII beta turn
type VIII turn
sequence
SO:0001155
beta_turn_type_eight
A motif of four consecutive peptide residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ -30 degrees. Residue(i+2): phi ~ -120 degrees, psi ~ 120 degrees.
PMID:2371257
SO:cb
A sequence element characteristic of some RNA polymerase II promoters, usually located between -10 and -60 relative to the TSS. Consensus sequence is WATCGATW.
DRE motif
NDM4
WATCGATW_motif
sequence
SO:0001156
This consensus sequence was identified computationally using the MEME algorithm within core promoter sequences from -60 to +40, with an E value of 1.7e-183. Tends to co-occur with Motif 7. Tends to not occur with DPE motif (SO:0000015) or motif 10.
DRE_motif
A sequence element characteristic of some RNA polymerase II promoters, usually located between -10 and -60 relative to the TSS. Consensus sequence is WATCGATW.
PMID:12537576
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements with respect to the TSS (+1). Consensus sequence is YGGTCACACTR. Marked spatial preference within core promoter; tend to occur near the TSS, although not as tightly as INR (SO:0000014).
DMv4
DMv4 motif
directional motif v4
motif 1 element
promoter motif 1
YGGTCACATR
sequence
SO:0001157
DMv4_motif
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements with respect to the TSS (+1). Consensus sequence is YGGTCACACTR. Marked spatial preference within core promoter; tend to occur near the TSS, although not as tightly as INR (SO:0000014).
PMID:16827941:12537576
A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and +1 relative to the TSS. Consensus sequence is AWCAGCTGWT. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015).
E box motif
generic E box motif
AWCAGCTGWT
sequence
NDM5
SO:0001158
E_box_motif
A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and +1 relative to the TSS. Consensus sequence is AWCAGCTGWT. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015).
PMID:12537576:16827941
A sequence element characteristic of some RNA polymerase II promoters, usually located between -50 and -10 relative to the TSS. Consensus sequence is KTYRGTATWTTT. Tends to co-occur with DMv4 (SO:0001157) . Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162).
DMv5
DMv5 motif
directional motif v5
KTYRGTATWTTT
sequence
promoter motif 6
SO:0001159
DMv5_motif
A sequence element characteristic of some RNA polymerase II promoters, usually located between -50 and -10 relative to the TSS. Consensus sequence is KTYRGTATWTTT. Tends to co-occur with DMv4 (SO:0001157) . Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162).
PMID:12537576:16827941
A sequence element characteristic of some RNA polymerase II promoters, usually located between -30 and +15 relative to the TSS. Consensus sequence is KNNCAKCNCTRNY. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015) or MTE (0001162).
DMv3
DMv3 motif
directional motif v3
promoter motif 7
KNNCAKCNCTRNY
sequence
SO:0001160
DMv3_motif
A sequence element characteristic of some RNA polymerase II promoters, usually located between -30 and +15 relative to the TSS. Consensus sequence is KNNCAKCNCTRNY. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015) or MTE (0001162).
PMID:12537576:16827941
A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and -45 relative to the TSS. Consensus sequence is MKSYGGCARCGSYSS. Tends to co-occur with DMv3 (SO:0001160). Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162).
DMv2
DMv2 motif
directional motif v2
promoter motif 8
MKSYGGCARCGSYSS
sequence
SO:0001161
DMv2_motif
A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and -45 relative to the TSS. Consensus sequence is MKSYGGCARCGSYSS. Tends to co-occur with DMv3 (SO:0001160). Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162).
PMID:12537576:16827941
A sequence element characteristic of some RNA polymerase II promoters, usually located between +20 and +30 relative to the TSS. Consensus sequence is CSARCSSAACGS. Tends to co-occur with INR motif (SO:0000014). Tends to not occur with DPE motif (SO:0000015) or DMv5 (SO:0001159).
motif ten element
motif_ten_element
CSARCSSAACGS
sequence
SO:0001162
MTE
A sequence element characteristic of some RNA polymerase II promoters, usually located between +20 and +30 relative to the TSS. Consensus sequence is CSARCSSAACGS. Tends to co-occur with INR motif (SO:0000014). Tends to not occur with DPE motif (SO:0000015) or DMv5 (SO:0001159).
PMID:12537576:15231738
PMID:16858867
A promoter motif with consensus sequence TCATTCG.
DMp3
INR1 motif
directional motif p3
directional promoter motif 3
sequence
SO:0001163
INR1_motif
A promoter motif with consensus sequence TCATTCG.
PMID:16827941
A promoter motif with consensus sequence CGGACGT.
DMp5
DPE1 motif
directional motif 5
sequence
directional promoter motif 5
SO:0001164
DPE1_motif
A promoter motif with consensus sequence CGGACGT.
PMID:16827941
A promoter motif with consensus sequence CARCCCT.
DMv1 motif
sequence
DMv1
directional promoter motif v1
SO:0001165
DMv1_motif
A promoter motif with consensus sequence CARCCCT.
PMID:16827941
A non directional promoter motif with consensus sequence GAGAGCG.
GAGA
GAGA motif
NDM1
sequence
SO:0001166
GAGA_motif
A non directional promoter motif with consensus sequence GAGAGCG.
PMID:16827941
A non directional promoter motif with consensus CGMYGYCR.
NDM2
NDM2 motif
non directional promoter motif 2
sequence
SO:0001167
NDM2_motif
A non directional promoter motif with consensus CGMYGYCR.
PMID:16827941
A non directional promoter motif with consensus sequence GAAAGCT.
NDM3
NDM3 motif
non directional motif 3
sequence
SO:0001168
NDM3_motif
A non directional promoter motif with consensus sequence GAAAGCT.
PMID:16827941
A ds_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded RNA.
double stranded RNA virus sequence
ds RNA viral sequence
sequence
SO:0001169
ds_RNA_viral_sequence
A ds_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded RNA.
SO:ke
A kind of DNA transposon that populates the genomes of protists, fungi, and animals, characterized by a unique set of proteins necessary for their transposition, including a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. Polintons are characterized by 6-bp target site duplications, terminal-inverted repeats that are several hundred nucleotides long, and 5'-AG and TC-3' termini. Polintons exist as autonomous and nonautonomous elements.
sequence
maverick element
SO:0001170
polinton
A kind of DNA transposon that populates the genomes of protists, fungi, and animals, characterized by a unique set of proteins necessary for their transposition, including a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. Polintons are characterized by 6-bp target site duplications, terminal-inverted repeats that are several hundred nucleotides long, and 5'-AG and TC-3' termini. Polintons exist as autonomous and nonautonomous elements.
PMID:16537396
A component of the large ribosomal subunit in mitochondrial rRNA.
SO:0002345
21S LSU rRNA
21S rRNA
21S ribosomal RNA
rRNA 21S
sequence
SO:0001171
This term has been merged into mt_LSU_rRNA (SO:0002345) as part of reorganization of rRNA child terms 10 June 2021. Requested by EBI. See GitHub Issue #493.
rRNA_21S
true
A component of the large ribosomal subunit in mitochondrial rRNA.
RSC:cb
A region of a tRNA.
tRNA region
sequence
SO:0001172
tRNA_region
A region of a tRNA.
RSC:cb
A sequence of seven nucleotide bases in tRNA which contains the anticodon. It has the sequence 5'-pyrimidine-purine-anticodon-modified purine-any base-3.
anti-codon loop
anticodon loop
sequence
SO:0001173
anticodon_loop
A sequence of seven nucleotide bases in tRNA which contains the anticodon. It has the sequence 5'-pyrimidine-purine-anticodon-modified purine-any base-3.
ISBN:0716719207
A sequence of three nucleotide bases in tRNA which recognizes a codon in mRNA.
http://en.wikipedia.org/wiki/Anticodon
anti-codon
sequence
SO:0001174
anticodon
A sequence of three nucleotide bases in tRNA which recognizes a codon in mRNA.
RSC:cb
http://en.wikipedia.org/wiki/Anticodon
wiki
Base sequence at the 3' end of a tRNA. The 3'-hydroxyl group on the terminal adenosine is the attachment point for the amino acid.
CCA sequence
CCA tail
sequence
SO:0001175
CCA_tail
Base sequence at the 3' end of a tRNA. The 3'-hydroxyl group on the terminal adenosine is the attachment point for the amino acid.
ISBN:0716719207
Non-base-paired sequence of nucleotide bases in tRNA. It contains several dihydrouracil residues.
DHU loop
sequence
D loop
SO:0001176
DHU_loop
Non-base-paired sequence of nucleotide bases in tRNA. It contains several dihydrouracil residues.
ISBN:071671920
Non-base-paired sequence of three nucleotide bases in tRNA. It has sequence T-Psi-C.
T loop
TpsiC loop
sequence
SO:0001177
T_loop
Non-base-paired sequence of three nucleotide bases in tRNA. It has sequence T-Psi-C.
ISBN:0716719207
A primary transcript encoding pyrrolysyl tRNA (SO:0000766).
pyrrolysine tRNA primary transcript
sequence
SO:0001178
pyrrolysine_tRNA_primary_transcript
A primary transcript encoding pyrrolysyl tRNA (SO:0000766).
RSC:cb
U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA.
http://en.wikipedia.org/wiki/Small_nucleolar_RNA_U3
U3 small nucleolar RNA
U3 snoRNA
small nucleolar RNA U3
snoRNA U3
sequence
SO:0001179
The definition is most of the old definition for snoRNA (SO:0000275).
U3_snoRNA
U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00012
http://en.wikipedia.org/wiki/Small_nucleolar_RNA_U3
wiki
A cis-acting element found in the 3' UTR of some mRNA which is rich in AUUUA pentamers. Messenger RNAs bearing multiple AU-rich elements are often unstable.
http://en.wikipedia.org/wiki/AU-rich_element
AU rich element
AU-rich element
sequence
ARE
SO:0001180
AU_rich_element
A cis-acting element found in the 3' UTR of some mRNA which is rich in AUUUA pentamers. Messenger RNAs bearing multiple AU-rich elements are often unstable.
PMID:7892223
http://en.wikipedia.org/wiki/AU-rich_element
wiki
A cis-acting element found in the 3' UTR of some mRNA which is bound by the Drosophila Bruno protein and its homologs.
Bruno response element
sequence
BRE
SO:0001181
Not to be confused with BRE_motif (SO:0000016), which binds transcription factor II B.
Bruno_response_element
A cis-acting element found in the 3' UTR of some mRNA which is bound by the Drosophila Bruno protein and its homologs.
PMID:10893231
A regulatory sequence found in the 5' and 3' UTRs of many mRNAs which encode iron-binding proteins. It has a hairpin structure and is recognized by trans-acting proteins known as iron-regulatory proteins.
http://en.wikipedia.org/wiki/Iron_responsive_element
IRE
iron responsive element
sequence
SO:0001182
iron_responsive_element
A regulatory sequence found in the 5' and 3' UTRs of many mRNAs which encode iron-binding proteins. It has a hairpin structure and is recognized by trans-acting proteins known as iron-regulatory proteins.
PMID:3198610
PMID:8710843
http://en.wikipedia.org/wiki/Iron_responsive_element
wiki
An attribute describing a sequence composed of nucleobases bound to a morpholino backbone. A morpholino backbone consists of morpholine (CHEBI:34856) rings connected by phosphorodiamidate linkages.
http://en.wikipedia.org/wiki/Morpholino
morpholino backbone
sequence
SO:0001183
Do not use this for feature annotation. Use morpholino_oligo (SO:0000034) instead.
morpholino_backbone
An attribute describing a sequence composed of nucleobases bound to a morpholino backbone. A morpholino backbone consists of morpholine (CHEBI:34856) rings connected by phosphorodiamidate linkages.
RSC:cb
http://en.wikipedia.org/wiki/Morpholino
wiki
An attribute describing a sequence composed of peptide nucleic acid (CHEBI:48021), a chemical consisting of nucleobases bound to a backbone composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds.
sequence
peptide nucleic acid
SO:0001184
Do not use this term for feature annotation. Use PNA_oligo (SO:0001011) instead.
PNA
An attribute describing a sequence composed of peptide nucleic acid (CHEBI:48021), a chemical consisting of nucleobases bound to a backbone composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds.
RSC:cb
An attribute describing the sequence of a transcript that has catalytic activity with or without an associated ribonucleoprotein.
sequence
SO:0001185
Do not use this for feature annotation. Use enzymatic_RNA (SO:0000372) instead.
enzymatic
An attribute describing the sequence of a transcript that has catalytic activity with or without an associated ribonucleoprotein.
RSC:cb
An attribute describing the sequence of a transcript that has catalytic activity even without an associated ribonucleoprotein.
sequence
SO:0001186
Do not use this for feature annotation. Use ribozyme (SO:0000374) instead.
ribozymic
An attribute describing the sequence of a transcript that has catalytic activity even without an associated ribonucleoprotein.
RSC:cb
A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue.
pseudouridylation guide snoRNA
sequence
SO:0001187
Has RNA pseudouridylation guide activity (GO:0030558).
pseudouridylation_guide_snoRNA
A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue.
GOC:mah
PMID:12457565
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of 'locked' deoxyribose rings connected to a phosphate backbone. The deoxyribose unit's conformation is 'locked' by a 2'-C,4'-C-oxymethylene link.
sequence
SO:0001188
Do not use this term for feature annotation. Use LNA_oligo (SO:0001189) instead.
LNA
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of 'locked' deoxyribose rings connected to a phosphate backbone. The deoxyribose unit's conformation is 'locked' by a 2'-C,4'-C-oxymethylene link.
CHEBI:48010
An oligo composed of LNA residues.
http://en.wikipedia.org/wiki/Locked_nucleic_acid
LNA oligo
locked nucleic acid
sequence
SO:0001189
LNA_oligo
An oligo composed of LNA residues.
RSC:cb
http://en.wikipedia.org/wiki/Locked_nucleic_acid
wiki
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of threose rings connected to a phosphate backbone.
sequence
SO:0001190
Do not use this term for feature annotation. Use TNA_oligo (SO:0001191) instead.
TNA
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of threose rings connected to a phosphate backbone.
CHEBI:48019
An oligo composed of TNA residues.
http://en.wikipedia.org/wiki/Threose_nucleic_acid
TNA oligo
threose nucleic acid
sequence
SO:0001191
TNA_oligo
An oligo composed of TNA residues.
RSC:cb
http://en.wikipedia.org/wiki/Threose_nucleic_acid
wiki
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of an acyclic three-carbon propylene glycol connected to a phosphate backbone. It has two enantiomeric forms, (R)-GNA and (S)-GNA.
sequence
SO:0001192
Do not use this term for feature annotation. Use GNA_oligo (SO:0001192) instead.
GNA
An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of an acyclic three-carbon propylene glycol connected to a phosphate backbone. It has two enantiomeric forms, (R)-GNA and (S)-GNA.
CHEBI:48015
An oligo composed of GNA residues.
http://en.wikipedia.org/wiki/Glycerol_nucleic_acid
GNA oligo
glycerol nucleic acid
glycol nucleic acid
sequence
SO:0001193
GNA_oligo
An oligo composed of GNA residues.
RSC:cb
http://en.wikipedia.org/wiki/Glycerol_nucleic_acid
wiki
An attribute describing a GNA sequence in the (R)-GNA enantiomer.
R GNA
sequence
SO:0001194
Do not use this term for feature annotation. Use R_GNA_oligo (SO:0001195) instead.
R_GNA
An attribute describing a GNA sequence in the (R)-GNA enantiomer.
CHEBI:48016
An oligo composed of (R)-GNA residues.
(R)-glycerol nucleic acid
(R)-glycol nucleic acid
R GNA oligo
sequence
SO:0001195
R_GNA_oligo
An oligo composed of (R)-GNA residues.
RSC:cb
An attribute describing a GNA sequence in the (S)-GNA enantiomer.
S GNA
sequence
SO:0001196
Do not use this term for feature annotation. Use S_GNA_oligo (SO:0001197) instead.
S_GNA
An attribute describing a GNA sequence in the (S)-GNA enantiomer.
CHEBI:48017
An oligo composed of (S)-GNA residues.
(S)-glycerol nucleic acid
(S)-glycol nucleic acid
S GNA oligo
sequence
SO:0001197
S_GNA_oligo
An oligo composed of (S)-GNA residues.
RSC:cb
A ds_DNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded DNA.
double stranded DNA virus
ds DNA viral sequence
sequence
SO:0001198
ds_DNA_viral_sequence
A ds_DNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded DNA.
SO:ke
A ss_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as single stranded RNA.
single strand RNA virus
ss RNA viral sequence
sequence
SO:0001199
ss_RNA_viral_sequence
A ss_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as single stranded RNA.
SO:ke
A negative_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that is complementary to mRNA and must be converted to positive sense RNA by RNA polymerase before translation.
negative sense ssRNA viral sequence
sequence
negative sense single stranded RNA virus
SO:0001200
negative_sense_ssRNA_viral_sequence
A negative_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that is complementary to mRNA and must be converted to positive sense RNA by RNA polymerase before translation.
SO:ke
A positive_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that can be immediately translated by the host.
positive sense ssRNA viral sequence
sequence
positive sense single stranded RNA virus
SO:0001201
positive_sense_ssRNA_viral_sequence
A positive_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that can be immediately translated by the host.
SO:ke
A ambisense_RNA_virus is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus with both messenger and anti messenger polarity.
ambisense single stranded RNA virus
ambisense ssRNA viral sequence
sequence
SO:0001202
ambisense_ssRNA_viral_sequence
A ambisense_RNA_virus is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus with both messenger and anti messenger polarity.
SO:ke
A region (DNA) to which RNA polymerase binds, to begin transcription.
SO:0000167
RNA polymerase promoter
sequence
SO:0001203
Term merged with promoter SO:0000167 in August 2020 as part of GREEKC initiative. See GitHub Issue 492 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/492)
RNA_polymerase_promoter
true
A region (DNA) to which RNA polymerase binds, to begin transcription.
xenbase:jb
A region (DNA) to which Bacteriophage RNA polymerase binds, to begin transcription.
Phage RNA Polymerase Promoter
sequence
SO:0001204
former parent RNA_polymerase_promoter SO:0001203 was merged with promoter SO:0000167 in Aug 2020 as part of GREEKC.
Phage_RNA_Polymerase_Promoter
A region (DNA) to which Bacteriophage RNA polymerase binds, to begin transcription.
xenbase:jb
A region (DNA) to which the SP6 RNA polymerase binds, to begin transcription.
SP6 RNA Polymerase Promoter
sequence
SO:0001205
SP6_RNA_Polymerase_Promoter
A region (DNA) to which the SP6 RNA polymerase binds, to begin transcription.
xenbase:jb
A DNA sequence to which the T3 RNA polymerase binds, to begin transcription.
T3 RNA Polymerase Promoter
sequence
SO:0001206
T3_RNA_Polymerase_Promoter
A DNA sequence to which the T3 RNA polymerase binds, to begin transcription.
xenbase:jb
A region (DNA) to which the T7 RNA polymerase binds, to begin transcription.
T7 RNA Polymerase Promoter
sequence
SO:0001207
T7_RNA_Polymerase_Promoter
A region (DNA) to which the T7 RNA polymerase binds, to begin transcription.
xenbase:jb
An EST read from the 5' end of a transcript that usually codes for a protein. These regions tend to be conserved across species and do not change much within a gene family.
5' EST
five prime EST
sequence
SO:0001208
five_prime_EST
An EST read from the 5' end of a transcript that usually codes for a protein. These regions tend to be conserved across species and do not change much within a gene family.
http://www.ncbi.nlm.nih.gov/About/primer/est.html
An EST read from the 3' end of a transcript. They are more likely to fall within non-coding, or untranslated regions(UTRs).
3' EST
three prime EST
sequence
SO:0001209
three_prime_EST
An EST read from the 3' end of a transcript. They are more likely to fall within non-coding, or untranslated regions(UTRs).
http://www.ncbi.nlm.nih.gov/About/primer/est.html
The region of mRNA (not divisible by 3 bases) that is skipped or added during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
http://en.wikipedia.org/wiki/Translational_frameshift
INSDC_qualifier:ribosomal_slippage
ribosomal frameshift
ribosomal slippage
translational frameshift
sequence
SO:0001210
Added synonym 'ribosomal_slippage' on Feb 1, 2021, a term in INSDC and GenBank. See GitHub Issue #522.
translational_frameshift
The region of mRNA (not divisible by 3 bases) that is skipped or added during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
SO:ke
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Translational_frameshift
wiki
The region of mRNA 1 base long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
plus 1 ribosomal frameshift
plus 1 translational frameshift
sequence
SO:0001211
plus_1_translational_frameshift
The region of mRNA 1 base long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
SO:ke
The region of mRNA 2 bases long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
plus 2 ribosomal frameshift
plus 2 translational frameshift
sequence
SO:0001212
plus_2_translational_frameshift
The region of mRNA 2 bases long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
SO:ke
Group III introns are introns found in the mRNA of the plastids of euglenoid protists. They are spliced by a two step transesterification with bulged adenosine as initiating nucleophile.
http://en.wikipedia.org/wiki/Group_III_intron
group III intron
sequence
SO:0001213
GO:0000374.
group_III_intron
Group III introns are introns found in the mRNA of the plastids of euglenoid protists. They are spliced by a two step transesterification with bulged adenosine as initiating nucleophile.
PMID:11377794
http://en.wikipedia.org/wiki/Group_III_intron
wiki
The maximal intersection of exon and UTR.
noncoding region of exon
sequence
SO:0001214
An exon either containing but not starting with a start codon or containing but not ending with a stop codon will be partially coding and partially non coding.
noncoding_region_of_exon
The maximal intersection of exon and UTR.
SO:ke
The region of an exon that encodes for protein sequence.
coding region of exon
sequence
SO:0001215
An exon containing either a start or stop codon will be partially coding and partially non coding.
coding_region_of_exon
The region of an exon that encodes for protein sequence.
SO:ke
An intron that spliced via endonucleolytic cleavage and ligation rather than transesterification.
endonuclease spliced intron
sequence
SO:0001216
endonuclease_spliced_intron
An intron that spliced via endonucleolytic cleavage and ligation rather than transesterification.
SO:ke
A gene that codes for an RNA that can be translated into a protein.
protein coding gene
sequence
SO:0001217
protein_coding_gene
An insertion that derives from another organism, via the use of recombinant DNA technology.
transgenic insertion
sequence
SO:0001218
transgenic_insertion
An insertion that derives from another organism, via the use of recombinant DNA technology.
SO:bm
A gene that has been produced as the product of a reverse transcriptase mediated event.
sequence
SO:0001219
retrogene
An attribute describing an epigenetic process where a gene is inactivated by RNA interference.
silenced by RNA interference
sequence
SO:0001220
RNA interference is GO:0016246.
silenced_by_RNA_interference
An attribute describing an epigenetic process where a gene is inactivated by RNA interference.
RSC:cb
An attribute describing an epigenetic process where a gene is inactivated by histone modification.
silenced by histone modification
sequence
SO:0001221
Histone modification is GO:0016570.
silenced_by_histone_modification
An attribute describing an epigenetic process where a gene is inactivated by histone modification.
RSC:cb
An attribute describing an epigenetic process where a gene is inactivated by histone methylation.
silenced by histone methylation
sequence
SO:0001222
Histone methylation is GO:0016571.
silenced_by_histone_methylation
An attribute describing an epigenetic process where a gene is inactivated by histone methylation.
RSC:cb
An attribute describing an epigenetic process where a gene is inactivated by histone deacetylation.
silenced by histone deacetylation
sequence
SO:0001223
Histone deacetylation is GO:0016573.
silenced_by_histone_deacetylation
An attribute describing an epigenetic process where a gene is inactivated by histone deacetylation.
RSC:cb
A gene that is silenced by RNA interference.
RNA interference silenced gene
RNAi silenced gene
gene silenced by RNA interference
sequence
SO:0001224
gene_silenced_by_RNA_interference
A gene that is silenced by RNA interference.
SO:xp
A gene that is silenced by histone modification.
gene silenced by histone modification
sequence
SO:0001225
gene_silenced_by_histone_modification
A gene that is silenced by histone modification.
SO:xp
A gene that is silenced by histone methylation.
gene silenced by histone methylation
sequence
SO:0001226
gene_silenced_by_histone_methylation
A gene that is silenced by histone methylation.
SO:xp
A gene that is silenced by histone deacetylation.
gene silenced by histone deacetylation
sequence
SO:0001227
gene_silenced_by_histone_deacetylation
A gene that is silenced by histone deacetylation.
SO:xp
A modified RNA base in which the 5,6-dihydrouracil is bound to the ribose ring.
RNAMOD:051
http://en.wikipedia.org/wiki/Dihydrouridine
D
sequence
SO:0001228
dihydrouridine
A modified RNA base in which the 5,6-dihydrouracil is bound to the ribose ring.
RSC:cb
http://en.wikipedia.org/wiki/Dihydrouridine
wiki
D
A modified RNA base in which the 5- position of the uracil is bound to the ribose ring instead of the 4- position.
RNAMOD:050
http://en.wikipedia.org/wiki/Pseudouridine
Y
sequence
SO:0001229
The free molecule is CHEBI:17802.
pseudouridine
A modified RNA base in which the 5- position of the uracil is bound to the ribose ring instead of the 4- position.
RSC:cb
http://en.wikipedia.org/wiki/Pseudouridine
wiki
Y
A modified RNA base in which hypoxanthine is bound to the ribose ring.
http://en.wikipedia.org/wiki/Inosine
sequence
I
RNAMOD:017
SO:0001230
The free molecule is CHEBI:17596.
inosine
A modified RNA base in which hypoxanthine is bound to the ribose ring.
RSC:cb
http://library.med.utah.edu/RNAmods/
http://en.wikipedia.org/wiki/Inosine
wiki
A modified RNA base in which guanine is methylated at the 7- position.
7-methylguanine
seven methylguanine
sequence
SO:0001231
The free molecule is CHEBI:2274.
seven_methylguanine
A modified RNA base in which guanine is methylated at the 7- position.
RSC:cb
A modified RNA base in which thymine is bound to the ribose ring.
sequence
SO:0001232
The free molecule is CHEBI:30832.
ribothymidine
A modified RNA base in which thymine is bound to the ribose ring.
RSC:cb
A modified RNA base in which methylhypoxanthine is bound to the ribose ring.
sequence
SO:0001233
methylinosine
A modified RNA base in which methylhypoxanthine is bound to the ribose ring.
RSC:cb
An attribute describing a feature that has either intra-genome or intracellular mobility.
http://en.wikipedia.org/wiki/Mobile
sequence
SO:0001234
mobile
An attribute describing a feature that has either intra-genome or intracellular mobility.
RSC:cb
http://en.wikipedia.org/wiki/Mobile
wiki
A region containing at least one unique origin of replication and a unique termination site.
http://en.wikipedia.org/wiki/Replicon_(genetics)
sequence
SO:0001235
replicon
A region containing at least one unique origin of replication and a unique termination site.
ISBN:0716719207
http://en.wikipedia.org/wiki/Replicon_(genetics)
wiki
A base is a sequence feature that corresponds to a single unit of a nucleotide polymer.
http://en.wikipedia.org/wiki/Nucleobase
sequence
SO:0001236
base
A base is a sequence feature that corresponds to a single unit of a nucleotide polymer.
SO:ke
http://en.wikipedia.org/wiki/Nucleobase
wiki
A sequence feature that corresponds to a single amino acid residue in a polypeptide.
http://en.wikipedia.org/wiki/Amino_acid
amino acid
sequence
SO:0001237
Probably in the future this will cross reference to Chebi.
amino_acid
A sequence feature that corresponds to a single amino acid residue in a polypeptide.
RSC:cb
http://en.wikipedia.org/wiki/Amino_acid
wiki
The tanscription start site that is most frequently used for transcription of a gene.
major TSS
major transcription start site
sequence
SO:0001238
major_TSS
A tanscription start site that is not the most frequently used for transcription of a gene.
minor TSS
sequence
SO:0001239
minor_TSS
The region of a gene from the 5' most TSS to the 3' TSS.
SO:0000167
TSS region
sequence
SO:0001240
Merged into promoter (SO:0000167) on 11 Feb 2021 by Dave Sant. GREEKC had asked us to merge these terms to reduce redundancy. See GitHub Issue #528
TSS_region
true
The region of a gene from the 5' most TSS to the 3' TSS.
BBOP:nw
A gene that has multiple possible transcription start sites.
encodes alternate transcription start sites
sequence
SO:0001241
encodes_alternate_transcription_start_sites
true
A part of an miRNA primary_transcript.
miRNA primary transcript region
sequence
SO:0001243
miRNA_primary_transcript_region
A part of an miRNA primary_transcript.
SO:ke
The 60-70 nucleotide region remain after Drosha processing of the primary transcript, that folds back upon itself to form a hairpin structure.
pre-miRNA
sequence
SO:0001244
pre_miRNA
The 60-70 nucleotide region remain after Drosha processing of the primary transcript, that folds back upon itself to form a hairpin structure.
SO:ke
The stem of the hairpin loop formed by folding of the pre-miRNA.
miRNA stem
sequence
SO:0001245
miRNA_stem
The stem of the hairpin loop formed by folding of the pre-miRNA.
SO:ke
The loop of the hairpin loop formed by folding of the pre-miRNA.
miRNA loop
sequence
SO:0001246
miRNA_loop
The loop of the hairpin loop formed by folding of the pre-miRNA.
SO:ke
An oligo composed of synthetic nucleotides.
synthetic oligo
sequence
SO:0001247
synthetic_oligo
An oligo composed of synthetic nucleotides.
SO:ke
A region of the genome of known length that is composed by ordering and aligning two or more different regions.
http://en.wikipedia.org/wiki/Genome_assembly#Genome_assembly
sequence
SO:0001248
assembly
A region of the genome of known length that is composed by ordering and aligning two or more different regions.
SO:ke
http://en.wikipedia.org/wiki/Genome_assembly#Genome_assembly
wiki
A fragment assembly is a genome assembly that orders overlapping fragments of the genome based on landmark sequences. The base pair distance between the landmarks is known allowing additivity of lengths.
fragment assembly
physical map
sequence
SO:0001249
fragment_assembly
A fragment assembly is a genome assembly that orders overlapping fragments of the genome based on landmark sequences. The base pair distance between the landmarks is known allowing additivity of lengths.
SO:ke
A fingerprint_map is a physical map composed of restriction fragments.
BACmap
FPC
FPCmap
fingerprint map
restriction map
sequence
SO:0001250
fingerprint_map
A fingerprint_map is a physical map composed of restriction fragments.
SO:ke
An STS map is a physical map organized by the unique STS landmarks.
STS map
sequence
SO:0001251
STS_map
An STS map is a physical map organized by the unique STS landmarks.
SO:ke
A radiation hybrid map is a physical map.
RH map
radiation hybrid map
sequence
SO:0001252
RH_map
A radiation hybrid map is a physical map.
SO:ke
A DNA fragment generated by sonication. Sonication is a technique used to sheer DNA into smaller fragments.
sonicate fragment
sequence
SO:0001253
sonicate_fragment
A DNA fragment generated by sonication. Sonication is a technique used to sheer DNA into smaller fragments.
SO:ke
A kind of chromosome variation where the chromosome complement is an exact multiple of the haploid number and is greater than the diploid number.
http://en.wikipedia.org/wiki/Polyploid
sequence
SO:0001254
polyploid
A kind of chromosome variation where the chromosome complement is an exact multiple of the haploid number and is greater than the diploid number.
SO:ke
http://en.wikipedia.org/wiki/Polyploid
wiki
A polyploid where the multiple chromosome set was derived from the same organism.
http://en.wikipedia.org/wiki/Autopolyploid
sequence
SO:0001255
autopolyploid
A polyploid where the multiple chromosome set was derived from the same organism.
SO:ke
http://en.wikipedia.org/wiki/Autopolyploid
wiki
A polyploid where the multiple chromosome set was derived from a different organism.
http://en.wikipedia.org/wiki/Allopolyploid
sequence
SO:0001256
allopolyploid
A polyploid where the multiple chromosome set was derived from a different organism.
SO:ke
http://en.wikipedia.org/wiki/Allopolyploid
wiki
The binding site (recognition site) of a homing endonuclease. The binding site is typically large.
homing endonuclease binding site
sequence
SO:0001257
homing_endonuclease_binding_site
The binding site (recognition site) of a homing endonuclease. The binding site is typically large.
SO:ke
A sequence element characteristic of some RNA polymerase II promoters with sequence ATTGCAT that binds Pou-domain transcription factors.
octamer motif
sequence
SO:0001258
Nature. 1986 Oct 16-22;323(6089):640-3.
octamer_motif
A sequence element characteristic of some RNA polymerase II promoters with sequence ATTGCAT that binds Pou-domain transcription factors.
GOC:dh
PMID:3095662
A chromosome originating in an apicoplast.
apicoplast chromosome
sequence
SO:0001259
apicoplast_chromosome
A chromosome originating in an apicoplast.
SO:xp
A collection of discontinuous sequences.
sequence collection
sequence
SO:0001260
sequence_collection
A collection of discontinuous sequences.
SO:ke
A continuous region of sequence composed of the overlapping of multiple sequence_features, which ultimately provides evidence for another sequence_feature.
overlapping feature set
sequence
SO:0001261
This feature was requested by Nicole, tracker id 1911479. It is required to gather evidence together for annotation. An example would be overlapping ESTs that support an mRNA.
overlapping_feature_set
A continuous region of sequence composed of the overlapping of multiple sequence_features, which ultimately provides evidence for another sequence_feature.
SO:ke
A continous experimental result region extending the length of multiple overlapping EST's.
overlapping EST set
sequence
SO:0001262
overlapping_EST_set
A continous experimental result region extending the length of multiple overlapping EST's.
SO:ke
A gene that encodes a non-coding RNA.
ncRNA gene
non-coding RNA gene
sequence
SO:0001263
ncRNA_gene
A noncoding RNA that guides the insertion or deletion of uridine residues in mitochondrial mRNAs. This may also refer to synthetic RNAs used to guide DNA editing using the CRIPSR/Cas9 system.
gRNA gene
sequence
SO:0001264
gRNA_gene
A small noncoding RNA of approximately 22 nucleotides in length which may be involved in regulation of gene expression.
SO:0001270
miRNA gene
stRNA gene
stRNA_gene
sequence
SO:0001265
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514.
miRNA_gene
A gene encoding a small noncoding RNA that is generally found only in the cytoplasm.
scRNA gene
sequence
SO:0001266
scRNA_gene
A gene encoding a small noncoding RNA that participates in the processing or chemical modifications of many RNAs, including ribosomal RNAs and spliceosomal RNAs.
snoRNA gene
sequence
SO:0001267
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. Added additional children of snoRNA on 18 Nov 2021 at the request of Steven Marygold. See GitHub Issue #519.
snoRNA_gene
A gene that encodes a small nuclear RNA.
small nuclear RNA gene
snRNA gene
sequence
SO:0001268
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514.
snRNA_gene
A gene that encodes a small nuclear RNA.
http://en.wikipedia.org/wiki/Small_nuclear_RNA
A gene that encodes a signal recognition particle (SRP) RNA.
SRP RNA gene
sequence
SO:0001269
SRP_RNA_gene
true
A bacterial RNA with both tRNA and mRNA like properties.
tmRNA gene
sequence
SO:0001271
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514.
tmRNA_gene
A noncoding RNA that binds to a specific amino acid to allow that amino acid to be used by the ribosome during translation of RNA.
tRNA gene
sequence
SO:0001272
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514.
tRNA_gene
A modified adenine is an adenine base feature that has been altered.
modified adenosine
sequence
SO:0001273
modified_adenosine
A modified adenine is an adenine base feature that has been altered.
SO:ke
A modified inosine is an inosine base feature that has been altered.
modified inosine
sequence
SO:0001274
modified_inosine
A modified inosine is an inosine base feature that has been altered.
SO:ke
A modified cytidine is a cytidine base feature which has been altered.
modified cytidine
sequence
SO:0001275
modified_cytidine
A modified cytidine is a cytidine base feature which has been altered.
SO:ke
A guanosine base that has been modified.
modified guanosine
sequence
SO:0001276
modified_guanosine
A uridine base that has been modified.
modified uridine
sequence
SO:0001277
modified_uridine
1-methylinosine is a modified inosine.
RNAMOD:018
1-methylinosine
m1I
one methylinosine
sequence
SO:0001278
one_methylinosine
1-methylinosine is a modified inosine.
http://library.med.utah.edu/RNAmods/
m1I
1,2'-O-dimethylinosine is a modified inosine.
RNAMOD:019
1,2'-O-dimethylinosine
m'Im
one two prime O dimethylinosine
sequence
SO:0001279
one_two_prime_O_dimethylinosine
1,2'-O-dimethylinosine is a modified inosine.
http://library.med.utah.edu/RNAmods/
m'Im
2'-O-methylinosine is a modified inosine.
RNAMOD:081
2'-O-methylinosine
Im
two prime O methylinosine
sequence
SO:0001280
two_prime_O_methylinosine
2'-O-methylinosine is a modified inosine.
http://library.med.utah.edu/RNAmods/
Im
3-methylcytidine is a modified cytidine.
RNAMOD:020
3-methylcytidine
m3C
three methylcytidine
sequence
SO:0001281
three_methylcytidine
3-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m3C
5-methylcytidine is a modified cytidine.
RNAMOD:021
5-methylcytidine
five methylcytidine
m5C
sequence
SO:0001282
five_methylcytidine
5-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m5C
2'-O-methylcytidine is a modified cytidine.
RNAMOD:022
2'-O-methylcytidine
Cm
two prime O methylcytidine
sequence
SO:0001283
two_prime_O_methylcytidine
2'-O-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
Cm
2-thiocytidine is a modified cytidine.
RNAMOD:023
2-thiocytidine
s2C
two thiocytidine
sequence
SO:0001284
two_thiocytidine
2-thiocytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
s2C
N4-acetylcytidine is a modified cytidine.
RNAMOD:024
N4 acetylcytidine
N4-acetylcytidine
ac4C
sequence
SO:0001285
N4_acetylcytidine
N4-acetylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
ac4C
5-formylcytidine is a modified cytidine.
RNAMOD:025
5-formylcytidine
f5C
five formylcytidine
sequence
SO:0001286
five_formylcytidine
5-formylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
f5C
5,2'-O-dimethylcytidine is a modified cytidine.
RNAMOD:026
5,2'-O-dimethylcytidine
five two prime O dimethylcytidine
m5Cm
sequence
SO:0001287
five_two_prime_O_dimethylcytidine
5,2'-O-dimethylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m5Cm
N4-acetyl-2'-O-methylcytidine is a modified cytidine.
RNAMOD:027
N4 acetyl 2 prime O methylcytidine
N4-acetyl-2'-O-methylcytidine
ac4Cm
sequence
SO:0001288
N4_acetyl_2_prime_O_methylcytidine
N4-acetyl-2'-O-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
ac4Cm
Lysidine is a modified cytidine.
RNAMOD:028
http://en.wikipedia.org/wiki/Lysidine
k2C
sequence
SO:0001289
lysidine
Lysidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
http://en.wikipedia.org/wiki/Lysidine
wiki
k2C
N4-methylcytidine is a modified cytidine.
RNAMOD:082
N4 methylcytidine
N4-methylcytidine
m4C
sequence
SO:0001290
N4_methylcytidine
N4-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m4C
N4,2'-O-dimethylcytidine is a modified cytidine.
RNAMOD:083
N4 2 prime O dimethylcytidine
N4,2'-O-dimethylcytidine
m4Cm
sequence
SO:0001291
N4_2_prime_O_dimethylcytidine
N4,2'-O-dimethylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m4Cm
5-hydroxymethylcytidine is a modified cytidine.
RNAMOD:084
5-hydroxymethylcytidine
five hydroxymethylcytidine
hm5C
sequence
SO:0001292
five_hydroxymethylcytidine
5-hydroxymethylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
hm5C
5-formyl-2'-O-methylcytidine is a modified cytidine.
RNAMOD:095
5-formyl-2'-O-methylcytidine
f5Cm
five formyl two prime O methylcytidine
sequence
SO:0001293
five_formyl_two_prime_O_methylcytidine
5-formyl-2'-O-methylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
f5Cm
N4_N4_2_prime_O_trimethylcytidine is a modified cytidine.
RNAMOD:107
N4,N4,2'-O-trimethylcytidine
m42Cm
sequence
SO:0001294
N4_N4_2_prime_O_trimethylcytidine
N4_N4_2_prime_O_trimethylcytidine is a modified cytidine.
http://library.med.utah.edu/RNAmods/
m42Cm
1_methyladenosine is a modified adenosine.
RNAMOD:001
1-methyladenosine
m1A
one methyladenosine
sequence
SO:0001295
one_methyladenosine
1_methyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m1A
2_methyladenosine is a modified adenosine.
RNAMOD:002
2-methyladenosine
m2A
two methyladenosine
sequence
SO:0001296
two_methyladenosine
2_methyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m2A
N6_methyladenosine is a modified adenosine.
RNAMOD:003
N6 methyladenosine
N6-methyladenosine
m6A
sequence
SO:0001297
N6_methyladenosine
N6_methyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m6A
2prime_O_methyladenosine is a modified adenosine.
RNAMOD:004
2'-O-methyladenosine
Am
two prime O methyladenosine
sequence
SO:0001298
two_prime_O_methyladenosine
2prime_O_methyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
Am
2_methylthio_N6_methyladenosine is a modified adenosine.
RNAMOD:005
2-methylthio-N6-methyladenosine
ms2m6A
two methylthio N6 methyladenosine
sequence
SO:0001299
two_methylthio_N6_methyladenosine
2_methylthio_N6_methyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ms2m6A
N6_isopentenyladenosine is a modified adenosine.
RNAMOD:006
N6 isopentenyladenosine
N6-isopentenyladenosine
i6A
sequence
SO:0001300
N6_isopentenyladenosine
N6_isopentenyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
i6A
2_methylthio_N6_isopentenyladenosine is a modified adenosine.
RNAMOD:007
2-methylthio-N6-isopentenyladenosine
ms2i6A
two methylthio N6 isopentenyladenosine
sequence
SO:0001301
two_methylthio_N6_isopentenyladenosine
2_methylthio_N6_isopentenyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ms2i6A
N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine.
RNAMOD:008
N6 cis hydroxyisopentenyl adenosine
N6-(cis-hydroxyisopentenyl)adenosine
io6A
sequence
SO:0001302
N6_cis_hydroxyisopentenyl_adenosine
N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
io6A
2_methylthio_N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine.
RNAMOD:009
2-methylthio-N6-(cis-hydroxyisopentenyl) adenosine
ms2io6A
two methylthio N6 cis hydroxyisopentenyl adenosine
sequence
SO:0001303
two_methylthio_N6_cis_hydroxyisopentenyl_adenosine
2_methylthio_N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ms2io6A
N6_glycinylcarbamoyladenosine is a modified adenosine.
RNAMOD:010
N6 glycinylcarbamoyladenosine
N6-glycinylcarbamoyladenosine
g6A
sequence
SO:0001304
N6_glycinylcarbamoyladenosine
N6_glycinylcarbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
g6A
N6_threonylcarbamoyladenosine is a modified adenosine.
RNAMOD:011
N6 threonylcarbamoyladenosine
N6-threonylcarbamoyladenosine
t6A
sequence
SO:0001305
N6_threonylcarbamoyladenosine
N6_threonylcarbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
t6A
2_methylthio_N6_threonyl_carbamoyladenosine is a modified adenosine.
RNAMOD:012
2-methylthio-N6-threonyl carbamoyladenosine
ms2t6A
two methylthio N6 threonyl carbamoyladenosine
sequence
SO:0001306
two_methylthio_N6_threonyl_carbamoyladenosine
2_methylthio_N6_threonyl_carbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ms2t6A
N6_methyl_N6_threonylcarbamoyladenosine is a modified adenosine.
RNAMOD:013
N6 methyl N6 threonylcarbamoyladenosine
N6-methyl-N6-threonylcarbamoyladenosine
m6t6A
sequence
SO:0001307
N6_methyl_N6_threonylcarbamoyladenosine
N6_methyl_N6_threonylcarbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m6t6A
N6_hydroxynorvalylcarbamoyladenosine is a modified adenosine.
RNAMOD:014
N6 hydroxynorvalylcarbamoyladenosine
N6-hydroxynorvalylcarbamoyladenosine
hn6A
sequence
SO:0001308
N6_hydroxynorvalylcarbamoyladenosine
N6_hydroxynorvalylcarbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
hn6A
2_methylthio_N6_hydroxynorvalyl_carbamoyladenosine is a modified adenosine.
RNAMOD:015
2-methylthio-N6-hydroxynorvalyl carbamoyladenosine
ms2hn6A
two methylthio N6 hydroxynorvalyl carbamoyladenosine
sequence
SO:0001309
two_methylthio_N6_hydroxynorvalyl_carbamoyladenosine
2_methylthio_N6_hydroxynorvalyl_carbamoyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ms2hn6A
2prime_O_ribosyladenosine_phosphate is a modified adenosine.
RNAMOD:016
2'-O-ribosyladenosine (phosphate)
Ar(p)
two prime O ribosyladenosine phosphate
sequence
SO:0001310
two_prime_O_ribosyladenosine_phosphate
2prime_O_ribosyladenosine_phosphate is a modified adenosine.
http://library.med.utah.edu/RNAmods/
Ar(p)
N6_N6_dimethyladenosine is a modified adenosine.
RNAMOD:080
N6,N6-dimethyladenosine
m62A
sequence
SO:0001311
N6_N6_dimethyladenosine
N6_N6_dimethyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m62A
N6_2prime_O_dimethyladenosine is a modified adenosine.
RNAMOD:088
N6 2 prime O dimethyladenosine
N6,2'-O-dimethyladenosine
m6Am
sequence
SO:0001312
N6_2_prime_O_dimethyladenosine
N6_2prime_O_dimethyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m6Am
N6_N6_2prime_O_trimethyladenosine is a modified adenosine.
RNAMOD:089
N6,N6,2'-O-trimethyladenosine
m62Am
sequence
SO:0001313
N6_N6_2_prime_O_trimethyladenosine
N6_N6_2prime_O_trimethyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m62Am
1,2'-O-dimethyladenosine is a modified adenosine.
RNAMOD:097
1,2'-O-dimethyladenosine
m1Am
one two prime O dimethyladenosine
sequence
SO:0001314
one_two_prime_O_dimethyladenosine
1,2'-O-dimethyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
m1Am
N6_acetyladenosine is a modified adenosine.
RNAMOD:102
N6 acetyladenosine
N6-acetyladenosine
ac6A
sequence
SO:0001315
N6_acetyladenosine
N6_acetyladenosine is a modified adenosine.
http://library.med.utah.edu/RNAmods/
ac6A
7-deazaguanosine is a modified guanosine.
seven deazaguanosine
sequence
7-deazaguanosine
SO:0001316
seven_deazaguanosine
7-deazaguanosine is a modified guanosine.
http://library.med.utah.edu/RNAmods/
Queuosine is a modified 7-deazoguanosine.
RNAMOD:043
http://en.wikipedia.org/wiki/Queuosine
Q
sequence
SO:0001317
queuosine
Queuosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
http://en.wikipedia.org/wiki/Queuosine
wiki
Q
Epoxyqueuosine is a modified 7-deazoguanosine.
RNAMOD:044
eQ
sequence
SO:0001318
epoxyqueuosine
Epoxyqueuosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
eQ
Galactosyl_queuosine is a modified 7-deazoguanosine.
RNAMOD:045
galQ
galactosyl queuosine
galactosyl-queuosine
sequence
SO:0001319
galactosyl_queuosine
Galactosyl_queuosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
galQ
Mannosyl_queuosine is a modified 7-deazoguanosine.
RNAMOD:046
manQ
mannosyl queuosine
mannosyl-queuosine
sequence
SO:0001320
mannosyl_queuosine
Mannosyl_queuosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
manQ
7_cyano_7_deazaguanosine is a modified 7-deazoguanosine.
RNAMOD:047
7-cyano-7-deazaguanosine
preQ0
seven cyano seven deazaguanosine
sequence
SO:0001321
seven_cyano_seven_deazaguanosine
7_cyano_7_deazaguanosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
preQ0
7_aminomethyl_7_deazaguanosine is a modified 7-deazoguanosine.
RNAMOD:048
7-aminomethyl-7-deazaguanosine
preQ1
seven aminomethyl seven deazaguanosine
sequence
SO:0001322
seven_aminomethyl_seven_deazaguanosine
7_aminomethyl_7_deazaguanosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
preQ1
Archaeosine is a modified 7-deazoguanosine.
RNAMOD:049
G+
sequence
SO:0001323
archaeosine
Archaeosine is a modified 7-deazoguanosine.
http://library.med.utah.edu/RNAmods/
G+
1_methylguanosine is a modified guanosine base feature.
RNAMOD:029
1-methylguanosine
m1G
one methylguanosine
sequence
SO:0001324
one_methylguanosine
1_methylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m1G
N2_methylguanosine is a modified guanosine base feature.
RNAMOD:030
N2 methylguanosine
N2-methylguanosine
m2G
sequence
SO:0001325
N2_methylguanosine
N2_methylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m2G
7_methylguanosine is a modified guanosine base feature.
RNAMOD:031
7-methylguanosine
m7G
seven methylguanosine
sequence
SO:0001326
seven_methylguanosine
7_methylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m7G
2prime_O_methylguanosine is a modified guanosine base feature.
RNAMOD:032
2'-O-methylguanosine
Gm
two prime O methylguanosine
sequence
SO:0001327
two_prime_O_methylguanosine
2prime_O_methylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
Gm
N2_N2_dimethylguanosine is a modified guanosine base feature.
RNAMOD:033
N2,N2-dimethylguanosine
m22G
sequence
SO:0001328
N2_N2_dimethylguanosine
N2_N2_dimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m22G
N2_2prime_O_dimethylguanosine is a modified guanosine base feature.
RNAMOD:034
N2 2 prime O dimethylguanosine
N2,2'-O-dimethylguanosine
m2Gm
sequence
SO:0001329
N2_2_prime_O_dimethylguanosine
N2_2prime_O_dimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m2Gm
N2_N2_2prime_O_trimethylguanosine is a modified guanosine base feature.
RNAMOD:035
N2,N2,2'-O-trimethylguanosine
m22Gmv
sequence
SO:0001330
N2_N2_2_prime_O_trimethylguanosine
N2_N2_2prime_O_trimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m22Gmv
2prime_O_ribosylguanosine_phosphate is a modified guanosine base feature.
RNAMOD:036
2'-O-ribosylguanosine (phosphate)
Gr(p)
two prime O ribosylguanosine phosphate
sequence
SO:0001331
two_prime_O_ribosylguanosine_phosphate
2prime_O_ribosylguanosine_phosphate is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
Gr(p)
Wybutosine is a modified guanosine base feature.
RNAMOD:037
yW
sequence
SO:0001332
wybutosine
Wybutosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
yW
Peroxywybutosine is a modified guanosine base feature.
RNAMOD:038
o2yW
sequence
SO:0001333
peroxywybutosine
Peroxywybutosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
o2yW
Hydroxywybutosine is a modified guanosine base feature.
RNAMOD:039
OHyW
sequence
SO:0001334
hydroxywybutosine
Hydroxywybutosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
OHyW
Undermodified_hydroxywybutosine is a modified guanosine base feature.
RNAMOD:040
OHyW*
undermodified hydroxywybutosine
sequence
SO:0001335
undermodified_hydroxywybutosine
Undermodified_hydroxywybutosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
OHyW*
Wyosine is a modified guanosine base feature.
RNAMOD:041
IMG
sequence
SO:0001336
wyosine
Wyosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
IMG
Methylwyosine is a modified guanosine base feature.
RNAMOD:042
mimG
sequence
SO:0001337
methylwyosine
Methylwyosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
mimG
N2_7_dimethylguanosine is a modified guanosine base feature.
RNAMOD:090
N2 7 dimethylguanosine
N2,7-dimethylguanosine
m2,7G
sequence
SO:0001338
N2_7_dimethylguanosine
N2_7_dimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m2,7G
N2_N2_7_trimethylguanosine is a modified guanosine base feature.
RNAMOD:091
N2,N2,7-trimethylguanosine
m2,2,7G
sequence
SO:0001339
N2_N2_7_trimethylguanosine
N2_N2_7_trimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m2,2,7G
1_2prime_O_dimethylguanosine is a modified guanosine base feature.
RNAMOD:096
1,2'-O-dimethylguanosine
m1Gm
one two prime O dimethylguanosine
sequence
SO:0001340
one_two_prime_O_dimethylguanosine
1_2prime_O_dimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m1Gm
4_demethylwyosine is a modified guanosine base feature.
RNAMOD:100
4-demethylwyosine
four demethylwyosine
imG-14
sequence
SO:0001341
four_demethylwyosine
4_demethylwyosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
imG-14
Isowyosine is a modified guanosine base feature.
RNAMOD:101
imG2
sequence
SO:0001342
isowyosine
Isowyosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
imG2
N2_7_2prirme_O_trimethylguanosine is a modified guanosine base feature.
RNAMOD:106
N2 7 2prirme O trimethylguanosine
N2,7,2'-O-trimethylguanosine
m2,7Gm
sequence
SO:0001343
N2_7_2prirme_O_trimethylguanosine
N2_7_2prirme_O_trimethylguanosine is a modified guanosine base feature.
http://library.med.utah.edu/RNAmods/
m2,7Gm
5_methyluridine is a modified uridine base feature.
RNAMOD:052
http://en.wikipedia.org/wiki/5-methyluridine
5-methyluridine
five methyluridine
m5U
sequence
SO:0001344
five_methyluridine
5_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
http://en.wikipedia.org/wiki/5-methyluridine
wiki
m5U
2prime_O_methyluridine is a modified uridine base feature.
RNAMOD:053
2'-O-methyluridine
Um
two prime O methyluridine
sequence
SO:0001345
two_prime_O_methyluridine
2prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
Um
5_2_prime_O_dimethyluridine is a modified uridine base feature.
RNAMOD:054
5,2'-O-dimethyluridine
five two prime O dimethyluridine
m5Um
sequence
SO:0001346
five_two_prime_O_dimethyluridine
5_2_prime_O_dimethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m5Um
1_methylpseudouridine is a modified uridine base feature.
RNAMOD:055
1-methylpseudouridine
m1Y
one methylpseudouridine
sequence
SO:0001347
one_methylpseudouridine
1_methylpseudouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m1Y
2prime_O_methylpseudouridine is a modified uridine base feature.
RNAMOD:056
2'-O-methylpseudouridine
Ym
two prime O methylpseudouridine
sequence
SO:0001348
two_prime_O_methylpseudouridine
2prime_O_methylpseudouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
Ym
2_thiouridine is a modified uridine base feature.
RNAMOD:057
2-thiouridine
s2U
two thiouridine
sequence
SO:0001349
two_thiouridine
2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
s2U
4_thiouridine is a modified uridine base feature.
RNAMOD:058
4-thiouridine
four thiouridine
s4U
sequence
SO:0001350
four_thiouridine
4_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
s4U
5_methyl_2_thiouridine is a modified uridine base feature.
RNAMOD:059
5-methyl-2-thiouridine
five methyl 2 thiouridine
m5s2U
sequence
SO:0001351
five_methyl_2_thiouridine
5_methyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m5s2U
2_thio_2prime_O_methyluridine is a modified uridine base feature.
RNAMOD:060
2-thio-2'-O-methyluridine
s2Um
two thio two prime O methyluridine
sequence
SO:0001352
two_thio_two_prime_O_methyluridine
2_thio_2prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
s2Um
3_3_amino_3_carboxypropyl_uridine is a modified uridine base feature.
RNAMOD:061
3-(3-amino-3-carboxypropyl)uridine
acp3U
sequence
SO:0001353
three_three_amino_three_carboxypropyl_uridine
3_3_amino_3_carboxypropyl_uridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
acp3U
5_hydroxyuridine is a modified uridine base feature.
RNAMOD:060
5-hydroxyuridine
five hydroxyuridine
ho5U
sequence
SO:0001354
five_hydroxyuridine
5_hydroxyuridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
ho5U
5_methoxyuridine is a modified uridine base feature.
RNAMOD:063
5-methoxyuridine
five methoxyuridine
mo5U
sequence
SO:0001355
five_methoxyuridine
5_methoxyuridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mo5U
Uridine_5_oxyacetic_acid is a modified uridine base feature.
RNAMOD:064
cmo5U
uridine 5-oxyacetic acid
uridine five oxyacetic acid
sequence
SO:0001356
uridine_five_oxyacetic_acid
Uridine_5_oxyacetic_acid is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
cmo5U
Uridine_5_oxyacetic_acid_methyl_ester is a modified uridine base feature.
RNAMOD:065
mcmo5U
uridine 5-oxyacetic acid methyl ester
uridine five oxyacetic acid methyl ester
sequence
SO:0001357
uridine_five_oxyacetic_acid_methyl_ester
Uridine_5_oxyacetic_acid_methyl_ester is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mcmo5U
5_carboxyhydroxymethyl_uridine is a modified uridine base feature.
RNAMOD:066
5-(carboxyhydroxymethyl)uridine
chm5U
five carboxyhydroxymethyl uridine
sequence
SO:0001358
five_carboxyhydroxymethyl_uridine
5_carboxyhydroxymethyl_uridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
chm5U
5_carboxyhydroxymethyl_uridine_methyl_ester is a modified uridine base feature.
RNAMOD:067
5-(carboxyhydroxymethyl)uridine methyl ester
five carboxyhydroxymethyl uridine methyl ester
mchm5U
sequence
SO:0001359
five_carboxyhydroxymethyl_uridine_methyl_ester
5_carboxyhydroxymethyl_uridine_methyl_ester is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mchm5U
Five_methoxycarbonylmethyluridine is a modified uridine base feature.
RNAMOD:068
5-methoxycarbonylmethyluridine
five methoxycarbonylmethyluridine
mcm5U
sequence
SO:0001360
five_methoxycarbonylmethyluridine
Five_methoxycarbonylmethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mcm5U
Five_methoxycarbonylmethyl_2_prime_O_methyluridine is a modified uridine base feature.
RNAMOD:069
5-methoxycarbonylmethyl-2'-O-methyluridine
five methoxycarbonylmethyl two prime O methyluridine
mcm5Um
sequence
SO:0001361
five_methoxycarbonylmethyl_two_prime_O_methyluridine
Five_methoxycarbonylmethyl_2_prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mcm5Um
5_methoxycarbonylmethyl_2_thiouridine is a modified uridine base feature.
RNAMOD:070
5-methoxycarbonylmethyl-2-thiouridine
five methoxycarbonylmethyl two thiouridine
mcm5s2U
sequence
SO:0001362
five_methoxycarbonylmethyl_two_thiouridine
5_methoxycarbonylmethyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mcm5s2U
5_aminomethyl_2_thiouridine is a modified uridine base feature.
RNAMOD:071
5-aminomethyl-2-thiouridine
five aminomethyl two thiouridine
nm5s2U
sequence
SO:0001363
five_aminomethyl_two_thiouridine
5_aminomethyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
nm5s2U
5_methylaminomethyluridine is a modified uridine base feature.
RNAMOD:072
5-methylaminomethyluridine
five methylaminomethyluridine
mnm5U
sequence
SO:0001364
five_methylaminomethyluridine
5_methylaminomethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mnm5U
5_methylaminomethyl_2_thiouridine is a modified uridine base feature.
RNAMOD:073
5-methylaminomethyl-2-thiouridine
five methylaminomethyl two thiouridine
mnm5s2U
sequence
SO:0001365
five_methylaminomethyl_two_thiouridine
5_methylaminomethyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mnm5s2U
5_methylaminomethyl_2_selenouridine is a modified uridine base feature.
RNAMOD:074
5-methylaminomethyl-2-selenouridine
five methylaminomethyl two selenouridine
mnm5se2U
sequence
SO:0001366
five_methylaminomethyl_two_selenouridine
5_methylaminomethyl_2_selenouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
mnm5se2U
5_carbamoylmethyluridine is a modified uridine base feature.
RNAMOD:075
5-carbamoylmethyluridine
five carbamoylmethyluridine
ncm5U
sequence
SO:0001367
five_carbamoylmethyluridine
5_carbamoylmethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
ncm5U
5_carbamoylmethyl_2_prime_O_methyluridine is a modified uridine base feature.
RNAMOD:076
5-carbamoylmethyl-2'-O-methyluridine
five carbamoylmethyl two prime O methyluridine
ncm5Um
sequence
SO:0001368
five_carbamoylmethyl_two_prime_O_methyluridine
5_carbamoylmethyl_2_prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
ncm5Um
5_carboxymethylaminomethyluridine is a modified uridine base feature.
RNAMOD:077
5-carboxymethylaminomethyluridine
cmnm5U
five carboxymethylaminomethyluridine
sequence
SO:0001369
five_carboxymethylaminomethyluridine
5_carboxymethylaminomethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
cmnm5U
5_carboxymethylaminomethyl_2_prime_O_methyluridine is a modified uridine base feature.
RNAMOD:078
5-carboxymethylaminomethyl- 2'-O-methyluridine
cmnm5Um
five carboxymethylaminomethyl two prime O methyluridine
sequence
SO:0001370
five_carboxymethylaminomethyl_two_prime_O_methyluridine
5_carboxymethylaminomethyl_2_prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
cmnm5Um
5_carboxymethylaminomethyl_2_thiouridine is a modified uridine base feature.
RNAMOD:079
5-carboxymethylaminomethyl-2-thiouridine
cmnm5s2U
five carboxymethylaminomethyl two thiouridine
sequence
SO:0001371
five_carboxymethylaminomethyl_two_thiouridine
5_carboxymethylaminomethyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
cmnm5s2U
3_methyluridine is a modified uridine base feature.
RNAMOD:085
3-methyluridine
m3U
three methyluridine
sequence
SO:0001372
three_methyluridine
3_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m3U
1_methyl_3_3_amino_3_carboxypropyl_pseudouridine is a modified uridine base feature.
RNAMOD:086
1-methyl-3-(3-amino-3-carboxypropyl) pseudouridine
m1acp3Y
sequence
SO:0001373
one_methyl_three_three_amino_three_carboxypropyl_pseudouridine
1_methyl_3_3_amino_3_carboxypropyl_pseudouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m1acp3Y
5_carboxymethyluridine is a modified uridine base feature.
RNAMOD:087
5-carboxymethyluridine
cm5U
five carboxymethyluridine
sequence
SO:0001374
five_carboxymethyluridine
5_carboxymethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
cm5U
3_2prime_O_dimethyluridine is a modified uridine base feature.
RNAMOD:092
3,2'-O-dimethyluridine
m3Um
three two prime O dimethyluridine
sequence
SO:0001375
three_two_prime_O_dimethyluridine
3_2prime_O_dimethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m3Um
5_methyldihydrouridine is a modified uridine base feature.
RNAMOD:093
5-methyldihydrouridine
five methyldihydrouridine
m5D
sequence
SO:0001376
five_methyldihydrouridine
5_methyldihydrouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m5D
3_methylpseudouridine is a modified uridine base feature.
RNAMOD:094
3-methylpseudouridine
m3Y
three methylpseudouridine
sequence
SO:0001377
three_methylpseudouridine
3_methylpseudouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
m3Y
5_taurinomethyluridine is a modified uridine base feature.
RNAMOD:098
5-taurinomethyluridine
five taurinomethyluridine
tm5U
sequence
SO:0001378
five_taurinomethyluridine
5_taurinomethyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
tm5U
5_taurinomethyl_2_thiouridineis a modified uridine base feature.
RNAMOD:099
5-taurinomethyl-2-thiouridine
five taurinomethyl two thiouridine
tm5s2U
sequence
SO:0001379
five_taurinomethyl_two_thiouridine
5_taurinomethyl_2_thiouridineis a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
tm5s2U
5_isopentenylaminomethyl_uridine is a modified uridine base feature.
RNAMOD:103
5-(isopentenylaminomethyl)uridine
five isopentenylaminomethyl uridine
inm5U
sequence
SO:0001380
five_isopentenylaminomethyl_uridine
5_isopentenylaminomethyl_uridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
inm5U
5_isopentenylaminomethyl_2_thiouridine is a modified uridine base feature.
RNAMOD:104
5-(isopentenylaminomethyl)- 2-thiouridine
five isopentenylaminomethyl two thiouridine
inm5s2U
sequence
SO:0001381
five_isopentenylaminomethyl_two_thiouridine
5_isopentenylaminomethyl_2_thiouridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
inm5s2U
5_isopentenylaminomethyl_2prime_O_methyluridine is a modified uridine base feature.
RNAMOD:105
5-(isopentenylaminomethyl)- 2'-O-methyluridine
five isopentenylaminomethyl two prime O methyluridine
inm5Um
sequence
SO:0001382
five_isopentenylaminomethyl_two_prime_O_methyluridine
5_isopentenylaminomethyl_2prime_O_methyluridine is a modified uridine base feature.
http://library.med.utah.edu/RNAmods/
inm5Um
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a histone.
histone binding site
sequence
SO:0001383
histone_binding_site
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a histone.
SO:ke
A portion of a CDS that is not the complete CDS.
CDS fragment
incomplete CDS
sequence
SO:0001384
CDS_fragment
A post translationally modified amino acid feature.
modified amino acid feature
sequence
SO:0001385
modified_amino_acid_feature
A post translationally modified amino acid feature.
SO:ke
A post translationally modified glycine amino acid feature.
MOD:00908
ModGly
modified glycine
sequence
SO:0001386
modified_glycine
A post translationally modified glycine amino acid feature.
SO:ke
ModGly
A post translationally modified alanine amino acid feature.
MOD:00901
ModAla
modified L alanine
modified L-alanine
sequence
SO:0001387
modified_L_alanine
A post translationally modified alanine amino acid feature.
SO:ke
ModAla
A post translationally modified asparagine amino acid feature.
MOD:00903
ModAsn
modified L asparagine
modified L-asparagine
sequence
SO:0001388
modified_L_asparagine
A post translationally modified asparagine amino acid feature.
SO:ke
ModAsn
A post translationally modified aspartic acid amino acid feature.
MOD:00904
ModAsp
modified L aspartic acid
modified L-aspartic acid
sequence
SO:0001389
modified_L_aspartic_acid
A post translationally modified aspartic acid amino acid feature.
SO:ke
ModAsp
A post translationally modified cysteine amino acid feature.
MOD:00905
ModCys
modified L cysteine
modified L-cysteine
sequence
SO:0001390
modified_L_cysteine
A post translationally modified cysteine amino acid feature.
SO:ke
ModCys
A post translationally modified glutamic acid.
MOD:00906
ModGlu
modified L glutamic acid
modified L-glutamic acid
sequence
SO:0001391
modified_L_glutamic_acid
ModGlu
A post translationally modified threonine amino acid feature.
MOD:00917
ModThr
modified L threonine
modified L-threonine
sequence
SO:0001392
modified_L_threonine
A post translationally modified threonine amino acid feature.
SO:ke
ModThr
A post translationally modified tryptophan amino acid feature.
MOD:00918
ModTrp
modified L tryptophan
modified L-tryptophan
sequence
SO:0001393
modified_L_tryptophan
A post translationally modified tryptophan amino acid feature.
SO:ke
ModTrp
A post translationally modified glutamine amino acid feature.
MOD:00907
ModGln
modified L glutamine
modified L-glutamine
sequence
SO:0001394
modified_L_glutamine
A post translationally modified glutamine amino acid feature.
SO:ke
A post translationally modified methionine amino acid feature.
MOD:00913
ModMet
modified L methionine
modified L-methionine
sequence
SO:0001395
modified_L_methionine
A post translationally modified methionine amino acid feature.
SO:ke
ModMet
A post translationally modified isoleucine amino acid feature.
MOD:00910
ModIle
modified L isoleucine
modified L-isoleucine
sequence
SO:0001396
modified_L_isoleucine
A post translationally modified isoleucine amino acid feature.
SO:ke
ModIle
A post translationally modified phenylalanine amino acid feature.
MOD:00914
ModPhe
modified L phenylalanine
modified L-phenylalanine
sequence
SO:0001397
modified_L_phenylalanine
A post translationally modified phenylalanine amino acid feature.
SO:ke
ModPhe
A post translationally modified histidine amino acid feature.
MOD:00909
ModHis
modified L histidine
modified L-histidine
sequence
SO:0001398
modified_L_histidine
A post translationally modified histidine amino acid feature.
SO:ke
A post translationally modified serine amino acid feature.
MOD:00916
MosSer
modified L serine
modified L-serine
sequence
SO:0001399
modified_L_serine
A post translationally modified serine amino acid feature.
SO:ke
MOD:00916
http://www.psidev.info/index.php?q=node/104
MosSer
A post translationally modified lysine amino acid feature.
MOD:00912
ModLys
modified L lysine
modified L-lysine
sequence
SO:0001400
modified_L_lysine
A post translationally modified lysine amino acid feature.
SO:ke
ModLys
A post translationally modified leucine amino acid feature.
MOD:00911
ModLeu
modified L leucine
modified L-leucine
sequence
SO:0001401
modified_L_leucine
A post translationally modified leucine amino acid feature.
SO:ke
ModLeu
A post translationally modified selenocysteine amino acid feature.
MOD:01158
modified L selenocysteine
modified L-selenocysteine
sequence
SO:0001402
modified_L_selenocysteine
A post translationally modified selenocysteine amino acid feature.
SO:ke
A post translationally modified valine amino acid feature.
MOD:00920
ModVal
modified L valine
modified L-valine
sequence
SO:0001403
modified_L_valine
A post translationally modified valine amino acid feature.
SO:ke
ModVal
A post translationally modified proline amino acid feature.
MOD:00915
ModPro
modified L proline
modified L-proline
sequence
SO:0001404
modified_L_proline
A post translationally modified proline amino acid feature.
SO:ke
ModPro
A post translationally modified tyrosine amino acid feature.
MOD:00919
ModTry
modified L tyrosine
modified L-tyrosine
sequence
SO:0001405
modified_L_tyrosine
A post translationally modified tyrosine amino acid feature.
SO:ke
ModTry
A post translationally modified arginine amino acid feature.
MOD:00902
ModArg
modified L arginine
modified L-arginine
sequence
SO:0001406
modified_L_arginine
A post translationally modified arginine amino acid feature.
SO:ke
ModArg
An attribute describing the nature of a proteinaceous polymer, where by the amino acid units are joined by peptide bonds.
sequence
SO:0001407
peptidyl
An attribute describing the nature of a proteinaceous polymer, where by the amino acid units are joined by peptide bonds.
SO:ke
The C-terminal residues of a polypeptide which are exchanged for a GPI-anchor.
cleaved for gpi anchor region
sequence
SO:0001408
cleaved_for_gpi_anchor_region
The C-terminal residues of a polypeptide which are exchanged for a GPI-anchor.
EBI:rh
A region which is intended for use in an experiment.
biomaterial region
sequence
SO:0001409
biomaterial_region
A region which is intended for use in an experiment.
SO:cb
A region which is the result of some arbitrary experimental procedure. The procedure may be carried out with biological material or inside a computer.
experimental output artefact
experimental_output_artefact
sequence
analysis feature
SO:0001410
experimental_feature
A region which is the result of some arbitrary experimental procedure. The procedure may be carried out with biological material or inside a computer.
SO:cb
A region defined by its disposition to be involved in a biological process.
INSDC_misc_feature
INSDC_note:biological_region
biological region
sequence
SO:0001411
biological_region
A region defined by its disposition to be involved in a biological process.
SO:cb
A DNA region within which self-interaction occurs more often than expected by chance because of DNA-looping.
topologically defined region
sequence
SO:0001412
topologically_defined_region
A DNA region within which self-interaction occurs more often than expected by chance because of DNA-looping.
PMID:32782014
SO:cb
The point within a chromosome where a translocation begins or ends.
translocation breakpoint
sequence
SO:0001413
translocation_breakpoint
The point within a chromosome where a translocation begins or ends.
SO:cb
The point within a chromosome where a insertion begins or ends.
insertion breakpoint
sequence
SO:0001414
insertion_breakpoint
The point within a chromosome where a insertion begins or ends.
SO:cb
The point within a chromosome where a deletion begins or ends.
deletion breakpoint
sequence
SO:0001415
deletion_breakpoint
The point within a chromosome where a deletion begins or ends.
SO:cb
A flanking region located five prime of a specific region.
five prime flanking region
sequence
5' flanking region
SO:0001416
five_prime_flanking_region
A flanking region located five prime of a specific region.
SO:chado
A flanking region located three prime of a specific region.
three prime flanking region
sequence
3' flanking region
SO:0001417
three_prime_flanking_region
A flanking region located three prime of a specific region.
SO:chado
An experimental region, defined by a tiling array experiment to be transcribed at some level.
transcribed fragment
sequence
transfrag
SO:0001418
Term requested by the MODencode group.
transcribed_fragment
An experimental region, defined by a tiling array experiment to be transcribed at some level.
SO:ke
Intronic 2 bp region bordering exon. A splice_site that adjacent_to exon and overlaps intron.
cis splice site
sequence
SO:0001419
cis_splice_site
Intronic 2 bp region bordering exon. A splice_site that adjacent_to exon and overlaps intron.
SO:cjm
SO:ke
Primary transcript region bordering trans-splice junction.
trans splice site
sequence
SO:0001420
trans_splice_site
Primary transcript region bordering trans-splice junction.
SO:ke
The boundary between an intron and an exon.
splice boundary
splice junction
sequence
SO:0001421
splice_junction
The boundary between an intron and an exon.
SO:ke
A region of a polypeptide, involved in the transition from one conformational state to another.
polypeptide conformational switch
sequence
SO:0001422
MM Young, K Kirshenbaum, KA Dill & S Highsmith. Predicting conformational switches in proteins. Protein Science, 1999, 8, 1752-64. K. Kirshenbaum, M.M. Young and S. Highsmith. Predicting Allosteric Switches in Myosins. Protein Science 8(9):1806-1815. 1999.
conformational_switch
A region of a polypeptide, involved in the transition from one conformational state to another.
SO:ke
A read produced by the dye terminator method of sequencing.
sequence
dye terminator read
SO:0001423
dye_terminator_read
A read produced by the dye terminator method of sequencing.
SO:ke
A read produced by pyrosequencing technology.
sequence
pyorsequenced read
SO:0001424
An example is a read produced by Roche 454 technology.
pyrosequenced_read
A read produced by pyrosequencing technology.
SO:ke
A read produced by ligation based sequencing technologies.
sequence
ligation based read
SO:0001425
An example of this kind of read is one produced by ABI SOLiD.
ligation_based_read
A read produced by ligation based sequencing technologies.
SO:ke
A read produced by the polymerase based sequence by synthesis method.
sequence
polymerase synthesis read
SO:0001426
An example is a read produced by Illumina technology.
polymerase_synthesis_read
A read produced by the polymerase based sequence by synthesis method.
SO:ke
A structural region in an RNA molecule which promotes ribosomal frameshifting of cis coding sequence.
cis regulatory frameshift element
sequence
SO:0001427
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
cis_regulatory_frameshift_element
A structural region in an RNA molecule which promotes ribosomal frameshifting of cis coding sequence.
RFAM:jd
A sequence assembly derived from expressed sequences.
expressed sequence assembly
sequence
SO:0001428
From tracker [ 2372385 ] expressed_sequence_assembly.
expressed_sequence_assembly
A sequence assembly derived from expressed sequences.
SO:ke
A binding site that, in the molecule, interacts selectively and non-covalently with DNA.
DNA binding site
sequence
SO:0001429
DNA_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with DNA.
SO:ke
true
A gene that is not transcribed under normal conditions and is not critical to normal cellular functioning.
cryptic gene
sequence
SO:0001431
cryptic_gene
A gene that is not transcribed under normal conditions and is not critical to normal cellular functioning.
SO:ke
SO:0001545
sequence variant affecting polyadenylation
sequence
mutation affecting polyadenylation
SO:0001432
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_polyadenylation
true
A three prime RACE (Rapid Amplification of cDNA Ends) clone is a cDNA clone copied from the 3' end of an mRNA (using a poly-dT primer to capture the polyA tail and a gene-specific or randomly primed 5' primer), and spliced into a vector for propagation in a suitable host.
sequence
3' RACE clone
SO:0001433
three_prime_RACE_clone
A three prime RACE (Rapid Amplification of cDNA Ends) clone is a cDNA clone copied from the 3' end of an mRNA (using a poly-dT primer to capture the polyA tail and a gene-specific or randomly primed 5' primer), and spliced into a vector for propagation in a suitable host.
modENCODE:nlw
A cassette pseudogene is a kind of gene in an inactive form which may recombine at a telomeric locus to form a functional copy.
cassette pseudogene
sequence
cassette type psedogene
SO:0001434
Requested by the Trypanosome community.
cassette_pseudogene
A cassette pseudogene is a kind of gene in an inactive form which may recombine at a telomeric locus to form a functional copy.
SO:ke
A non-polar, hydorophobic amino acid encoded by the codons GCN (GCT, GCC, GCA and GCG).
A
Ala
sequence
SO:0001435
A place holder for a cross product with chebi.
alanine
A
Ala
A non-polar, hydorophobic amino acid encoded by the codons GTN (GTT, GTC, GTA and GTG).
V
Val
sequence
SO:0001436
A place holder for a cross product with chebi.
valine
V
Val
A non-polar, hydorophobic amino acid encoded by the codons CTN (CTT, CTC, CTA and CTG), TTA and TTG.
L
Leu
sequence
SO:0001437
A place holder for a cross product with chebi.
leucine
L
Leu
A non-polar, hydorophobic amino acid encoded by the codons ATH (ATT, ATC and ATA).
I
Ile
sequence
SO:0001438
A place holder for a cross product with chebi.
isoleucine
I
Ile
A non-polar, hydorophobic amino acid encoded by the codons CCN (CCT, CCC, CCA and CCG).
P
Pro
sequence
SO:0001439
A place holder for a cross product with chebi.
proline
P
Pro
A non-polar, hydorophobic amino acid encoded by the codon TGG.
Trp
W
sequence
SO:0001440
A place holder for a cross product with chebi.
tryptophan
Trp
W
A non-polar, hydorophobic amino acid encoded by the codons TTT and TTC.
F
Phe
sequence
SO:0001441
A place holder for a cross product with chebi.
phenylalanine
F
Phe
A non-polar, hydorophobic amino acid encoded by the codon ATG.
M
Met
sequence
SO:0001442
A place holder for a cross product with chebi.
methionine
M
Met
A non-polar, hydorophilic amino acid encoded by the codons GGN (GGT, GGC, GGA and GGG).
G
Gly
sequence
SO:0001443
A place holder for a cross product with chebi.
glycine
G
Gly
A polar, hydorophilic amino acid encoded by the codons TCN (TCT, TCC, TCA, TCG), AGT and AGC.
S
Ser
sequence
SO:0001444
A place holder for a cross product with chebi.
serine
S
Ser
A polar, hydorophilic amino acid encoded by the codons ACN (ACT, ACC, ACA and ACG).
T
Thr
sequence
SO:0001445
A place holder for a cross product with chebi.
threonine
T
Thr
A polar, hydorophilic amino acid encoded by the codons TAT and TAC.
Tyr
Y
sequence
SO:0001446
A place holder for a cross product with chebi.
tyrosine
Tyr
Y
A polar amino acid encoded by the codons TGT and TGC.
C
Cys
sequence
SO:0001447
A place holder for a cross product with chebi.
cysteine
C
Cys
A polar, hydorophilic amino acid encoded by the codons CAA and CAG.
Gln
Q
sequence
SO:0001448
A place holder for a cross product with chebi.
glutamine
Gln
Q
A polar, hydorophilic amino acid encoded by the codons AAT and AAC.
Asn
N
sequence
SO:0001449
A place holder for a cross product with chebi.
asparagine
Asn
N
A positively charged, hydorophilic amino acid encoded by the codons AAA and AAG.
K
Lys
sequence
SO:0001450
A place holder for a cross product with chebi.
lysine
K
Lys
A positively charged, hydorophilic amino acid encoded by the codons CGN (CGT, CGC, CGA and CGG), AGA and AGG.
Arg
R
sequence
SO:0001451
A place holder for a cross product with chebi.
arginine
Arg
R
A positively charged, hydorophilic amino acid encoded by the codons CAT and CAC.
H
His
sequence
SO:0001452
A place holder for a cross product with chebi.
histidine
H
His
A negatively charged, hydorophilic amino acid encoded by the codons GAT and GAC.
Asp
D
aspartic acid
sequence
SO:0001453
A place holder for a cross product with chebi.
aspartic_acid
Asp
D
A negatively charged, hydorophilic amino acid encoded by the codons GAA and GAG.
E
Glu
glutamic acid
sequence
SO:0001454
A place holder for a cross product with chebi.
glutamic_acid
E
Glu
A relatively rare amino acid encoded by the codon UGA in some contexts, whereas UGA is a termination codon in other contexts.
Sec
U
sequence
SO:0001455
A place holder for a cross product with chebi.
selenocysteine
A relatively rare amino acid encoded by the codon UGA in some contexts, whereas UGA is a termination codon in other contexts.
PMID:23275319
Sec
U
A relatively rare amino acid encoded by the codon UAG in some contexts, whereas UAG is a termination codon in other contexts.
O
Pyl
sequence
SO:0001456
A place holder for a cross product with chebi.
pyrrolysine
A relatively rare amino acid encoded by the codon UAG in some contexts, whereas UAG is a termination codon in other contexts.
PMID:15788401
O
Pyl
A region defined by a set of transcribed sequences from the same gene or expressed pseudogene.
transcribed cluster
sequence
unigene cluster
SO:0001457
This term was requested by Jeff Bowes, using the tracker, ID = 2594157.
transcribed_cluster
A region defined by a set of transcribed sequences from the same gene or expressed pseudogene.
SO:ke
A kind of transcribed_cluster defined by a set of transcribed sequences from the a unique gene.
sequence
unigene cluster
SO:0001458
This term was requested by Jeff Bowes, using the tracker, ID = 2594157.
unigene_cluster
A kind of transcribed_cluster defined by a set of transcribed sequences from the a unique gene.
SO:ke
Clustered Palindromic Repeats interspersed with bacteriophage derived spacer sequences.
http:en.wikipedia.org/wiki/CRISPR
CRISPR element
Clustered_Regularly_Interspaced_Short_Palindromic_Repeat
sequence
SO:0001459
CRISPR
Clustered Palindromic Repeats interspersed with bacteriophage derived spacer sequences.
RFAM:jd
A binding site that, in an insulator region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
sequence
insulator binding site
SO:0001460
See tracker ID 2060908.
insulator_binding_site
A binding site that, in an insulator region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
SO:ke
A binding site that, in the enhancer region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
sequence
enhancer binding site
SO:0001461
enhancer_binding_site
A binding site that, in the enhancer region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
SO:ke
A collection of contigs.
contig collection
sequence
SO:0001462
See tracker ID: 2138359.
contig_collection
A collection of contigs.
SO:ke
Long, intervening non-coding RNA. A transcript that does not overlap within the start or end genomic coordinates of a coding gene or pseudogene on either strand.
large intervening non-coding RNA
long intergenic non-coding RNA
long intervening non-coding RNA
sequence
SO:0001463
lincRNA
Long, intervening non-coding RNA. A transcript that does not overlap within the start or end genomic coordinates of a coding gene or pseudogene on either strand.
PMID:19182780
PMID:23463798
SO:ke
http://www.gencodegenes.org/gencode_biotypes.html
An EST spanning part or all of the untranslated regions of a protein-coding transcript.
UTR sequence tag
sequence
SO:0001464
UST
An EST spanning part or all of the untranslated regions of a protein-coding transcript.
SO:nlw
A UST located in the 3'UTR of a protein-coding transcript.
sequence
3' UST
SO:0001465
three_prime_UST
A UST located in the 3'UTR of a protein-coding transcript.
SO:nlw
An UST located in the 5'UTR of a protein-coding transcript.
sequence
5' UST
SO:0001466
five_prime_UST
An UST located in the 5'UTR of a protein-coding transcript.
SO:nlw
A tag produced from a single sequencing read from a RACE product; typically a few hundred base pairs long.
RACE sequence tag
sequence
SO:0001467
RST
A tag produced from a single sequencing read from a RACE product; typically a few hundred base pairs long.
SO:nlw
A tag produced from a single sequencing read from a 3'-RACE product; typically a few hundred base pairs long.
3' RST
sequence
SO:0001468
three_prime_RST
A tag produced from a single sequencing read from a 3'-RACE product; typically a few hundred base pairs long.
SO:nlw
A tag produced from a single sequencing read from a 5'-RACE product; typically a few hundred base pairs long.
sequence
5' RST
SO:0001469
five_prime_RST
A tag produced from a single sequencing read from a 5'-RACE product; typically a few hundred base pairs long.
SO:nlw
A match against an UST sequence.
UST match
sequence
SO:0001470
UST_match
A match against an UST sequence.
SO:nlw
A match against an RST sequence.
RST match
sequence
SO:0001471
RST_match
A match against an RST sequence.
SO:nlw
A nucleotide match to a primer sequence.
primer match
sequence
SO:0001472
primer_match
A nucleotide match to a primer sequence.
SO:nlw
A region of the pri miRNA that base pairs with the guide to form the hairpin.
kareneilbeck
2009-05-27T03:35:43Z
miRNA antiguide
miRNA passenger strand
miRNA star
sequence
SO:0001473
miRNA_antiguide
A region of the pri miRNA that base pairs with the guide to form the hairpin.
SO:ke
The boundary between the spliced leader and the first exon of the mRNA.
kareneilbeck
2009-07-13T04:50:49Z
trans-splice junction
sequence
SO:0001474
trans_splice_junction
The boundary between the spliced leader and the first exon of the mRNA.
SO:ke
A region of a primary transcript, that is removed via trans splicing.
kareneilbeck
2009-07-14T11:36:08Z
sequence
SO:0001475
outron
A region of a primary transcript, that is removed via trans splicing.
PMID:16401417
SO:ke
A plasmid that occurs naturally.
kareneilbeck
2009-09-01T03:43:06Z
natural plasmid
sequence
SO:0001476
natural_plasmid
A plasmid that occurs naturally.
SO:xp
A gene trap construct is a type of engineered plasmid which is designed to integrate into a genome and produce a fusion transcript between exons of the gene into which it inserts and a reporter element in the construct. Gene traps contain a splice acceptor, do not contain promoter elements for the reporter, and are mutagenic. Gene traps may be bicistronic with the second cassette containing a promoter driving an a selectable marker.
kareneilbeck
2009-09-01T03:49:09Z
gene trap construct
sequence
SO:0001477
gene_trap_construct
A gene trap construct is a type of engineered plasmid which is designed to integrate into a genome and produce a fusion transcript between exons of the gene into which it inserts and a reporter element in the construct. Gene traps contain a splice acceptor, do not contain promoter elements for the reporter, and are mutagenic. Gene traps may be bicistronic with the second cassette containing a promoter driving an a selectable marker.
ZFIN:dh
A promoter trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when inserted in close proximity to a promoter element. Promoter traps typically do not contain promoter elements and are mutagenic.
kareneilbeck
2009-09-01T03:52:01Z
promoter trap construct
sequence
SO:0001478
promoter_trap_construct
A promoter trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when inserted in close proximity to a promoter element. Promoter traps typically do not contain promoter elements and are mutagenic.
ZFIN:dh
An enhancer trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when the expression from a basic minimal promoter is enhanced by genomic enhancer elements. Enhancer traps contain promoter elements and are not usually mutagenic.
kareneilbeck
2009-09-01T03:53:26Z
enhancer trap construct
sequence
SO:0001479
enhancer_trap_construct
An enhancer trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when the expression from a basic minimal promoter is enhanced by genomic enhancer elements. Enhancer traps contain promoter elements and are not usually mutagenic.
ZFIN:dh
A region of sequence from the end of a PAC clone that may provide a highly specific marker.
kareneilbeck
2009-09-09T05:18:12Z
PAC end
sequence
SO:0001480
PAC_end
A region of sequence from the end of a PAC clone that may provide a highly specific marker.
ZFIN:mh
RAPD is a 'PCR product' where a sequence variant is identified through the use of PCR with random primers.
kareneilbeck
2009-09-09T05:26:10Z
Random Amplification Polymorphic DNA
sequence
SO:0001481
RAPD
RAPD is a 'PCR product' where a sequence variant is identified through the use of PCR with random primers.
ZFIN:mh
An enhancer that drives the pattern of transcription and binds to the same TF as the primary enhancer, but is located in the intron of or on the far side of a neighboring gene.
kareneilbeck
2009-09-09T05:29:29Z
shadow enhancer
sequence
SO:0001482
shadow_enhancer
An enhancer that drives the pattern of transcription and binds to the same TF as the primary enhancer, but is located in the intron of or on the far side of a neighboring gene.
PMID:22083793
SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist.
kareneilbeck
2009-10-08T11:37:49Z
single nucleotide variant
sequence
SO:0001483
SNV
SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist.
SO:bm
An X element combinatorial repeat is a repeat region located between the X element and the telomere or adjacent Y' element.
kareneilbeck
2009-11-10T11:03:37Z
INSDC_feature:repeat_region
INSDC_qualifier:x_element_combinatorial_repeat
X element combinatorial repeat
sequence
SO:0001484
X element combinatorial repeats contain Tbf1p binding sites,
and possible functions include a role in telomerase-independent telomere
maintenance via recombination or as a barrier against transcriptional
silencing. These are usually present as a combination of one or more of
several types of smaller elements (designated A, B, C, or D). This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747.
X_element_combinatorial_repeat
An X element combinatorial repeat is a repeat region located between the X element and the telomere or adjacent Y' element.
http://www.yeastgenome.org/help/glossary.html
A Y' element is a repeat region (SO:0000657) located adjacent to telomeric repeats or X element combinatorial repeats, either as a single copy or tandem repeat of two to four copies.
kareneilbeck
2009-11-10T12:08:57Z
INSDC_feature:repeat_region
INSDC_qualifier:Y_prime_element
Y prime element
Y' element
sequence
SO:0001485
This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747.
Y_prime_element
A Y' element is a repeat region (SO:0000657) located adjacent to telomeric repeats or X element combinatorial repeats, either as a single copy or tandem repeat of two to four copies.
http:http://www.yeastgenome.org/help/glossary.html
The status of a whole genome sequence, where the data is minimally filtered or un-filtered, from any number of sequencing platforms, and is assembled into contigs. Genome sequence of this quality may harbour regions of poor quality and can be relatively incomplete.
kareneilbeck
2009-10-23T12:48:32Z
standard draft
sequence
SO:0001486
standard_draft
The status of a whole genome sequence, where the data is minimally filtered or un-filtered, from any number of sequencing platforms, and is assembled into contigs. Genome sequence of this quality may harbour regions of poor quality and can be relatively incomplete.
DOI:10.1126
The status of a whole genome sequence, where overall coverage represents at least 90 percent of the genome.
kareneilbeck
2009-10-23T12:52:36Z
high quality draft
sequence
SO:0001487
high_quality_draft
The status of a whole genome sequence, where overall coverage represents at least 90 percent of the genome.
DOI:10.1126
The status of a whole genome sequence, where additional work has been performed, using either manual or automated methods, such as gap resolution.
kareneilbeck
2009-10-23T12:54:35Z
improved high quality draft
sequence
SO:0001488
improved_high_quality_draft
The status of a whole genome sequence, where additional work has been performed, using either manual or automated methods, such as gap resolution.
DOI:10.1126
The status of a whole genome sequence,where annotation, and verification of coding regions has occurred.
kareneilbeck
2009-10-23T12:57:10Z
annotation directed improvement
sequence
SO:0001489
annotation_directed_improved_draft
The status of a whole genome sequence,where annotation, and verification of coding regions has occurred.
DOI:10.1126
The status of a whole genome sequence, where the assembly is high quality, closure approaches have been successful for most gaps, misassemblies and low quality regions.
kareneilbeck
2009-10-23T01:01:07Z
non contiguous finished
sequence
SO:0001490
noncontiguous_finished
The status of a whole genome sequence, where the assembly is high quality, closure approaches have been successful for most gaps, misassemblies and low quality regions.
DOI:10.1126
The status of a whole genome sequence, with less than 1 error per 100,000 base pairs.
kareneilbeck
2009-10-23T01:04:43Z
finished
finished genome
sequence
SO:0001491
finished_genome
The status of a whole genome sequence, with less than 1 error per 100,000 base pairs.
DOI:10.1126
A regulatory region that is part of an intron.
kareneilbeck
2009-11-08T02:48:02Z
intronic regulatory region
sequence
SO:0001492
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
intronic_regulatory_region
A regulatory region that is part of an intron.
SO:ke
A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region composed of 8-11bp which enables binding by the centromere binding factor 1(Cbf1p).
kareneilbeck
2009-11-09T05:47:23Z
CDEI
Centromere DNA Element I
sequence
SO:0001493
This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699.
centromere_DNA_Element_I
A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region composed of 8-11bp which enables binding by the centromere binding factor 1(Cbf1p).
PMID:11222754
A centromere DNA Element II (CDEII) is part a conserved region of the centromere, consisting of a consensus region that is AT-rich and ~ 75-100 bp in length.
kareneilbeck
2009-11-09T05:51:26Z
CDEII
centromere DNA Element II
sequence
SO:0001494
This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699.
centromere_DNA_Element_II
A centromere DNA Element II (CDEII) is part a conserved region of the centromere, consisting of a consensus region that is AT-rich and ~ 75-100 bp in length.
PMID:11222754
A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region that consists of a 25-bp which enables binding by the centromere DNA binding factor 3 (CBF3) complex.
kareneilbeck
2009-11-09T05:54:47Z
CDEIII
centromere DNA Element III
sequence
SO:0001495
This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699.
centromere_DNA_Element_III
A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region that consists of a 25-bp which enables binding by the centromere DNA binding factor 3 (CBF3) complex.
PMID:11222754
The telomeric repeat is a repeat region, part of the chromosome, which in yeast, is a G-rich terminal sequence of the form (TG(1-3))n or more precisely ((TG)(1-6)TG(2-3))n.
kareneilbeck
2009-11-09T06:00:42Z
INSDC_feature:repeat_region
INSDC_qualifier:telomeric_repeat
telomeric repeat
sequence
SO:0001496
The repeats are maintained by telomerase and there is generally 300 (+/-) 75 bp of TG(1-3) at a given end. Telomeric repeats function in completing chromosome replication and protecting the ends from degradation and end-to-end fusions. This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880739.
telomeric_repeat
The telomeric repeat is a repeat region, part of the chromosome, which in yeast, is a G-rich terminal sequence of the form (TG(1-3))n or more precisely ((TG)(1-6)TG(2-3))n.
PMID:8720065
The X element is a conserved region, of the telomere, of ~475 bp that contains an ARS sequence and in most cases an Abf1p binding site.
kareneilbeck
2009-11-10T10:56:54Z
X element core sequence
sequence
X element
SO:0001497
Possible functions include roles in chromosomal segregation,
maintenance of chromosome stability, recombinational sequestering, or as a
barrier to transcriptional silencing. This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747.
From Janos Demeter: The only region shared by all chromosome ends, the X element core sequence is a small conserved element (~475 bp) that contains an ARS sequence and in most cases an Abf1p binding site. Between these is a GC-rich region nearly identical to the meiosis-specific regulatory sequence URS1.
X_element
The X element is a conserved region, of the telomere, of ~475 bp that contains an ARS sequence and in most cases an Abf1p binding site.
PMID:7785338
PMID:8005434
http://www.yeastgenome.org/help/glossary.html#xelemcoresequence
A region of sequence from the end of a YAC clone that may provide a highly specific marker.
kareneilbeck
2009-11-19T11:07:18Z
YAC end
sequence
SO:0001498
YAC_end
A region of sequence from the end of a YAC clone that may provide a highly specific marker.
SO:ke
The status of whole genome sequence.
kareneilbeck
2009-10-23T12:47:47Z
whole genome sequence status
sequence
SO:0001499
This terms and children were added to SO in response to tracker request by Patrick Chain. The paper Genome Project Standards in a New Era of Sequencing. Science October 9th 2009, addresses these terms.
whole_genome_sequence_status
The status of whole genome sequence.
DOI:10.1126
A biological_region characterized as a single heritable trait in a phenotype screen. The heritable phenotype may be mapped to a chromosome but generally has not been characterized to a specific gene locus.
kareneilbeck
2009-12-07T01:50:55Z
heritable phenotypic marker
phenotypic marker
sequence
SO:0001500
heritable_phenotypic_marker
A biological_region characterized as a single heritable trait in a phenotype screen. The heritable phenotype may be mapped to a chromosome but generally has not been characterized to a specific gene locus.
JAX:hdene
A collection of peptide sequences.
kareneilbeck
2009-12-11T10:58:58Z
peptide collection
peptide set
sequence
SO:0001501
Term requested via tracker ID: 2910829.
peptide_collection
A collection of peptide sequences.
BBOP:nlw
An experimental feature with high sequence identity to another sequence.
kareneilbeck
2009-12-11T11:06:05Z
high identity region
sequence
SO:0001502
Requested by tracker ID: 2902685.
high_identity_region
An experimental feature with high sequence identity to another sequence.
SO:ke
A transcript for which no open reading frame has been identified and for which no other function has been determined.
kareneilbeck
2009-12-21T05:37:14Z
processed transcript
sequence
SO:0001503
Ensembl and Vega also use this term name. Requested by Howard Deen of MGI.
processed_transcript
A transcript for which no open reading frame has been identified and for which no other function has been determined.
MGI:hdeen
A chromosome variation derived from an event during meiosis.
kareneilbeck
2010-03-02T05:03:18Z
sequence
assortment derived variation
SO:0001504
assortment_derived_variation
A chromosome variation derived from an event during meiosis.
SO:ke
A collection of sequences (often chromosomes) taken as the standard for a given organism and genome assembly.
kareneilbeck
2010-03-03T02:10:03Z
sequence
reference genome
SO:0001505
reference_genome
A collection of sequences (often chromosomes) taken as the standard for a given organism and genome assembly.
SO:ke
A collection of sequences (often chromosomes) of an individual.
kareneilbeck
2010-03-03T02:11:25Z
sequence
variant genome
SO:0001506
variant_genome
A collection of sequences (often chromosomes) of an individual.
SO:ke
A collection of one or more sequences of an individual.
kareneilbeck
2010-03-03T02:13:28Z
sequence
variant collection
SO:0001507
variant_collection
A collection of one or more sequences of an individual.
SO:ke
An attribute of alteration of one or more chromosomes.
kareneilbeck
2010-03-04T02:53:23Z
alteration attribute
sequence
SO:0001508
alteration_attribute
An attribute of a change in the structure or number of a chromosomes.
kareneilbeck
2010-03-04T02:54:30Z
chromosomal variation attribute
sequence
SO:0001509
chromosomal_variation_attribute
A change in chromosomes that occurs between two separate chromosomes.
kareneilbeck
2010-03-04T02:55:25Z
sequence
SO:0001510
intrachromosomal
A change in chromosomes that occurs between two sections of the same chromosome or between homologous chromosomes.
kareneilbeck
2010-03-04T02:55:43Z
sequence
SO:0001511
interchromosomal
A quality of a chromosomal insertion,.
kareneilbeck
2010-03-04T02:55:56Z
insertion attribute
sequence
SO:0001512
insertion_attribute
A quality of a chromosomal insertion,.
SO:ke
An insertion of extension of a tandem repeat.
kareneilbeck
2010-03-04T02:56:37Z
sequence
SO:0001513
tandem
A quality of an insertion where the insert is not in a cytologically inverted orientation.
kareneilbeck
2010-03-04T02:56:49Z
sequence
SO:0001514
direct
A quality of an insertion where the insert is not in a cytologically inverted orientation.
SO:ke
A quality of an insertion where the insert is in a cytologically inverted orientation.
kareneilbeck
2010-03-04T02:57:40Z
sequence
SO:0001515
inverted
A quality of an insertion where the insert is in a cytologically inverted orientation.
SO:ke
The quality of a duplication where the new region exists independently of the original.
kareneilbeck
2010-03-04T02:57:51Z
sequence
SO:0001516
free
The quality of a duplication where the new region exists independently of the original.
SO:ke
When a region of a chromosome is changed to the reverse order without duplication or deletion.
kareneilbeck
2010-03-04T02:58:10Z
inversion attribute
sequence
SO:0001517
inversion_attribute
An inversion event that includes the centromere.
kareneilbeck
2010-03-04T02:58:24Z
sequence
SO:0001518
pericentric
An inversion event that does not include the centromere.
kareneilbeck
2010-03-04T02:58:35Z
sequence
SO:0001519
paracentric
An attribute of a translocation, which is then a region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions.
kareneilbeck
2010-03-04T02:58:47Z
translocation attribute
sequence
SO:0001520
translocaton_attribute
When translocation occurs between nonhomologous chromosomes and involved an equal exchange of genetic materials.
kareneilbeck
2010-03-04T02:59:34Z
sequence
SO:0001521
reciprocal
When a translocation is simply moving genetic material from one chromosome to another.
kareneilbeck
2010-03-04T02:59:51Z
sequence
SO:0001522
insertional
An attribute of a duplication, which is an insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome.
kareneilbeck
2010-03-05T01:56:33Z
sequence
duplication attribute
SO:0001523
duplication_attribute
When a genome contains an abnormal amount of chromosomes.
kareneilbeck
2010-03-05T02:21:00Z
sequence
chromosomally aberrant genome
SO:0001524
chromosomally_aberrant_genome
A region of sequence where the final nucleotide assignment differs from the original assembly due to an improvement that replaces a mistake.
kareneilbeck
2010-03-09T02:16:31Z
sequence
assembly error correction
SO:0001525
assembly_error_correction
A region of sequence where the final nucleotide assignment differs from the original assembly due to an improvement that replaces a mistake.
SO:ke
A region of sequence where the final nucleotide assignment is different from that given by the base caller due to an improvement that replaces a mistake.
kareneilbeck
2010-03-09T02:18:07Z
sequence
base call error correction
SO:0001526
base_call_error_correction
A region of sequence where the final nucleotide assignment is different from that given by the base caller due to an improvement that replaces a mistake.
SO:ke
A region of peptide sequence used to target the polypeptide molecule to a specific organelle.
kareneilbeck
2010-03-11T02:15:05Z
peptide localization signal
sequence
localization signal
SO:0001527
peptide_localization_signal
A region of peptide sequence used to target the polypeptide molecule to a specific organelle.
SO:ke
A polypeptide region that targets a polypeptide to the nucleus.
kareneilbeck
2010-03-11T02:16:38Z
http://en.wikipedia.org/wiki/Nuclear_localization_signal
NLS
sequence
SO:0001528
nuclear_localization_signal
A polypeptide region that targets a polypeptide to the nucleus.
SO:ke
http://en.wikipedia.org/wiki/Nuclear_localization_signal
wikipedia
A polypeptide region that targets a polypeptide to the endosome.
kareneilbeck
2010-03-11T02:20:58Z
endosomal localization signal
sequence
SO:0001529
endosomal_localization_signal
A polypeptide region that targets a polypeptide to the endosome.
SO:ke
A polypeptide region that targets a polypeptide to the lysosome.
kareneilbeck
2010-03-11T02:24:10Z
lysosomal localization signal
sequence
SO:0001530
lysosomal_localization_signal
A polypeptide region that targets a polypeptide to the lysosome.
SO:ke
A polypeptide region that targets a polypeptide to he cytoplasm.
kareneilbeck
2010-03-11T02:25:25Z
http://en.wikipedia.org/wiki/Nuclear_export_signal
NES
nuclear export signal
sequence
SO:0001531
nuclear_export_signal
A polypeptide region that targets a polypeptide to he cytoplasm.
SO:ke
A region recognized by a recombinase.
kareneilbeck
2010-03-11T03:16:47Z
http://en.wikipedia.org/wiki/Recombination_Signal_Sequences
sequence
recombination signal sequence
SO:0001532
recombination_signal_sequence
A region recognized by a recombinase.
SO:ke
http://en.wikipedia.org/wiki/Recombination_Signal_Sequences
wikipedia
A splice site that is in part of the transcript not normally spliced. They occur via mutation or transcriptional error.
kareneilbeck
2010-03-11T03:25:06Z
cryptic splice site
sequence
cryptic splice signal
SO:0001533
cryptic_splice_site
A splice site that is in part of the transcript not normally spliced. They occur via mutation or transcriptional error.
SO:ke
A polypeptide region that targets a polypeptide to the nuclear rim.
kareneilbeck
2010-03-11T03:31:30Z
PMID:16027110
sequence
nuclear rim localization signal
SO:0001534
nuclear_rim_localization_signal
A polypeptide region that targets a polypeptide to the nuclear rim.
SO:ke
A P-element is a DNA transposon responsible for hybrid dysgenesis. P elements in this terminal inverted repeat (TIR) transposon superfamily have 31 bp perfect TIR and upon insertion duplicate an 8 bp sequence. It contains transposase that may lack the DDE domain.
kareneilbeck
2010-03-12T03:40:33Z
DTP transposon
P TIR transposon
P element
P transposable element
P-element
sequence
SO:0001535
Moved from under DNA_transposon (SO:0000182) by Dave Sant as per request from GitHub issue #488 on June 25, 2020
P_TIR_transposon
A P-element is a DNA transposon responsible for hybrid dysgenesis. P elements in this terminal inverted repeat (TIR) transposon superfamily have 31 bp perfect TIR and upon insertion duplicate an 8 bp sequence. It contains transposase that may lack the DDE domain.
PMID:6309410
SO:ke
A variant whereby the effect is evaluated with respect to a reference.
kareneilbeck
2010-03-22T11:30:25Z
functional effect variant
functional variant
sequence
SO:0001536
Updated after request from Lea Starita, lea.starita@gmail.com from the NCBI.
functional_effect_variant
A variant whereby the effect is evaluated with respect to a reference.
SO:ke
A sequence variant that changes one or more structural features.
kareneilbeck
2010-03-22T11:31:01Z
http://vat.gersteinlab.org/formats.php
Jannovar:structural_variant
VAT:svOverlap
sequence
structural variant
SO:0001537
structural_variant
A sequence variant that changes one or more structural features.
SO:ke
http://vat.gersteinlab.org/formats.php
VAT
Jannovar:structural_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAT:svOverlap
A sequence variant which alters the functioning of a transcript with respect to a reference sequence.
kareneilbeck
2010-03-22T11:32:58Z
transcript function variant
sequence
SO:0001538
transcript_function_variant
A sequence variant which alters the functioning of a transcript with respect to a reference sequence.
SO:ke
A sequence variant that affects the functioning of a translational product with respect to a reference sequence.
kareneilbeck
2010-03-22T11:46:15Z
translational product variant
sequence
SO:0001539
translational_product_function_variant
A sequence variant that affects the functioning of a translational product with respect to a reference sequence.
SO:ke
A sequence variant which alters the level of a transcript.
kareneilbeck
2010-03-22T11:47:07Z
level of transcript variant
sequence
SO:0001540
level_of_transcript_variant
A sequence variant which alters the level of a transcript.
SO:ke
A sequence variant that decreases the level of mature, spliced and processed RNA with respect to a reference sequence.
kareneilbeck
2010-03-22T11:47:47Z
decreased transcript level
sequence
SO:0001541
decreased_transcript_level_variant
A sequence variant that decreases the level of mature, spliced and processed RNA with respect to a reference sequence.
SO:ke
A sequence variant that increases the level of mature, spliced and processed RNA with respect to a reference sequence.
kareneilbeck
2010-03-22T11:48:17Z
increased transcript level variant
sequence
SO:0001542
increased_transcript_level_variant
A sequence variant that increases the level of mature, spliced and processed RNA with respect to a reference sequence.
SO:ke
A sequence variant that affects the post transcriptional processing of a transcript with respect to a reference sequence.
kareneilbeck
2010-03-22T11:48:48Z
transcript processing variant
sequence
SO:0001543
transcript_processing_variant
A sequence variant that affects the post transcriptional processing of a transcript with respect to a reference sequence.
SO:ke
A transcript processing variant whereby the process of editing is disrupted with respect to the reference.
kareneilbeck
2010-03-22T11:49:25Z
editing variant
sequence
SO:0001544
editing_variant
A transcript processing variant whereby the process of editing is disrupted with respect to the reference.
SO:ke
A sequence variant that changes polyadenylation with respect to a reference sequence.
kareneilbeck
2010-03-22T11:49:40Z
polyadenylation variant
sequence
SO:0001545
polyadenylation_variant
A sequence variant that changes polyadenylation with respect to a reference sequence.
SO:ke
A variant that changes the stability of a transcript with respect to a reference sequence.
kareneilbeck
2010-03-22T11:50:01Z
transcript stability variant
sequence
SO:0001546
transcript_stability_variant
A variant that changes the stability of a transcript with respect to a reference sequence.
SO:ke
A sequence variant that decreases transcript stability with respect to a reference sequence.
kareneilbeck
2010-03-22T11:50:23Z
decrease transcript stability variant
sequence
SO:0001547
decreased_transcript_stability_variant
A sequence variant that decreases transcript stability with respect to a reference sequence.
SO:ke
A sequence variant that increases transcript stability with respect to a reference sequence.
kareneilbeck
2010-03-22T11:50:39Z
increased transcript stability variant
sequence
SO:0001548
increased_transcript_stability_variant
A sequence variant that increases transcript stability with respect to a reference sequence.
SO:ke
A variant that changes alters the transcription of a transcript with respect to a reference sequence.
kareneilbeck
2010-03-22T11:51:26Z
transcription variant
sequence
SO:0001549
transcription_variant
A variant that changes alters the transcription of a transcript with respect to a reference sequence.
SO:ke
A sequence variant that changes the rate of transcription with respect to a reference sequence.
kareneilbeck
2010-03-22T11:51:50Z
rate of transcription variant
sequence
SO:0001550
rate_of_transcription_variant
A sequence variant that changes the rate of transcription with respect to a reference sequence.
SO:ke
A sequence variant that increases the rate of transcription with respect to a reference sequence.
kareneilbeck
2010-03-22T11:52:17Z
increased transcription rate variant
sequence
SO:0001551
increased_transcription_rate_variant
A sequence variant that increases the rate of transcription with respect to a reference sequence.
SO:ke
A sequence variant that decreases the rate of transcription with respect to a reference sequence.
kareneilbeck
2010-03-22T11:52:43Z
decreased transcription rate variant
sequence
SO:0001552
decreased_transcription_rate_variant
A sequence variant that decreases the rate of transcription with respect to a reference sequence.
SO:ke
A functional variant that changes the translational product level with respect to a reference sequence.
kareneilbeck
2010-03-22T11:53:32Z
translational product level variant
sequence
SO:0001553
translational_product_level_variant
A functional variant that changes the translational product level with respect to a reference sequence.
SO:ke
A sequence variant which changes polypeptide functioning with respect to a reference sequence.
kareneilbeck
2010-03-22T11:53:54Z
polypeptide function variant
sequence
SO:0001554
polypeptide_function_variant
A sequence variant which changes polypeptide functioning with respect to a reference sequence.
SO:ke
A sequence variant which decreases the translational product level with respect to a reference sequence.
kareneilbeck
2010-03-22T11:54:25Z
decrease translational product level
sequence
SO:0001555
decreased_translational_product_level
A sequence variant which decreases the translational product level with respect to a reference sequence.
SO:ke
A sequence variant which increases the translational product level with respect to a reference sequence.
kareneilbeck
2010-03-22T11:55:25Z
increase translational product level
sequence
SO:0001556
increased_translational_product_level
A sequence variant which increases the translational product level with respect to a reference sequence.
SO:ke
A sequence variant which causes gain of polypeptide function with respect to a reference sequence.
kareneilbeck
2010-03-22T11:56:12Z
polypeptide gain of function variant
sequence
SO:0001557
polypeptide_gain_of_function_variant
A sequence variant which causes gain of polypeptide function with respect to a reference sequence.
SO:ke
A sequence variant which changes the localization of a polypeptide with respect to a reference sequence.
kareneilbeck
2010-03-22T11:56:37Z
polypeptide localization variant
sequence
SO:0001558
polypeptide_localization_variant
A sequence variant which changes the localization of a polypeptide with respect to a reference sequence.
SO:ke
A sequence variant that causes the loss of a polypeptide function with respect to a reference sequence.
kareneilbeck
2010-03-22T11:56:58Z
polypeptide loss of function variant
sequence
SO:0001559
polypeptide_loss_of_function_variant
A sequence variant that causes the loss of a polypeptide function with respect to a reference sequence.
SO:ke
A sequence variant that causes the inactivation of a ligand binding site with respect to a reference sequence.
kareneilbeck
2010-03-22T11:58:00Z
inactive ligand binding site
sequence
SO:0001560
inactive_ligand_binding_site
A sequence variant that causes the inactivation of a ligand binding site with respect to a reference sequence.
SO:ke
A sequence variant that causes some but not all loss of polypeptide function with respect to a reference sequence.
kareneilbeck
2010-03-22T11:58:32Z
polypeptide partial loss of function
sequence
SO:0001561
polypeptide_partial_loss_of_function
A sequence variant that causes some but not all loss of polypeptide function with respect to a reference sequence.
SO:ke
A sequence variant that causes a change in post translational processing of the peptide with respect to a reference sequence.
kareneilbeck
2010-03-22T11:59:06Z
polypeptide post translational processing variant
sequence
SO:0001562
polypeptide_post_translational_processing_variant
A sequence variant that causes a change in post translational processing of the peptide with respect to a reference sequence.
SO:ke
A sequence variant where copies of a feature (CNV) are either increased or decreased.
kareneilbeck
2010-03-22T02:27:33Z
copy number change
sequence
SO:0001563
copy_number_change
A sequence variant where copies of a feature (CNV) are either increased or decreased.
SO:ke
A sequence variant where the structure of the gene is changed.
kareneilbeck
2010-03-22T02:28:01Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:gene_variant
VAAST:gene_variant
gene structure variant
snpEff:GENE
sequence
SO:0001564
gene_variant
A sequence variant where the structure of the gene is changed.
SO:ke
Jannovar:gene_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:gene_variant
snpEff:GENE
A sequence variant whereby a two genes have become joined.
kareneilbeck
2010-03-22T02:28:28Z
gene fusion
sequence
SO:0001565
gene_fusion
A sequence variant whereby a two genes have become joined.
SO:ke
A sequence variant located within a regulatory region.
kareneilbeck
2010-03-22T02:28:48Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:regulatory_region_variant
VEP:regulatory_region_variant
regulatory region variant
regulatory_region_
snpEff:REGULATION
sequence
SO:0001566
EBI term: Regulatory region variations - In regulatory region annotated by Ensembl.
regulatory_region_variant
A sequence variant located within a regulatory region.
SO:ke
Jannovar:regulatory_region_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:regulatory_region_variant
regulatory_region_
http://ensembl.org/info/docs/variation/index.html
snpEff:REGULATION
A sequence variant where at least one base in the terminator codon is changed, but the terminator remains.
kareneilbeck
2010-04-19T05:02:30Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:stop_retained_variant
VAAST:stop_retained
VAAST:stop_retained_variant
VEP:stop_retained_variant
snpEff:NON_SYNONYMOUS_STOP
snpEff:SYNONYMOUS_STOP
stop retained variant
sequence
SO:0001567
stop_retained_variant
A sequence variant where at least one base in the terminator codon is changed, but the terminator remains.
SO:ke
Jannovar:stop_retained_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:stop_retained
VAAST:stop_retained_variant
VEP:stop_retained_variant
snpEff:NON_SYNONYMOUS_STOP
snpEff:SYNONYMOUS_STOP
A sequence variant that changes the process of splicing.
kareneilbeck
2010-03-22T02:29:22Z
Jannovar:splicing_variant
splicing variant
sequence
SO:0001568
splicing_variant
A sequence variant that changes the process of splicing.
SO:ke
Jannovar:splicing_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A sequence variant causing a new (functional) splice site.
kareneilbeck
2010-03-22T02:29:41Z
cryptic splice site activation
sequence
SO:0001569
cryptic_splice_site_variant
A sequence variant causing a new (functional) splice site.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A sequence variant whereby a new splice site is created due to the activation of a new acceptor.
kareneilbeck
2010-03-22T02:30:11Z
cryptic splice acceptor
sequence
SO:0001570
cryptic_splice_acceptor
A sequence variant whereby a new splice site is created due to the activation of a new acceptor.
SO:ke
A sequence variant whereby a new splice site is created due to the activation of a new donor.
kareneilbeck
2010-03-22T02:30:35Z
cryptic splice donor
sequence
SO:0001571
cryptic_splice_donor
A sequence variant whereby a new splice site is created due to the activation of a new donor.
SO:ke
A sequence variant whereby an exon is lost from the transcript.
kareneilbeck
2010-03-22T02:31:09Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:exon_loss_variant
exon loss
snpEff:EXON_DELETED
sequence
SO:0001572
exon_loss_variant
A sequence variant whereby an exon is lost from the transcript.
SO:ke
Jannovar:exon_loss_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:EXON_DELETED
A sequence variant whereby an intron is gained by the processed transcript; usually a result of an alteration of the donor or acceptor.
kareneilbeck
2010-03-22T02:31:25Z
intron gain
intron gain variant
sequence
SO:0001573
intron_gain_variant
A sequence variant whereby an intron is gained by the processed transcript; usually a result of an alteration of the donor or acceptor.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A splice variant that changes the 2 base region at the 3' end of an intron.
kareneilbeck
2010-03-22T02:31:52Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:splice_acceptor_variant
Seattleseq:splice-acceptor
VAAST:splice_acceptor_variant
VEP:splice_acceptor_variant
snpEff:SPLICE_SITE_ACCEPTOR
splice acceptor variant
sequence
SO:0001574
splice_acceptor_variant
A splice variant that changes the 2 base region at the 3' end of an intron.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Jannovar:splice_acceptor_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:splice-acceptor
VAAST:splice_acceptor_variant
VEP:splice_acceptor_variant
snpEff:SPLICE_SITE_ACCEPTOR
A splice variant that changes the 2 base pair region at the 5' end of an intron.
kareneilbeck
2010-03-22T02:32:10Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:splice_donor_variant
Seattleseq:splice-donor
VAAST:splice_donor_variant
VEP:splice_donor_variant
snpEff:SPLICE_SITE_DONOR
splice donor variant
sequence
SO:0001575
splice_donor_variant
A splice variant that changes the 2 base pair region at the 5' end of an intron.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Jannovar:splice_donor_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:splice-donor
VAAST:splice_donor_variant
VEP:splice_donor_variant
snpEff:SPLICE_SITE_DONOR
A sequence variant that changes the structure of the transcript.
kareneilbeck
2010-03-22T02:32:41Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:transcript_variant
VAAST:transcript_variant
snpEff:TRANSCRIPT
transcript variant
sequence
SO:0001576
transcript_variant
A sequence variant that changes the structure of the transcript.
SO:ke
Jannovar:transcript_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:transcript_variant
snpEff:TRANSCRIPT
A transcript variant with a complex INDEL- Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border.
kareneilbeck
2010-03-22T02:33:03Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
complex change in transcript
complex transcript variant
complex_indel
sequence
Seattleseq:codingComplex
Seattleseq:codingComplex-near-splice
SO:0001577
EBI term: Complex InDel - Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border.
complex_transcript_variant
A transcript variant with a complex INDEL- Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border.
http://ensembl.org/info/docs/variation/index.html
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
complex_indel
http://ensembl.org/info/docs/variation/index.html
Seattleseq:codingComplex
Seattleseq:codingComplex-near-splice
A sequence variant where at least one base of the terminator codon (stop) is changed, resulting in an elongated transcript.
kareneilbeck
2010-03-23T03:46:42Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
ANNOVAR:stoploss
Jannovar:stop_lost
Seattleseq:stop-lost
VAAST:stop_lost
VAT:removedStop
VEP:stop_lost
snpEff:STOP_LOST
stop codon lost
stop lost
sequence
Seattleseq:stop-lost-near-splice
SO:0001578
EBI term: Stop lost - In coding sequence, resulting in the loss of a stop codon.
stop_lost
A sequence variant where at least one base of the terminator codon (stop) is changed, resulting in an elongated transcript.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
http://vat.gersteinlab.org/formats.php
VAT
ANNOVAR:stoploss
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:stop_lost
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:stop-lost
VAAST:stop_lost
VAT:removedStop
VEP:stop_lost
snpEff:STOP_LOST
stop lost
http://ensembl.org/info/docs/variation/index.html
Seattleseq:stop-lost-near-splice
transcript sequence variant
sequence
SO:0001579
transcript_sequence_variant
true
A sequence variant that changes the coding sequence.
kareneilbeck
2010-03-22T02:34:36Z
SO:0001581
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:coding_sequence_variant
Seattleseq:coding
VAAST:coding_sequence_variant
VEP:coding_sequence_variant
coding sequence variant
coding variant
codon variant
codon_variant
snpEff:CDS
sequence
snpEff:CODON_CHANGE
SO:0001580
coding_sequence_variant
A sequence variant that changes the coding sequence.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Jannovar:coding_sequence_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:coding
VAAST:coding_sequence_variant
VEP:coding_sequence_variant
snpEff:CDS
snpEff:CODON_CHANGE
true
A codon variant that changes at least one base of the first codon of a transcript.
kareneilbeck
2010-03-22T02:35:18Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
loinc:LA6695-6
Jannovar:initiator_codon_variant
VAT:startOverlap
initiatior codon variant
initiator codon change
sequence
snpEff:NON_SYNONYMOUS_START
SO:0001582
This is being used to annotate changes to the first codon of a transcript, when the first annotated codon is not to methionine. A variant is predicted to change the first amino acid of a translation irrespective of the fact that the underlying codon is an AUG. As such for transcripts with an incomplete CDS (sequence does not start with an AUG), it is still called.
initiator_codon_variant
A codon variant that changes at least one base of the first codon of a transcript.
SO:ke
http://vat.gersteinlab.org/formats.php
VAT
loinc:LA6695-6
Initiating Methionine
Jannovar:initiator_codon_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAT:startOverlap
snpEff:NON_SYNONYMOUS_START
A sequence variant, that changes one or more bases, resulting in a different amino acid sequence but where the length is preserved.
kareneilbeck
2010-03-22T02:35:49Z
SO:0001584
SO:0001783
http://en.wikipedia.org/wiki/Missense_mutation
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
loinc:LA6698-0
Jannovar:missense_variant
Seattleseq:missense
VAAST:missense_variant
VAT:nonsynonymous
VEP:missense_variant
missense
missense codon
snpEff:NON_SYNONYMOUS_CODING
sequence
ANNOVAR:nonsynonymous SNV
Seattleseq:missense-near-splice
VAAST:non_synonymous_codon
SO:0001583
EBI term: Non-synonymous SNPs. SNPs that are located in the coding sequence and result in an amino acid change in the encoded peptide sequence. A change that causes a non_synonymous_codon can be more than 3 bases - for example 4 base substitution.
missense_variant
A sequence variant, that changes one or more bases, resulting in a different amino acid sequence but where the length is preserved.
EBI:fc
EBI:gr
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
http://vat.gersteinlab.org/formats.php
VAT
loinc:LA6698-0
Missense
Jannovar:missense_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:missense
VAAST:missense_variant
VAT:nonsynonymous
VEP:missense_variant
missense
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
snpEff:NON_SYNONYMOUS_CODING
ANNOVAR:nonsynonymous SNV
http://www.openbioinformatics.org/annovar/annovar_download.html
Seattleseq:missense-near-splice
VAAST:non_synonymous_codon
true
A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for a different but similar amino acid. These variants may or may not be deleterious.
kareneilbeck
2010-03-22T02:36:40Z
conservative missense codon
conservative missense variant
sequence
neutral missense codon
quiet missense codon
SO:0001585
conservative_missense_variant
A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for a different but similar amino acid. These variants may or may not be deleterious.
SO:ke
A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for an amino acid with different biochemical properties.
kareneilbeck
2010-03-22T02:37:16Z
non conservative missense codon
non conservative missense variant
sequence
SO:0001586
non_conservative_missense_variant
A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for an amino acid with different biochemical properties.
SO:ke
A sequence variant whereby at least one base of a codon is changed, resulting in a premature stop codon, leading to a shortened polypeptide.
kareneilbeck
2010-03-22T02:37:52Z
http://ensembl.org/info/genome/variation/prediction/predicted_data.html
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
loinc:LA6699-8
ANNOVAR:stopgain
Jannovar:stop_gained
Seattleseq:stop-gained
VAAST:stop_gained
VAT:prematureStop
VEP:stop_gained
nonsense
nonsense codon
snpEff:STOP_GAINED
stop gained
sequence
Seattleseq:stop-gained-near-splice
stop codon gained
SO:0001587
EBI term: Stop gained - In coding sequence, resulting in the gain of a stop codon (i.e. leading to a shortened peptide sequence).
stop_gained
A sequence variant whereby at least one base of a codon is changed, resulting in a premature stop codon, leading to a shortened polypeptide.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
http://vat.gersteinlab.org/formats.php
VAT
loinc:LA6699-8
Nonsense
ANNOVAR:stopgain
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:stop_gained
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:stop-gained
VAAST:stop_gained
VAT:prematureStop
VEP:stop_gained
nonsense
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
snpEff:STOP_GAINED
stop gained
http://ensembl.org/info/docs/variation/index.html
Seattleseq:stop-gained-near-splice
true
A sequence variant which causes a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three.
kareneilbeck
2010-03-22T02:40:19Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
loinc:LA6694-9
Jannovar:frameshift_variant
Seattleseq:frameshift
VAAST:frameshift_variant
VEP:frameshift_variant
frameshift variant
frameshift_
frameshift_coding
snpEff:FRAME_SHIFT
VAT:deletionFS
VAT:insertionFS
sequence
ANNOVAR:frameshift block substitution
ANNOVAR:frameshift substitution
Seattleseq:frameshift-near-splice
SO:0001589
EBI term:Frameshift variations - In coding sequence, resulting in a frameshift.
frameshift_variant
A sequence variant which causes a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
http://vat.gersteinlab.org/formats.php
VAT
loinc:LA6694-9
Frameshift
Jannovar:frameshift_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:frameshift
VAAST:frameshift_variant
VEP:frameshift_variant
frameshift_
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
frameshift_coding
http://ensembl.org/info/docs/variation/index.html
snpEff:FRAME_SHIFT
VAT:deletionFS
VAT:insertionFS
ANNOVAR:frameshift block substitution
http://www.openbioinformatics.org/annovar/annovar_download.html
Seattleseq:frameshift-near-splice
A sequence variant whereby at least one of the bases in the terminator codon is changed.
kareneilbeck
2010-03-22T02:40:37Z
SO:0001625
http://vat.gersteinlab.org/formats.php
loinc:LA6700-2
VAT:endOverlap
terminal codon variant
terminal_codon_variant
terminator codon variant
sequence
SO:0001590
The terminal codon may be the terminator, or in an incomplete transcript the last available codon.
terminator_codon_variant
A sequence variant whereby at least one of the bases in the terminator codon is changed.
SO:ke
http://vat.gersteinlab.org/formats.php
VAT
loinc:LA6700-2
Stop Codon Mutation
VAT:endOverlap
A sequence variant that reverts the sequence of a previous frameshift mutation back to the initial frame.
kareneilbeck
2010-03-22T02:41:09Z
frame restoring variant
sequence
SO:0001591
frame_restoring_variant
A sequence variant that reverts the sequence of a previous frameshift mutation back to the initial frame.
SO:ke
A sequence variant which causes a disruption of the translational reading frame, by shifting one base ahead.
kareneilbeck
2010-03-22T02:41:30Z
-1 frameshift variant
minus 1 frameshift variant
sequence
SO:0001592
minus_1_frameshift_variant
A sequence variant which causes a disruption of the translational reading frame, by shifting one base ahead.
http://arjournals.annualreviews.org/doi/pdf/10.1146/annurev.ge.08.120174.001535
A sequence variant which causes a disruption of the translational reading frame, by shifting two bases forward.
kareneilbeck
2010-03-22T02:41:52Z
-2 frameshift variant
minus 2 frameshift variant
sequence
SO:0001593
minus_2_frameshift_variant
A sequence variant which causes a disruption of the translational reading frame, by shifting one base backward.
kareneilbeck
2010-03-22T02:42:06Z
+1 frameshift variant
plus 1 frameshift variant
sequence
SO:0001594
plus_1_frameshift_variant
A sequence variant which causes a disruption of the translational reading frame, by shifting one base backward.
http://arjournals.annualreviews.org/doi/pdf/10.1146/annurev.ge.08.120174.001535
A sequence variant which causes a disruption of the translational reading frame, by shifting two bases backward.
kareneilbeck
2010-03-22T02:42:23Z
+2 frameshift variant
plus 2 frameshift variant
sequence
SO:0001595
plus_2_frameshift_variant
A sequence variant within a transcript that changes the secondary structure of the RNA product.
kareneilbeck
2010-03-22T02:43:18Z
transcript secondary structure variant
sequence
SO:0001596
transcript_secondary_structure_variant
A sequence variant within a transcript that changes the secondary structure of the RNA product.
SO:ke
A secondary structure variant that compensate for the change made by a previous variant.
kareneilbeck
2010-03-22T02:43:54Z
compensatory transcript secondary structure variant
sequence
SO:0001597
compensatory_transcript_secondary_structure_variant
A secondary structure variant that compensate for the change made by a previous variant.
SO:ke
A sequence variant within the transcript that changes the structure of the translational product.
kareneilbeck
2010-03-22T02:44:17Z
translational product structure variant
sequence
SO:0001598
translational_product_structure_variant
A sequence variant within the transcript that changes the structure of the translational product.
SO:ke
A sequence variant that changes the resulting polypeptide structure.
kareneilbeck
2010-03-22T02:44:46Z
3D polypeptide structure variant
sequence
SO:0001599
3D_polypeptide_structure_variant
A sequence variant that changes the resulting polypeptide structure.
SO:ke
A sequence variant that changes the resulting polypeptide structure.
kareneilbeck
2010-03-22T02:45:13Z
complex 3D structural variant
sequence
SO:0001600
complex_3D_structural_variant
A sequence variant that changes the resulting polypeptide structure.
SO:ke
A sequence variant in the CDS region that causes a conformational change in the resulting polypeptide sequence.
kareneilbeck
2010-03-22T02:45:48Z
conformational change variant
sequence
SO:0001601
conformational_change_variant
A sequence variant in the CDS region that causes a conformational change in the resulting polypeptide sequence.
SO:ke
A variant that changes the translational product with respect to the reference.
kareneilbeck
2010-03-22T02:46:54Z
complex change of translational product variant
sequence
SO:0001602
complex_change_of_translational_product_variant
A sequence variant with in the CDS that causes a change in the resulting polypeptide sequence.
kareneilbeck
2010-03-22T02:47:13Z
polypeptide sequence variant
sequence
SO:0001603
polypeptide_sequence_variant
A sequence variant with in the CDS that causes a change in the resulting polypeptide sequence.
SO:ke
A sequence variant within a CDS resulting in the loss of an amino acid from the resulting polypeptide.
kareneilbeck
2010-03-22T02:47:36Z
amino acid deletion
sequence
SO:0001604
amino_acid_deletion
A sequence variant within a CDS resulting in the loss of an amino acid from the resulting polypeptide.
SO:ke
A sequence variant within a CDS resulting in the gain of an amino acid to the resulting polypeptide.
kareneilbeck
2010-03-22T02:47:56Z
amino acid insertion
sequence
SO:0001605
amino_acid_insertion
A sequence variant within a CDS resulting in the gain of an amino acid to the resulting polypeptide.
SO:ke
A sequence variant of a codon resulting in the substitution of one amino acid for another in the resulting polypeptide.
kareneilbeck
2010-03-22T02:48:17Z
VAAST:amino_acid_substitution
amino acid substitution
sequence
SO:0001606
amino_acid_substitution
A sequence variant of a codon resulting in the substitution of one amino acid for another in the resulting polypeptide.
SO:ke
VAAST:amino_acid_substitution
A sequence variant of a codon causing the substitution of a similar amino acid for another in the resulting polypeptide.
kareneilbeck
2010-03-22T02:48:57Z
conservative amino acid substitution
sequence
SO:0001607
conservative_amino_acid_substitution
A sequence variant of a codon causing the substitution of a similar amino acid for another in the resulting polypeptide.
SO:ke
A sequence variant of a codon causing the substitution of a non conservative amino acid for another in the resulting polypeptide.
kareneilbeck
2010-03-22T02:49:23Z
non conservative amino acid substitution
sequence
SO:0001608
non_conservative_amino_acid_substitution
A sequence variant of a codon causing the substitution of a non conservative amino acid for another in the resulting polypeptide.
SO:ke
An elongation of a polypeptide sequence deriving from a sequence variant extending the CDS.
kareneilbeck
2010-03-22T02:49:52Z
elongated polypeptide
sequence
SO:0001609
elongated_polypeptide
An elongation of a polypeptide sequence deriving from a sequence variant extending the CDS.
SO:ke
An elongation of a polypeptide sequence at the C terminus deriving from a sequence variant extending the CDS.
kareneilbeck
2010-03-22T02:50:20Z
elongated polypeptide C terminal
sequence
SO:0001610
elongated_polypeptide_C_terminal
An elongation of a polypeptide sequence at the C terminus deriving from a sequence variant extending the CDS.
SO:ke
An elongation of a polypeptide sequence at the N terminus deriving from a sequence variant extending the CDS.
kareneilbeck
2010-03-22T02:50:31Z
elongated polypeptide N terminal
sequence
SO:0001611
elongated_polypeptide_N_terminal
An elongation of a polypeptide sequence at the N terminus deriving from a sequence variant extending the CDS.
SO:ke
A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the C terminus.
kareneilbeck
2010-03-22T02:51:05Z
elongated in frame polypeptide C terminal
sequence
SO:0001612
elongated_in_frame_polypeptide_C_terminal
A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the C terminus.
SO:ke
A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the C terminus.
kareneilbeck
2010-03-22T02:51:20Z
elongated polypeptide out of frame C terminal
sequence
SO:0001613
elongated_out_of_frame_polypeptide_C_terminal
A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the C terminus.
SO:ke
A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the N terminus.
kareneilbeck
2010-03-22T02:51:49Z
elongated in frame polypeptide N terminal
sequence
SO:0001614
elongated_in_frame_polypeptide_N_terminal_elongation
A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the N terminus.
SO:ke
A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the N terminus.
kareneilbeck
2010-03-22T02:52:05Z
elongated out of frame N terminal
sequence
SO:0001615
elongated_out_of_frame_polypeptide_N_terminal
A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the N terminus.
SO:ke
A sequence variant that causes a fusion of two polypeptide sequences.
kareneilbeck
2010-03-22T02:52:43Z
polypeptide fusion
sequence
SO:0001616
polypeptide_fusion
A sequence variant that causes a fusion of two polypeptide sequences.
SO:ke
A sequence variant of the CD that causes a truncation of the resulting polypeptide.
kareneilbeck
2010-03-22T02:53:07Z
polypeptide truncation
sequence
SO:0001617
polypeptide_truncation
A sequence variant of the CD that causes a truncation of the resulting polypeptide.
SO:ke
A sequence variant that causes the inactivation of a catalytic site with respect to a reference sequence.
kareneilbeck
2010-03-22T03:06:14Z
inactive catalytic site
sequence
SO:0001618
inactive_catalytic_site
A sequence variant that causes the inactivation of a catalytic site with respect to a reference sequence.
SO:ke
A transcript variant of a non coding RNA gene.
kareneilbeck
2010-03-23T11:16:23Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:non_coding_transcript_variant
VEP:non_coding_transcript_variant
nc transcript variant
non coding transcript variant
within_non_coding_gene
ANNOVAR:ncRNA
sequence
SO:0001619
Within non-coding gene - Located within a gene that does not code for a protein.
non_coding_transcript_variant
A transcript variant of a non coding RNA gene.
SO:ke
Jannovar:non_coding_transcript_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:non_coding_transcript_variant
within_non_coding_gene
http://ensembl.org/info/docs/variation/index.html
ANNOVAR:ncRNA
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
A transcript variant located with the sequence of the mature miRNA.
kareneilbeck
2010-03-23T11:16:58Z
http://snpeff.sourceforge.net/SnpEff_manual.html
VEP:mature_miRNA_variant
mature miRNA variant
snpEff:MICRO_RNA
within_mature_miRNA
sequence
SO:0001620
EBI term: Within mature miRNA - Located within a microRNA.
mature_miRNA_variant
A transcript variant located with the sequence of the mature miRNA.
SO:ke
VEP:mature_miRNA_variant
snpEff:MICRO_RNA
within_mature_miRNA
http://ensembl.org/info/docs/variation/index.html
A variant in a transcript that is the target of nonsense-mediated mRNA decay.
kareneilbeck
2010-03-23T11:20:40Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
NMD transcript variant
NMD_transcript
Nonsense mediated decay transcript variant
VEP:NMD_transcript_variant
sequence
SO:0001621
NMD_transcript_variant
A variant in a transcript that is the target of nonsense-mediated mRNA decay.
SO:ke
NMD_transcript
http://ensembl.org/info/docs/variation/index.html
VEP:NMD_transcript_variant
A transcript variant that is located within the UTR.
kareneilbeck
2010-03-23T11:22:58Z
UTR variant
UTR_
sequence
SO:0001622
UTR_variant
A transcript variant that is located within the UTR.
SO:ke
UTR_
http://ensembl.org/info/docs/variation/index.html
A UTR variant of the 5' UTR.
kareneilbeck
2010-03-23T11:23:29Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
5'UTR variant
5PRIME_UTR
Jannovar:5_prime_utr_variant
Seattleseq:5-prime-UTR
VAAST:5_prime_UTR_variant
VAAST:five_prime_UTR_variant
VEP:5_prime_UTR_variant
five prime UTR variant
snpEff:UTR_5_PRIME
untranslated-5
sequence
ANNOVAR:UTR5
SO:0001623
EBI term: 5prime UTR variations - In 5prime UTR (untranslated region).
5_prime_UTR_variant
A UTR variant of the 5' UTR.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
5PRIME_UTR
http://ensembl.org/info/docs/variation/index.html
Jannovar:5_prime_utr_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:5-prime-UTR
VAAST:5_prime_UTR_variant
VAAST:five_prime_UTR_variant
VEP:5_prime_UTR_variant
snpEff:UTR_5_PRIME
untranslated-5
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
ANNOVAR:UTR5
http://www.openbioinformatics.org/annovar/annovar_download.html
A UTR variant of the 3' UTR.
kareneilbeck
2010-03-23T11:23:54Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
3'UTR variant
3PRIME_UTR
Jannovar:3_prime_utr_variant
Seattleseq:3-prime-UTR
VAAST:3_prime_UTR_variant
VAAST:three_prime_UTR_variant
VEP:3_prime_UTR_variant
snpEff:UTR_3_PRIME
three prime UTR variant
untranslated-3
sequence
ANNOVAR:UTR3
SO:0001624
EBI term 3prime UTR variations - In 3prime UTR.
3_prime_UTR_variant
A UTR variant of the 3' UTR.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
3PRIME_UTR
http://ensembl.org/info/docs/variation/index.html
Jannovar:3_prime_utr_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:3-prime-UTR
VAAST:3_prime_UTR_variant
VAAST:three_prime_UTR_variant
VEP:3_prime_UTR_variant
snpEff:UTR_3_PRIME
untranslated-3
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
ANNOVAR:UTR3
http://www.openbioinformatics.org/annovar/annovar_download.html
true
A sequence variant where at least one base of the final codon of an incompletely annotated transcript is changed.
kareneilbeck
2010-03-23T03:51:15Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:incomplete_terminal_codon_variant
incomplete terminal codon variant
partial_codon
sequence
SO:0001626
EBI term: Partial codon - Located within the final, incomplete codon of a transcript with a shortened coding sequence where the end is unknown.
incomplete_terminal_codon_variant
A sequence variant where at least one base of the final codon of an incompletely annotated transcript is changed.
SO:ke
VEP:incomplete_terminal_codon_variant
partial_codon
http://ensembl.org/info/docs/variation/index.html
A transcript variant occurring within an intron.
kareneilbeck
2010-03-23T03:52:38Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:intron_variant
Seattleseq:intron
VAAST:intron_variant
VEP:intron_variant
intron variant
intron_
intronic
snpEff:INTRON
sequence
ANNOVAR:intronic
Seattleseq:intron-near-splice
SO:0001627
EBI term: Intronic variations - In intron.
intron_variant
A transcript variant occurring within an intron.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Jannovar:intron_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:intron
VAAST:intron_variant
VEP:intron_variant
intron_
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
intronic
http://ensembl.org/info/docs/variation/index.html
snpEff:INTRON
ANNOVAR:intronic
http://www.openbioinformatics.org/annovar/annovar_download.html
Seattleseq:intron-near-splice
A sequence variant located in the intergenic region, between genes.
kareneilbeck
2010-03-23T05:07:37Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:intergenic_variant
Seattleseq:intergenic
VEP:intergenic_variant
intergenic
intergenic variant
snpEff:INTERGENIC
sequence
ANNOVAR:intergenic
SO:0001628
EBI term Intergenic variations - More than 5 kb either upstream or downstream of a transcript.
intergenic_variant
A sequence variant located in the intergenic region, between genes.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Jannovar:intergenic_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:intergenic
VEP:intergenic_variant
intergenic
http://ensembl.org/info/docs/variation/index.html
snpEff:INTERGENIC
ANNOVAR:intergenic
http://www.openbioinformatics.org/annovar/annovar_download.html
A sequence variant that changes the first two or last two bases of an intron, or the 5th base from the start of the intron in the orientation of the transcript.
kareneilbeck
2010-03-24T09:42:00Z
http://vat.gersteinlab.org/formats.php
VAT:spliceOverlap
essential_splice_site
splice site variant
sequence
SO:0001629
EBI term - essential splice site - In the first 2 or the last 2 base pairs of an intron. The 5th base is on the donor (5') side of the intron. Updated to b in line with Cancer Genome Project at the Sanger.
splice_site_variant
A sequence variant that changes the first two or last two bases of an intron, or the 5th base from the start of the intron in the orientation of the transcript.
http://ensembl.org/info/docs/variation/index.html
http://vat.gersteinlab.org/formats.php
VAT
VAT:spliceOverlap
essential_splice_site
http://ensembl.org/info/docs/variation/index.html
A sequence variant in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron.
kareneilbeck
2010-03-24T09:46:02Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:splice_region_variant
VAAST:splice_region_variant
VEP:splice_region_variant
snpEff:SPLICE_SITE_REGION
splice region variant
sequence
ANNOVAR:splicing
snpEff:SPLICE_SITE_BRANCH
snpEff:SPLICE_SITE_BRANCH_U12
SO:0001630
EBI term: splice site - 1-3 bps into an exon or 3-8 bps into an intron.
splice_region_variant
A sequence variant in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron.
http://ensembl.org/info/docs/variation/index.html
Jannovar:splice_region_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:splice_region_variant
VEP:splice_region_variant
snpEff:SPLICE_SITE_REGION
splice region variant
http://ensembl.org/info/docs/variation/index.html
ANNOVAR:splicing
http://www.openbioinformatics.org/annovar/annovar_download.html
snpEff:SPLICE_SITE_BRANCH
snpEff:SPLICE_SITE_BRANCH_U12
A sequence variant located 5' of a gene.
kareneilbeck
2010-03-24T09:49:13Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:upstream_gene_variant
VEP:upstream_gene_variant
snpEff:UPSTREAM
upstream gene variant
sequence
ANNOVAR:upstream
SO:0001631
Different groups annotate up and downstream to different lengths. The subtypes are specific and are backed up with cross references.
upstream_gene_variant
A sequence variant located 5' of a gene.
SO:ke
Jannovar:upstream_gene_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:upstream_gene_variant
snpEff:UPSTREAM
ANNOVAR:upstream
http://www.openbioinformatics.org/annovar/annovar_download.html
A sequence variant located 3' of a gene.
kareneilbeck
2010-03-24T09:49:38Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:downstream_gene_variant
VEP:downstream_gene_variant
downstream gene variant
snpEff:DOWNSTREAM
sequence
ANNOVAR:downstream
SO:0001632
Different groups annotate up and downstream to different lengths. The subtypes are specific and are backed up with cross references.
downstream_gene_variant
A sequence variant located 3' of a gene.
SO:ke
Jannovar:downstream_gene_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:downstream_gene_variant
snpEff:DOWNSTREAM
ANNOVAR:downstream
http://www.openbioinformatics.org/annovar/annovar_download.html
A sequence variant located within 5 KB of the end of a gene.
kareneilbeck
2010-03-24T09:50:16Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
5KB downstream variant
Seattleseq:downstream-gene
downstream
sequence
within 5KB downstream
SO:0001633
EBI term Downstream variations - Within 5 kb downstream of the 3prime end of a transcript.
5KB_downstream_variant
A sequence variant located within 5 KB of the end of a gene.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:downstream-gene
downstream
http://ensembl.org/info/docs/variation/index.html
A sequence variant located within a half KB of the end of a gene.
kareneilbeck
2010-03-24T09:50:42Z
500B downstream variant
near-gene-3
sequence
SO:0001634
500B_downstream_variant
A sequence variant located within a half KB of the end of a gene.
SO:ke
near-gene-3
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
A sequence variant located within 5KB 5' of a gene.
kareneilbeck
2010-03-24T09:51:06Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
5kb upstream variant
Seattleseq:upstream-gene
upstream
sequence
SO:0001635
EBI term Upstream variations - Within 5 kb upstream of the 5prime end of a transcript.
5KB_upstream_variant
A sequence variant located within 5KB 5' of a gene.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:upstream-gene
upstream
http://ensembl.org/info/docs/variation/index.html
A sequence variant located within 2KB 5' of a gene.
kareneilbeck
2010-03-24T09:51:22Z
2KB upstream variant
near-gene-5
sequence
SO:0001636
2KB_upstream_variant
A sequence variant located within 2KB 5' of a gene.
SO:ke
near-gene-5
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
A gene that encodes for ribosomal RNA.
kareneilbeck
2010-04-21T10:10:32Z
rDNA
rRNA gene
sequence
SO:0001637
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
rRNA_gene
A gene that encodes for ribosomal RNA.
SO:ke
A gene that encodes for an piwi associated RNA.
kareneilbeck
2010-04-21T10:11:36Z
piRNA gene
sequence
SO:0001638
Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514.
piRNA_gene
A gene that encodes for an piwi associated RNA.
SO:ke
A gene that encodes an RNase P RNA.
kareneilbeck
2010-04-21T10:13:23Z
RNase P RNA gene
sequence
SO:0001639
Moved under enzymatic_RNA_gene on 18 Nov 2021. See GitHub Issue #533.
RNase_P_RNA_gene
A gene that encodes an RNase P RNA.
SO:ke
A gene that encodes a RNase_MRP_RNA.
kareneilbeck
2010-04-21T10:13:58Z
sequence
RNase MRP RNA gene
SO:0001640
Moved under enzymatic_RNA_gene on 18 Nov 2021. See GitHub Issue #533.
RNase_MRP_RNA_gene
A gene that encodes a RNase_MRP_RNA.
SO:ke
A gene that encodes a long, intervening non-coding RNA.
kareneilbeck
2010-04-21T10:14:24Z
lincRNA gene
sequence
SO:0001641
lincRNA_gene
A gene that encodes a long, intervening non-coding RNA.
PMID:23463798
SO:ke
http://www.gencodegenes.org/gencode_biotypes.html
A mathematically defined repeat (MDR) is a experimental feature that is determined by querying overlapping oligomers of length k against a database of shotgun sequence data and identifying regions in the query sequence that exceed a statistically determined threshold of repetitiveness.
kareneilbeck
2010-05-03T11:50:14Z
mathematically defined repeat
sequence
SO:0001642
Mathematically defined repeat regions are determined without regard to the biological origin of the repetitive region. The repeat units of a MDR are the overlapping oligomers of size k that were used to for the query. Tools that can annotate mathematically defined repeats include Tallymer (Kurtz et al 2008, BMC Genomics: 517) and RePS (Wang et al, Genome Res 12(5): 824-831.).
mathematically_defined_repeat
A mathematically defined repeat (MDR) is a experimental feature that is determined by querying overlapping oligomers of length k against a database of shotgun sequence data and identifying regions in the query sequence that exceed a statistically determined threshold of repetitiveness.
SO:jestill
A telomerase RNA gene is a non coding RNA gene the RNA product of which is a component of telomerase.
kareneilbeck
2010-05-18T05:26:38Z
http:http://en.wikipedia.org/wiki/Telomerase_RNA_component
TERC
Telomerase RNA component
telomerase RNA gene
sequence
SO:0001643
telomerase_RNA_gene
A telomerase RNA gene is a non coding RNA gene the RNA product of which is a component of telomerase.
SO:ke
http:http://en.wikipedia.org/wiki/Telomerase_RNA_component
wikipedia
An engineered vector that is able to take part in homologous recombination in a host with the intent of introducing site specific genomic modifications.
kareneilbeck
2010-05-28T02:05:25Z
sequence
targeting vector
SO:0001644
targeting_vector
An engineered vector that is able to take part in homologous recombination in a host with the intent of introducing site specific genomic modifications.
MGD:tm
PMID:10354467
A measurable sequence feature that varies within a population.
kareneilbeck
2010-05-28T02:33:07Z
sequence
genetic marker
SO:0001645
genetic_marker
A measurable sequence feature that varies within a population.
SO:db
A genetic marker, discovered using Diversity Arrays Technology (DArT) technology.
kareneilbeck
2010-05-28T02:34:43Z
DArT marker
sequence
SO:0001646
DArT_marker
A genetic marker, discovered using Diversity Arrays Technology (DArT) technology.
SO:ke
A kind of ribosome entry site, specific to Eukaryotic organisms that overlaps part of both 5' UTR and CDS sequence.
kareneilbeck
2010-06-07T03:12:20Z
http://en.wikipedia.org/wiki/Kozak_consensus_sequence
kozak consensus
kozak consensus sequence
kozak sequence
sequence
SO:0001647
kozak_sequence
A kind of ribosome entry site, specific to Eukaryotic organisms that overlaps part of both 5' UTR and CDS sequence.
SO:ke
http://en.wikipedia.org/wiki/Kozak_consensus_sequence
wikipedia
A transposon that is disrupted by the insertion of another element.
kareneilbeck
2010-06-23T03:22:57Z
nested transposon
sequence
SO:0001648
nested_transposon
A transposon that is disrupted by the insertion of another element.
SO:ke
A repeat that is disrupted by the insertion of another element.
kareneilbeck
2010-06-23T03:24:55Z
INSDC_feature:repeat_region
INSDC_qualifier:nested
nested repeat
sequence
SO:0001649
nested_repeat
A repeat that is disrupted by the insertion of another element.
SO:ke
A sequence variant which does not cause a disruption of the translational reading frame.
kareneilbeck
2010-07-19T01:24:44Z
VAAST:inframe_variant
cds-indel
inframe variant
sequence
ANNOVAR:nonframeshift block substitution
ANNOVAR:nonframeshift substitution
SO:0001650
inframe_variant
A sequence variant which does not cause a disruption of the translational reading frame.
SO:ke
VAAST:inframe_variant
cds-indel
ANNOVAR:nonframeshift block substitution
http://www.openbioinformatics.org/annovar/annovar_download.html
ANNOVAR:nonframeshift substitution
true
true
A transcription factor binding site of variable direct repeats of the sequence PuGGTCA spaced by five nucleotides (DR5) found in the promoters of retinoic acid-responsive genes, to which retinoic acid receptors bind.
kareneilbeck
2010-08-03T10:46:12Z
RARE
retinoic acid responsive element
sequence
SO:0001653
retinoic_acid_responsive_element
A transcription factor binding site of variable direct repeats of the sequence PuGGTCA spaced by five nucleotides (DR5) found in the promoters of retinoic acid-responsive genes, to which retinoic acid receptors bind.
PMID:11327309
PMID:19917671
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
kareneilbeck
2010-08-03T12:26:05Z
sequence
nucleotide to protein binding site
SO:0001654
nucleotide_to_protein_binding_site
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues.
SO:ke
A binding site that, in the molecule, interacts selectively and non-covalently with nucleotide residues.
kareneilbeck
2010-08-03T12:30:04Z
np_bind
nucleotide binding site
sequence
SO:0001655
See GO:0000166 : nucleotide binding.
nucleotide_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with nucleotide residues.
SO:cb
np_bind
uniprot:feature
A binding site that, in the molecule, interacts selectively and non-covalently with metal ions.
kareneilbeck
2010-08-03T12:31:42Z
sequence
metal binding site
SO:0001656
See GO:0046872 : metal ion binding.
metal_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with metal ions.
SO:cb
A binding site that, in the molecule, interacts selectively and non-covalently with a small molecule such as a drug, or hormone.
kareneilbeck
2010-08-03T12:32:58Z
ligand binding site
sequence
SO:0001657
ligand_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with a small molecule such as a drug, or hormone.
SO:ke
An NTR is a nested repeat of two distinct tandem motifs interspersed with each other.
kareneilbeck
2010-08-26T09:36:16Z
NTR
nested tandem repeat
sequence
SO:0001658
Tracker ID: 3052459.
nested_tandem_repeat
An NTR is a nested repeat of two distinct tandem motifs interspersed with each other.
SO:AF
An element that can exist within the promoter region of a gene.
kareneilbeck
2010-10-01T11:48:32Z
promoter element
sequence
SO:0001659
Mmoved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020.
promoter_element
An element that only exists within the promoter region of a eukaryotic gene.
kareneilbeck
2010-10-01T11:49:03Z
core eukaryotic promoter element
sequence
general transcription factor binding site
SO:0001660
core_eukaryotic_promoter_element
An element that only exists within the promoter region of a eukaryotic gene.
GREEKC:cl
A TATA box core promoter of a gene transcribed by RNA polymerase II.
kareneilbeck
2010-10-01T02:42:12Z
RNA polymerase II TATA box
sequence
SO:0001661
RNA_polymerase_II_TATA_box
A TATA box core promoter of a gene transcribed by RNA polymerase II.
PMID:16858867
A TATA box core promoter of a gene transcribed by RNA polymerase III.
kareneilbeck
2010-10-01T02:43:16Z
RNA polymerase III TATA box
sequence
SO:0001662
RNA_polymerase_III_TATA_box
A TATA box core promoter of a gene transcribed by RNA polymerase III.
SO:ke
A core RNA polymerase II promoter element with consensus (G/A)T(T/G/A)(T/A)(G/T)(T/G)(T/G).
kareneilbeck
2010-10-01T02:49:55Z
BREd
sequence
BREd motif
SO:0001663
BREd_motif
A core RNA polymerase II promoter element with consensus (G/A)T(T/G/A)(T/A)(G/T)(T/G)(T/G).
PMID:16858867
A discontinuous core element of RNA polymerase II transcribed genes, situated downstream of the TSS. It is composed of three sub elements: SI, SII and SIII.
kareneilbeck
2010-10-01T02:56:41Z
sequence
downstream core element
SO:0001664
DCE
A discontinuous core element of RNA polymerase II transcribed genes, situated downstream of the TSS. It is composed of three sub elements: SI, SII and SIII.
PMID:16858867
A sub element of the DCE core promoter element, with consensus sequence CTTC.
kareneilbeck
2010-10-01T03:00:10Z
sequence
DCE SI
SO:0001665
DCE_SI
A sub element of the DCE core promoter element, with consensus sequence CTTC.
PMID:16858867
SO:ke
A sub element of the DCE core promoter element with consensus sequence CTGT.
kareneilbeck
2010-10-01T03:00:30Z
DCE SII
sequence
SO:0001666
DCE_SII
A sub element of the DCE core promoter element with consensus sequence CTGT.
PMID:16858867
SO:ke
A sub element of the DCE core promoter element with consensus sequence AGC.
kareneilbeck
2010-10-01T03:00:44Z
DCE SIII
sequence
SO:0001667
DCE_SIII
A sub element of the DCE core promoter element with consensus sequence AGC.
PMID:16858867
SO:ke
DNA segment that ranges from about -250 to -40 relative to +1 of RNA transcription start site, where sequence specific DNA-binding transcription factors binds, such as Sp1, CTF (CCAAT-binding transcription factor), and CBF (CCAAT-box binding factor).
kareneilbeck
2010-10-01T03:10:23Z
sequence
proximal promoter element
specific transcription factor binding site
SO:0001668
proximal_promoter_element
DNA segment that ranges from about -250 to -40 relative to +1 of RNA transcription start site, where sequence specific DNA-binding transcription factors binds, such as Sp1, CTF (CCAAT-binding transcription factor), and CBF (CCAAT-box binding factor).
PMID:12515390
PMID:9679020
SO:ml
The minimal portion of the promoter required to properly initiate transcription in RNA polymerase II transcribed genes.
kareneilbeck
2010-10-01T03:13:41Z
RNApol II core promoter
sequence
SO:0001669
RNApol_II_core_promoter
The minimal portion of the promoter required to properly initiate transcription in RNA polymerase II transcribed genes.
PMID:16858867
A regulatory promoter element that is distal from the TSS.
kareneilbeck
2010-10-01T03:21:08Z
sequence
distal promoter element
SO:0001670
distal_promoter_element
A DNA sequence to which bacterial RNA polymerase sigma 70 binds, to begin transcription.
kareneilbeck
2010-10-06T01:41:34Z
bacterial RNA polymerase promoter sigma 70
sequence
SO:0001671
bacterial_RNApol_promoter_sigma_70_element
A DNA sequence to which bacterial RNA polymerase sigma 54 binds, to begin transcription.
kareneilbeck
2010-10-06T01:42:37Z
bacterial RNA polymerase promoter sigma54
sequence
<new synonym>
SO:0001672
bacterial_RNApol_promoter_sigma54_element
A conserved region about 12-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54.
kareneilbeck
2010-10-06T01:44:57Z
minus 12 signal
sequence
SO:0001673
minus_12_signal
A conserved region about 12-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54.
PMID:18331472
A conserved region about 24-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54.
kareneilbeck
2010-10-06T01:45:24Z
sequence
minus 24 signal
SO:0001674
minus_24_signal
A conserved region about 24-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54.
PMID:18331472
An A box within an RNA polymerase III type 1 promoter.
kareneilbeck
2010-10-06T05:43:43Z
sequence
A box type 1
SO:0001675
The A box can be found in the promoters of type 1 and type 2 (pol III) so sub-typing here allows the part of relationship of the subtypes to remain true.
A_box_type_1
An A box within an RNA polymerase III type 1 promoter.
SO:ke
An A box within an RNA polymerase III type 2 promoter.
kareneilbeck
2010-10-06T05:44:18Z
sequence
A box type 2
SO:0001676
The A box can be found in the promoters of type 1 and type 2 (pol III) so sub-typing here allows the part of relationship of the subtypes to remain true.
A_box_type_2
An A box within an RNA polymerase III type 2 promoter.
SO:ke
A core promoter region of RNA polymerase III type 1 promoters.
kareneilbeck
2010-10-06T05:52:03Z
IE
sequence
intermediate element
SO:0001677
intermediate_element
A core promoter region of RNA polymerase III type 1 promoters.
PMID:12381659
A promoter element that is not part of the core promoter, but provides the promoter with a specific regulatory region.
kareneilbeck
2010-10-07T04:39:48Z
sequence
regulatory promoter element
SO:0001678
regulatory_promoter_element
A promoter element that is not part of the core promoter, but provides the promoter with a specific regulatory region.
PMID:12381659
A regulatory region that is involved in the control of the process of transcription.
kareneilbeck
2010-10-12T03:49:35Z
transcription regulatory region
sequence
SO:0001679
Obsoleted by David Sant on 11 Feb 2021 when it was merged with transcriptional_cis_regulatory_region (SO:0001055) to reduce redundancy and be consistent with Gene Ontology. See GitHub Issue #527.
transcription_regulatory_region
true
A regulatory region that is involved in the control of the process of transcription.
SO:ke
A regulatory region that is involved in the control of the process of translation.
kareneilbeck
2010-10-12T03:52:45Z
translation regulatory region
sequence
SO:0001680
translation_regulatory_region
A regulatory region that is involved in the control of the process of translation.
SO:ke
A regulatory region that is involved in the control of the process of recombination.
kareneilbeck
2010-10-12T03:53:35Z
recombination regulatory region
sequence
SO:0001681
recombination_regulatory_region
A regulatory region that is involved in the control of the process of recombination.
SO:ke
A regulatory region that is involved in the control of the process of nucleotide replication.
kareneilbeck
2010-10-12T03:54:09Z
INSDC_feature:regulatory
INSDC_qualifier:replication_regulatory_region
replication regulatory region
sequence
SO:0001682
replication_regulatory_region
A regulatory region that is involved in the control of the process of nucleotide replication.
SO:ke
A sequence motif is a nucleotide or amino-acid sequence pattern that may have biological significance.
kareneilbeck
2010-10-14T04:13:22Z
http://en.wikipedia.org/wiki/Sequence_motif
sequence
sequence motif
SO:0001683
sequence_motif
A sequence motif is a nucleotide or amino-acid sequence pattern that may have biological significance.
http://en.wikipedia.org/wiki/Sequence_motif
http://en.wikipedia.org/wiki/Sequence_motif
wikipedia
An attribute of an experimentally derived feature.
kareneilbeck
2010-10-28T02:22:23Z
sequence
experimental feature attribute
SO:0001684
experimental_feature_attribute
An attribute of an experimentally derived feature.
SO:ke
The score of an experimentally derived feature such as a p-value.
kareneilbeck
2010-10-28T02:23:16Z
sequence
SO:0001685
score
The score of an experimentally derived feature such as a p-value.
SO:ke
An experimental feature attribute that defines the quality of the feature in a quantitative way, such as a phred quality score.
kareneilbeck
2010-10-28T02:24:11Z
sequence
quality value
SO:0001686
quality_value
An experimental feature attribute that defines the quality of the feature in a quantitative way, such as a phred quality score.
SO:ke
The nucleotide region (usually a palindrome) that is recognized by a restriction enzyme. This may or may not be equal to the restriction enzyme binding site.
kareneilbeck
2010-10-29T12:29:57Z
restriction endonuclease recognition site
restriction enzyme recognition site
sequence
SO:0001687
restriction_enzyme_recognition_site
The nucleotide region (usually a palindrome) that is recognized by a restriction enzyme. This may or may not be equal to the restriction enzyme binding site.
SO:ke
The boundary at which a restriction enzyme breaks the nucleotide sequence.
kareneilbeck
2010-10-29T12:35:02Z
restriction enzyme cleavage junction
sequence
SO:0001688
restriction_enzyme_cleavage_junction
The boundary at which a restriction enzyme breaks the nucleotide sequence.
SO:ke
The restriction enzyme cleavage junction on the 5' strand of the nucleotide sequence.
kareneilbeck
2010-10-29T12:36:24Z
5' restriction enzyme junction
sequence
SO:0001689
five_prime_restriction_enzyme_junction
The restriction enzyme cleavage junction on the 5' strand of the nucleotide sequence.
SO:ke
The restriction enzyme cleavage junction on the 3' strand of the nucleotide sequence.
kareneilbeck
2010-10-29T12:37:52Z
3' restriction enzyme junction
sequence
SO:0001690
three_prime_restriction_enzyme_junction
A restriction enzyme recognition site that, when cleaved, results in no overhangs.
kareneilbeck
2010-10-29T12:39:53Z
blunt end restriction enzyme cleavage site
sequence
SO:0001691
blunt_end_restriction_enzyme_cleavage_site
A restriction enzyme recognition site that, when cleaved, results in no overhangs.
SBOL:jgquinn
SO:ke
A site where restriction enzymes can cleave that will produce an overhang or 'sticky end'.
kareneilbeck
2010-10-29T12:40:50Z
sequence
sticky end restriction enzyme cleavage site
SO:0001692
sticky_end_restriction_enzyme_cleavage_site
A restriction enzyme cleavage site where both strands are cut at the same position.
kareneilbeck
2010-10-29T12:43:14Z
sequence
blunt end restriction enzyme cleavage site
SO:0001693
blunt_end_restriction_enzyme_cleavage_junction
A restriction enzyme cleavage site where both strands are cut at the same position.
SO:ke
A restriction enzyme cleavage site whereby only one strand is cut.
kareneilbeck
2010-10-29T12:44:48Z
sequence
single strand restriction enzyme cleavage site
SO:0001694
single_strand_restriction_enzyme_cleavage_site
A restriction enzyme cleavage site whereby only one strand is cut.
SO:ke
A terminal region of DNA sequence where the end of the region is not blunt ended.
kareneilbeck
2010-10-29T12:48:35Z
single strand overhang
sequence
sticky end
SO:0001695
restriction_enzyme_single_strand_overhang
A terminal region of DNA sequence where the end of the region is not blunt ended.
SO:ke
A region that has been implicated in binding although the exact coordinates of binding may be unknown.
kareneilbeck
2010-11-02T11:39:59Z
sequence
experimentally defined binding region
SO:0001696
experimentally_defined_binding_region
A region that has been implicated in binding although the exact coordinates of binding may be unknown.
SO:ke
A region of sequence identified by CHiP seq technology to contain a protein binding site.
kareneilbeck
2010-11-02T11:43:07Z
sequence
ChIP seq region
SO:0001697
ChIP_seq_region
A region of sequence identified by CHiP seq technology to contain a protein binding site.
SO:ke
"A primer containing an SNV at the 3' end for accurate genotyping.
kareneilbeck
2010-11-11T03:25:21Z
ASPE primer
allele specific primer extension primer
sequence
SO:0001698
ASPE_primer
"A primer containing an SNV at the 3' end for accurate genotyping.
http://www.ncbi.nlm.nih.gov/pubmed/11252801
A primer with one or more mismatches to the DNA template corresponding to a position within a restriction enzyme recognition site.
kareneilbeck
2010-11-11T03:27:09Z
dCAPS primer
derived cleaved amplified polymorphic primer
sequence
SO:0001699
dCAPS_primer
A primer with one or more mismatches to the DNA template corresponding to a position within a restriction enzyme recognition site.
http://www.ncbi.nlm.nih.gov/pubmed/9628033
Histone modification is a post translationally modified region whereby residues of the histone protein are modified by methylation, acetylation, phosphorylation, ubiquitination, sumoylation, citrullination, or ADP-ribosylation.
kareneilbeck
2010-03-31T10:22:08Z
histone modification
sequence
histone modification site
SO:0001700
histone_modification
Histone modification is a post translationally modified region whereby residues of the histone protein are modified by methylation, acetylation, phosphorylation, ubiquitination, sumoylation, citrullination, or ADP-ribosylation.
http:en.wikipedia.org/wiki/Histone
A histone modification site where the modification is the methylation of the residue.
kareneilbeck
2010-03-31T10:23:02Z
histone methylation
histone methylation site
sequence
SO:0001701
histone_methylation_site
A histone modification site where the modification is the methylation of the residue.
SO:ke
A histone modification where the modification is the acylation of the residue.
kareneilbeck
2010-03-31T10:23:27Z
histone acetylation
histone acetylation site
sequence
SO:0001702
histone_acetylation_site
A histone modification where the modification is the acylation of the residue.
SO:ke
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2010-03-31T10:25:05Z
H3K9 acetylation site
H3K9ac
sequence
SO:0001703
H3K9_acetylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is acetylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 14th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2010-03-31T10:25:53Z
H3K14 acetylation site
H3K14ac
sequence
SO:0001704
H3K14_acetylation_site
A kind of histone modification site, whereby the 14th residue (a lysine), from the start of the H3 histone protein is acetylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is mono-methylated.
kareneilbeck
2010-03-31T10:28:14Z
H3K4 mono-methylation site
sequence
H3K4me1
SO:0001705
H3K4_monomethylation_site
A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is mono-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 protein is tri-methylated.
kareneilbeck
2010-03-31T10:29:12Z
H3K4 tri-methylation
sequence
H3K4me3
SO:0001706
H3K4_trimethylation
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 protein is tri-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
kareneilbeck
2010-03-31T10:30:34Z
H3K9 tri-methylation site
sequence
H3K9Me3
SO:0001707
H3K9_trimethylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
kareneilbeck
2010-03-31T10:31:54Z
H2K27 mono-methylation site
sequence
H2K27Me1
SO:0001708
H3K27_monomethylation_site
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
kareneilbeck
2010-03-31T10:32:41Z
H3K27 tri-methylation site
sequence
H3K27Me3
SO:0001709
H3K27_trimethylation_site
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is mono- methylated.
kareneilbeck
2010-03-31T10:33:42Z
H3K79 mono-methylation site
sequence
H3K79me1
SO:0001710
H3K79_monomethylation_site
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is mono- methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is di-methylated.
kareneilbeck
2010-03-31T10:34:39Z
H3K79 di-methylation site
sequence
H3K79Me2
SO:0001711
H3K79_dimethylation_site
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is di-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
kareneilbeck
2010-03-31T10:35:30Z
H3K79 tri-methylation site
sequence
H3K79Me3
SO:0001712
H3K79_trimethylation_site
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H4histone protein is mono-methylated.
kareneilbeck
2010-03-31T10:36:43Z
H4K20 mono-methylation site
sequence
H4K20Me1
SO:0001713
H4K20_monomethylation_site
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H4histone protein is mono-methylated.
http://en.wikipedia.org/wiki/Histone
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B protein is methylated.
kareneilbeck
2010-03-31T10:38:12Z
H2BK5 mono-methylation site
sequence
SO:0001714
H2BK5_monomethylation_site
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B protein is methylated.
http://en.wikipedia.org/wiki/Histone
An ISRE is a transcriptional cis regulatory region, containing the consensus region: YAGTTTC(A/T)YTTTYCC, responsible for increased transcription via interferon binding.
kareneilbeck
2010-04-05T11:15:08Z
interferon stimulated response element
sequence
SO:0001715
Term requested via tracker (2981725) by Alan Ruttenberg, April 2010. It has been described as both an enhancer and a promoter, so the parent is the more general term. Moved from is_a SO:0001055 transcriptional_cis_regulatory_region to SO:0000235 TF_binding_site after Colin Logie pointed out that this is a consensus sequence where transcription factors bind, GREEKC Jan 21, 2021.
ISRE
An ISRE is a transcriptional cis regulatory region, containing the consensus region: YAGTTTC(A/T)YTTTYCC, responsible for increased transcription via interferon binding.
http://genesdev.cshlp.org/content/2/4/383.abstrac
A histone modification site where ubiquitin may be added.
kareneilbeck
2010-04-13T10:12:18Z
sequence
histone ubiquitination site
SO:0001716
histone_ubiqitination_site
A histone modification site where ubiquitin may be added.
SO:ke
A histone modification site on H2B where ubiquitin may be added.
kareneilbeck
2010-04-13T10:13:28Z
sequence
H2BUbiq
SO:0001717
H2B_ubiquitination_site
A histone modification site on H2B where ubiquitin may be added.
SO:ke
A kind of histone modification site, whereby the 18th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2010-04-13T10:39:35Z
H3K18 acetylation site
H3K18ac
sequence
SO:0001718
H3K18_acetylation_site
A kind of histone modification site, whereby the 18th residue (a lysine), from the start of the H3 histone protein is acetylated.
SO:ke
A kind of histone modification, whereby the 23rd residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2010-04-13T10:42:45Z
H3K23 acetylation site
H3K23ac
sequence
SO:0001719
H3K23_acetylation_site
A kind of histone modification, whereby the 23rd residue (a lysine), from the start of the H3 histone protein is acetylated.
SO:ke
A biological DNA region implicated in epigenomic changes caused by mechanisms other than changes in the underlying DNA sequence. This includes, nucleosomal histone post-translational modifications, nucleosome depletion to render DNA accessible and post-replicational base modifications such as cytosine modification.
kareneilbeck
2010-03-27T12:02:29Z
sequence
epigenetically modified region
SO:0001720
Moved from is_a biological_region (SO:0001411) to is_a regulatory_region (SO:0005836) on 11 Feb 2021. GREEKC members pointed out that this would be a more appropriate location. See GitHub Issue #530. 11 Feb 2021 updated definition along with addition of epigenomically_modified_region (SO:0002332). Epigenetically modified region is now not inherited while epigenomically modified region is not annotated as inherited. See GitHub Issue #532 and issue #534.
epigenetically_modified_region
A biological DNA region implicated in epigenomic changes caused by mechanisms other than changes in the underlying DNA sequence. This includes, nucleosomal histone post-translational modifications, nucleosome depletion to render DNA accessible and post-replicational base modifications such as cytosine modification.
SO:ke
http://en.wikipedia.org/wiki/Epigenetics
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acylated.
kareneilbeck
2010-04-13T10:44:09Z
H3K27 acylation site
sequence
H3K27Ac
SO:0001721
H3K27_acylation_site
true
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acylated.
SO:ke
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
kareneilbeck
2010-04-13T10:46:32Z
H3K36 mono-methylation site
sequence
H3K36Me1
SO:0001722
H3K36_monomethylation_site
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
SO:ke
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is dimethylated.
kareneilbeck
2010-04-13T10:59:35Z
H3K36 di-methylation site
sequence
H3K36Me2
SO:0001723
H3K36_dimethylation_site
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is dimethylated.
SO:ke
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
kareneilbeck
2010-04-13T11:01:58Z
H3K36 tri-methylation site
sequence
H3K36Me3
SO:0001724
H3K36_trimethylation_site
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is tri-methylated.
SO:ke
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is di-methylated.
kareneilbeck
2010-04-13T11:03:15Z
H3K4 di-methylation site
sequence
H3K4Me2
SO:0001725
H3K4_dimethylation_site
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is di-methylated.
SO:ke
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is di-methylated.
kareneilbeck
2010-04-13T01:45:41Z
H3K27 di-methylation site
sequence
H3K27Me2
SO:0001726
H3K27_dimethylation_site
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is di-methylated.
SO:ke
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
kareneilbeck
2010-04-13T11:06:17Z
H3K9 mono-methylation site
sequence
H3K9Me1
SO:0001727
H3K9_monomethylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is mono-methylated.
SO:ke
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein may be dimethylated.
kareneilbeck
2010-04-13T11:08:19Z
H3K9 di-methylation site
sequence
H3K9Me2
SO:0001728
H3K9_dimethylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein may be dimethylated.
SO:ke
A kind of histone modification site, whereby the 16th residue (a lysine), from the start of the H4 histone protein is acetylated.
kareneilbeck
2010-04-13T11:09:41Z
H4K16 acetylation site
H4K16ac
sequence
SO:0001729
H4K16_acetylation_site
A kind of histone modification site, whereby the 16th residue (a lysine), from the start of the H4 histone protein is acetylated.
SO:ke
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H4 histone protein is acetylated.
kareneilbeck
2010-04-13T11:13:00Z
H4K5 acetylation site
H4K5ac
sequence
SO:0001730
H4K5_acetylation_site
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H4 histone protein is acetylated.
SO:ke
A kind of histone modification site, whereby the 8th residue (a lysine), from the start of the H4 histone protein is acetylated.
kareneilbeck
2010-04-13T11:14:24Z
H4K8 acetylation site
H4K8ac
sequence
SO:0001731
H4K8_acetylation_site
A kind of histone modification site, whereby the 8th residue (a lysine), from the start of the H4 histone protein is acetylated.
SO:KE
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is methylated.
kareneilbeck
2010-04-13T11:26:22Z
H3K27 methylation site
sequence
SO:0001732
H3K27_methylation_site
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is methylated.
SO:ke
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is methylated.
kareneilbeck
2010-04-13T11:27:28Z
H3K36 methylation site
sequence
SO:0001733
H3K36_methylation_site
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is methylated.
SO:ke
A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is methylated.
kareneilbeck
2010-04-13T11:28:14Z
H3K4 methylation site
sequence
SO:0001734
H3K4_methylation_site
A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is methylated.
SO:ke
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is methylated.
kareneilbeck
2010-04-13T11:29:16Z
H3K79 methylation site
sequence
SO:0001735
H3K79_methylation_site
A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is methylated.
SO:ke
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is methylated.
kareneilbeck
2010-04-13T11:31:37Z
H3K9 methylation site
sequence
SO:0001736
H3K9_methylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is methylated.
SO:ke
A histone modification, whereby the histone protein is acylated at multiple sites in a region.
kareneilbeck
2010-04-13T01:58:21Z
sequence
histone acylation region
SO:0001737
histone_acylation_region
A histone modification, whereby the histone protein is acylated at multiple sites in a region.
SO:ke
A region of the H4 histone whereby multiple lysines are acylated.
kareneilbeck
2010-04-13T02:00:06Z
H4K acylation region
sequence
H4KAc
SO:0001738
H4K_acylation_region
A region of the H4 histone whereby multiple lysines are acylated.
SO:ke
A gene with a start codon other than AUG.
kareneilbeck
2011-01-10T01:30:31Z
gene with non canonical start codon
sequence
SO:0001739
Requested by flybase, Dec 2010.
gene_with_non_canonical_start_codon
A gene with a start codon other than AUG.
SO:xp
A gene with a translational start codon of CUG.
kareneilbeck
2011-01-10T01:32:35Z
gene with start codon CUG
sequence
SO:0001740
Requested by flybase, Dec 2010.
gene_with_start_codon_CUG
A gene with a translational start codon of CUG.
SO:mc
A gene segment which when incorporated by somatic recombination in the final gene transcript results in a nonfunctional product.
batchelorc
2011-02-15T05:07:52Z
pseudogenic gene segment
sequence
SO:0001741
pseudogenic_gene_segment
A gene segment which when incorporated by somatic recombination in the final gene transcript results in a nonfunctional product.
SO:hd
A sequence alteration whereby the copy number of a given regions is greater than the reference sequence.
kareneilbeck
2011-02-28T01:54:09Z
copy number gain
sequence
gain
SO:0001742
copy_number_gain
A sequence alteration whereby the copy number of a given regions is greater than the reference sequence.
SO:ke
gain
http://www.ncbi.nlm.nih.gov/dbvar/
A sequence alteration whereby the copy number of a given region is less than the reference sequence.
kareneilbeck
2011-02-28T01:55:02Z
copy number loss
sequence
loss
SO:0001743
copy_number_loss
A sequence alteration whereby the copy number of a given region is less than the reference sequence.
SO:ke
loss
http://www.ncbi.nlm.nih.gov/dbvar/
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from one parent and no copies of the same chromosome or region from the other parent.
kareneilbeck
2011-02-28T02:01:05Z
http:http://en.wikipedia.org/wiki/Uniparental_disomy
UPD
uniparental disomy
sequence
SO:0001744
UPD
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from one parent and no copies of the same chromosome or region from the other parent.
SO:BM
http:http://en.wikipedia.org/wiki/Uniparental_disomy
wikipedia
UPD
http://www.ncbi.nlm.nih.gov/dbvar/
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the mother and no copies of the same chromosome or region from the father.
kareneilbeck
2011-02-28T02:03:01Z
maternal uniparental disomy
sequence
SO:0001745
maternal_uniparental_disomy
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the mother and no copies of the same chromosome or region from the father.
SO:bm
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the father and no copies of the same chromosome or region from the mother.
kareneilbeck
2011-02-28T02:03:30Z
paternal uniparental disomy
sequence
SO:0001746
paternal_uniparental_disomy
Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the father and no copies of the same chromosome or region from the mother.
SO:bm
A DNA sequence that in the normal state of the chromosome corresponds to an unfolded, un-complexed stretch of double-stranded DNA.
kareneilbeck
2011-02-28T02:21:52Z
open chromatin region
sequence
SO:0001747
Requested by John Calley 3125900.
open_chromatin_region
A DNA sequence that in the normal state of the chromosome corresponds to an unfolded, un-complexed stretch of double-stranded DNA.
SO:cb
A SL2_acceptor_site which appends the SL3 RNA leader sequence to the 5' end of an mRNA. SL3 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T02:58:40Z
SL3 acceptor site
sequence
SO:0001748
SL3_acceptor_site
A SL2_acceptor_site which appends the SL3 RNA leader sequence to the 5' end of an mRNA. SL3 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL4 RNA leader sequence to the 5' end of an mRNA. SL4 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:08:47Z
SL4 acceptor site
sequence
SO:0001749
SL4_acceptor_site
A SL2_acceptor_site which appends the SL4 RNA leader sequence to the 5' end of an mRNA. SL4 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL5 RNA leader sequence to the 5' end of an mRNA. SL5 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:09:36Z
SL5 acceptor site
sequence
SO:0001750
SL5_acceptor_site
A SL2_acceptor_site which appends the SL5 RNA leader sequence to the 5' end of an mRNA. SL5 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL6 RNA leader sequence to the 5' end of an mRNA. SL6 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:10:14Z
SL6 acceptor site
sequence
SO:0001751
SL6_acceptor_site
A SL2_acceptor_site which appends the SL6 RNA leader sequence to the 5' end of an mRNA. SL6 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL7 RNA leader sequence to the 5' end of an mRNA. SL7 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:13:20Z
SL37 acceptor site
sequence
SO:0001752
SL7_acceptor_site
A SL2_acceptor_site which appends the SL7 RNA leader sequence to the 5' end of an mRNA. SL7 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL8 RNA leader sequence to the 5' end of an mRNA. SL8 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:15:26Z
SL8 acceptor site
sequence
SO:0001753
SL8_acceptor_site
A SL2_acceptor_site which appends the SL8 RNA leader sequence to the 5' end of an mRNA. SL8 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL9 RNA leader sequence to the 5' end of an mRNA. SL9 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:15:57Z
SL9 acceptor site
sequence
SO:0001754
SL9_acceptor_site
A SL2_acceptor_site which appends the SL9 RNA leader sequence to the 5' end of an mRNA. SL9 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL10 RNA leader sequence to the 5' end of an mRNA. SL10 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:16:31Z
SL10 acceptor site
sequence
SO:0001755
SL10_acceptor_site
A SL2_acceptor_site which appends the SL10 RNA leader sequence to the 5' end of an mRNA. SL10 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL11 RNA leader sequence to the 5' end of an mRNA. SL11 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:16:54Z
SL11 acceptor site
sequence
SO:0001756
SL11_acceptor_site
A SL2_acceptor_site which appends the SL11 RNA leader sequence to the 5' end of an mRNA. SL11 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A SL2_acceptor_site which appends the SL12 RNA leader sequence to the 5' end of an mRNA. SL12 acceptor sites occur in genes in internal segments of polycistronic transcripts.
kareneilbeck
2011-02-28T03:17:23Z
SL12 acceptor site
sequence
SO:0001757
SL12_acceptor_site
A SL2_acceptor_site which appends the SL12 RNA leader sequence to the 5' end of an mRNA. SL12 acceptor sites occur in genes in internal segments of polycistronic transcripts.
SO:nlw
A pseudogene that arose via gene duplication. Generally duplicated pseudogenes have the same structure as the original gene, including intron-exon structure and some regulatory sequence.
kareneilbeck
2011-03-09T09:58:04Z
sequence
duplicated pseudogene
SO:0001758
duplicated_pseudogene
A pseudogene that arose via gene duplication. Generally duplicated pseudogenes have the same structure as the original gene, including intron-exon structure and some regulatory sequence.
http://en.wikipedia.org/wiki/Pseudogene
A pseudogene, deactivated from original state by mutation, fixed in a population,where the ortholog in a reference species such as mouse remains functional.
kareneilbeck
2011-03-09T10:04:04Z
INSDC_feature:gene
INSDC_qualifier:unitary
sequence
disabled gene
unitary pseudogene
SO:0001759
This is different from a non processed pseudogene because the gene was not duplicated. An example is the L-gulono-lactone oxidase pseudogene in primates.
unitary_pseudogene
A pseudogene, deactivated from original state by mutation, fixed in a population,where the ortholog in a reference species such as mouse remains functional.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
SO:ke
http://en.wikipedia.org/wiki/Pseudogene
A pseudogene that arose from a means other than retrotransposition. A pseudogene created via genomic duplication of a functional protein-coding parent gene followed by accumulation of deleterious mutations.
kareneilbeck
2011-03-09T10:54:47Z
INSDC_feature:gene
INSDC_qualifier:unprocessed
unprocessed pseudogene
unprocessed_pseudogene
sequence
non processed pseudogene
SO:0001760
non_processed_pseudogene
A pseudogene that arose from a means other than retrotransposition. A pseudogene created via genomic duplication of a functional protein-coding parent gene followed by accumulation of deleterious mutations.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
SO:ke
A dependent entity that inheres in a bearer, a sequence variant.
kareneilbeck
2011-03-15T03:40:35Z
variant quality
sequence
SO:0001761
variant_quality
A dependent entity that inheres in a bearer, a sequence variant.
PMID:17597783
SO:ke
A quality inhering in a variant by virtue of its origin.
kareneilbeck
2011-03-15T03:42:13Z
variant origin
sequence
SO:0001762
variant_origin
A quality inhering in a variant by virtue of its origin.
PMID:17597783
SO:ke
A physical quality which inheres to the variant by virtue of the number instances of the variant within a population.
kareneilbeck
2011-03-15T03:44:39Z
variant frequency
sequence
SO:0001763
variant_frequency
A physical quality which inheres to the variant by virtue of the number instances of the variant within a population.
PMID:17597783
SO:ke
A physical quality which inheres to the variant by virtue of the number instances of the variant within a population.
kareneilbeck
2011-03-15T03:47:20Z
unique variant
sequence
SO:0001764
unique_variant
A physical quality which inheres to the variant by virtue of the number instances of the variant within a population.
SO:ke
When a variant from the genomic sequence is rarely found in the general population. The threshold for 'rare' varies between studies.
kareneilbeck
2011-03-15T03:48:29Z
rare variant
sequence
SO:0001765
rare_variant
A variant that affects one of several possible alleles at that location, such as the major histocompatibility complex (MHC) genes.
kareneilbeck
2011-03-15T03:48:51Z
polymorphic variant
sequence
SO:0001766
polymorphic_variant
When a variant from the genomic sequence is commonly found in the general population.
kareneilbeck
2011-03-15T03:50:36Z
common variant
sequence
SO:0001767
common_variant
When a variant has become fixed in the population so that it is now the only variant.
kareneilbeck
2011-03-15T03:50:53Z
fixed variant
sequence
SO:0001768
fixed_variant
A quality inhering in a variant by virtue of its phenotype.
kareneilbeck
2011-03-15T03:53:15Z
variant phenotype
sequence
SO:0001769
variant_phenotype
A quality inhering in a variant by virtue of its phenotype.
PMID:17597783
SO:ke
A variant that does not affect the function of the gene or cause disease.
kareneilbeck
2011-03-15T03:55:40Z
benign variant
sequence
SO:0001770
benign_variant
A variant that has been found to be associated with disease.
kareneilbeck
2011-03-15T04:05:16Z
disease associated variant
sequence
SO:0001771
disease_associated_variant
A variant that has been found to cause disease.
kareneilbeck
2011-03-15T04:05:46Z
disease causing variant
sequence
SO:0001772
disease_causing_variant
A sequence variant where the mutated gene product does not allow for one or more basic functions necessary for survival.
kareneilbeck
2011-03-15T04:06:22Z
lethal variant
sequence
SO:0001773
lethal_variant
A variant within a gene that contributes to a quantitative trait such as height or weight.
kareneilbeck
2011-03-15T04:28:13Z
quantitative variant
sequence
SO:0001774
quantitative_variant
A variant in the genetic material inherited from the mother.
kareneilbeck
2011-03-15T04:30:23Z
maternal variant
sequence
SO:0001775
maternal_variant
A variant in the genetic material inherited from the father.
kareneilbeck
2011-03-15T04:30:47Z
paternal variant
sequence
SO:0001776
paternal_variant
A variant that has arisen after splitting of the embryo, resulting in the variant being found in only some of the tissues or cells of the body.
kareneilbeck
2011-03-15T04:31:12Z
somatic variant
sequence
SO:0001777
somatic_variant
A variant present in the embryo that is carried by every cell in the body.
kareneilbeck
2011-03-15T04:31:46Z
germline variant
sequence
SO:0001778
germline_variant
A variant that is found only by individuals that belong to the same pedigree.
kareneilbeck
2011-03-15T04:32:18Z
pedigree specific variant
sequence
SO:0001779
pedigree_specific_variant
A variant found within only speficic populations.
kareneilbeck
2011-03-15T04:33:05Z
population specific variant
sequence
SO:0001780
population_specific_variant
A variant arising in the offspring that is not found in either of the parents.
kareneilbeck
2011-03-15T04:33:34Z
de novo variant
sequence
SO:0001781
de_novo_variant
A sequence variant located within a transcription factor binding site.
kareneilbeck
2011-03-17T10:59:20Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:tf_binding_site_variant
TF binding site variant
VEP:TF_binding_site_variant
sequence
SO:0001782
TF_binding_site_variant
A sequence variant located within a transcription factor binding site.
EBI:fc
Jannovar:tf_binding_site_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:TF_binding_site_variant
true
A structural sequence alteration or rearrangement encompassing one or more genome fragments, with 4 or more breakpoints.
kareneilbeck
2011-03-23T03:21:19Z
SO:1000146
complex chromosomal mutation
complex_chromosomal_mutation
sequence
complex
SO:0001784
complex_structural_alteration
A structural sequence alteration or rearrangement encompassing one or more genome fragments, with 4 or more breakpoints.
FB:reference_manual
NCBI:th
SO:ke
complex
http://www.ncbi.nlm.nih.gov/dbvar/
An alteration of the genome that leads to a change in the structure of one or more chromosomes.
kareneilbeck
2011-03-25T02:27:41Z
structural alteration
sequence
SO:0001785
structural_alteration
A functional variant whereby the sequence alteration causes a loss of function of one allele of a gene.
kareneilbeck
2011-03-25T02:32:58Z
LOH
loss of heterozygosity
sequence
SO:0001786
loss_of_heterozygosity
A functional variant whereby the sequence alteration causes a loss of function of one allele of a gene.
SO:ke
A sequence variant that causes a change at the 5th base pair after the start of the intron in the orientation of the transcript.
kareneilbeck
2011-04-05T04:16:28Z
splice donor 5th base variant
sequence
SO:0001787
splice_donor_5th_base_variant
A sequence variant that causes a change at the 5th base pair after the start of the intron in the orientation of the transcript.
EBI:gr
An U-box is a conserved T-rich region upstream of a retroviral polypurine tract that is involved in PPT primer creation during reverse transcription.
kareneilbeck
2011-04-08T10:39:14Z
U-box
sequence
SO:0001788
U_box
An U-box is a conserved T-rich region upstream of a retroviral polypurine tract that is involved in PPT primer creation during reverse transcription.
PMID:10556309
PMID:11577982
PMID:9649446
A specialized region in the genomes of some yeast and fungi, the genes of which regulate mating type.
kareneilbeck
2011-04-08T11:14:07Z
http://en.wikipedia.org/wiki/Mating-type_region
mating type region
sequence
SO:0001789
mating_type_region
A specialized region in the genomes of some yeast and fungi, the genes of which regulate mating type.
SO:ke
An assembly region that has been sequenced from both ends resulting in a read_pair (mate_pair).
kareneilbeck
2011-04-14T01:48:20Z
paired end fragment
sequence
SO:0001790
paired_end_fragment
An assembly region that has been sequenced from both ends resulting in a read_pair (mate_pair).
SO:ke
A sequence variant that changes exon sequence.
kareneilbeck
2011-05-06T01:51:17Z
http://snpeff.sourceforge.net/SnpEff_manual.html
ANNOVAR:exonic
Jannovar:exon_variant
VAAST:exon_variant
exon variant
snpEff:EXON
sequence
SO:0001791
exon_variant
A sequence variant that changes exon sequence.
SO:ke
ANNOVAR:exonic
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:exon_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAAST:exon_variant
snpEff:EXON
A sequence variant that changes non-coding exon sequence in a non-coding transcript.
kareneilbeck
2011-05-06T01:51:59Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq:non-coding-exon
VEP:non_coding_transcript_exon_variant
non coding transcript exon variant
non_coding_transcript_exon_variant
snpEff:non_coding_exon_variant
ANNOVAR:ncRNA_exonic
Jannovar:non_coding_transcript_exon_variant
sequence
Seattleseq:non-coding-exon-near-splice
SO:0001792
non_coding_transcript_exon_variant
A sequence variant that changes non-coding exon sequence in a non-coding transcript.
EBI:fc
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:non-coding-exon
VEP:non_coding_transcript_exon_variant
non_coding_transcript_exon_variant
snpEff:non_coding_exon_variant
ANNOVAR:ncRNA_exonic
Jannovar:non_coding_transcript_exon_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:non-coding-exon-near-splice
A read from an end of the clone sequence.
kareneilbeck
2011-05-13T11:32:27Z
clone end
sequence
SO:0001793
clone_end
A read from an end of the clone sequence.
SO:ke
A point centromere is a relatively small centromere (about 125 bp DNA) in discrete sequence, found in some yeast including S. cerevisiae.
kareneilbeck
2011-05-31T12:42:35Z
point centromere
sequence
SO:0001794
point_centromere
A point centromere is a relatively small centromere (about 125 bp DNA) in discrete sequence, found in some yeast including S. cerevisiae.
PMID:7502067
SO:vw
A regional centromere is a large modular centromere found in fission yeast and higher eukaryotes. It consist of a central core region flanked by inverted inner and outer repeat regions.
kareneilbeck
2011-05-31T12:43:07Z
regional centromere
sequence
SO:0001795
regional_centromere
A regional centromere is a large modular centromere found in fission yeast and higher eukaryotes. It consist of a central core region flanked by inverted inner and outer repeat regions.
PMID:7502067
SO:vw
A conserved region within the central region of a modular centromere, where the kinetochore is formed.
kareneilbeck
2011-05-31T12:56:30Z
regional centromere central core
sequence
SO:0001796
regional_centromere_central_core
A conserved region within the central region of a modular centromere, where the kinetochore is formed.
SO:vw
A repeat region found within the modular centromere.
kareneilbeck
2011-05-31T12:59:27Z
INSDC_feature:repeat_region
INSDC_qualifier:centromeric_repeat
centromeric repeat
sequence
SO:0001797
centromeric_repeat
A repeat region found within the modular centromere.
SO:ke
The inner inverted repeat region of a modular centromere and part of the central core surrounding a non-conserved central region. This region is adjacent to the central core, on each chromosome arm.
kareneilbeck
2011-05-31T01:01:08Z
lmr repeat
lmr1L
lmr1R
regional centromere inner repeat region
sequence
SO:0001798
regional_centromere_inner_repeat_region
The inner inverted repeat region of a modular centromere and part of the central core surrounding a non-conserved central region. This region is adjacent to the central core, on each chromosome arm.
SO:vw
The heterochromatic outer repeat region of a modular centromere. These repeats exist in tandem arrays on both chromosome arms.
kareneilbeck
2011-05-31T01:03:23Z
regional centromere outer repeat region
sequence
SO:0001799
regional_centromere_outer_repeat_region
The heterochromatic outer repeat region of a modular centromere. These repeats exist in tandem arrays on both chromosome arms.
SO:vw
The sequence of a 21 nucleotide double stranded, polyadenylated non coding RNA, transcribed from the TAS gene.
kareneilbeck
2011-05-31T03:24:06Z
sequence
trans acting small interfering RNA
SO:0001800
tasiRNA
The sequence of a 21 nucleotide double stranded, polyadenylated non coding RNA, transcribed from the TAS gene.
PMID:16145017
A primary transcript encoding a tasiRNA.
kareneilbeck
2011-05-31T03:27:35Z
tasiRNA primary transcript
sequence
SO:0001801
tasiRNA_primary_transcript
A primary transcript encoding a tasiRNA.
PMID:16145017
A transcript processing variant whereby polyadenylation of the encoded transcript is increased with respect to the reference.
kareneilbeck
2011-06-01T10:53:12Z
increased polyadenylation variant
sequence
SO:0001802
Term requested by M. Dumontier, June 1 2011.
increased_polyadenylation_variant
A transcript processing variant whereby polyadenylation of the encoded transcript is increased with respect to the reference.
SO:ke
A transcript processing variant whereby polyadenylation of the encoded transcript is decreased with respect to the reference.
kareneilbeck
2011-06-01T10:53:40Z
decreased polyadenylation variant
sequence
SO:0001803
Term requested by M. Dumontier, June 1 2011.
decreased_polyadenylation_variant
A transcript processing variant whereby polyadenylation of the encoded transcript is decreased with respect to the reference.
SO:ke
A conserved polypeptide motif that mediates protein-protein interaction and defines adaptor proteins for DDB1/cullin 4 ubiquitin ligases.
kareneilbeck
2011-06-17T12:10:44Z
DDB box
DDB-box
sequence
SO:0001804
Note: PMID:18794354 describes the DDB box, and has lots of alignments, but doesn't actually come out with a consensus sequence.
DDB_box
A conserved polypeptide motif that mediates protein-protein interaction and defines adaptor proteins for DDB1/cullin 4 ubiquitin ligases.
PMID:18794354
PMID:19818632
A conserved polypeptide motif that can be recognized by both Fizzy/Cdc20- and FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is RXXLXXXXN.
kareneilbeck
2011-06-17T12:16:02Z
D-box
destruction box
sequence
SO:0001805
destruction_box
A conserved polypeptide motif that can be recognized by both Fizzy/Cdc20- and FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is RXXLXXXXN.
PMID:12208841
PMID:1842691
A C-terminal tetrapeptide motif that mediates retention of a protein in (or retrieval to) the endoplasmic reticulum. In mammals the sequence is KDEL, and in fungi HDEL or DDEL.
kareneilbeck
2011-06-17T12:19:49Z
ER retention signal
endoplasmic reticulum retention signal
sequence
SO:0001806
ER_retention_signal
A C-terminal tetrapeptide motif that mediates retention of a protein in (or retrieval to) the endoplasmic reticulum. In mammals the sequence is KDEL, and in fungi HDEL or DDEL.
PMID:2077689
doi:10.1093/jxb/50.331.157
A conserved polypeptide motif that can be recognized by FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is KENXXXN.
kareneilbeck
2011-06-17T12:24:14Z
KEN box
sequence
SO:0001807
KEN_box
A conserved polypeptide motif that can be recognized by FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is KENXXXN.
PMID:10733526
PMID:1220884
PMID:18426916
A polypeptide region that targets a polypeptide to the mitochondrion.
kareneilbeck
2011-06-17T12:26:35Z
MTS
mitochondrial signal sequence
mitochondrial targeting signal
sequence
SO:0001808
mitochondrial_targeting_signal
A polypeptide region that targets a polypeptide to the mitochondrion.
PomBase:mah
A signal sequence that is not cleaved from the polypeptide. Anchors a Type II membrane protein to the membrane.
kareneilbeck
2011-06-17T12:28:53Z
signal anchor
uncleaved signal peptide
sequence
SO:0001809
signal_anchor
A signal sequence that is not cleaved from the polypeptide. Anchors a Type II membrane protein to the membrane.
http://www.cbs.dtu.dk/services/SignalP/background/biobackground.php
A polypeptide region that mediates binding to PCNA. The consensus sequence is QXX(hh)XX(aa), where (h) denotes residues with moderately hydrophobic side chains and (a) denotes residues with highly hydrophobic aromatic side chains.
kareneilbeck
2011-06-17T12:33:25Z
PIP box
sequence
SO:0001810
PIP_box
A polypeptide region that mediates binding to PCNA. The consensus sequence is QXX(hh)XX(aa), where (h) denotes residues with moderately hydrophobic side chains and (a) denotes residues with highly hydrophobic aromatic side chains.
PMID:9631646
A post-translationally modified region in which residues of the protein are modified by phosphorylation.
kareneilbeck
2011-06-17T12:36:20Z
phosphorylation site
sequence
SO:0001811
phosphorylation_site
A post-translationally modified region in which residues of the protein are modified by phosphorylation.
PomBase:mah
A region that traverses the lipid bilayer and adopts a helical secondary structure.
kareneilbeck
2011-06-17T12:39:46Z
transmembrane helix
sequence
SO:0001812
transmembrane_helix
A region that traverses the lipid bilayer and adopts a helical secondary structure.
PomBase:mah
A polypeptide region that targets a polypeptide to the vacuole.
kareneilbeck
2011-06-17T12:42:48Z
vacuolar sorting signal
sequence
SO:0001813
vacuolar_sorting_signal
A polypeptide region that targets a polypeptide to the vacuole.
PomBase:mah
An attribute of a coding genomic variant.
kareneilbeck
2011-06-24T03:32:25Z
coding variant quality
sequence
SO:0001814
coding_variant_quality
A variant that does not lead to any change in the amino acid sequence.
kareneilbeck
2011-06-24T03:33:16Z
sequence
SO:0001815
synonymous
A variant that leads to the change of an amino acid within the protein.
kareneilbeck
2011-06-24T03:33:36Z
sequence
non synonymous
SO:0001816
non_synonymous
An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is divisible by 3.
kareneilbeck
2011-06-24T03:34:03Z
sequence
SO:0001817
inframe
An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is divisible by 3.
SO:ke
A sequence_variant which is predicted to change the protein encoded in the coding sequence.
kareneilbeck
2011-06-24T03:38:02Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:protein_altering_variant
protein altering variant
sequence
SO:0001818
protein_altering_variant
A sequence_variant which is predicted to change the protein encoded in the coding sequence.
EBI:gr
VEP:protein_altering_variant
A sequence variant where there is no resulting change to the encoded amino acid.
kareneilbeck
2011-06-24T03:38:30Z
SO:0001588
http://en.wikipedia.org/wiki/Silent_mutation
http://en.wikipedia.org/wiki/Synonymous_mutation
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:synonymous_variant
Seattleseq:synonymous
VAAST:synonymous_codon
VAAST:synonymous_variant
VAT:synonymous
VEP:synonymous_variant
coding-synon
snpEff:SYNONYMOUS_CODING
synonymous codon
synonymous_coding
synonymous_codon
sequence
ANNOVAR:synonymous SNV
Seattleseq:synonymous-near-splice
silent mutation
silent substitution
silent_mutation
SO:0001819
EBI term: Synonymous SNPs - In coding sequence, not resulting in an amino acid change (i.e. silent mutation).
This term is sometimes used synonomously with the more general term 'silent mutation', although a silent mutation may occur in non coding sequence. The best practice is to annotate to the most specific term.
synonymous_variant
A sequence variant where there is no resulting change to the encoded amino acid.
SO:ke
http://en.wikipedia.org/wiki/Silent_mutation
wiki
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
http://vat.gersteinlab.org/formats.php
VAT
Jannovar:synonymous_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
Seattleseq:synonymous
VAAST:synonymous_codon
VAAST:synonymous_variant
VAT:synonymous
VEP:synonymous_variant
coding-synon
ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd
snpEff:SYNONYMOUS_CODING
ANNOVAR:synonymous SNV
http://www.openbioinformatics.org/annovar/annovar_download.html
Seattleseq:synonymous-near-splice
A coding sequence variant where the change does not alter the frame of the transcript.
kareneilbeck
2011-06-27T11:25:33Z
inframe change in CDS length
inframe indel
sequence
SO:0001820
inframe_indel
A coding sequence variant where the change does not alter the frame of the transcript.
SO:ke
An inframe non synonymous variant that inserts bases into in the coding sequence.
kareneilbeck
2011-06-27T11:26:22Z
SO:0001651
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
ANNOVAR:nonframeshift insertion
Jannovar:inframe_insertion
VAT:insertionNFS
VEP:inframe_insertion
inframe increase in CDS length
inframe insertion
inframe_codon_gain
snpEFF:CODON_INSERTION
sequence
inframe codon gain
SO:0001821
inframe_insertion
An inframe non synonymous variant that inserts bases into in the coding sequence.
EBI:gr
http://vat.gersteinlab.org/formats.php
VAT
ANNOVAR:nonframeshift insertion
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:inframe_insertion
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAT:insertionNFS
VEP:inframe_insertion
snpEFF:CODON_INSERTION
An inframe non synonymous variant that deletes bases from the coding sequence.
kareneilbeck
2011-06-27T11:27:10Z
SO:0001652
http://snpeff.sourceforge.net/SnpEff_manual.html
http://vat.gersteinlab.org/formats.php
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
ANNOVAR:nonframeshift deletion
Jannovar:inframe_deletion
VAT:deletionNFS
VEP:inframe_deletion
inframe decrease in CDS length
inframe_codon_loss
sequence
inframe codon loss
inframe deletion
snpEff:CODON_DELETION
SO:0001822
inframe_deletion
An inframe non synonymous variant that deletes bases from the coding sequence.
EBI:gr
http://vat.gersteinlab.org/formats.php
VAT
ANNOVAR:nonframeshift deletion
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:inframe_deletion
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VAT:deletionNFS
VEP:inframe_deletion
snpEff:CODON_DELETION
An inframe increase in cds length that inserts one or more codons into the coding sequence between existing codons.
kareneilbeck
2011-06-27T11:28:02Z
conservative increase in CDS length
conservative inframe insertion
sequence
SO:0001823
conservative_inframe_insertion
An inframe increase in cds length that inserts one or more codons into the coding sequence between existing codons.
EBI:gr
An inframe increase in cds length that inserts one or more codons into the coding sequence within an existing codon.
kareneilbeck
2011-06-27T11:28:37Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:disruptive_inframe_insertion
disruptive increase in CDS length
disruptive inframe insertion
snpEff:CODON_CHANGE_PLUS_CODON_INSERTION
sequence
SO:0001824
disruptive_inframe_insertion
An inframe increase in cds length that inserts one or more codons into the coding sequence within an existing codon.
EBI:gr
Jannovar:disruptive_inframe_insertion
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:CODON_CHANGE_PLUS_CODON_INSERTION
An inframe decrease in cds length that deletes one or more entire codons from the coding sequence but does not change any remaining codons.
kareneilbeck
2011-06-27T11:30:43Z
conservative inframe deletion
sequence
conservative decrease in CDS length
SO:0001825
conservative_inframe_deletion
An inframe decrease in cds length that deletes one or more entire codons from the coding sequence but does not change any remaining codons.
EBI:gr
An inframe decrease in cds length that deletes bases from the coding sequence starting within an existing codon.
kareneilbeck
2011-06-27T11:31:31Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:disruptive_inframe_deletion
disruptive decrease in CDS length
disruptive inframe deletion
snpEff:CODON_CHANGE_PLUS_CODON_DELETION
sequence
SO:0001826
disruptive_inframe_deletion
An inframe decrease in cds length that deletes bases from the coding sequence starting within an existing codon.
EBI:gr
Jannovar:disruptive_inframe_deletion
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:CODON_CHANGE_PLUS_CODON_DELETION
A sequencer read of an mRNA substrate.
kareneilbeck
2011-06-28T04:04:32Z
mRNA read
sequence
SO:0001827
Requested by Bayer Cropscience June, 2011.
mRNA_read
A sequencer read of an mRNA substrate.
SO:ke
A sequencer read of a genomic DNA substrate.
kareneilbeck
2011-06-28T04:06:10Z
gDNA read
gDNA_read
genomic DNA read
sequence
SO:0001828
genomic_DNA_read
A sequencer read of a genomic DNA substrate.
SO:ke
A contig composed of mRNA_reads.
kareneilbeck
2011-06-28T04:07:09Z
sequence
mRNA contig
SO:0001829
Requested by Bayer Cropscience June, 2011.
mRNA_contig
A contig composed of mRNA_reads.
SO:ke
A PCR product obtained by applying the AFLP technique, based on a restriction enzyme digestion of genomic DNA and an amplification of the resulting fragments.
kareneilbeck
2011-07-14T12:12:35Z
http://en.wikipedia.org/wiki/Amplified_fragment_length_polymorphism
AFLP
AFLP fragment
AFLP-PCR
amplified fragment length polymorphism
amplified fragment length polymorphism PCR
sequence
SO:0001830
Requested by Bayer Cropscience June, 2011.
AFLP_fragment
A PCR product obtained by applying the AFLP technique, based on a restriction enzyme digestion of genomic DNA and an amplification of the resulting fragments.
GMOD:ea
http://en.wikipedia.org/wiki/Amplified_fragment_length_polymorphism
wiki
A match to a protein HMM such as pfam.
kareneilbeck
2011-08-11T03:20:27Z
protein hmm match
sequence
SO:0001831
protein_hmm_match
A match to a protein HMM such as pfam.
SO:ke
A region of immunoglobulin sequence, either constant or variable.
kareneilbeck
2011-09-01T03:27:20Z
immunoglobulin region
sequence
SO:0001832
immunoglobulin_region
A region of immunoglobulin sequence, either constant or variable.
SO:ke
The variable region of an immunoglobulin polypeptide sequence.
kareneilbeck
2011-09-01T03:28:40Z
INSDC_feature:V_region
V region
sequence
SO:0001833
V_region
The variable region of an immunoglobulin polypeptide sequence.
SO:ke
The constant region of an immunoglobulin polypeptide sequence.
kareneilbeck
2011-09-01T03:29:41Z
C region
sequence
SO:0001834
C_region
The constant region of an immunoglobulin polypeptide sequence.
SO:ke
Extra nucleotides inserted between rearranged immunoglobulin segments.
kareneilbeck
2011-09-01T03:50:16Z
INSDC_feature:N_region
N-region
sequence
SO:0001835
N_region
Extra nucleotides inserted between rearranged immunoglobulin segments.
SO:ke
The switch region of immunoglobulin heavy chains; it is involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin classes from the same B-cell.
kareneilbeck
2011-09-01T03:52:05Z
INSDC_feature:S_region
S region
sequence
SO:0001836
S_region
The switch region of immunoglobulin heavy chains; it is involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin classes from the same B-cell.
SO:ke
A kind of insertion where the inserted sequence is a mobile element.
kareneilbeck
2011-10-04T12:36:52Z
mobile element insertion
sequence
SO:0001837
Requested by the EBI.
mobile_element_insertion
A kind of insertion where the inserted sequence is a mobile element.
EBI:dvga
An insertion the sequence of which cannot be mapped to the reference genome.
kareneilbeck
2011-10-04T01:14:50Z
novel sequence insertion
sequence
SO:0001838
Requested by the NCBI.
novel_sequence_insertion
An insertion the sequence of which cannot be mapped to the reference genome.
NCBI:th
A promoter element with consensus sequence GTGRGAA, bound by CSL (CBF1/RBP-JK/Suppressor of Hairless/LAG-1) transcription factors.
kareneilbeck
2011-10-07T03:37:43Z
CSL response element
sequence
SO:0001839
CSL_response_element
A promoter element with consensus sequence GTGRGAA, bound by CSL (CBF1/RBP-JK/Suppressor of Hairless/LAG-1) transcription factors.
PMID:19101542
A GATA transcription factor element containing the consensus sequence WGATAR (in which W indicates A/T and R indicates A/G).
kareneilbeck
2011-10-07T03:42:05Z
GATA box
sequence
GATA element
SO:0001840
Changed to is_a SO:0001055 transcriptional_cis_regulatory_region from core_eukaryotic_promoter_element SO:0001660 after Ruth Lovering from GREEKC initiative pointed out that GATA boxes are frequently in enhancer regions, Dave Sant Aug 2020. Moved from is_a SO:0001055 transcriptional_cis_regulatory_region to SO:0000235 TF_binding_site after Colin Logie pointed out that this is a consensus sequence where transcription factors bind, GREEKC Jan 21, 2021.
GATA_box
A GATA transcription factor element containing the consensus sequence WGATAR (in which W indicates A/T and R indicates A/G).
PMID:8321208
A pseudogene in the reference genome, though known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error.
kareneilbeck
2011-10-07T03:46:57Z
polymorphic psuedogene
sequence
SO:0001841
This terms is used by Ensembl and Vega. Pseudogene owing to a SNP/DIP but in other individuals/haplotypes/strains the gene is translated.
polymorphic_pseudogene
A pseudogene in the reference genome, though known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
JAX:hd
A promoter element with consensus sequence TGACTCA, bound by AP-1 and related transcription factors.
kareneilbeck
2011-10-07T03:54:52Z
AP-1 binding site
sequence
SO:0001842
AP_1_binding_site
A promoter element with consensus sequence TGACTCA, bound by AP-1 and related transcription factors.
PMID:1899230
PMID:3034432
PMID:3125983
MERGED DEFINITION:
TARGET DEFINITION: A promoter element with consensus sequence TGACGTCA; bound by the ATF/CREB family of transcription factors.
--------------------
SOURCE DEFINITION: A promoter element that contains a core sequence TGACGT, bound by a protein complex that regulates transcription of genes encoding PKA pathway components.
kareneilbeck
2011-10-07T03:58:48Z
SO:0001900
ATF/CRE site
Atf1/Pcr1 recognition motif
M26 binding site
M26_binding_site
cyclic AMP response element
m26 site
sequence
SO:0001843
New synonym Atf1/Pcr1 recognition motif added in response to Antonia Lock GitHub Issue Request #437, PMID:15716492
CRE
MERGED DEFINITION:
TARGET DEFINITION: A promoter element with consensus sequence TGACGTCA; bound by the ATF/CREB family of transcription factors.
--------------------
SOURCE DEFINITION: A promoter element that contains a core sequence TGACGT, bound by a protein complex that regulates transcription of genes encoding PKA pathway components.
PMID:11483355
PMID:11483993
PMID:15448137
ATF/CRE site
PMID:11483993
A promoter element bound by copper ion-sensing transcription factors such as S. cerevisiae Mac1p or S. pombe Cuf1; the consensus sequence is HTHNNGCTGD (more specifically TTTGCKCR in budding yeast).
kareneilbeck
2011-10-07T04:02:51Z
copper-response element
sequence
SO:0001844
CuRE
A promoter element bound by copper ion-sensing transcription factors such as S. cerevisiae Mac1p or S. pombe Cuf1; the consensus sequence is HTHNNGCTGD (more specifically TTTGCKCR in budding yeast).
PMID:10593913
PMID:9188496
PMID:9211922
A promoter element with consensus sequence CGWGGWNGMM, bound by transcription factors related to RecA and found in promoters of genes expressed following several types of DNA damage or inhibition of DNA synthesis.
kareneilbeck
2011-10-07T04:17:25Z
DNA damage response element
sequence
SO:0001845
DRE
A promoter element with consensus sequence CGWGGWNGMM, bound by transcription factors related to RecA and found in promoters of genes expressed following several types of DNA damage or inhibition of DNA synthesis.
PMID:11073995
PMID:8668127
A promoter element that has consensus sequence GTAAACAAACAAAM and contains a heptameric core GTAAACA, bound by transcription factors with a forkhead DNA-binding domain.
kareneilbeck
2011-10-07T04:20:01Z
sequence
FLEX element
SO:0001846
FLEX_element
A promoter element that has consensus sequence GTAAACAAACAAAM and contains a heptameric core GTAAACA, bound by transcription factors with a forkhead DNA-binding domain.
PMID:10747048
PMID:14871934
A promoter element with consensus sequence TTTRTTTACA, bound by transcription factors with a forkhead DNA-binding domain.
kareneilbeck
2011-10-07T04:22:06Z
forkhead motif
sequence
SO:0001847
forkhead_motif
A promoter element with consensus sequence TTTRTTTACA, bound by transcription factors with a forkhead DNA-binding domain.
PMID:15195092
A core promoter element that has the consensus sequence CAGTCACA (or its inverted form TGTGACTG), and plays the role of a TATA box in promoters that do not contain a canonical TATA sequence.
kareneilbeck
2011-10-07T04:24:14Z
homoID
homol D box
sequence
SO:0001848
homol_D_box
A core promoter element that has the consensus sequence CAGTCACA (or its inverted form TGTGACTG), and plays the role of a TATA box in promoters that do not contain a canonical TATA sequence.
PMID:21673110
PMID:7501449
PMID:8458332
A core promoter element that has the consensus sequence ACCCTACCCT (or its inverted form AGGGTAGGGT), and is found near the homol D box in some promoters that use a homol D box instead of a canonical TATA sequence.
kareneilbeck
2011-10-07T04:26:09Z
homol E box
sequence
SO:0001849
homol_E_box
A core promoter element that has the consensus sequence ACCCTACCCT (or its inverted form AGGGTAGGGT), and is found near the homol D box in some promoters that use a homol D box instead of a canonical TATA sequence.
PMID:7501449
A promoter element that consists of at least three copies of the pentanucleotide NGAAN, bound by the heat shock transcription factor HSF.
kareneilbeck
2011-10-07T04:29:10Z
heat shock element
sequence
SO:0001850
HSE
A promoter element that consists of at least three copies of the pentanucleotide NGAAN, bound by the heat shock transcription factor HSF.
PMID:17347150
PMID:8689565
A GATA promoter element with consensus sequence WGATAA, found in promoters of genes repressed in the presence of iron.
kareneilbeck
2011-10-07T04:32:42Z
IDP (GATA)
iron repressed GATA element
sequence
SO:0001851
The synonym IDP (GATA) is found in an annotation but un-traced as far as literature goes.
iron_repressed_GATA_element
A GATA promoter element with consensus sequence WGATAA, found in promoters of genes repressed in the presence of iron.
PMID:11956219
PMID:17211681
A promoter element with consensus sequence ACAAT, found in promoters of mating type M-specific genes in fission yeast and bound by the transcription factor Mat1-Mc.
kareneilbeck
2011-10-07T04:39:43Z
mating type M-box
sequence
SO:0001852
Note that this should not be confused with the M-box that has consensus sequence CATGTG and is bound by bHLH transcription factors such as MITF.
mating_type_M_box
A promoter element with consensus sequence ACAAT, found in promoters of mating type M-specific genes in fission yeast and bound by the transcription factor Mat1-Mc.
PMID:9233811
A non-palindromic sequence found in the promoters of genes whose expression is regulated in response to androgen.
kareneilbeck
2011-10-10T04:52:44Z
ARE
androgen response element
sequence
SO:0001853
androgen_response_element
A non-palindromic sequence found in the promoters of genes whose expression is regulated in response to androgen.
PMID:21796522
A smFISH is a probe that binds RNA in a single molecule in situ hybridization experiment.
kareneilbeck
2011-10-10T05:00:30Z
single molecule fish probe
sequence
smFISH probe
SO:0001854
smFISH_probe
A smFISH is a probe that binds RNA in a single molecule in situ hybridization experiment.
PMID:18806792
A promoter element with consensus sequence ACGCGT, bound by the transcription factor complex MBF (MCB-binding factor) and found in promoters of genes expressed during the G1/S transition of the cell cycle.
kareneilbeck
2011-10-10T05:09:45Z
MluI cell cycle box
sequence
SO:0001855
MCB
A promoter element with consensus sequence ACGCGT, bound by the transcription factor complex MBF (MCB-binding factor) and found in promoters of genes expressed during the G1/S transition of the cell cycle.
PMID:16285853
A promoter element with consensus sequence CCAAT, bound by a protein complex that represses transcription in response to low iron levels.
kareneilbeck
2011-10-10T05:13:54Z
CCAAT motif
sequence
SO:0001856
CCAAT_motif
A promoter element with consensus sequence CCAAT, bound by a protein complex that represses transcription in response to low iron levels.
PMID:16963626
A promoter element with consensus sequence CCAGCC, bound by the fungal transcription factor Ace2.
kareneilbeck
2011-10-10T05:19:10Z
Ace2 upstream activating sequence
sequence
SO:0001857
Ace2_UAS
A promoter element with consensus sequence CCAGCC, bound by the fungal transcription factor Ace2.
PMID:16678171
A promoter element with consensus sequence TTCTTTGTTY, bound an HMG-box transcription factor such as S. pombe Ste11, and found in promoters of genes up-regulated early in meiosis.
kareneilbeck
2011-10-10T05:22:13Z
TR box
sequence
SO:0001858
TR_box
A promoter element with consensus sequence TTCTTTGTTY, bound an HMG-box transcription factor such as S. pombe Ste11, and found in promoters of genes up-regulated early in meiosis.
PMID:1657709
A promoter element with consensus sequence CCCCTC, bound by the PKA-responsive zinc finger transcription factor Rst2.
kareneilbeck
2011-10-14T10:25:02Z
stress-starvation response element of Schizosaccharomyces pombe
sequence
STREP motif
SO:0001859
STREP_motif
A promoter element with consensus sequence CCCCTC, bound by the PKA-responsive zinc finger transcription factor Rst2.
PMID:11739717
A DNA motif that contains a core consensus sequence AGGTAAGGGTAATGCAC, is found in the intergenic regions of rDNA repeats, and is bound by an RNA polymerase I transcription termination factor (e.g. S. pombe Reb1). The S. pombe telomeric repeat consensus is TTAC(0-1)A(0-1)G(1-8).
kareneilbeck
2011-10-19T11:23:09Z
rDIS
sequence
SO:0001860
Page 208 of ISBN:978-0199638901
rDNA_intergenic_spacer_element
A DNA motif that contains a core consensus sequence AGGTAAGGGTAATGCAC, is found in the intergenic regions of rDNA repeats, and is bound by an RNA polymerase I transcription termination factor (e.g. S. pombe Reb1). The S. pombe telomeric repeat consensus is TTAC(0-1)A(0-1)G(1-8).
ISBN:978-0199638901
PMID:9016645
A 10-bp promoter element bound by sterol regulatory element binding proteins (SREBPs), found in promoters of genes involved in sterol metabolism. Many variants of the sequence ATCACCCCAC function as SREs.
kareneilbeck
2011-10-19T03:02:05Z
SRE
sequence
SO:0001861
sterol_regulatory_element
A 10-bp promoter element bound by sterol regulatory element binding proteins (SREBPs), found in promoters of genes involved in sterol metabolism. Many variants of the sequence ATCACCCCAC function as SREs.
GO:mah
PMID:11111080
PMID:16537923
SRE
GO:mah
A dinucleotide repeat region composed of GT repeating elements.
kareneilbeck
2011-10-19T03:54:37Z
d(GT)n
sequence
SO:0001862
paper:PMID:16043634.
GT_dinucleotide_repeat
A dinucleotide repeat region composed of GT repeating elements.
SO:ke
A trinucleotide repeat region composed of GTT repeating elements.
kareneilbeck
2011-10-19T03:56:54Z
d(GTT)
sequence
SO:0001863
GTT_trinucleotide_repeat
A trinucleotide repeat region composed of GTT repeating elements.
SO:ke
A DNA motif to which the S. pombe Sap1 protein binds. The consensus sequence is 5'-TARGCAGNTNYAACGMG-3'; it is found at the mating type locus, where it is important for mating type switching, and at replication fork barriers in rDNA repeats.
kareneilbeck
2011-10-19T04:24:16Z
Sap1 recognitions site
sequence
SO:0001864
Sap1_recognition_motif
A DNA motif to which the S. pombe Sap1 protein binds. The consensus sequence is 5'-TARGCAGNTNYAACGMG-3'; it is found at the mating type locus, where it is important for mating type switching, and at replication fork barriers in rDNA repeats.
PMID:16166653
PMID:7651412
An RNA polymerase II promoter element found in the promoters of genes regulated by calcineurin. The consensus sequence is GNGGCKCA.
kareneilbeck
2011-10-20T10:12:19Z
CDRE motif
calcineurin-dependent response element
sequence
SO:0001865
CDRE_motif
An RNA polymerase II promoter element found in the promoters of genes regulated by calcineurin. The consensus sequence is GNGGCKCA.
PMID:16928959
calcineurin-dependent response element
PMID:16928959
A contig of BAC reads.
kareneilbeck
2012-01-17T02:45:04Z
BAC read contig
sequence
SO:0001866
Requested by Bayer Cropscience December, 2011.
BAC_read_contig
A contig of BAC reads.
GMOD:ea
A gene suspected of being involved in the expression of a trait.
kareneilbeck
2012-01-17T02:53:03Z
candidate gene
target gene
sequence
SO:0001867
Requested by Bayer Cropscience December, 2011.
candidate_gene
A gene suspected of being involved in the expression of a trait.
GMOD:ea
A candidate gene whose association with a trait is based on the gene's location on a chromosome.
kareneilbeck
2012-01-17T02:54:42Z
positional candidate gene
sequence
positional target gene
SO:0001868
Requested by Bayer Cropscience December, 2011.
positional_candidate_gene
A candidate gene whose association with a trait is based on the gene's location on a chromosome.
GMOD:ea
A candidate gene whose function has something in common biologically with the trait under investigation.
kareneilbeck
2012-01-17T02:57:30Z
functional candidate gene
functional target gene
sequence
SO:0001869
Requested by Bayer Cropscience December, 2011.
functional_candidate_gene
A candidate gene whose function has something in common biologically with the trait under investigation.
GMOD:ea
A short ncRNA that is transcribed from an enhancer. May have a regulatory function.
kareneilbeck
2012-01-17T03:09:35Z
eRNA
sequence
SO:0001870
enhancerRNA
A short ncRNA that is transcribed from an enhancer. May have a regulatory function.
SO:cjm
doi:10.1038/465173a
A promoter element with consensus sequence GNAACR, bound by the transcription factor complex PBF (PCB-binding factor) and found in promoters of genes expressed during the M/G1 transition of the cell cycle.
kareneilbeck
2012-01-17T03:14:02Z
sequence
SO:0001871
PCB
A promoter element with consensus sequence GNAACR, bound by the transcription factor complex PBF (PCB-binding factor) and found in promoters of genes expressed during the M/G1 transition of the cell cycle.
GO:mah
PMID:12411492
A region of a chromosome, where the chromosome has undergone a large structural rearrangement that altered the genome organization. There is no longer synteny to the reference genome.
kareneilbeck
2012-02-03T04:38:35Z
rearrangement region
sequence
SO:0001872
NCBI definition: An orphan rearrangement between chromosomal location observed in isolation.
rearrangement_region
A region of a chromosome, where the chromosome has undergone a large structural rearrangement that altered the genome organization. There is no longer synteny to the reference genome.
NCBI:th
PMID:18564416
A rearrangement breakpoint between two different chromosomes.
kareneilbeck
2012-02-03T04:43:45Z
interchromosomal breakpoint
sequence
SO:0001873
interchromosomal_breakpoint
A rearrangement breakpoint between two different chromosomes.
NCBI:th
A rearrangement breakpoint within the same chromosome.
kareneilbeck
2012-02-03T04:44:53Z
intrachromosomal breakpoint
sequence
SO:0001874
intrachromosomal_breakpoint
A rearrangement breakpoint within the same chromosome.
NCBI:th
A supercontig that is not been assigned to any ultracontig during a genome assembly project.
kareneilbeck
2012-02-14T05:02:20Z
unassigned supercontig
sequence
unassigned scaffold
SO:0001875
Requested by Bayer Cropscience January, 2012.
unassigned_supercontig
A supercontig that is not been assigned to any ultracontig during a genome assembly project.
GMOD:ea
A partial DNA sequence assembly of a chromosome or full genome, which contains gaps that are filled with N's.
kareneilbeck
2012-02-14T05:05:32Z
pseudomolecule
partial genomic sequence assembly
sequence assembly with N-gaps
sequence
SO:0001876
Requested by Bayer Cropscience January, 2012.
partial_genomic_sequence_assembly
A partial DNA sequence assembly of a chromosome or full genome, which contains gaps that are filled with N's.
GMOD:ea
A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced.
kareneilbeck
2012-02-14T05:18:01Z
INSDC_feature:ncRNA
http://www.gencodegenes.org/gencode_biotypes.html
INSDC_qualifier:lncRNA
lncRNA_transcript
long non-coding RNA
sequence
SO:0001877
Updated the definition of lncRNA (SO:0001877) from "A non-coding RNA over 200nucleotides in length." to "A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced." See GitHub Issue #575
lncRNA
A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced.
HGNC:mw
PMID:33353982
http://www.gencodegenes.org/gencode_biotypes.html
GENCODE
A sequence variant that falls entirely or partially within a genomic feature.
kareneilbeck
2012-04-03T11:27:27Z
feature alteration
sequence
SO:0001878
Created in conjunction with the EBI.
feature_variant
A sequence variant that falls entirely or partially within a genomic feature.
EBI:fc
SO:ke
A sequence variant, caused by an alteration of the genomic sequence, where the deletion, is greater than the extent of the underlying genomic features.
kareneilbeck
2012-04-03T11:36:48Z
feature ablation
sequence
SO:0001879
Created in conjunction with the EBI.
feature_ablation
A sequence variant, caused by an alteration of the genomic sequence, where the deletion, is greater than the extent of the underlying genomic features.
SO:ke
A sequence variant, caused by an alteration of the genomic sequence, where the structural change, an amplification of sequence, is greater than the extent of the underlying genomic features.
kareneilbeck
2012-04-03T11:37:48Z
feature amplification
sequence
SO:0001880
Created in conjunction with the EBI.
feature_amplification
A sequence variant, caused by an alteration of the genomic sequence, where the structural change, an amplification of sequence, is greater than the extent of the underlying genomic features.
SO:ke
A sequence variant, caused by an alteration of the genomic sequence, where the structural change, a translocation, is greater than the extent of the underlying genomic features.
kareneilbeck
2012-04-03T11:38:52Z
feature translocation
sequence
SO:0001881
Created in conjunction with the EBI.
feature_translocation
A sequence variant, caused by an alteration of the genomic sequence, where the structural change, a translocation, is greater than the extent of the underlying genomic features.
SO:ke
A sequence variant, caused by an alteration of the genomic sequence, where a deletion fuses genomic features.
kareneilbeck
2012-04-03T11:39:20Z
feature fusion
sequence
SO:0001882
Created in conjunction with the EBI.
feature_fusion
A sequence variant, caused by an alteration of the genomic sequence, where a deletion fuses genomic features.
SO:ke
A feature translocation where the region contains a transcript.
kareneilbeck
2012-04-03T12:29:52Z
transcript translocation
sequence
SO:0001883
Created in conjunction with the EBI.
transcript_translocation
A feature translocation where the region contains a transcript.
SO:ke
A feature translocation where the region contains a regulatory region.
kareneilbeck
2012-04-03T12:31:04Z
regulatory region translocation
sequence
SO:0001884
Created in conjunction with the EBI.
regulatory_region_translocation
A feature translocation where the region contains a regulatory region.
SO:ke
A feature translocation where the region contains a transcription factor binding site.
kareneilbeck
2012-04-03T12:31:15Z
TFBS binding site translocation
transcription factor binding site translocation
sequence
SO:0001885
Created in conjunction with the EBI.
TFBS_translocation
A feature translocation where the region contains a transcription factor binding site.
SO:ke
A feature fusion where the deletion brings together transcript regions.
kareneilbeck
2012-04-03T12:34:56Z
transcript fusion
sequence
SO:0001886
Created in conjunction with the EBI.
transcript_fusion
A feature fusion where the deletion brings together transcript regions.
SO:ke
A feature fusion where the deletion brings together regulatory regions.
kareneilbeck
2012-04-03T12:35:58Z
regulatory region fusion
sequence
SO:0001887
Created in conjunction with the EBI.
regulatory_region_fusion
A feature fusion where the deletion brings together regulatory regions.
SO:ke
A fusion where the deletion brings together transcription factor binding sites.
kareneilbeck
2012-04-03T12:36:42Z
TFBS fusion
transcription factor binding site fusion
sequence
SO:0001888
Created in conjunction with the EBI.
TFBS_fusion
A fusion where the deletion brings together transcription factor binding sites.
SO:ke
A feature amplification of a region containing a transcript.
kareneilbeck
2012-04-03T12:39:23Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:transcript_amplification
transcript amplification
sequence
SO:0001889
Created in conjunction with the EBI.
transcript_amplification
A feature amplification of a region containing a transcript.
SO:ke
VEP:transcript_amplification
A feature fusion where the deletion brings together a regulatory region and a transcript region.
kareneilbeck
2012-04-03T12:40:17Z
transcript regulatory region fusion
sequence
SO:0001890
Created in conjunction with the EBI.
transcript_regulatory_region_fusion
A feature fusion where the deletion brings together a regulatory region and a transcript region.
SO:ke
A feature amplification of a region containing a regulatory region.
kareneilbeck
2012-04-03T12:41:28Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:regulatory_region_amplification
regulatory region amplification
sequence
SO:0001891
Created in conjunction with the EBI.
regulatory_region_amplification
A feature amplification of a region containing a regulatory region.
SO:ke
VEP:regulatory_region_amplification
A feature amplification of a region containing a transcription factor binding site.
kareneilbeck
2012-04-03T12:42:48Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
TFBS amplification
VEP:TFBS_amplification
transcription factor binding site amplification
sequence
SO:0001892
Created in conjunction with the EBI.
TFBS_amplification
A feature amplification of a region containing a transcription factor binding site.
SO:ke
VEP:TFBS_amplification
A feature ablation whereby the deleted region includes a transcript feature.
kareneilbeck
2012-04-03T12:44:19Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:transcript_ablation
VEP:transcript_ablation
transcript ablation
sequence
SO:0001893
Created in conjunction with the EBI.
transcript_ablation
A feature ablation whereby the deleted region includes a transcript feature.
SO:ke
Jannovar:transcript_ablation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:transcript_ablation
A feature ablation whereby the deleted region includes a regulatory region.
kareneilbeck
2012-04-03T12:45:13Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:regulatory_region_ablation
regulatory region ablation
sequence
SO:0001894
Created in conjunction with the EBI.
regulatory_region_ablation
A feature ablation whereby the deleted region includes a regulatory region.
SO:ke
VEP:regulatory_region_ablation
A feature ablation whereby the deleted region includes a transcription factor binding site.
kareneilbeck
2012-04-03T12:45:56Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
TFBS ablation
VEP:TFBS_ablation
transcription factor binding site ablation
sequence
SO:0001895
Created in conjunction with the EBI.
TFBS_ablation
A feature ablation whereby the deleted region includes a transcription factor binding site.
SO:ke
VEP:TFBS_ablation
A CDS that is part of a transposable element.
kareneilbeck
2012-04-05T01:57:04Z
transposable element CDS
sequence
SO:0001896
transposable_element_CDS
A CDS that is part of a transposable element.
SO:ke
A pseudogene contained within a transposable element.
kareneilbeck
2012-04-05T04:09:45Z
transposable element pseudogene
sequence
SO:0001897
transposable_element_pseudogene
A pseudogene contained within a transposable element.
SO:ke
A repeat region which is part of the regional centromere outer repeat region.
kareneilbeck
2012-04-06T11:48:48Z
dg repeat
sequence
SO:0001898
For the S. pombe project - requested by Val Wood.
dg_repeat
A repeat region which is part of the regional centromere outer repeat region.
PMID:16407326
SO:vw
A repeat region which is part of the regional centromere outer repeat region.
kareneilbeck
2012-04-06T11:50:07Z
dh repeat
sequence
SO:0001899
For the S. pombe project - requested by Val Wood.
dh_repeat
A repeat region which is part of the regional centromere outer repeat region.
PMID:16407326
SO:vw
true
A conserved 17-bp sequence (5'-ATCA(C/A)AACCCTAACCCT-3') commonly present upstream of the start site of histone transcription units functioning as a transcription factor binding site.
kareneilbeck
2012-04-06T12:05:24Z
AACCCT box
sequence
SO:0001901
AACCCT_box
A conserved 17-bp sequence (5'-ATCA(C/A)AACCCTAACCCT-3') commonly present upstream of the start site of histone transcription units functioning as a transcription factor binding site.
PMID:17452352
PMID:4092687
A region surrounding a cis_splice site, either within 1-3 bases of the exon or 3-8 bases of the intron.
kareneilbeck
2012-04-06T12:23:32Z
sequence
splice region
SO:0001902
splice_region
A region surrounding a cis_splice site, either within 1-3 bases of the exon or 3-8 bases of the intron.
SO:bm
true
Non-coding RNA transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA.
kareneilbeck
2012-04-06T04:36:44Z
natural antisense transcript
sequence
antisense lncRNA
SO:0001904
Relationship is_a SO:0000644 antisense_RNA added 23 April 2021. See GitHub Issue #443
antisense_lncRNA
Non-coding RNA transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA.
PMID:19638999
A transcript that is transcribed from the outer repeat region of a regional centromere.
kareneilbeck
2012-04-11T04:54:22Z
centromere outer repeat transcript
regional centromere outer repeat region transcript
regional_centromere_outer_repeat_region_transcript
sequence
SO:0001905
regional_centromere_outer_repeat_transcript
A transcript that is transcribed from the outer repeat region of a regional centromere.
PomBase:mah
A sequence variant that causes the reduction of a genomic feature, with regard to the reference sequence.
kareneilbeck
2012-04-12T05:05:28Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:feature_truncation
VEP:feature_truncation
feature truncation
sequence
SO:0001906
feature_truncation
A sequence variant that causes the reduction of a genomic feature, with regard to the reference sequence.
SO:ke
Jannovar:feature_truncation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:feature_truncation
A sequence variant that causes the extension of a genomic feature, with regard to the reference sequence.
kareneilbeck
2012-04-12T05:05:56Z
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
VEP:feature_elongation
feature elongation
sequence
SO:0001907
feature_elongation
A sequence variant that causes the extension of a genomic feature, with regard to the reference sequence.
SO:ke
VEP:feature_elongation
A sequence variant that causes the extension of a genomic feature from within the feature rather than from the terminus of the feature, with regard to the reference sequence.
kareneilbeck
2012-04-12T05:06:20Z
Jannovar:internal_feature_elongation
internal feature elongation
sequence
SO:0001908
internal_feature_elongation
A sequence variant that causes the extension of a genomic feature from within the feature rather than from the terminus of the feature, with regard to the reference sequence.
SO:ke
Jannovar:internal_feature_elongation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A frameshift variant that causes the translational reading frame to be extended relative to the reference feature.
kareneilbeck
2012-04-12T05:10:05Z
ANNOVAR:frameshift insertion
Jannovar:frameshift_elongation
frameshift elongation
sequence
SO:0001909
frameshift_elongation
A frameshift variant that causes the translational reading frame to be extended relative to the reference feature.
SO:ke
ANNOVAR:frameshift insertion
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:frameshift_elongation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A frameshift variant that causes the translational reading frame to be shortened relative to the reference feature.
kareneilbeck
2012-04-12T05:10:45Z
ANNOVAR:frameshift deletion
Jannovar:frameshift_truncation
frameshift truncation
sequence
SO:0001910
frameshift_truncation
A frameshift variant that causes the translational reading frame to be shortened relative to the reference feature.
SO:ke
ANNOVAR:frameshift deletion
http://www.openbioinformatics.org/annovar/annovar_download.html
Jannovar:frameshift_truncation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A sequence variant where copies of a feature are increased relative to the reference.
kareneilbeck
2012-04-13T11:26:32Z
copy number increase
sequence
SO:0001911
copy_number_increase
A sequence variant where copies of a feature are increased relative to the reference.
SO:ke
A sequence variant where copies of a feature are decreased relative to the reference.
kareneilbeck
2012-04-13T11:27:52Z
copy number decrease
sequence
SO:0001912
copy_number_decrease
A sequence variant where copies of a feature are decreased relative to the reference.
SO:ke
A bacterial promoter with sigma ecf factor binding dependency. This is a type of bacterial promoters that requires a sigma ECF factor to bind to identified -10 and -35 sequence regions in order to mediate binding of the RNA polymerase to the promoter region as part of transcription initiation.
kareneilbeck
2012-06-11T02:41:33Z
bacterial RNApol promoter sigma ecf
sequence
SO:0001913
Requested by Kevin Clancy - invitrogen -May 2012.
bacterial_RNApol_promoter_sigma_ecf_element
A bacterial promoter with sigma ecf factor binding dependency. This is a type of bacterial promoters that requires a sigma ECF factor to bind to identified -10 and -35 sequence regions in order to mediate binding of the RNA polymerase to the promoter region as part of transcription initiation.
Invitrogen:kc
A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing.
kareneilbeck
2012-06-11T02:55:02Z
DNA spacer replication fork barrier
RFB
RTS1 barrier
RTS1 element
rDNA replication fork barrier
sequence
SO:0001914
Requested by Midori - June 2012.
rDNA_replication_fork_barrier
A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing.
PMID:14645529
A region defined by a cluster of experimentally determined transcription starting sites.
kareneilbeck
2012-10-17T12:09:50Z
TSC
TSS cluster
transcriptional initiation cluster
transcriptional start site cluster
sequence
SO:0001915
transcription_start_cluster
A region defined by a cluster of experimentally determined transcription starting sites.
PMID:19624849
PMID:21372179
SO:andrewgibson
A CAGE tag is a sequence tag hat corresponds to 5' ends of mRNA at cap sites, produced by cap analysis gene expression and used to identify transcriptional start sites.
kareneilbeck
2012-10-17T12:36:58Z
CAGE tag
sequence
SO:0001916
CAGE_tag
A CAGE tag is a sequence tag hat corresponds to 5' ends of mRNA at cap sites, produced by cap analysis gene expression and used to identify transcriptional start sites.
SO:andrewgibson
A kind of transcription_initiation_cluster defined by the clustering of CAGE tags on a sequence region.
kareneilbeck
2012-10-17T12:42:03Z
INSDC_feature:misc_feature
CAGE cluster
CAGE peak
CAGE_peak
INSDC_note:CAGE_cluster
sequence
SO:0001917
CAGE_cluster
A kind of transcription_initiation_cluster defined by the clustering of CAGE tags on a sequence region.
PMID:16645617
SO:andrewgibson
A cytosine methylated at the 5 carbon.
kareneilbeck
2012-10-17T12:46:10Z
http://www.insdc.org/files/feature_table.html#7.4.2
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
5 methylcytosine
5-mC
m-5C
m5c
sequence
SO:0001918
5_methylcytosine
A cytosine methylated at the 5 carbon.
SO:rtapella
A cytosine methylated at the 4 nitrogen.
kareneilbeck
2012-10-17T12:50:40Z
http://www.insdc.org/files/feature_table.html#7.4.2
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
4-mC
4-methylcytosine
N4 methylcytosine
N4-methylcytosine
N4_methylcytosine
m-4C
m4c
sequence
SO:0001919
4_methylcytosine
A cytosine methylated at the 4 nitrogen.
SO:rtapella
An adenine methylated at the 6 nitrogen.
kareneilbeck
2012-10-17T12:54:23Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
6-mA
6-methyladenine
6mA
N6-methyladenine
m-6A
m6a
sequence
SO:0001920
N6_methyladenine
An adenine methylated at the 6 nitrogen.
SO:rtapella
A contig of mitochondria derived sequences.
kareneilbeck
2012-10-31T12:34:38Z
mitochondrial contig
sequence
SO:0001921
Requested by Bayer Cropscience, October, 2012.
mitochondrial_contig
A contig of mitochondria derived sequences.
GMOD:ea
A scaffold composed of mitochondrial contigs.
kareneilbeck
2012-10-31T12:42:45Z
mitochondrial scaffold
mitochondrial supercontig
mitochondrial_scaffold
sequence
SO:0001922
mitochondrial_supercontig
A scaffold composed of mitochondrial contigs.
GMOD:ea
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts contain G rich telomeric RNA repeats and RNA tracts corresponding to adjacent subtelomeric sequences. They are 100-9000 bases long.
kareneilbeck
2012-10-31T01:06:40Z
sequence
telomeric repeat containing RNA
SO:0001923
Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012.
TERRA
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts contain G rich telomeric RNA repeats and RNA tracts corresponding to adjacent subtelomeric sequences. They are 100-9000 bases long.
PMID:22139915
A non coding RNA transcript, complementary to subtelomeric tract of TERRA transcript but devoid of the repeats.
kareneilbeck
2012-10-31T01:11:49Z
sequence
SO:0001924
Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012.
ARRET
A non coding RNA transcript, complementary to subtelomeric tract of TERRA transcript but devoid of the repeats.
PMID:2139915
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts consist of C rich repeats.
kareneilbeck
2012-10-31T01:24:37Z
sequence
SO:0001925
Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012.
ARIA
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts consist of C rich repeats.
PMID:22139915
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts are antisense of ARRET transcripts.
kareneilbeck
2012-10-31T01:40:22Z
anti-ARRET
sequence
SO:0001926
Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012.
anti_ARRET
A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts are antisense of ARRET transcripts.
PMID:22139915
A non-coding transcript derived from the transcript of the telomere.
kareneilbeck
2012-10-31T01:42:15Z
telomeric transcript
sequence
SO:0001927
telomeric_transcript
A non-coding transcript derived from the transcript of the telomere.
PMID:22139915
A duplication of the distal region of a chromosome.
kareneilbeck
2012-10-31T01:56:44Z
distal duplication
sequence
SO:0001928
This term is used by Complete Genomics in the structural variant analysis files.
distal_duplication
A duplication of the distal region of a chromosome.
SO:bm
A sequencer read of a mitochondrial DNA sample.
kareneilbeck
2012-11-14T04:39:56Z
mitochondrial DNA read
sequence
SO:0001929
Requested by Bayer Cropscience, October, 2012.
mitochondrial_DNA_read
A sequencer read of a mitochondrial DNA sample.
GMOD:ea
A sequencer read of a chloroplast DNA sample.
kareneilbeck
2012-11-14T04:43:45Z
chloroplast DNA read
sequence
SO:0001930
Requested by Bayer Cropscience, October, 2012.
chloroplast_DNA_read
A sequencer read of a chloroplast DNA sample.
GMOD:ea
Genomic DNA sequence produced from some base calling or alignment algorithm which uses aligned or assembled multiple gDNA sequences as input.
kareneilbeck
2012-11-28T12:53:14Z
consensus gDNA
consensus genomic DNA
sequence
SO:0001931
Requested by Bayer Cropscience November, 2012.
consensus_gDNA
Genomic DNA sequence produced from some base calling or alignment algorithm which uses aligned or assembled multiple gDNA sequences as input.
GMOD:ea
A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 5' end.
kareneilbeck
2013-03-06T09:50:44Z
restriction enzyme five prime single strand overhang
sequence
SO:0001932
restriction_enzyme_five_prime_single_strand_overhang
A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 5' end.
SO:ke
A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 3' end.
kareneilbeck
2013-03-06T09:52:14Z
restriction enzyme three prime single strand overhang
sequence
SO:0001933
restriction_enzyme_three_prime_single_strand_overhang
A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 3' end.
SO:ke
A repeat_region containing repeat_units of 1 bp that is repeated multiple times in tandem.
kareneilbeck
2013-03-06T09:59:15Z
monomeric repeat
sequence
SO:0001934
monomeric_repeat
A repeat_region containing repeat_units of 1 bp that is repeated multiple times in tandem.
SO:ke
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H3 protein is tri-methylated.
kareneilbeck
2013-03-06T10:13:48Z
H3K20 trimethylation site
sequence
SO:0001935
H3K20_trimethylation_site
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H3 protein is tri-methylated.
EBI:nj
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2013-03-06T10:16:55Z
H3K36 acetylation site
H3K36ac
sequence
SO:0001936
H3K36_acetylation_site
A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H2B protein is acetylated.
kareneilbeck
2013-03-06T10:19:13Z
H2BK12 acetylation site
H2BK12ac
sequence
SO:0001937
H2BK12_acetylation_site
A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H2B protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2A histone protein is acetylated.
kareneilbeck
2013-03-06T10:20:57Z
H2AK5 acetylation site
H2AK5ac
sequence
SO:0001938
H2AK5_acetylation_site
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2A histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H4 histone protein is acetylated.
kareneilbeck
2013-03-06T10:26:15Z
H4K12 acetylation site
H4K12ac
sequence
SO:0001939
H4K12_acetylation_site
A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H4 histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 120th residue (a lysine), from the start of the H2B histone protein is acetylated.
kareneilbeck
2013-03-06T10:28:38Z
H2BK120 acetylation site
H2BK120ac
sequence
SO:0001940
H2BK120_acetylation_site
A kind of histone modification site, whereby the 120th residue (a lysine), from the start of the H2B histone protein is acetylated.
EBI:nj
http://dx.doi.org/10.4161/epi.6.5.15623
A kind of histone modification site, whereby the 91st residue (a lysine), from the start of the H4 histone protein is acetylated.
kareneilbeck
2013-03-06T10:41:04Z
H4K91 acetylation site
H4K91ac
sequence
SO:0001941
H4K91_acetylation_site
A kind of histone modification site, whereby the 91st residue (a lysine), from the start of the H4 histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H2B histone protein is acetylated.
kareneilbeck
2013-03-06T10:44:31Z
H2BK20 acetylation site
H2BK20ac
sequence
SO:0001942
H2BK20_acetylation_site
A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H2B histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2013-03-06T10:46:32Z
H3K4 acetylation site
H3K4ac
sequence
SO:0001943
H3K4_acetylation_site
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H2A histone protein is acetylated.
kareneilbeck
2013-03-06T10:48:11Z
H2AK9 acetylation site
H2AK9ac
sequence
SO:0001944
H2AK9_acetylation_site
A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H2A histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 56th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2013-03-06T10:51:14Z
H3K56 acetylation site
H3K56ac
sequence
SO:0001945
H3K56_acetylation_site
A kind of histone modification site, whereby the 56th residue (a lysine), from the start of the H3 histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2B histone protein is acetylated.
kareneilbeck
2013-03-06T10:53:23Z
H2BK15 acetylation site
H2BK15ac
sequence
SO:0001946
H2BK15_acetylation_site
A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2B histone protein is acetylated.
EBI:nj
A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is mono-methylated.
kareneilbeck
2013-03-06T10:57:13Z
H3R2me1
H3R2 monomethylation site
sequence
SO:0001947
H3R2_monomethylation_site
A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is mono-methylated.
EBI:nj
A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is di-methylated.
kareneilbeck
2013-03-06T10:59:17Z
H3R2 dimethylation site
H3R2me2
sequence
SO:0001948
H3R2_dimethylation_site
A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is di-methylated.
EBI:nj
A kind of histone modification site, whereby the 3nd residue (an arginine), from the start of the H4 protein is di-methylated.
kareneilbeck
2013-03-06T11:01:27Z
H4R3 dimethylation site
H4R3me2
sequence
SO:0001949
H4R3_dimethylation_site
A kind of histone modification site, whereby the 3nd residue (an arginine), from the start of the H4 protein is di-methylated.
EBI:nj
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H4 protein is tri-methylated.
kareneilbeck
2013-03-06T11:03:29Z
H4K4me3
H4K4 trimethylation site
sequence
SO:0001950
H4K4_trimethylation_site
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H4 protein is tri-methylated.
EBI:nj
A kind of histone modification site, whereby the 23rd residue (a lysine), from the start of the H3 protein is di-methylated.
kareneilbeck
2013-03-06T11:05:33Z
H3K23 dimethylation site
H3K23me2
sequence
SO:0001951
H3K23_dimethylation_site
A kind of histone modification site, whereby the 23rd residue (a lysine), from the start of the H3 protein is di-methylated.
EBI:nj
A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites.
kareneilbeck
2013-03-06T11:36:25Z
promoter flanking region
sequence
SO:0001952
promoter_flanking_region
A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites.
EBI:nj
A region of DNA sequence formed from the ligation of two sticky ends where the palindrome is broken and no longer comprises the recognition site and thus cannot be re-cut by the restriction enzymes used to create the sticky ends.
kareneilbeck
2013-03-06T03:18:11Z
sequence
SO:0001953
restriction_enzyme_assembly_scar
A region of DNA sequence formed from the ligation of two sticky ends where the palindrome is broken and no longer comprises the recognition site and thus cannot be re-cut by the restriction enzymes used to create the sticky ends.
SO:ke
A region related to restriction enzyme function.
kareneilbeck
2013-03-06T03:23:34Z
sequence
restriction enzyme region
SO:0001954
Not a great term for annotation, but used to classify the various regions related to restriction enzymes.
restriction_enzyme_region
A region related to restriction enzyme function.
SO:ke
A polypeptide region that proves structure in a protein that affects the stability of the protein.
kareneilbeck
2013-03-06T03:32:47Z
sequence
protein stability element
SO:0001955
protein_stability_element
A polypeptide region that proves structure in a protein that affects the stability of the protein.
SO:ke
A polypeptide_region that codes for a protease cleavage site.
kareneilbeck
2013-03-06T03:36:28Z
protease site
sequence
SO:0001956
protease_site
A polypeptide_region that codes for a protease cleavage site.
SO:ke
RNA secondary structure that affects the stability of an RNA molecule.
kareneilbeck
2013-03-06T03:38:35Z
sequence
rna stability element
SO:0001957
RNA_stability_element
true
RNA secondary structure that affects the stability of an RNA molecule.
SO:ke
A kind of intron whereby the excision is driven by lariat formation.
kareneilbeck
2013-03-07T10:58:40Z
lariat intron
sequence
SO:0001958
Requested by PomBase 3604508.
lariat_intron
A kind of intron whereby the excision is driven by lariat formation.
SO:ke
A cis-regulatory element, conserved sequence YYC+1TTTYY, and spans -2 to +6 relative to +1 TSS. It is present in most ribosomal protein genes in Drosophila and mammals but not in the yeast Saccharomyces cerevisiae. Resembles the initiator (TCAKTY in Drosophila) but functionally distinct from initiator.
kareneilbeck
2013-05-17T04:38:48Z
TCT element
polypyrimidine initiator
sequence
SO:0001959
TCT_motif
A cis-regulatory element, conserved sequence YYC+1TTTYY, and spans -2 to +6 relative to +1 TSS. It is present in most ribosomal protein genes in Drosophila and mammals but not in the yeast Saccharomyces cerevisiae. Resembles the initiator (TCAKTY in Drosophila) but functionally distinct from initiator.
PMID:20801935
SO:myl
A modified DNA cytosine base feature, modified by a hydroxymethyl group at the 5 carbon.
kareneilbeck
2013-05-17T05:05:31Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
5-hmC
5-hydroxymethylcytosine
sequence
SO:0001960
5_hydroxymethylcytosine
A modified DNA cytosine base feature, modified by a hydroxymethyl group at the 5 carbon.
SO:ke
A modified DNA cytosine base feature, modified by a formyl group at the 5 carbon.
kareneilbeck
2013-05-17T05:06:13Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
5-fC
5-formylcytosine
sequence
SO:0001961
5_formylcytosine
A modified DNA cytosine base feature, modified by a formyl group at the 5 carbon.
SO:ke
A modified adenine DNA base feature.
kareneilbeck
2013-05-20T01:22:30Z
sequence
SO:0001962
modified_adenine
A modified adenine DNA base feature.
SO:ke
A modified cytosine DNA base feature.
kareneilbeck
2013-05-20T01:23:47Z
sequence
SO:0001963
modified_cytosine
A modified cytosine DNA base feature.
SO:ke
A modified guanine DNA base feature.
kareneilbeck
2013-05-20T01:25:31Z
sequence
SO:0001964
modified_guanine
A modified guanine DNA base feature.
SO:ke
A modified DNA guanine base,at the 8 carbon, often the product of DNA damage.
kareneilbeck
2013-05-20T01:27:51Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
8-oxoG
8-oxoguanine
sequence
SO:0001965
8_oxoguanine
A modified DNA guanine base,at the 8 carbon, often the product of DNA damage.
SO:ke
A modified DNA cytosine base feature, modified by a carboxy group at the 5 carbon.
kareneilbeck
2013-05-20T01:30:01Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
5-caC
5-carboxycytosine
sequence
SO:0001966
5_carboxylcytosine
A modified DNA cytosine base feature, modified by a carboxy group at the 5 carbon.
SO:ke
A modified DNA adenine base,at the 8 carbon, often the product of DNA damage.
kareneilbeck
2013-05-20T01:31:05Z
http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf
8-oxoA
8-oxoadenine
sequence
SO:0001967
8_oxoadenine
A modified DNA adenine base,at the 8 carbon, often the product of DNA damage.
SO:ke
A transcript variant of a protein coding gene.
kareneilbeck
2013-05-22T04:34:49Z
Jannovar:coding_transcript_variant
coding transcript variant
sequence
SO:0001968
coding_transcript_variant
A transcript variant of a protein coding gene.
SO:ke
Jannovar:coding_transcript_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A transcript variant occurring within an intron of a coding transcript.
kareneilbeck
2013-05-23T10:54:17Z
Jannovar:coding_transcript_intron_variant
coding sequence intron variant
sequence
SO:0001969
coding_transcript_intron_variant
A transcript variant occurring within an intron of a coding transcript.
SO:ke
Jannovar:coding_transcript_intron_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A transcript variant occurring within an intron of a non coding transcript.
kareneilbeck
2013-05-23T10:55:03Z
non coding transcript intron variant
ANNOVAR:ncRNA_intronic
Jannovar:non_coding_transcript_intron_variant
sequence
SO:0001970
non_coding_transcript_intron_variant
A transcript variant occurring within an intron of a non coding transcript.
SO:ke
ANNOVAR:ncRNA_intronic
Jannovar:non_coding_transcript_intron_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
A binding site to which a polypeptide will bind with a zinc finger motif, which is characterized by requiring one or more Zinc 2+ ions for stabilized folding.
kareneilbeck
2013-07-29T04:41:53Z
zinc finger binding site
zinc_fing
sequence
SO:0001971
zinc_finger_binding_site
zinc_fing
A histone 4 modification where the modification is the acetylation of the residue.
kareneilbeck
2013-07-30T10:43:04Z
H4ac
histone 4 acetylation site
sequence
SO:0001972
histone_4_acetylation_site
A histone 4 modification where the modification is the acetylation of the residue.
EBI:nj
ISBN:0815341059
SO:ke
A histone 3 modification where the modification is the acetylation of the residue.
kareneilbeck
2013-07-30T10:46:42Z
H3ac
histone 3 acetylation site
sequence
SO:0001973
histone_3_acetylation_site
A histone 3 modification where the modification is the acetylation of the residue.
EBI:nj
ISBN:0815341059
SO:ke
A transcription factor binding site with consensus sequence CCGCGNGGNGGCAG, bound by CCCTF-binding factor.
kareneilbeck
2013-07-30T10:59:11Z
CCCTF binding site
CTCF binding site
sequence
SO:0001974
CTCF_binding_site
A transcription factor binding site with consensus sequence CCGCGNGGNGGCAG, bound by CCCTF-binding factor.
EBI:nj
A restriction enzyme recognition site that, when cleaved, results in 5 prime overhangs.
kareneilbeck
2013-07-30T11:32:16Z
five prime sticky end restriction enzyme cleavage site
sequence
SO:0001975
Requested by Jackie Quinn. The sticky restriction sites are different from junctions because they include the sequence that is cut, inclusive of the five prime junction and the three prime junction.
five_prime_sticky_end_restriction_enzyme_cleavage_site
A restriction enzyme recognition site that, when cleaved, results in 5 prime overhangs.
SO:ke
A restriction enzyme recognition site that, when cleaved, results in 3 prime overhangs.
kareneilbeck
2013-07-30T11:37:19Z
three prime sticky end restriction enzyme cleavage site
sequence
SO:0001976
Requested by Jackie Quinn. The sticky restriction sites are different from junctions because they include the sequence that is cut, inclusive of the five prime junction and the three prime junction.
three_prime_sticky_end_restriction_enzyme_cleavage_site
A restriction enzyme recognition site that, when cleaved, results in 3 prime overhangs.
SO:ke
A region of a transcript encoding the cleavage site for a ribonuclease enzyme.
kareneilbeck
2013-07-30T11:41:06Z
ribonuclease site
sequence
SO:0001977
ribonuclease_site
A region of a transcript encoding the cleavage site for a ribonuclease enzyme.
SO:ke
A region of sequence where developer information is encoded.
kareneilbeck
2013-07-30T11:49:22Z
DNA signature
sequence
SO:0001978
Requested by Jackie Quinn for use in synthetic biology.
signature
A region of sequence where developer information is encoded.
SO:ke
A motif that affects the stability of RNA.
kareneilbeck
2013-07-30T03:33:53Z
RNA stability element
sequence
SO:0001979
RNA_stability_element
A motif that affects the stability of RNA.
PMID:22495308
SO:ke
A regulatory promoter element identified in mutation experiments, with consensus sequence: CACGTG. Present in promoters, intergenic regions, coding regions, and introns. They are involved in gene expression responses to light and interact with G-box binding factor and I-box binding factor 1a.
kareneilbeck
2013-07-30T04:00:50Z
G-box
GBF binding sequence
sequence
SO:0001980
A plant specific region.
G_box
A regulatory promoter element identified in mutation experiments, with consensus sequence: CACGTG. Present in promoters, intergenic regions, coding regions, and introns. They are involved in gene expression responses to light and interact with G-box binding factor and I-box binding factor 1a.
PMID:19249238
PMID:8571452
SO:ml
An orientation dependent regulatory promoter element, with consensus sequence of TTGCACAN4TTGCACA, found in plants.
kareneilbeck
2013-07-30T04:12:19Z
L-box
L-box promoter element
sequence
SO:0001981
L_box
An orientation dependent regulatory promoter element, with consensus sequence of TTGCACAN4TTGCACA, found in plants.
PMID:17381552
PMID:2902624
SO:ml
A plant regulatory promoter motif, composed of a highly conserved hexamer GATAAG (I-box core).
kareneilbeck
2013-07-30T04:17:55Z
I-box promoter motif
sequence
SO:0001982
I-box
A plant regulatory promoter motif, composed of a highly conserved hexamer GATAAG (I-box core).
PMID:2347304
PMID:2902624
SO:ml
A 5' UTR variant where a premature start codon is introduced, moved or lost.
kareneilbeck
2013-07-30T04:36:25Z
5' UTR premature start codon variant
sequence
SO:0001983
Requested by Andy Menzies at the Sanger. This isn't necessarily a protein coding change. A premature start codon can effect the production of a mature protein product by providing a competing translation start point. Some genes balance their expression this way, eg THPO requires the presence of a premature start to limit expression, its loss leads to Familial thrombocythemia.
5_prime_UTR_premature_start_codon_variant
A 5' UTR variant where a premature start codon is introduced, moved or lost.
SANGER:am
A gene cassette array that corresponds to a silenced version of a mating type region.
kareneilbeck
2013-07-31T02:40:38Z
sequence
silent mating-type cassette
SO:0001984
silent_mating_type_cassette_array
A gene cassette array that corresponds to a silenced version of a mating type region.
PomBase:mah
Any of the DNA segments produced by discontinuous synthesis of the lagging strand during DNA replication.
kareneilbeck
2013-07-31T02:57:55Z
Okazaki fragment
sequence
SO:0001985
Requested by Midori Harris, 2013.
Okazaki_fragment
Any of the DNA segments produced by discontinuous synthesis of the lagging strand during DNA replication.
ISBN:0805350152
A feature variant, where the alteration occurs upstream of the transcript TSS.
kareneilbeck
2013-07-31T03:46:14Z
upstream transcript variant
sequence
SO:0001986
Requested by Graham Ritchie, EBI/Sanger.
upstream_transcript_variant
A feature variant, where the alteration occurs upstream of the transcript TSS.
EBI:gr
A feature variant, where the alteration occurs downstream of the transcript termination site.
kareneilbeck
2013-07-31T03:47:51Z
downstream transcript variant
sequence
SO:0001987
Requested by Graham Ritchie, EBI/Sanger.
downstream_transcript_variant
A 5' UTR variant where a premature start codon is gained.
kareneilbeck
2013-07-31T03:53:06Z
http://snpeff.sourceforge.net/SnpEff_manual.html
5 prime UTR premature start codon gain variant
Jannovar:5_prime_UTR_premature_start_codon_gain_variant
snpEff:START_GAINED
sequence
SO:0001988
5_prime_UTR_premature_start_codon_gain_variant
A 5' UTR variant where a premature start codon is gained.
Sanger:am
Jannovar:5_prime_UTR_premature_start_codon_gain_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:START_GAINED
A 5' UTR variant where a premature start codon is lost.
kareneilbeck
2013-07-31T03:56:48Z
sequence
SO:0001989
5_prime_UTR_premature_start_codon_loss_variant
A 5' UTR variant where a premature start codon is lost.
SANGER:am
A 5' UTR variant where a premature start codon is moved.
kareneilbeck
2013-07-31T03:57:47Z
sequence
SO:0001990
five_prime_UTR_premature_start_codon_location_variant
A 5' UTR variant where a premature start codon is moved.
SANGER:am
A consensus AFLP fragment is an AFLP sequence produced from any alignment algorithm which uses assembled multiple AFLP sequences as input.
kareneilbeck
2013-09-24T10:43:41Z
consensus AFLP fragment
consensus amplified fragment length polymorphism fragment
sequence
SO:0001991
Requested by Bayer Cropscience September, 2013.
consensus_AFLP_fragment
A consensus AFLP fragment is an AFLP sequence produced from any alignment algorithm which uses assembled multiple AFLP sequences as input.
GMOD:ea
A non-synonymous variant is an inframe, protein altering variant, resulting in a codon change.
kareneilbeck
2013-10-16T11:47:51Z
non_synonymous_coding
nonsynonymous variant
sequence
SO:0001992
nonsynonymous_variant
A non-synonymous variant is an inframe, protein altering variant, resulting in a codon change.
SO:ke
non_synonymous_coding
http://ensembl.org/info/docs/variation/index.html
Intronic positions associated with cis-splicing. Contains the first and second positions immediately before the exon and the first, second and fifth positions immediately after.
kareneilbeck
2014-01-04T06:20:00Z
extended cis splice site
sequence
SO:0001993
Added by Andy Menzies (Sanger).
extended_cis_splice_site
Intronic positions associated with cis-splicing. Contains the first and second positions immediately before the exon and the first, second and fifth positions immediately after.
SANGER:am
Fifth intronic position after the intron exon boundary, close to the 5' edge of the intron.
kareneilbeck
2014-01-04T06:26:02Z
intron base 5
sequence
SO:0001994
intron_base_5
Fifth intronic position after the intron exon boundary, close to the 5' edge of the intron.
SANGER:am
A sequence variant occurring in the intron, within 10 bases of exon.
kareneilbeck
2014-01-04T06:37:27Z
extended intronic splice region variant
sequence
SO:0001995
Added by Andy Menzies (Sanger).
extended_intronic_splice_region_variant
A sequence variant occurring in the intron, within 10 bases of exon.
sanger:am
Region of intronic sequence within 10 bases of an exon.
kareneilbeck
2014-01-04T06:41:23Z
extended intronic splice region
sequence
SO:0001996
extended_intronic_splice_region
Region of intronic sequence within 10 bases of an exon.
SANGER:am
A heterochromatic region of the chromosome, adjacent to the telomere (on the centromeric side) that contains repetitive DNA and sometimes genes and it is transcribed.
kareneilbeck
2014-01-05T07:02:01Z
sequence
SO:0001997
subtelomere
A heterochromatic region of the chromosome, adjacent to the telomere (on the centromeric side) that contains repetitive DNA and sometimes genes and it is transcribed.
POMBE:al
A small RNA oligo, typically about 20 bases, that guides the cas nuclease to a target DNA sequence in the CRISPR/cas mutagenesis method.
kareneilbeck
2014-01-05T07:25:08Z
small guide RNA
sequence
gRNA
guide RNA
SO:0001998
sgRNA
A small RNA oligo, typically about 20 bases, that guides the cas nuclease to a target DNA sequence in the CRISPR/cas mutagenesis method.
PMID:23934893
DNA motif that is a component of a mating type region.
kareneilbeck
2014-01-05T07:30:17Z
mating type region motif
sequence
SO:0001999
mating_type_region_motif
DNA motif that is a component of a mating type region.
SO:ke
true
A segment of non-homology between a and alpha mating alleles, found at all three mating loci (HML, MAT, and HMR), has two forms (Ya and Yalpha).
kareneilbeck
2014-01-05T07:33:30Z
Y-region
sequence
SO:0002001
Requested by Janos Demeter, SGD.
Y_region
A segment of non-homology between a and alpha mating alleles, found at all three mating loci (HML, MAT, and HMR), has two forms (Ya and Yalpha).
SGD:jd
A mating type region motif, one of two segments of homology found at all three mating loci (HML, MAT, and HMR).
kareneilbeck
2014-01-05T07:34:59Z
Z1-region
sequence
SO:0002002
Requested by Janos Demeter, SGD.
Z1_region
A mating type region motif, one of two segments of homology found at all three mating loci (HML, MAT, and HMR).
SGD:jd
A mating type region motif, the rightmost segment of homology in the HML and MAT mating loci (not present in HMR).
kareneilbeck
2014-01-05T07:36:45Z
Z2-segment
sequence
SO:0002003
Requested by Janos Demeter, SGD.
Z2_region
A mating type region motif, the rightmost segment of homology in the HML and MAT mating loci (not present in HMR).
SGD:jd
The ACS is an 11-bp sequence of the form 5'-WTTTAYRTTTW-3' which is at the core of every yeast ARS, and is necessary but not sufficient for recognition and binding by the origin recognition complex (ORC). Functional ARSs require an ACS, as well as other cis elements in the 5' (C domain) and 3' (B domain) flanking sequences of the ACS.
kareneilbeck
2014-01-05T07:47:48Z
ACS
ARS consensus sequence
sequence
SO:0002004
ARS_consensus_sequence
The ACS is an 11-bp sequence of the form 5'-WTTTAYRTTTW-3' which is at the core of every yeast ARS, and is necessary but not sufficient for recognition and binding by the origin recognition complex (ORC). Functional ARSs require an ACS, as well as other cis elements in the 5' (C domain) and 3' (B domain) flanking sequences of the ACS.
SGD:jd
The determinant of selective removal (DSR) motif consists of repeats of U(U/C)AAAC. The motif targets meiotic transcripts for removal during mitosis via the exosome.
kareneilbeck
2014-01-05T07:51:27Z
DSR motif
sequence
SO:0002005
Requested by Antonia Locke, (Pombe).
DSR_motif
The determinant of selective removal (DSR) motif consists of repeats of U(U/C)AAAC. The motif targets meiotic transcripts for removal during mitosis via the exosome.
PMID:22645662
A promoter element that has the consensus sequence GNMGATC, and is found in promoters of genes repressed in the presence of zinc.
kareneilbeck
2014-01-05T09:23:27Z
zinc repressed element
sequence
SO:0002006
This element is bound by Loz1 in S. pombe. The paper does not name the element. This term was requested by Midoris Harris, for Pombe.
zinc_repressed_element
A promoter element that has the consensus sequence GNMGATC, and is found in promoters of genes repressed in the presence of zinc.
PMID:24003116
POMBE:mh
An MNV is a multiple nucleotide variant (substitution) in which the inserted sequence is the same length as the replaced sequence.
kareneilbeck
2014-01-13T03:48:40Z
multiple nucleotide substitution
multiple nucleotide variant
sequence
SO:0002007
MNV
An MNV is a multiple nucleotide variant (substitution) in which the inserted sequence is the same length as the replaced sequence.
NCBI:th
A sequence variant whereby at least one base of a codon encoding a rare amino acid is changed, resulting in a different encoded amino acid.
kareneilbeck
2014-03-24T02:24:01Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:rare_amino_acid_variant
rare amino acid variant
snpEff:RARE_AMINO_ACID
sequence
SO:0002008
Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function.
rare_amino_acid_variant
A sequence variant whereby at least one base of a codon encoding a rare amino acid is changed, resulting in a different encoded amino acid.
SO:ke
Jannovar:rare_amino_acid_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:RARE_AMINO_ACID
A sequence variant whereby at least one base of a codon encoding selenocysteine is changed, resulting in a different encoded amino acid.
kareneilbeck
2014-03-24T02:29:44Z
selenocysteine loss
sequence
SO:0002009
Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function.
selenocysteine_loss
A sequence variant whereby at least one base of a codon encoding selenocysteine is changed, resulting in a different encoded amino acid.
SO:ke
A sequence variant whereby at least one base of a codon encoding pyrrolysine is changed, resulting in a different encoded amino acid.
kareneilbeck
2014-03-24T02:30:16Z
pyrrolysine loss
sequence
SO:0002010
Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function.
pyrrolysine_loss
A sequence variant whereby at least one base of a codon encoding pyrrolysine is changed, resulting in a different encoded amino acid.
SO:ke
A variant that occurs within a gene but falls outside of all transcript features. This occurs when alternate transcripts of a gene do not share overlapping sequence.
kareneilbeck
2014-03-24T02:33:13Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:intragenic_variant
intragenic variant
snpEff:INTRAGENIC
sequence
SO:0002011
Requested by Pablo Cingolani, for use in SnpEff.
intragenic_variant
A variant that occurs within a gene but falls outside of all transcript features. This occurs when alternate transcripts of a gene do not share overlapping sequence.
SO:ke
Jannovar:intragenic_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:INTRAGENIC
A codon variant that changes at least one base of the canonical start codon.
kareneilbeck
2014-03-24T02:41:28Z
http://snpeff.sourceforge.net/SnpEff_manual.html
http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences
Jannovar:start_lost
VEP:start_lost
snpEff:START_LOST
sequence
SO:0002012
Request from Uma Devi Paila, UVA. This term should not be applied to incomplete transcripts.
start_lost
A codon variant that changes at least one base of the canonical start codon.
SO:ke
Jannovar:start_lost
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
VEP:start_lost
snpEff:START_LOST
A sequence variant that causes the reduction of a the 5'UTR with regard to the reference sequence.
kareneilbeck
2014-03-25T10:46:42Z
http://snpeff.sourceforge.net/SnpEff_manual.html
5 prime UTR truncation
Jannovar:5_prime_utr_truncation
snpEff:UTR_5_DELETED
sequence
SO:0002013
5_prime_UTR_truncation
A sequence variant that causes the reduction of a the 5'UTR with regard to the reference sequence.
SO:ke
Jannovar:5_prime_utr_truncation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:UTR_5_DELETED
A sequence variant that causes the extension of 5' UTR, with regard to the reference sequence.
kareneilbeck
2014-03-25T10:48:26Z
5 prime UTR elongation
sequence
SO:0002014
5_prime_UTR_elongation
A sequence variant that causes the extension of 5' UTR, with regard to the reference sequence.
SO:ke
A sequence variant that causes the reduction of a the 3' UTR with regard to the reference sequence.
kareneilbeck
2014-03-25T10:54:50Z
http://snpeff.sourceforge.net/SnpEff_manual.html
3 prime UTR truncation
Jannovar:3_prime_utr_truncation
snpEff:UTR_3_DELETED
sequence
SO:0002015
3_prime_UTR_truncation
A sequence variant that causes the reduction of a the 3' UTR with regard to the reference sequence.
SO:ke
Jannovar:3_prime_utr_truncation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:UTR_3_DELETED
A sequence variant that causes the extension of 3' UTR, with regard to the reference sequence.
kareneilbeck
2014-03-25T10:55:33Z
3 prime UTR elongation
sequence
SO:0002016
3_prime_UTR_elongation
A sequence variant that causes the extension of 3' UTR, with regard to the reference sequence.
SO:ke
A sequence variant located in a conserved intergenic region, between genes.
kareneilbeck
2014-03-25T02:54:39Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:conserved_intergenic_variant
conserved intergenic variant
snpEff:INTERGENIC_CONSERVED
sequence
SO:0002017
Requested by Uma Paila (UVA) for snpEff.
conserved_intergenic_variant
A sequence variant located in a conserved intergenic region, between genes.
SO:ke
Jannovar:conserved_intergenic_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:INTERGENIC_CONSERVED
A transcript variant occurring within a conserved region of an intron.
kareneilbeck
2014-03-25T02:58:41Z
http://snpeff.sourceforge.net/SnpEff_manual.html
Jannovar:conserved_intron_variant
conserved intron variant
snpEff:INTRON_CONSERVED
sequence
SO:0002018
Requested by Uma Paila (UVA) for snpEff.
conserved_intron_variant
A transcript variant occurring within a conserved region of an intron.
SO:ke
Jannovar:conserved_intron_variant
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
snpEff:INTRON_CONSERVED
A sequence variant where at least one base in the start codon is changed, but the start remains.
kareneilbeck
2014-03-28T10:08:41Z
snpEff:SYNONYMOUS_START
sequence
SO:0002019
Requested by Uma Paila as this term is annotated by snpEff. This would be used for non_AUG start codon annotation.
start_retained_variant
A sequence variant where at least one base in the start codon is changed, but the start remains.
SO:ke
snpEff:SYNONYMOUS_START
Boundary elements are DNA motifs that prevent heterochromatin from spreading into neighboring euchromatic regions.
kareneilbeck
2014-05-30T14:45:37Z
boundary element
sequence
insulator
SO:0002020
Requested by Antonia Lock. Insulator is included as a related synonym since this is used to refer to insulator in the literature (NCBI:cf).
boundary_element
Boundary elements are DNA motifs that prevent heterochromatin from spreading into neighboring euchromatic regions.
PMID:24013502
A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing.
kareneilbeck
2014-05-30T14:57:26Z
mating type region replication fork barrier
sequence
SO:0002021
Requested by Midori Harris.
mating_type_region_replication_fork_barrier
A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing.
PMID:17614787
A small RNA molecule, 22-23 nt in size, that is the product of a longer RNA. The production of priRNAs is independent of dicer and involves binding of RNA by argonaute and trimming by triman. In fission yeast, priRNAs trigger the establishment of heterochromatin. PriRNAs are primarily generated from centromeric transcripts (dg and dh repeats), but may also be produced from degradation products of primary transcripts.
kareneilbeck
2014-05-30T15:01:24Z
primal small RNA
sequence
SO:0002022
priRNA
A small RNA molecule, 22-23 nt in size, that is the product of a longer RNA. The production of priRNAs is independent of dicer and involves binding of RNA by argonaute and trimming by triman. In fission yeast, priRNAs trigger the establishment of heterochromatin. PriRNAs are primarily generated from centromeric transcripts (dg and dh repeats), but may also be produced from degradation products of primary transcripts.
PMID:20178743
PMID:24095277
PomBase:al
A nucleic tag which is used in a ligation step of library preparation process to allow pooling of samples while maintaining ability to identify individual source material and creation of a multiplexed library.
kareneilbeck
2014-05-30T15:13:16Z
multiplexing sequence identifier
sequence
SO:0002023
multiplexing_sequence_identifier
A nucleic tag which is used in a ligation step of library preparation process to allow pooling of samples while maintaining ability to identify individual source material and creation of a multiplexed library.
OBO:prs
PMID:22574170
The leftmost segment of homology in the HML and MAT mating loci, but not present in HMR.
kareneilbeck
2014-07-11T13:20:08Z
SO:0002000
W-region
sequence
SO:0002024
MERGED COMMENT:
TARGET COMMENT: Requested by Janos Demeter, SGD.
--------------------
SOURCE COMMENT: Requested by Janos Demeter, SGD.
W_region
The leftmost segment of homology in the HML and MAT mating loci, but not present in HMR.
SGD:jd
A genome region where chromosome pairing occurs preferentially during homologous chromosome pairing during early meiotic prophase of Meiosis I.
kareneilbeck
2014-07-14T11:40:34Z
cis-acting homologous chromosome pairing region
sequence
SO:0002025
Comment: An example of this is the Sme2 locus in fission yeast S. pombe, where is coincident with an ribonuclear complex termed the "Mei2 dot". This term was Requested by Val Wood, PomBase.
cis_acting_homologous_chromosome_pairing_region
A genome region where chromosome pairing occurs preferentially during homologous chromosome pairing during early meiotic prophase of Meiosis I.
PMID:22582262
PMID:23117617
PMID:24173580
PomBase:vw
The nucleotide sequence which encodes the intein portion of the precursor gene.
kareneilbeck
2014-07-14T11:53:21Z
sequence
SO:0002026
Requested by Janos Demeter 2014.
intein_encoding_region
The nucleotide sequence which encodes the intein portion of the precursor gene.
PMID:8165123
A short open reading frame that is found in the 5' untranslated region of an mRNA and plays a role in translational regulation.
kareneilbeck
2014-07-14T11:59:23Z
PMID:26684391
regulatory uORF
upstream ORF
sequence
SO:0002027
uORF
A short open reading frame that is found in the 5' untranslated region of an mRNA and plays a role in translational regulation.
PMID:12890013
PMID:16153175
POMBASE:mah
An open reading frame that encodes a peptide of less than 100 amino acids.
kareneilbeck
2014-07-14T12:02:33Z
smORF
small ORF
sequence
SO:0002028
sORF
An open reading frame that encodes a peptide of less than 100 amino acids.
PMID:23970561
PMID:24705786
POMBASE:mah
A translated ORF encoded entirely within the antisense strand of a known protein coding gene.
kareneilbeck
2014-07-14T12:04:32Z
translated nested antisense gene
sequence
SO:0002029
tnaORF
A translated ORF encoded entirely within the antisense strand of a known protein coding gene.
POMBASE:vw
One of two segments of homology found at all three mating loci (HML, MAT and HMR).
kareneilbeck
2014-07-14T18:43:21Z
x-region
sequence
SO:0002030
X_region
One of two segments of homology found at all three mating loci (HML, MAT and HMR).
SGD:jd
A short hairpin RNA (shRNA) is an RNA transcript that makes a tight hairpin turn that can be used to silence target gene expression via RNA interference.
kareneilbeck
2014-10-23T09:16:29Z
http:http:en.wikipedia.org/wiki/Small_hairpin_RNA
short hairpin RNA
small hairpin RNA
sequence
SO:0002031
shRNA
A short hairpin RNA (shRNA) is an RNA transcript that makes a tight hairpin turn that can be used to silence target gene expression via RNA interference.
PMID:6699500
SO:ke
http:http:en.wikipedia.org/wiki/Small_hairpin_RNA
wikipedia
A non-coding transcript encoded by sequences adjacent to the ends of the 5' and 3' miR-encoding sequences that abut the loop in precursor miRNA.
kareneilbeck
2015-01-09T13:57:43Z
microRNA-offset RNA
sequence
SO:0002032
MoRs are generated from miR hairpins that are longer and can produce two functional miR per strand. They are called moRs because they are not located next to the loop and thus their biogenesis process is a little different, but functionally, they are supposed to act like miRs. It is the same for loRs that are the loop fragments, they are generated differently than miRs or moRs but if loaded into the risc they are supposed to act the same way miRs do.
Requested by Thomas Desvignes, Jan 2015.
moR
A non-coding transcript encoded by sequences adjacent to the ends of the 5' and 3' miR-encoding sequences that abut the loop in precursor miRNA.
SO:ke
A short, non coding transcript of loop-derived sequences encoded in precursor miRNA.
kareneilbeck
2015-01-09T14:02:02Z
loop-origin miRs
sequence
SO:0002033
MoRs are generated from miR hairpins that are longer and can produce two functional miR per strand. They are called moRs because they are not located next to the loop and thus their biogenesis process is a little different, but functionally, they are supposed to act like miRs. It is the same for loRs that are the loop fragments, they are generated differently than miRs or moRs but if loaded into the risc they are supposed to act the same way miRs do.
Requested by Thomas Desvignes, Jan 2015.
loR
A short, non coding transcript of loop-derived sequences encoded in precursor miRNA.
SO:ke
A snoRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:02:13Z
miR encoding snoRNA primary transcript
sequence
SO:0002034
miR_encoding_snoRNA_primary_transcript
A snoRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A primary transcript encoding a lncRNA.
kareneilbeck
2015-01-09T15:23:03Z
lncRNA primary transcript
sequence
SO:0002035
lncRNA_primary_transcript
A primary transcript encoding a lncRNA.
SO:ke
A lncRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:23:48Z
miR encoding lncRNA primary transcript
sequence
SO:0002036
miR_encoding_lncRNA_primary_transcript
A lncRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A tRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:28:23Z
miR encoding tRNA primary transcript
sequence
SO:0002037
miR_encoding_tRNA_primary_transcript
A tRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A primary transcript encoding an shRNA.
kareneilbeck
2015-01-09T15:30:43Z
shRNA primary transcript
sequence
SO:0002038
shRNA_primary_transcript
A primary transcript encoding an shRNA.
SO:ke
A shRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:32:00Z
miR encoding shRNA primary transcript
sequence
SO:0002039
miR_encoding_shRNA_primary_transcript
A shRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A primary transcript encoding a vaultRNA.
kareneilbeck
2015-01-09T15:33:33Z
vaultRNA primary transcript
sequence
SO:0002040
vaultRNA_primary_transcript
A primary transcript encoding a vaultRNA.
SO:ke
A vaultRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:34:32Z
miR encoding vaultRNA primary transcript
sequence
SO:0002041
miR_encoding_vaultRNA_primary_transcript
A vaultRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A primary transcript encoding a Y-RNA.
kareneilbeck
2015-01-09T15:36:51Z
Y-RNA primary transcript
sequence
SO:0002042
Y_RNA_primary_transcript
A primary transcript encoding a Y-RNA.
SO:ke
A Y-RNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
kareneilbeck
2015-01-09T15:37:46Z
miR encoding Y-RNA primary transcript
sequence
SO:0002043
miR_encoding_Y_RNA_primary_transcript
A Y-RNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA.
SO:ke
A TCS element is a (yeast) transcription factor binding site, bound by the TEA DNA binding domain (DBD) of transcription factors. The consensus site is CATTCC or CATTCT.
kareneilbeck
2015-02-09T15:02:53Z
TCS element
TEA Consensus Sequence
sequence
SO:0002044
Requested by Rama - SGD.
TCS_element
A TCS element is a (yeast) transcription factor binding site, bound by the TEA DNA binding domain (DBD) of transcription factors. The consensus site is CATTCC or CATTCT.
PMID:1489142
PMID:20118212
SO:ke
A PRE is a (yeast) TFBS with consensus site [TGAAAC(A/G)].
kareneilbeck
2015-02-09T15:05:43Z
PRE
pheromone response element
sequence
SO:0002045
Requested by Rama, SGD.
pheromone_response_element
A PRE is a (yeast) TFBS with consensus site [TGAAAC(A/G)].
PMID:1489142
SO:ke
A FRE is an enhancer element necessary and sufficient to confer filamentation associated expression in S. cerevisiae.
kareneilbeck
2015-02-09T15:09:47Z
filamentation and invasion response element
sequence
SO:0002046
Requested by Rama, SGD.
FRE
A FRE is an enhancer element necessary and sufficient to confer filamentation associated expression in S. cerevisiae.
PMID:1489142
SO:ke
Transcription pause sites are regions of a gene where RNA polymerase may pause during transcription. The functional role of pausing may be to facilitate factor recruitment, RNA folding, and synchronization with translation. Consensus transcription pause site have been observed in E. coli.
kareneilbeck
2015-02-09T15:32:52Z
transcription pause site
sequence
SO:0002047
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
transcription_pause_site
Transcription pause sites are regions of a gene where RNA polymerase may pause during transcription. The functional role of pausing may be to facilitate factor recruitment, RNA folding, and synchronization with translation. Consensus transcription pause site have been observed in E. coli.
PMID:24789973
SO:ke
A reading frame that could encode a full-length protein but which contains obvious mid-sequence disablements (frameshifts or premature stop codons).
kareneilbeck
2015-02-09T16:15:46Z
dORF
disabled ORF
sequence
disabled_reading frame
SO:0002048
disabled_reading_frame
A reading frame that could encode a full-length protein but which contains obvious mid-sequence disablements (frameshifts or premature stop codons).
SGD:se
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acetylated.
kareneilbeck
2015-05-14T10:17:11Z
H3K27 acetylation site
H3K27ac
sequence
SO:0002049
Requested by: Sagar Jain, Richard Scheuermann.
H3K27_acetylation_site
A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acetylated.
SO:rs
A promoter that allows for continual transcription of gene.
kareneilbeck
2015-05-14T10:39:09Z
constitutive promoter
sequence
SO:0002050
constitutive_promoter
A promoter that allows for continual transcription of gene.
SO:ke
A promoter whereby activity is induced by the presence or absence of biotic or abiotic factors.
kareneilbeck
2015-05-14T10:39:56Z
inducible promoter
sequence
SO:0002051
inducible_promoter
A promoter whereby activity is induced by the presence or absence of biotic or abiotic factors.
SO:ke
A variant where the mutated gene product adversely affects the other (wild type) gene product.
kareneilbeck
2015-05-14T11:16:28Z
dominant negative
dominant negative variant
sequence
SO:0002052
Requested by Deanna Church.
dominant_negative_variant
A variant where the mutated gene product adversely affects the other (wild type) gene product.
SO:ke
A sequence variant whereby new or enhanced function is conferred on the gene product.
kareneilbeck
2015-05-14T11:20:47Z
gain of function variant
sequence
SO:0002053
gain_of_function_variant
A sequence variant whereby new or enhanced function is conferred on the gene product.
SO:ke
A sequence variant whereby the gene product has diminished or abolished function.
kareneilbeck
2015-05-14T11:21:29Z
loss of function variant
sequence
SO:0002054
loss_of_function_variant
A sequence variant whereby the gene product has diminished or abolished function.
SO:ke
A variant whereby the gene product is not functional or the gene product is not produced.
kareneilbeck
2015-05-14T11:21:57Z
null mutation
sequence
SO:0002055
null_mutation
A variant whereby the gene product is not functional or the gene product is not produced.
SO:ke
An intronic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control.
kareneilbeck
2015-05-14T12:24:10Z
ISS
intronic splicing silencer
sequence
SO:0002056
Requested by Javier Diez Perez.
intronic_splicing_silencer
An intronic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control.
PMID:23241926
SO:ke
kareneilbeck
2015-05-14T12:28:31Z
ISE
sequence
SO:0002057
intronic_splicing_enhancer
true
An exonic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control.
kareneilbeck
2015-05-14T12:42:12Z
ESS
exonic splicing silencer
sequence
SO:0002058
Requested by Javier Diez Perez.
exonic_splicing_silencer
An exonic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control.
PMID:23241926
SO:ke
A regulatory_region that promotes or induces the process of recombination.
kareneilbeck
2015-05-14T13:08:58Z
recombination enhancer
sequence
SO:0002059
recombination_enhancer
A regulatory_region that promotes or induces the process of recombination.
PMID:8861911
SGD:se
A translocation where the regions involved are from different chromosomes.
kareneilbeck
2015-06-18T11:10:30Z
sequence
SO:0002060
interchromosomal_translocation
A translocation where the regions involved are from different chromosomes.
NCBI:th
A translocation where the regions involved are from the same chromosome.
kareneilbeck
2015-06-18T11:10:51Z
sequence
SO:0002061
intrachromosomal_translocation
A translocation where the regions involved are from the same chromosome.
NCBI:th
A contiguous cluster of translocations, usually the result of a single catastrophic event such as chromothripsis or chromoanasynthesis.
kareneilbeck
2015-06-18T11:24:55Z
complex chromosomal rearrangement
sequence
SO:0002062
complex_chromosomal_rearrangement
A contiguous cluster of translocations, usually the result of a single catastrophic event such as chromothripsis or chromoanasynthesis.
NCBI:th
An insertion of sequence from the Alu family of mobile elements.
kareneilbeck
2015-06-18T11:30:36Z
Alu insertion
sequence
SO:0002063
Alu_insertion
An insertion of sequence from the Alu family of mobile elements.
NCBI:th
An insertion from the Line1 family of mobile elements.
kareneilbeck
2015-06-18T11:34:44Z
sequence
line1 insertion
SO:0002064
LINE1_insertion
An insertion from the Line1 family of mobile elements.
NCBI:th
An insertion of sequence from the SVA family of mobile elements.
kareneilbeck
2015-06-18T11:36:12Z
sequence
SO:0002065
SVA_insertion
An insertion of sequence from the SVA family of mobile elements.
NCBI:th
A deletion of a mobile element when comparing a reference sequence (has mobile element) to a individual sequence (does not have mobile element).
kareneilbeck
2015-09-04T13:40:43Z
mobile element deletion
sequence
SO:0002066
mobile_element_deletion
A deletion of a mobile element when comparing a reference sequence (has mobile element) to a individual sequence (does not have mobile element).
NCBI:th
A deletion of the HERV mobile element with respect to a reference.
kareneilbeck
2015-09-04T13:42:52Z
HERV deletion
sequence
SO:0002067
HERV_deletion
A deletion of the HERV mobile element with respect to a reference.
NCBI:th
A deletion of an SVA mobile element.
kareneilbeck
2015-09-04T13:45:22Z
SVA deletion
sequence
SO:0002068
SVA_deletion
A deletion of an SVA mobile element.
NCBI:th
A deletion of a LINE1 mobile element with respect to a reference.
kareneilbeck
2015-09-04T13:46:26Z
sequence
LINE1 deletion
SO:0002069
LINE1_deletion
A deletion of a LINE1 mobile element with respect to a reference.
NCBI:th
A deletion of an Alu mobile element with respect to a reference.
kareneilbeck
2015-09-04T13:47:16Z
sequence
SO:0002070
Alu_deletion
A deletion of an Alu mobile element with respect to a reference.
NCBI:th
A CDS that is supported by proteomics data.
kareneilbeck
2015-10-12T13:25:02Z
sequence
SO:0002071
CDS_supported_by_peptide_spectrum_match
A CDS that is supported by proteomics data.
SO:ke
A position or feature where two sequences have been compared.
kareneilbeck
2015-11-23T14:14:32Z
INSDC_feature:misc_feature
INSDC_note:sequence_comparison
sequence comparison
sequence
SO:0002072
sequence_comparison
A position or feature within a sequence that is identical to the comparable position or feature of a specified reference sequence.
kareneilbeck
2015-11-23T14:15:08Z
no sequence alteration
sequence
SO:0002073
This term is requested by the ClinVar data model group for use in the allele registry and such. A sequence at a defined location that is defined to match the reference assembly.
no_sequence_alteration
A position or feature within a sequence that is identical to the comparable position or feature of a specified reference sequence.
SO:ke
A variant that falls in an intergenic region that is 1 kb or less between 2 genes.
kareneilbeck
2015-11-23T14:24:16Z
ANNOVAR:upstream;downstream
sequence
SO:0002074
This term is added to map to the Annovar annotation 'upstream,downstream' .
intergenic_1kb_variant
A variant that falls in an intergenic region that is 1 kb or less between 2 genes.
SO:ke
ANNOVAR:upstream;downstream
A sequence variant that intersects an incompletely annotated transcript.
kareneilbeck
2015-11-23T14:43:51Z
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
incomplete transcript variant
sequence
SO:0002075
This term is to map to the ANNOVAR term 'ncRNA' http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ . The description in the documentation (11/23/15) 'variant overlaps a transcript without coding annotation in the gene definition'. and this is further clarified in the document: ncRNA above refers to RNA without coding annotation. It does not mean that this is a RNA that will never be translated; it merely means that the user-selected gene annotation system was not able to give a coding sequence annotation. It could still code protein products and may have such annotations in future versions of gene annotation or in another gene annotation system. For example, BC039000 is regarded as ncRNA by ANNOVAR when using UCSC Known Gene annotation, but it is regarded as a protein-coding gene by ANNOVAR when using ENSEMBL annotation.
It is further clarified in the comments section as: ncRNA does NOT mean conventional non-coding RNA. It means a RNA without complete coding sequence, and it can be a coding RNA that is annotated incorrectly by RefSeq or other gene definition systems.
incomplete_transcript_variant
A sequence variant that intersects an incompletely annotated transcript.
SO:ke
A sequence variant that intersects the 3' UTR of an incompletely annotated transcript.
kareneilbeck
2015-11-23T14:45:52Z
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
ANNOVAR:ncRNA_UTR3
sequence
incomplete transcript 3UTR variant
SO:0002076
incomplete_transcript_3UTR_variant
A sequence variant that intersects the 3' UTR of an incompletely annotated transcript.
SO:ke
ANNOVAR:ncRNA_UTR3
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
A sequence variant that intersects the 5' UTR of an incompletely annotated transcript.
kareneilbeck
2015-11-24T12:39:17Z
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
ANNOVAR:ncRNA_UTR5
incomplete transcript 5UTR variant
sequence
SO:0002077
incomplete_transcript_5UTR_variant
A sequence variant that intersects the 5' UTR of an incompletely annotated transcript.
SO:ke
ANNOVAR:ncRNA_UTR5
http://annovar.openbioinformatics.org/en/latest/user-guide/gene/
A sequence variant that intersects the intron of an incompletely annotated transcript.
kareneilbeck
2015-11-24T12:51:45Z
incomplete transcript intronic variant
sequence
SO:0002078
incomplete_transcript_intronic_variant
A sequence variant that intersects the intron of an incompletely annotated transcript.
SO:ke
A sequence variant that intersects the splice region of an incompletely annotated transcript.
kareneilbeck
2015-11-24T12:52:06Z
incomplete transcript splice region variant
sequence
SO:0002079
incomplete_transcript_splice_region_variant
A sequence variant that intersects the splice region of an incompletely annotated transcript.
SO:ke
A sequence variant that intersects the exon of an incompletely annotated transcript.
kareneilbeck
2015-11-24T12:52:10Z
incomplete transcript exonic variant
sequence
SO:0002080
incomplete_transcript_exonic_variant
A sequence variant that intersects the exon of an incompletely annotated transcript.
SO:ke
A sequence variant that intersects the coding regions of an incompletely annotated transcript.
kareneilbeck
2015-11-24T15:32:27Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq:coding-notMod3
Seattleseq:coding-unknown
sequence
SO:0002081
incomplete_transcript_CDS
A sequence variant that intersects the coding regions of an incompletely annotated transcript.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:coding-notMod3
Seattleseq:coding-unknown
A sequence variant that intersects the coding sequence near a splice region of an incompletely annotated transcript.
kareneilbeck
2015-11-24T15:51:06Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq:coding-notMod3-near-splice
Seattleseq:coding-unknown-near-splice
incomplete transcript coding splice variant
sequence
SO:0002082
incomplete_transcript_coding_splice_variant
A sequence variant that intersects the coding sequence near a splice region of an incompletely annotated transcript.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:coding-notMod3-near-splice
Seattleseq:coding-unknown-near-splice
A sequence variant located within 2KB 3' of a gene.
kareneilbeck
2015-11-24T15:55:49Z
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq:near-gene-3
sequence
SO:0002083
2KB_downstream_variant
A sequence variant located within 2KB 3' of a gene.
SO:ke
http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp
Seattleseq
Seattleseq:near-gene-3
A sequence variant in which a change has occurred within the exonic region of the splice site, 1-2 bases from boundary.
kareneilbeck
2015-12-01T14:38:47Z
ANNOVAR:exonic;splicing
exonic splice region variant
sequence
Seattleseq:coding-near-splice
SO:0002084
exonic_splice_region_variant
A sequence variant in which a change has occurred within the exonic region of the splice site, 1-2 bases from boundary.
SO:ke
ANNOVAR:exonic;splicing
Seattleseq:coding-near-splice
A sequence variant whereby two genes, on the same strand have become joined.
kareneilbeck
2016-02-23T12:16:48Z
unidirectional gene fusion
sequence
SO:0002085
Requested by SNPEFF team. Feb 2016.
unidirectional_gene_fusion
A sequence variant whereby two genes, on the same strand have become joined.
SO:ke
A sequence variant whereby two genes, on alternate strands have become joined.
kareneilbeck
2016-02-23T12:17:18Z
bidirectional gene fusion
sequence
SO:0002086
Requested by SNPEFF team. Feb 2016.
bidirectional_gene_fusion
A sequence variant whereby two genes, on alternate strands have become joined.
SO:ke
A non functional descendant of the coding portion of a coding transcript, part of a pseudogene.
kareneilbeck
2016-02-29T12:58:52Z
INSDC_feature:CDS
INSDC_qualifier:pseudo
pseudogenic CDS
sequence
SO:0002087
pseudogenic_CDS
A non functional descendant of the coding portion of a coding transcript, part of a pseudogene.
SO:ke
A transcript variant occurring within the splice region (1-3 bases of the exon or 3-8 bases of the intron) of a non coding transcript.
kareneilbeck
2016-03-07T09:40:46Z
ANNOVAR:ncRNA_splicing
sequence
SO:0002088
non_coding_transcript_splice_region_variant
A transcript variant occurring within the splice region (1-3 bases of the exon or 3-8 bases of the intron) of a non coding transcript.
SO:ke
A UTR variant of exonic sequence of the 3' UTR.
kareneilbeck
2016-03-07T10:37:04Z
3 prime UTR exon variant
sequence
SO:0002089
Requested by visze github tracker ID 346.
3_prime_UTR_exon_variant
A UTR variant of exonic sequence of the 3' UTR.
SO:ke
A UTR variant of intronic sequence of the 3' UTR.
kareneilbeck
2016-03-07T10:37:41Z
3 prime UTR intron variant
sequence
SO:0002090
Requested by visze github tracker ID 346.
3_prime_UTR_intron_variant
A UTR variant of intronic sequence of the 3' UTR.
SO:ke
A UTR variant of intronic sequence of the 5' UTR.
kareneilbeck
2016-03-07T10:38:04Z
5 prime UTR intron variant
sequence
SO:0002091
Requested by visze github tracker ID 346.
5_prime_UTR_intron_variant
A UTR variant of intronic sequence of the 5' UTR.
SO:ke
A UTR variant of exonic sequence of the 5' UTR.
kareneilbeck
2016-03-07T10:38:26Z
5 prime UTR exon variant
sequence
SO:0002092
Requested by visze github tracker ID 346.
5_prime_UTR_exon_variant
A UTR variant of exonic sequence of the 5' UTR.
SO:ke
A variant that impacts the internal interactions of the resulting polypeptide structure.
kareneilbeck
2016-03-07T11:43:55Z
structural interaction variant
sequence
SO:0002093
Requested by Pablo Cingolani. The way I calculate this is simply by looking at the PDB entry of one protein and then marking those AA that are within 3 Angstrom of each other (and far away in the AA sequence, e.g. over 20 AA distance). The assumption is that, since they are very close in distance, they must be "interacting" and thus important for protein structure.
structural_interaction_variant
A variant that impacts the internal interactions of the resulting polypeptide structure.
SO:ke
A genomic region at a non-allelic position where exchange of genetic material happens as a result of homologous recombination.
nicole
2016-05-17T13:34:12Z
INSDC_feature:misc_recomb
INSDC_qualifier:non_allelic_homologous
INSDC_qualifier:non_allelic_homologous_recombination
NAHRR
non allelic homologous recombination region
sequence
SO:0002094
non_allelic_homologous_recombination_region
A ncRNA, specific to the Cajal body, that has been demonstrated to function as a guide RNA in the site-specific synthesis of 2'-O-ribose-methylated nucleotides and pseudouridines in the RNA polymerase II-transcribed U1, U2, U4 and U5 spliceosomal small nuclear RNAs (snRNAs).
nicole
2016-05-19T13:42:45Z
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC126017/
small Cajal body specific RNA
small Cajal body-specific RNA
sequence
SO:0002095
Moved from is_a ncRNA (SO:0000655) to is_a snoRNA (SO:0000275) as per request from FlyBase by Dave Sant 24 April 2021. See GitHub Issue #509.
scaRNA
A ncRNA, specific to the Cajal body, that has been demonstrated to function as a guide RNA in the site-specific synthesis of 2'-O-ribose-methylated nucleotides and pseudouridines in the RNA polymerase II-transcribed U1, U2, U4 and U5 spliceosomal small nuclear RNAs (snRNAs).
PMC:126017
PMID:27775477
PMID:28869095
SO:nrs
A variation that expands or contracts a tandem repeat with regard to a reference.
kareneilbeck
2016-07-14T16:04:40Z
short tandem repeat variation
str variation
sequence
SO:0002096
short_tandem_repeat_variation
A variation that expands or contracts a tandem repeat with regard to a reference.
SO:ke
A pseudogene derived from a vertebrate immune system gene.
kareneilbeck
2016-07-15T16:00:22Z
vertebrate immune system pseudogene
sequence
SO:0002097
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
vertebrate_immune_system_pseudogene
A pseudogene derived from a vertebrate immune system gene.
SO:ke
A pseudogene derived from an immunoglobulin gene.
kareneilbeck
2016-07-15T16:01:47Z
immunoglobulin pseudogene
sequence
SO:0002098
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
immunoglobulin_pseudogene
A pseudogene derived from an immunoglobulin gene.
SO:ke
A pseudogene derived from a T-cell receptor gene.
kareneilbeck
2016-07-15T16:02:18Z
sequence
SO:0002099
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
T_cell_receptor_pseudogene
A pseudogene derived from a T-cell receptor gene.
SO:ke
A pseudogenic constant region of an immunoglobulin gene which closely resembles a known functional Imunoglobulin constant gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon.
kareneilbeck
2016-07-15T16:05:08Z
IG C pseudogene
sequence
SO:0002100
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_C_pseudogene
A pseudogenic constant region of an immunoglobulin gene which closely resembles a known functional Imunoglobulin constant gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A pseudogenic joining region which closely resembles a known functional imunoglobulin joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
kareneilbeck
2016-07-15T16:05:34Z
IG J pseudogene
IG joining pseudogene
IG_joining_pseudogene
Immunoglobulin Joining Pseudogene
Immunoglobulin_Joining_Pseudogene
sequence
SO:0002101
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_J_pseudogene
A pseudogenic joining region which closely resembles a known functional imunoglobulin joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A pseudogenic variable region which closely resembles a known functional imunoglobulin variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain.
kareneilbeck
2016-07-15T16:05:56Z
IG V pseudogene
IG variable pseudogene
IG_variable_pseudogene
Immunoglobulin variable pseudogene
Immunoglobulin_variable_pseudogene
sequence
SO:0002102
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_V_pseudogene
A pseudogenic variable region which closely resembles a known functional imunoglobulin variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A pseudogenic variable region which closely resembles a known functional T receptor variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain.
kareneilbeck
2016-07-15T16:06:29Z
T cell receptor V pseudogene
T cell receptor Variable pseudogene
TR V pseudogene
T_cell_receptor_V_pseudogene
T_cell_receptor_Variable_pseudogene
sequence
SO:0002103
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_V_pseudogene
A pseudogenic variable region which closely resembles a known functional T receptor variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A pseudogenic joining region which closely resembles a known functional T receptor (TR) joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
kareneilbeck
2016-07-15T16:06:51Z
T cell receptor J pseudogene
T cell receptor Joining pseudogene
TR J pseudogene
T_cell_receptor_J_pseudogene
T_cell_receptor_Joining_pseudogene
sequence
SO:0002104
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_J_pseudogene
A pseudogenic joining region which closely resembles a known functional T receptor (TR) joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated.
kareneilbeck
2016-07-18T12:31:53Z
translated processed pseudogene
sequence
SO:0002105
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
translated_processed_pseudogene
A processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A non-processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated.
kareneilbeck
2016-07-18T12:34:42Z
translated unprocessed pseudogene
translated_nonprocessed_pseudogene
sequence
SO:0002106
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
translated_unprocessed_pseudogene
A non-processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A unprocessed pseudogene supported by locus-specific evidence of transcription.
kareneilbeck
2016-07-18T12:41:53Z
transcribed unprocessed pseudogene
transcribed_non_processed_pseudogene
sequence
SO:0002107
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
transcribed_unprocessed_pseudogene
A unprocessed pseudogene supported by locus-specific evidence of transcription.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A species specific unprocessed pseudogene without a parent gene, as it has an active orthologue in another species.
kareneilbeck
2016-07-18T12:44:26Z
transcribed unitary pseudogene
sequence
SO:0002108
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
transcribed_unitary_pseudogene
A species specific unprocessed pseudogene without a parent gene, as it has an active orthologue in another species.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A processed_pseudogene overlapped by locus-specific evidence of transcription.
kareneilbeck
2016-07-18T12:45:48Z
transcribed processed pseudogene
sequence
SO:0002109
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
transcribed_processed_pseudogene
A processed_pseudogene overlapped by locus-specific evidence of transcription.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A polymorphic pseudogene in the reference genome, containing a retained intron, known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error.
kareneilbeck
2016-07-18T12:47:33Z
polymorphic pseudogene with retained intron
sequence
SO:0002110
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
polymorphic_pseudogene_with_retained_intron
A polymorphic pseudogene in the reference genome, containing a retained intron, known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A processed_transcript supported by EST and/or mRNA evidence that aligns unambiguously to a pseudogene locus (i.e. alignment to the pseudogene locus clearly better than alignment to parent locus).
kareneilbeck
2016-07-18T14:07:00Z
pseudogene processed transcript
sequence
SO:0002111
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
pseudogene_processed_transcript
A processed_transcript supported by EST and/or mRNA evidence that aligns unambiguously to a pseudogene locus (i.e. alignment to the pseudogene locus clearly better than alignment to parent locus).
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A protein coding transcript containing a retained intron.
kareneilbeck
2016-07-18T14:09:49Z
sequence
mRNA with retained intron
SO:0002112
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
coding_transcript_with_retained_intron
A protein coding transcript containing a retained intron.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A lncRNA transcript containing a retained intron.
kareneilbeck
2016-07-18T14:13:07Z
lncRNA with retained intron
lncRNA_retained_intron
sequence
SO:0002113
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
lncRNA_with_retained_intron
A lncRNA transcript containing a retained intron.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A protein coding transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon, making it susceptible to nonsense mediated decay.
kareneilbeck
2016-07-18T14:16:13Z
http://www.gencodegenes.org/gencode_biotypes.html
NMD transcript
nonsense mediated decay transcript
protein_coding_NMD
sequence
SO:0002114
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
NMD_transcript
A protein coding transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon, making it susceptible to nonsense mediated decay.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
http://www.gencodegenes.org/gencode_biotypes.html
GENCODE
A transcript supported by EST and/or mRNA evidence that aligns unambiguously to the pseudogene locus; has retained intronic sequence compared to a reference transcript sequence.
kareneilbeck
2016-07-18T14:19:04Z
pseudogene retained intron
sequence
SO:0002115
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes.
pseudogenic_transcript_with_retained_intron
A transcript supported by EST and/or mRNA evidence that aligns unambiguously to the pseudogene locus; has retained intronic sequence compared to a reference transcript sequence.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A processed transcript that does not contain a CDS that fullfills annotation criteria and not necessarily functionally non-coding.
kareneilbeck
2016-07-18T14:23:59Z
polymorphic pseudogene processed transcript
sequence
SO:0002116
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
polymorphic_pseudogene_processed_transcript
A processed transcript that does not contain a CDS that fullfills annotation criteria and not necessarily functionally non-coding.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
kareneilbeck
2016-07-18T14:27:21Z
sequence
SO:0002117
<new term>
true
A polymorphic pseudogene transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon. Premature stop codon is not introduced, directly or indirectly, as a result of the variation i.e. must be present in both protein_coding and pseudogenic alleles.
kareneilbeck
2016-07-18T14:28:02Z
NMD polymorphic pseudogene transcript
nonsense_mediated_decay_polymorphic_pseudogene
sequence
SO:0002118
Term added as part of collaboration with Gencode, adding biotypes used in annotation.
NMD_polymorphic_pseudogene_transcript
A polymorphic pseudogene transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon. Premature stop codon is not introduced, directly or indirectly, as a result of the variation i.e. must be present in both protein_coding and pseudogenic alleles.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A physical quality which inheres to the allele by virtue of the number instances of the allele within a population. This is the relative frequency of the allele at a given locus in a population.
kareneilbeck
2016-07-21T11:58:55Z
wikipedia:Allele_frequency
sequence
SO:0002119
Requested by HL7 clinical genomics group.
allelic_frequency
A physical quality which inheres to the allele by virtue of the number instances of the allele within a population. This is the relative frequency of the allele at a given locus in a population.
SO:ke
Transcript where ditag (digital gene expression profiling)and/or published experimental data strongly supports the existence of short non-coding transcripts transcribed from the 3'UTR.
nicole
2016-08-23T15:48:21Z
3'_overlapping_ncrna
3prime_overlapping_ncRNA
three prime overlapping noncoding rna
sequence
SO:0002120
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes.
three_prime_overlapping_ncrna
Transcript where ditag (digital gene expression profiling)and/or published experimental data strongly supports the existence of short non-coding transcripts transcribed from the 3'UTR.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
The configuration of the IG and TR variable (V), diversity (D) and joining (J) germline genes before DNA rearrangements (with or without constant (C) genes in undefined configuration. (germline, non rearranged regions of the IG DNA loci).
nicole
2016-08-23T15:54:51Z
immune_gene
sequence
SO:0002121
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
vertebrate_immune_system_gene
The configuration of the IG and TR variable (V), diversity (D) and joining (J) germline genes before DNA rearrangements (with or without constant (C) genes in undefined configuration. (germline, non rearranged regions of the IG DNA loci).
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A germline immunoglobulin gene.
nicole
2016-08-23T15:56:09Z
All_IG_genes
IG_genes
sequence
SO:0002122
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
immunoglobulin_gene
A germline immunoglobulin gene.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A constant (C) gene, a gene that codes the constant region of an immunoglobulin chain.
nicole
2016-08-23T15:57:29Z
IGC_gene
Immunoglobulin_Constant_germline_Gene
immunoglobulin_C_gene
sequence
SO:0002123
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_C_gene
A constant (C) gene, a gene that codes the constant region of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A gene that rearranges at the DNA level and codes the diversity region of the variable domain of an immunoglobuin (IG) gene.
nicole
2016-08-23T15:59:10Z
IGD_gene
Immunoglobulin_Diversity_ gene
immunoglobulin_D_gene
sequence
SO:0002124
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_D_gene
A gene that rearranges at the DNA level and codes the diversity region of the variable domain of an immunoglobuin (IG) gene.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
nicole
2016-08-23T16:00:36Z
IG_joining_gene
Immunoglobulin_Joining_Gene
immunoglobulin_J_gene
sequence
SO:0002125
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_J_gene
A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of an Immunoglobulin chain.
nicole
2016-08-23T16:02:09Z
IGV_gene
IG_variable_gene
Immunoglobulin_variable_gene
sequence
SO:0002126
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
IG_V_gene
A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of an Immunoglobulin chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A gene that encodes a long non-coding RNA.
nicole
2016-08-23T16:03:33Z
lnc RNA gene
lnc_RNA_gene
long_non_coding_RNA_gene
sequence
SO:0002127
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes.
lncRNA_gene
A gene that encodes a long non-coding RNA.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
Mitochondrial rRNA is an RNA component of the small or large subunits of mitochondrial ribosomes.
nicole
2016-08-23T16:08:59Z
Mt rRNA
Mt_rRNA
mitochondrial rRNA
mitochondrial_rRNA
sequence
SO:0002128
Updated definition to be consistent with format of other rRNA definitions on 10 June 2021. Requested by EBI. See GitHub Issue #493.
mt_rRNA
Mitochondrial rRNA is an RNA component of the small or large subunits of mitochondrial ribosomes.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
Mitochondrial transfer RNA.
nicole
2016-08-23T16:10:17Z
Mt_tRNA
mitochondrial_tRNA
sequence
SO:0002129
mt_tRNA
Mitochondrial transfer RNA.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A transcript that contains a CDS but has no stop codon before the polyA site is reached.
nicole
2016-08-23T16:11:34Z
non_stop_decay_transcript
sequence
SO:0002130
NSD_transcript
A transcript that contains a CDS but has no stop codon before the polyA site is reached.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A long non-coding transcript found within an intron of a coding or non-coding gene, with no overlap of exonic sequence.
nicole
2016-08-23T16:15:02Z
SO:0001903
sense intronic lncRNA
sense_intronic
sense_intronic_lncRNA
sense_intronic_non-coding_RNA
sequence
SO:0002131
Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579.
sense_intronic_lncRNA
A long non-coding transcript found within an intron of a coding or non-coding gene, with no overlap of exonic sequence.
GENECODE:http://www.gencodegenes.org/gencode_biotypes.html
A long non-coding transcript that contains a protein coding gene within its intronic sequence on the same strand, with no overlap of exonic sequence.
nicole
2016-08-23T16:16:13Z
sense overlap lncRNA
sense_overlap_lncRNA
sense_overlapping
sequence
SO:0002132
Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579.
sense_overlap_lncRNA
A long non-coding transcript that contains a protein coding gene within its intronic sequence on the same strand, with no overlap of exonic sequence.
GENECODE:http://www.gencodegenes.org/gencode_biotypes.html
A T-cell receptor germline gene.
nicole
2016-08-23T16:17:12Z
TR_gene
sequence
SO:0002133
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
T_cell_receptor_gene
A constant (C) gene, a gene that codes the constant region of a T-cell receptor chain.
nicole
2016-08-23T16:19:20Z
T_cell_receptor_C_gene
sequence
SO:0002134
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_C_Gene
A constant (C) gene, a gene that codes the constant region of a T-cell receptor chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A gene that rearranges at the DNA level and codes the diversity region of the variable domain of aT-cell receptor gene.
nicole
2016-08-23T16:20:06Z
T_cell_receptor_D_gene
sequence
SO:0002135
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_D_Gene
A gene that rearranges at the DNA level and codes the diversity region of the variable domain of aT-cell receptor gene.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of aT-cell receptor chain.
nicole
2016-08-23T16:20:36Z
T_cell_receptor_J_gene
sequence
SO:0002136
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_J_Gene
A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of aT-cell receptor chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of aT-cell receptor chain.
nicole
2016-08-23T16:21:04Z
T_cell_receptor_V_gene
sequence
SO:0002137
These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT.
TR_V_Gene
A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of aT-cell receptor chain.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php
A transcript feature that has been predicted but is not yet validated.
nicole
2016-08-23T16:27:38Z
predicted transcript
sequence
SO:0002138
predicted_transcript
A transcript feature that has been predicted but is not yet validated.
SO:ke
This is used for non-spliced EST clusters that have polyA features. This category has been specifically created for the ENCODE project to highlight regions that could indicate the presence of protein coding genes that require experimental validation, either by 5' RACE or RT-PCR to extend the transcripts, or by confirming expression of the putatively-encoded peptide with specific antibodies.
nicole
2016-08-23T16:28:07Z
TEC
to_be_experimentally_confirmed_transcript
sequence
SO:0002139
unconfirmed_transcript
This is used for non-spliced EST clusters that have polyA features. This category has been specifically created for the ENCODE project to highlight regions that could indicate the presence of protein coding genes that require experimental validation, either by 5' RACE or RT-PCR to extend the transcripts, or by confirming expression of the putatively-encoded peptide with specific antibodies.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
An origin of replication that initiates early in S phase.
nicole
2016-09-15T15:53:36Z
early origin
early origin of replication
early replication origin
sequence
SO:0002140
early_origin_of_replication
An origin of replication that initiates early in S phase.
PMID:23348837
PMID:9115207
An origin of replication that initiates late in S phase.
nicole
2016-09-15T15:56:07Z
late origin
late origin of replication
late replication origin
sequence
SO:0002141
late_origin_of_replication
An origin of replication that initiates late in S phase.
PMID:23348837
PMID:9115207
A histone 2A modification where the modification is the acetylation of the residue.
nicole
2016-10-25T12:03:46Z
H2Aac
histone 2A acetylation site
sequence
SO:0002142
histone_2A_acetylation_site
A histone 2A modification where the modification is the acetylation of the residue.
ISBN:0815341059
A histone 2B modification where the modification is the acetylation of the residue.
nicole
2016-10-25T12:04:04Z
H2Bac
histone 2B acetylation site
sequence
SO:0002143
histone_2B_acetylation_site
A histone 2B modification where the modification is the acetylation of the residue.
ISBN:0815341059
A histone 2AZ modification where the modification is the acetylation of the residue.
nicole
2016-10-25T14:11:49Z
H2A.Zac
H2AZac
histone 2AZ acetylation site
sequence
SO:0002144
histone_2AZ_acetylation_site
A histone 2AZ modification where the modification is the acetylation of the residue.
PMID:19385636
PMID:24316985
PMID:27087541
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
nicole
2016-10-25T14:19:43Z
H2A.ZK4ac
H2AZK4 acetylation site
H2AZK4ac
sequence
SO:0002145
H2AZK4_acetylation_site
A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
PMID:19385636
PMID:24316985
PMID:27087541
A kind of histone modification site, whereby the 7th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
nicole
2016-10-25T14:23:11Z
H2A.ZK7ac
H2AZK7 acetylation site
H2AZK7ac
sequence
SO:0002146
H2AZK7_acetylation_site
A kind of histone modification site, whereby the 7th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
PMID:19385636
PMID:24316985
PMID:27087541
A kind of histone modification site, whereby the 11th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
nicole
2016-10-25T14:23:31Z
H2A.ZK11ac
H2AZK11 acetylation site
H2AZK11ac
sequence
SO:0002147
H2AZK11_acetylation_site
A kind of histone modification site, whereby the 11th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
PMID:19385636
PMID:24316985
PMID:27087541
A kind of histone modification site, whereby the 13th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
nicole
2016-10-25T14:23:50Z
H2A.ZK13ac
H2AZK13 acetylation site
H2AZK13ac
sequence
SO:0002148
H2AZK13_acetylation_site
A kind of histone modification site, whereby the 13th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
PMID:19385636
PMID:24316985
PMID:27087541
A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
nicole
2016-10-25T14:24:08Z
H2A.ZK15ac
H2AZK15 acetylation site
H2AZK15ac
sequence
SO:0002149
H2AZK15_acetylation_site
A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2AZ histone protein is acetylated.
PMID:19385636
PMID:24316985
PMID:27087541
A uORF beginning with the canonical start codon AUG.
nicole
2016-10-26T09:37:11Z
AUG initiated uORF
sequence
SO:0002150
AUG_initiated_uORF
A uORF beginning with the canonical start codon AUG.
PMID:26684391
PMID:27313038
A uORF beginning with a codon other than AUG.
nicole
2016-10-26T09:37:45Z
non AUG initiated uORF
sequence
SO:0002151
non_AUG_initiated_uORF
A uORF beginning with a codon other than AUG.
PMID:26684391
PMID:27313038
A variant that falls downstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms.
nicole
2016-10-28T10:20:55Z
genic 3 prime transcript variant
genic 3' transcript variant
genic downstream transcript variant
sequence
SO:0002152
genic_downstream_transcript_variant
A variant that falls downstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms.
NCBI:dm
SO:ke
A variant that falls upstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms.
nicole
2016-10-28T10:23:17Z
genic 5 prime transcript variant
genic 5' transcript variant
genic upstream transcript variant
sequence
SO:0002153
genic_upstream_transcript_variant
A variant that falls upstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms.
NCBI:dm
SO:ke
A genomic region where there is an exchange of genetic material with another genomic region, occurring in somatic cells.
nicole
2016-10-28T10:33:54Z
INSDC_feature:misc_recomb
INSDC_qualifier:mitotic
INSDC_qualifier:mitotic_recombination
mitotic recombination region
sequence
SO:0002154
mitotic_recombination_region
A genomic region where there is an exchange of genetic material with another genomic region, occurring in somatic cells.
NCBI:cf
SO:ke
A genomic region in which there is an exchange of genetic material as a result of the repair of meiosis-specific double strand breaks that occur during meiotic prophase.
nicole
2016-10-28T10:34:55Z
INSDC_feature:misc_recomb
INSDC_qualifier:meiotic
INSDC_qualifier:meiotic_recombination
meiotic recombination region
sequence
SO:0002155
meiotic_recombination_region
A genomic region in which there is an exchange of genetic material as a result of the repair of meiosis-specific double strand breaks that occur during meiotic prophase.
NCBI:cf
SO:ke
A promoter element bound by the MADS family of transcription factors with consensus 5'-(C/T)TA(T/A)4TA(G/A)-3'.
nicole
2016-10-28T10:42:06Z
CArG box
sequence
SO:0002156
Requested by Antonia Lock
CArG_box
A promoter element bound by the MADS family of transcription factors with consensus 5'-(C/T)TA(T/A)4TA(G/A)-3'.
PMID:1748287
PMID:7623803
A gene cassette array containing H+ mating type specific information.
nicole
2016-11-17T10:59:00Z
sequence
SO:0002157
Mat2P
A gene cassette array containing H+ mating type specific information.
PMID:18354497
A gene cassette array containing H- mating type specific information.
nicole
2016-11-17T11:02:27Z
sequence
SO:0002158
Mat3M
A gene cassette array containing H- mating type specific information.
PMID:18354497
A conserved Cdc48/p97 interaction motif with strict consensus sequence F[PI]GKG[TK][RK]LG[GT] and relaxed consensus sequence FXGKGX[RK]LG.
nicole
2016-12-15T09:48:38Z
SHP box
sequence
SO:0002159
SHP_box
A conserved Cdc48/p97 interaction motif with strict consensus sequence F[PI]GKG[TK][RK]LG[GT] and relaxed consensus sequence FXGKGX[RK]LG.
PMID:17083136
PMID:27655872
A sequence variant that changes the length of one or more sequence features.
nicole
2017-04-26T12:31:12Z
sequence length variant
sequence
SO:0002160
sequence_length_variant
A sequence variant where the copies of a short tandem repeat (STR) feature are either contracted or expanded.
nicole
2017-04-26T12:50:55Z
short tandem repeat change
str change
sequence
SO:0002161
short_tandem_repeat_change
A short tandem repeat variant containing more repeat units than the reference sequence.
nicole
2017-04-26T12:51:26Z
short tandem repeat expansion
str expansion
sequence
SO:0002162
short_tandem_repeat_expansion
A short tandem repeat variant containing fewer repeat units than the reference sequence.
nicole
2017-04-26T12:52:33Z
short tandem repeat contraction
str contraction
sequence
SO:0002163
short_tandem_repeat_contraction
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B histone protein is acetylated.
nicole
2017-05-17T15:22:58Z
H2BK5 acetylation site
H2BK5ac
sequence
SO:0002164
H2BK5_acetylation_site
A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B histone protein is acetylated.
PMID:18552846
http://www.actrec.gov.in/histome/ptm_sp.php?ptm_sp=H2BK5ac
A short tandem repeat expansion with an increase in a sequence of three nucleotide units repeated in tandem compared to a reference sequence.
nicole
2017-06-02T10:43:42Z
trinucleotide repeat expansion
sequence
SO:0002165
trinucleotide_repeat_expansion
A ref_miRNA (RefSeq-miRNA) sequence is assigned at the creation of a new mature miRNA entry in a database. The ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA.
nicole
2017-06-22T11:05:49Z
RefSeq miRNA
RefSeq-miRNA
ref miRNA
sequence
SO:0002166
ref_miRNA
A ref_miRNA (RefSeq-miRNA) sequence is assigned at the creation of a new mature miRNA entry in a database. The ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA.
PMID:26453491
IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple ref_miRNA.
nicole
2017-06-22T11:09:42Z
sequence
SO:0002167
isomiR
IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple ref_miRNA.
PMID:26453491
An RNA_thermometer is a cis element in the 5' end of an mRNA that can change its secondary structure in response to temperature and coordinate temperature-dependent gene expression.
nicole
2017-07-17T10:07:45Z
https://en.wikipedia.org/wiki/RNA_thermometer
RNA thermometer
RNA thermoregulator
RNAT
thermoregulator
sequence
SO:0002168
RNA_thermometer
An RNA_thermometer is a cis element in the 5' end of an mRNA that can change its secondary structure in response to temperature and coordinate temperature-dependent gene expression.
PMID:22421878
https://en.wikipedia.org/wiki/RNA_thermometer
wiki
A sequence variant that falls in the polypyrimidine tract at 3' end of intron between 17 and 3 bases from the end (acceptor -3 to acceptor -17).
nicole
2017-07-31T13:40:13Z
splice polypyrimidine tract variant
sequence
SO:0002169
splice_polypyrimidine_tract_variant
A sequence variant that falls in the region between the 3rd and 6th base after splice junction (5' end of intron).
nicole
2017-07-31T13:48:32Z
splice donor region variant
sequence
SO:0002170
splice_donor_region_variant
A telomeric D-loop is a three-stranded DNA displacement loop that forms at the site where the telomeric 3' single-stranded DNA overhang (formed of the repeat sequence TTAGGG in mammals) is tucked back inside the double-stranded component of telomeric DNA molecule, thus forming a t-loop or telomeric-loop and protecting the chromosome terminus.
nicole
2017-08-01T13:12:11Z
telomeric D loop
sequence
SO:0002171
This definition is from GO:0061820 telomeric D-loop disassembly.
telomeric_D_loop
A telomeric D-loop is a three-stranded DNA displacement loop that forms at the site where the telomeric 3' single-stranded DNA overhang (formed of the repeat sequence TTAGGG in mammals) is tucked back inside the double-stranded component of telomeric DNA molecule, thus forming a t-loop or telomeric-loop and protecting the chromosome terminus.
PMID:10338204
PMID:15071557
PMID:24012755
A sequence_alteration where the source of the alteration is due to an artifact in the base-calling or assembly process.
nicole
2017-08-18T13:43:26Z
sequence alteration artifact
sequence
SO:0002172
sequence_alteration_artifact
An indel that is the result of base-calling or assembly error.
nicole
2017-08-18T15:16:20Z
indel artifact
sequence
SO:0002173
indel_artifact
A deletion that is the result of base-calling or assembly error.
nicole
2017-08-18T15:17:11Z
deletion artifact
sequence
SO:0002174
deletion_artifact
An insertion that is the result of base-calling or assembly error.
nicole
2017-08-18T15:17:42Z
insertion artifact
sequence
SO:0002175
insertion_artifact
A substitution that is the result of base-calling or assembly error.
nicole
2017-08-18T15:18:12Z
substitution artifact
sequence
SO:0002176
substitution_artifact
A duplication that is the result of base-calling or assembly error.
nicole
2017-08-18T15:26:00Z
duplication artifact
sequence
SO:0002177
duplication_artifact
An SNV that is the result of base-calling or assembly error.
nicole
2017-08-18T15:26:49Z
SNV artifact
sequence
SO:0002178
SNV_artifact
An MNV that is the result of base-calling or assembly error.
nicole
2017-08-18T15:27:21Z
MNV artifact
sequence
SO:0002179
MNV_artifact
A gene that encodes an enzymatic RNA.
nicole
2017-09-27T10:30:27Z
enzymatic RNA gene
sequence
SO:0002180
enzymatic_RNA_gene
A gene that encodes a ribozyme.
nicole
2017-09-27T10:31:09Z
ribozyme gene
sequence
SO:0002181
ribozyme_gene
A gene that encodes an antisense long, non-coding RNA.
nicole
2017-09-27T10:44:00Z
antisense lncRNA gene
sequence
SO:0002182
antisense_lncRNA_gene
A gene that encodes a sense overlap long non-coding RNA.
nicole
2017-09-27T10:48:05Z
sense overlap lncRNA gene
sense overlap ncRNA gene
sense_overlap_lncRNA_gene
sequence
SO:0002183
Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579.
sense_overlap_lncRNA_gene
A gene that encodes a sense intronic long non-coding RNA.
nicole
2017-09-27T11:03:50Z
sense intronic lncRNA gene
sense intronic ncRNA gene
sense_intronic_lncRNA_gene
sequence
SO:0002184
Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579.
sense_intronic_lncRNA_gene
A non-coding locus that originates from within the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand.
nicole
2017-10-03T11:43:48Z
bidirectional promoter lncRNA
bidirectional promoter lncRNA gene
bidirectional_promoter_lncRNA_gene
sequence
SO:0002185
This is a gencode term. See GitHub Issue #408. Synonyms "bidirectional promoter lncRNA gene" and "bidirectional_promoter_lncRNA_gene" added 23 April 2021 by David Sant. See GitHub Issue #506.
bidirectional_promoter_lncRNA_gene
A non-coding locus that originates from within the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand.
https://www.gencodegenes.org/pages/biotypes.html
A region of genomic sequence known to undergo mutational events with greater frequency than expected by chance.
nicole
2017-11-07T12:27:51Z
mutational hotspot
sequence
SO:0002186
mutational_hotspot
An insertion of sequence from the HERV family of mobile elements with respect to a reference.
nicole
2017-11-20T11:52:51Z
HERV insertion
sequence
SO:0002187
HERV_insertion
An insertion of sequence from the HERV family of mobile elements with respect to a reference.
NCBI:th
A gene_member_region that encodes sequence that directly contributes to the molecular function of its gene or gene product.
nicole
2017-12-15T11:08:43Z
functional gene region
sequence
SO:0002188
A functional_gene_region is a sequence feature that resides within a gene. But it is typically the corresponding region of translated/transcribed sequence in a gene product, that performs the molecular function qualifying it as a functional_gene_region. Here, a functional_gene_region must contribute directly to the molecular function of the gene product - regions that code for purely structural elements in a gene product that connect such directly functional elements together are not considered functional_gene_regions. Examples of regions considered 'functional' include those encoding enzymatic activity, binding activity, regions required for localization or membrane association, channel-forming regions, and signal peptides or other elements critical for processing of a gene product. In addition, regions that function at the genomic/DNA level are also included - e.g. regions of sequence known to be critical for binding transcription or splicing factors.
functional_gene_region
A gene_member_region that encodes sequence that directly contributes to the molecular function of its gene or gene product.
Clingen:mb
A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes.
nicole
2018-01-03T15:47:32Z
INSDC_feature:gene
INSDC_qualifier:allelic
allelic pseudogene
sequence
SO:0002189
allelic_pseudogene
A transcriptional cis regulatory region that when located between an enhancer and a gene's promoter prevents the enhancer from modulating the expression of the gene. Sometimes referred to as an insulator but may not include the barrier function of an insulator.
nicole
2018-01-04T17:28:52Z
INSDC_feature:regulatory
INSDC_qualifier:enhancer_blocking_element
enhancer blocking element
sequence
insulator
SO:0002190
Insulator is included as a related synonym since this is used to refer to insulator in the literature (NCBI:cf).
enhancer_blocking_element
A transcriptional cis regulatory region that when located between an enhancer and a gene's promoter prevents the enhancer from modulating the expression of the gene. Sometimes referred to as an insulator but may not include the barrier function of an insulator.
NCBI:cf
A regulatory region that controls epigenetic imprinting and affects the expression of target genes in an allele- or parent-of-origin-specific manner. Associated regulatory elements may include differentially methylated regions and non-coding RNAs.
nicole
2018-01-04T17:35:34Z
INSDC_feature:regulatory
INSDC_qualifier:imprinting_control_region
imprinting control region
sequence
SO:0002191
Moved from is_a regulatory_region (SO:0005836) to is_a epigenetically_modified_region (SO:0001720) on 11 Feb 2021. GREEKC members pointed out that this would be a more appropriate location. See GitHub Issue #530.
imprinting_control_region
A repeat lying outside the sequence for which it has functional significance (eg. transposon insertion target sites).
nicole
2018-01-05T16:27:21Z
INSDC_feature:repeat_region
INSDC_qualifier:flanking
flanking repeat
sequence
SO:0002192
flanking_repeat
The pseudogene has arisen by reverse transcription of a mRNA into cDNA, followed by reintegration into the genome. Therefore, it has lost any intron/exon structure, and it might have a pseudo-polyA-tail.
nicole
2018-01-08T11:43:58Z
INSDC_feature:rRNA
INSDC_qualifier:processed
processed pseudogenic rRNA
sequence
SO:0002193
processed_pseudogenic_rRNA
The pseudogene has arisen from a copy of the parent gene by duplication followed by accumulation of random mutation. The changes, compared to their functional homolog, include insertions, deletions, premature stop codons, frameshifts and a higher proportion of non-synonymous versus synonymous substitutions.
nicole
2018-01-08T11:49:41Z
INSDC_feature:rRNA
INSDC_qualifier:unprocessed
unprocessed pseudogenic rRNA
sequence
SO:0002194
unprocessed_pseudogenic_rRNA
The pseudogene has no parent. It is the original gene, which is functional in some species but disrupted in some way (indels, mutation, recombination) in another species or strain.
nicole
2018-01-08T11:51:59Z
INSDC_feature:rRNA
INSDC_qualifier:unitary
unitary pseudogenic rRNA
sequence
SO:0002195
unitary_pseudogenic_rRNA
A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes.
nicole
2018-01-08T11:53:13Z
INSDC_feature:rRNA
INSDC_qualifier:allelic
allelic pseudogenic rRNA
sequence
SO:0002196
allelic_pseudogenic_rRNA
The pseudogene has arisen by reverse transcription of a mRNA into cDNA, followed by reintegration into the genome. Therefore, it has lost any intron/exon structure, and it might have a pseudo-polyA-tail.
nicole
2018-01-08T12:10:10Z
INSDC_feature:tRNA
INSDC_qualifier:processed
processed pseudogenic tRNA
sequence
SO:0002197
processed_pseudogenic_tRNA
The pseudogene has arisen from a copy of the parent gene by duplication followed by accumulation of random mutation. The changes, compared to their functional homolog, include insertions, deletions, premature stop codons, frameshifts and a higher proportion of non-synonymous versus synonymous substitutions.
nicole
2018-01-08T12:14:34Z
INSDC_feature:tRNA
INSDC_qualifier:unprocessed
unprocessed pseudogenic tRNA
sequence
SO:0002198
unprocessed_pseudogenic_tRNA
The pseudogene has no parent. It is the original gene, which is functional in some species but disrupted in some way (indels, mutation, recombination) in another species or strain.
nicole
2018-01-08T12:16:59Z
INSDC_feature:tRNA
INSDC_qualifier:unitary
unitary pseudogenic tRNA
sequence
SO:0002199
unitary_pseudogenic_tRNA
A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes.
nicole
2018-01-08T12:18:38Z
INSDC_feature:tRNA
INSDC_qualifier:allelic
allelic pseudogenic tRNA
sequence
SO:0002200
allelic_pseudogenic_tRNA
A repeat at the ends of and within the sequence for which it has functional significance other than long terminal repeat.
nicole
2018-01-08T13:00:59Z
INSDC_feature:repeat_region
INSDC_qualifier:terminal
terminal repeat
sequence
SO:0002201
terminal_repeat
A repeat region that is prone to expansions and/or contractions.
nicole
2018-01-09T11:19:55Z
INSDC_feature:misc_feature
INSDC_note:repeat_instability_region
repeat instability region
sequence
SO:0002202
repeat_instability_region
A nucleotide site from which replication initiates.
nicole
2018-01-09T11:23:35Z
INSDC_feature:misc_feature
INSDC_note:replication_start_site
replication start site
sequence
SO:0002203
replication_start_site
A nucleotide site from which replication initiates.
NCBI:cf
A point in nucleic acid where a cleavage event occurs.
nicole
2018-01-09T11:30:34Z
INSDC_feature:misc_feature
INSDC_note:nucleotide_cleavage_site
nucleotide cleavage site
sequence
SO:0002204
nucleotide_cleavage_site
A regulatory element that acts in response to a stimulus, usually via transcription factor binding.
nicole
2018-01-10T16:33:25Z
INSDC_feature:regulatory
INSDC_qualifier:response_element
response element
sequence
SO:0002205
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
response_element
Identifies the biological source of the specified span of the sequence
nicole
2018-01-26T09:50:58Z
INSDC_feature:source
sequence source
sequence
SO:0002206
Terms such as genomic_DNA or mRNA can be used to describe a sequence source.
sequence_source
Identifies the biological source of the specified span of the sequence
NCBI:tm
A hexameric RNA motif consisting of nucleotides UNAAAC (where N can be any nucleotide) that targets the RNA for degradation.
nicole
2018-02-06T12:23:24Z
UNAAAC motif
sequence
SO:0002207
UNAAAC_motif
A hexameric RNA motif consisting of nucleotides UNAAAC (where N can be any nucleotide) that targets the RNA for degradation.
PMID:22645662
PMID:28765164
PomBase:al
An RNA that is transcribed from a long terminal repeat.
nicole
2018-02-07T11:51:45Z
LTR transcript
long terminal repeat transcript
sequence
SO:0002208
long_terminal_repeat_transcript
An RNA that is transcribed from a long terminal repeat.
PMID:24256266
PomBase:mh
A contig composed of genomic DNA derived sequences.
nicole
2018-03-21T12:25:14Z
gDNA contig
gDNA_contig
genomic DNA contig
sequence
SO:0002209
Requested by Bayer Crop Science, March 2018
genomic_DNA_contig
A contig composed of genomic DNA derived sequences.
BCS:etrwz
A variation qualifying the presence of a sequence in a genome which is entirely missing in another genome.
nicole
2018-03-21T12:59:14Z
PAV
presence absence variation
presence-absence variation
presence-absence_variation
presence/absence variation
presence/absence_variation
sequence
SO:0002210
Requested by Bayer Crop Science, March 2018
presence_absence_variation
A variation qualifying the presence of a sequence in a genome which is entirely missing in another genome.
BCS:bbean
PMID:19956538
PMID:25881062
A self replicating circular nucleic acid molecule that is distinct from a chromosome in the organism.
nicole
2018-04-18T11:13:38Z
circular plasmid
sequence
SO:0002211
circular_plasmid
A self replicating circular nucleic acid molecule that is distinct from a chromosome in the organism.
PMID:21719542
SBOL:jb
A self replicating linear nucleic acid molecule that is distinct from a chromosome in the organism. They are capped by terminal proteins covalently bound to the 5' ends of the DNA.
nicole
2018-04-18T11:14:04Z
linear plasmid
sequence
SO:0002212
linear_plasmid
A self replicating linear nucleic acid molecule that is distinct from a chromosome in the organism. They are capped by terminal proteins covalently bound to the 5' ends of the DNA.
PMID:21719542
SBOL:jb
Termination signal preferentially observed downstream of polyadenylation signal
nicole
2018-05-18T17:10:14Z
(A(U)GUA) motif
Nrd1 binding motif
Nrd1-dependent terminator
UCUUG motif
UGUAA/G motif
polyA site associated transcription termination signal
polyA site downstream element
transcription termination signal
sequence
SO:0002213
transcription_termination_signal
Termination signal preferentially observed downstream of polyadenylation signal
PMID:28367989
A sequence variant whereby at least one base of a codon is changed, resulting in a stop codon inserted next to an existing stop codon. This leads to a polypeptide of the same length.
nicole
2018-06-13T09:53:31Z
redundant inserted stop gained
sequence
SO:0002214
redundant_inserted_stop_gained
A DNA motif to which the S. pombe Zas1 protein binds. The consensus sequence is 5'-(Y)CCCCAY-3'.
nicole
2018-06-20T10:05:17Z
Zas1 recognition motif
sequence
SO:0002215
Zas1_recognition_motif
A DNA motif to which the S. pombe Zas1 protein binds. The consensus sequence is 5'-(Y)CCCCAY-3'.
PMID:29735745
PomBase:vw
A promoter element with consensus sequence [5'-TCG(G/C)(A/T)xxTTxAA], bound by the transcription factor Pho7.
nicole
2018-09-12T12:26:50Z
Pho7 binding site
sequence
SO:0002216
Pho7_binding_site
A promoter element with consensus sequence [5'-TCG(G/C)(A/T)xxTTxAA], bound by the transcription factor Pho7.
PMID:28811350
A sequence alteration which includes an insertion or a deletion. This describes a sequence length change when the direction of the change is unspecified or when such changes are pooled into one category.
nicoleruiz
2019-02-24T18:26:05Z
insertion or deletion
unspecified indel
sequence
SO:0002217
This term is used when there is a change that is either an insertion or a deletion but it is unknown which event occurred.
unspecified_indel
A sequence alteration which includes an insertion or a deletion. This describes a sequence length change when the direction of the change is unspecified or when such changes are pooled into one category.
ZFIN:st
A sequence variant in which the function of a gene product is altered with respect to a reference.
david
2019-03-01T10:21:26Z
function modified variant
sequence
function_modified_variant
functionally abnormal
SO:0002218
Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019.
functionally_abnormal
A sequence variant in which the function of a gene product is retained with respect to a reference.
david
2019-03-01T10:28:12Z
function retained variant
sequence
function_retained_variant
functionally normal
SO:0002219
Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019.
functionally_normal
A sequence variant in which the function of a gene product is unknown with respect to a reference.
david
2019-03-01T10:29:01Z
function uncertain variant
function_uncertain_variant
sequence
SO:0002220
Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019.
function_uncertain_variant
A regulatory_region including the Transcription Start Site (TSS) of a gene and serving as a platform for Pre-Initiation Complex (PIC) assembly, enabling transcription of a gene under certain conditions.
david
2019-07-31T14:01:20Z
Eukaryotic promoter
sequence
SO:0002221
eukaryotic_promoter
A regulatory_region essential for the specific initiation of transcription at a defined location in a DNA molecule, although this location might not be one single base. It is recognized by a specific RNA polymerase(RNAP)-holoenzyme, and this recognition is not necessarily autonomous.
david
2019-07-31T14:02:26Z
Prokaryotic promoter
sequence
SO:0002222
prokaryotic_promoter
A regulatory_region essential for the specific initiation of transcription at a defined location in a DNA molecule, although this location might not be one single base. It is recognized by a specific RNA polymerase(RNAP)-holoenzyme, and this recognition is not necessarily autonomous.
PMID:32665585
Sequences that decrease interactions between biological regions, such as between a promoter, its 5' context and/or the translational unit(s) it regulates. Spacers can affect regulation of translation, transcription, and other biological processes.
david
2019-09-06T19:05:52Z
doi:10.1101/584664
sequence
Inert DNA Spacer
SO:0002223
Updated by Evan Christensen on May 27, 2021 per github request https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/494
inert_DNA_spacer
Sequences that decrease interactions between biological regions, such as between a promoter, its 5' context and/or the translational unit(s) it regulates. Spacers can affect regulation of translation, transcription, and other biological processes.
PMID:20843779
PMID:24933158
PMID:27034378
PMID:28422998
doi:10.1101/584664
https://www.biorxiv.org/content/10.1101/584664v1
A region that codes for a 2A self-cleaving polypeptide region, which is a region that can result in a break in the peptide sequence at its terminal G-P junction.
david
2019-10-21T10:41:49Z
sequence
2A polypeptide region
2A self-cleaving polypeptide region
SO:0002224
Added by Dave Sant on October 21, 2019 per github request https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/475
2A_self_cleaving_peptide_region
A region that codes for a 2A self-cleaving polypeptide region, which is a region that can result in a break in the peptide sequence at its terminal G-P junction.
PMID:22301656
PMID:28526819
A conserved sequence (5'-CGNMGATCNTY-3') transcription repressor binding site required for gene repression in the presence of high zinc.
david
2019-10-30T11:19:52Z
sequence
LRE
LRE element
Loz1 response element
SO:0002225
Added on October 30, 2019 as per request of Val Wood request on GitHub Issue# 476 https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/476
LOZ1_response_element
A group II intron that recognizes IBS1/EBS1 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon and may also recognize a stem-loop in the RNA.
david
2020-03-27T08:56:34Z
group IIB intron
sequence
SO:0002226
group_IIC_intron
A group II intron that recognizes IBS1/EBS1 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon and may also recognize a stem-loop in the RNA.
PMID:20463000
A sequence variant extending the CDS, that causes elongation of the resulting polypeptide sequence.
david
2020-03-27T17:56:30Z
CDS Extension
elongated CDS
elongated_CDS
sequence
SO:0002227
Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480)
CDS_extension
A sequence variant extending the CDS, that causes elongation of the resulting polypeptide sequence.
PMID:14732127
PMID:15864293
PMID:27984720
PMID:31216041
PMID:32020195
A sequence variant extending the CDS at the 5' end, that causes elongation of the resulting polypeptide sequence at the N terminus.
david
2020-03-27T17:57:30Z
CDS Extension 5 prime
CDS Extension five prime
elongated CDS five prime
elongated_CDS_five_prime
sequence
SO:0002228
Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480)
CDS_five_prime_extension
A sequence variant extending the CDS at the 5' end, that causes elongation of the resulting polypeptide sequence at the N terminus.
PMID:14732127
PMID:15864293
PMID:27984720
PMID:31216041
PMID:32020195
A sequence variant extending the CDS at the 3' end, that causes elongation of the resulting polypeptide sequence at the C terminus.
david
2020-03-27T17:58:30Z
CDS Extension 3 prime
CDS Extension three prime
elongated CDS three prime
elongated_CDS_three_prime
sequence
SO:0002229
Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480)
CDS_three_prime_extension
A sequence variant extending the CDS at the 3' end, that causes elongation of the resulting polypeptide sequence at the C terminus.
PMID:14732127
PMID:15864293
PMID:27984720
PMID:31216041
PMID:32020195
A C-terminus protein motif (CAAX) serving as a post-translational prenylation site modified by the attachment of either a farnesyl or a geranyl-geranyl group to a cysteine residue. Farnesyltransferase recognizes CaaX boxes where X = M, S, Q, A, or C, whereas Geranylgeranyltransferase I recognizes CaaX boxes with X = L or E.
david
2020-03-27T18:04:30Z
CAAX box
sequence
SO:0002230
Added as per request by Val Wood GitHub issue #479 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/479)
CAAX_box
An RNA that catalyzes its own cleavage.
david
2020-03-30T16:02:30Z
self cleaving ribozyme
sequence
SO:0002231
Added as per request by John T. Sexton GitHub issue #470 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/470)
self_cleaving_ribozyme
A genetic feature that encodes a trait used for artificial selection of a subpopulation.
david
2020-04-01T10:04:30Z
selectable marker
selectable_marker
selection marker
sequence
SO:0002232
Added as per request by Bryan Bartley GitHub issue #468 and #402 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/468)
selection_marker
A chromosomal locus where complementary lncRNA and associated proteins accumulate at the corresponding lncRNA gene loci to tether homologous chromosome during chromosome pairing at meiosis I.
david
2020-04-14T10:09:30Z
homologous chromosome recognition and pairing locus
sequence
SO:0002233
Added as per request by Val Wood GitHub issue #483 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/483)
homologous_chromosome_recognition_and_pairing_locus
A chromosomal locus where complementary lncRNA and associated proteins accumulate at the corresponding lncRNA gene loci to tether homologous chromosome during chromosome pairing at meiosis I.
PMID:22582262
PMID:31811152
A cis-acting element involved in RNA stability found in the 3' UTR of some RNA (consensus UGUAAAUA).
david
2020-04-14T10:40:30Z
PRE binding RNA
pumilio response element
sequence
SO:0002234
Added as per request by Val Wood GitHub issue #455 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/455)
pumilio_response_element
A cis-acting element involved in RNA stability found in the 3' UTR of some RNA (consensus UGUAAAUA).
PMID:30601114
A polypeptide region that mediates binding to SUMO. The motif contains a hydrophobic core sequence consisting of three or four Ile, Leu, or Val residues plus one acidic or polar residue at position 2 or 3.
david
2020-04-22T12:40:30Z
SBM
SIM
SUMO binding motif
SUMO interaction motif
sequence
SO:0002235
Added as per request GitHub issue #434 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/434)
SUMO_interaction_motif
A polypeptide region that mediates binding to SUMO. The motif contains a hydrophobic core sequence consisting of three or four Ile, Leu, or Val residues plus one acidic or polar residue at position 2 or 3.
PMID:15388847,PMID:16524884
A gene which codes for 18S_rRNA, which functions as the small subunit of the ribosome in eukaryotes.
david
2020-05-07T16:12:30Z
18S rRNA gene
18S_rRNA_gene
sequence
SO:0002236
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_18S_gene
A gene which codes for 16S_rRNA, which functions as the small subunit of the ribosome in prokaryotes.
david
2020-05-07T16:12:30Z
16S rRNA gene
16S_rRNA_gene
sequence
SO:0002237
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_16S_gene
A gene which codes for 5S_rRNA, which is a portion of the large subunit of the ribosome in both eukaryotes and prokaryotes.
david
2020-05-07T16:12:30Z
5S rRNA gene
5S_rRNA_gene
sequence
SO:0002238
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_5S_gene
A gene which codes for 28S_rRNA, which functions as a component of the large subunit of the ribosome in eukaryotes.
david
2020-05-07T16:12:30Z
28S rRNA gene
28S_rRNA_gene
sequence
SO:0002239
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_28S_gene
A gene which codes for 5_8S_rRNA (5.8S rRNA), which functions as a component of the large subunit of the ribosome in eukaryotes.
david
2020-05-07T16:12:30Z
5.8S rRNA gene
5_8S rRNA gene
5_8S_rRNA_gene
sequence
SO:0002240
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_5_8S_gene
A gene which codes for 21S_rRNA, which functions as a component of the large subunit of the ribosome in mitochondria.
SO:0002364
david
2020-05-07T16:12:30Z
21S rRNA gene
21S_rRNA_gene
sequence
SO:0002241
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472) Removed relationship derives_from SO:0001171 on 10 June 2021 when SO:0001171 rRNA_21S was obsoleted into SO:0002345 mt_LSU_rRNA. See GitHub Issue #493. OBSOLETED on 12 September 2022, merged into SO:0002364 mt_LSU_rRNA_gene see GitHub Issue #513.
rRNA_21S_gene
true
A gene which codes for 25S_rRNA, which functions as a component of the large subunit of the ribosome in some eukaryotes.
david
2020-05-07T16:12:30Z
25S rRNA gene
25S_rRNA_gene
sequence
SO:0002242
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_25S_gene
A gene which codes for 23S_rRNA, which functions as a component of the large subunit of the ribosome in prokaryotes.
david
2020-05-07T16:12:30Z
23S rRNA gene
23S_rRNA_gene
sequence
SO:0002243
Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_23S_gene
A transcript which is partially duplicated due to duplication of DNA, leading to a new transcript that is only partial and likely nonfunctional.
david
2020-05-13T09:07:30Z
partially duplicated transcript
sequence
SO:0002244
Added as per request from the Illumina group
partially_duplicated_transcript
A partially_duplicated_transcript where the 5' end of the transcript is duplicated.
david
2020-05-13T09:07:30Z
5' duplicated transcript
five prime duplicated transcript
five prime partially duplicated transcript
sequence
SO:0002245
Added as per request from the Illumina group
five_prime_duplicated_transcript
A partially_duplicated_transcript where the 3' end of the transcript is duplicated.
david
2020-05-13T09:07:30Z
3' duplicated transcript
three prime duplicated transcript
three prime partially duplicated transcript
sequence
SO:0002246
Added as per request from the Illumina group
three_prime_duplicated_transcript
A non-coding RNA less than 200 nucleotides in length.
david
2020-05-13T11:07:30Z
Small noncoding RNA
sequence
SO:0002247
Added as per request from GitHub Issue #485 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/485)
sncRNA
A non-coding RNA less than 200 nucleotides in length.
PMID:30069443
A region of DNA that is predicted to be translated and transcribed into a protein by a protein detection algorithm that does not get transcribed in nature.
david
2020-05-13T11:40:30Z
spurious protein
sequence
SO:0002248
Added as per request from GitHub Issue #478 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/478)
spurious_protein
A region of DNA that is predicted to be translated and transcribed into a protein by a protein detection algorithm that does not get transcribed in nature.
PMID:21771858
A CDS region corresponding to a mature protein region of a polypeptide.
david
2020-05-13T13:40:30Z
INSDC_feature:mat_peptide
mature protein region of CDS
sequence
SO:0002249
Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484)
mature_protein_region_of_CDS
A CDS region corresponding to a propeptide of a polypeptide.
david
2020-05-13T13:40:30Z
INSDC_feature:propeptide
propeptide region of CDS
sequence
SO:0002250
Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484)
propeptide_region_of_CDS
A CDS region corresponding to a signal peptide of a polypeptide.
david
2020-05-13T13:40:30Z
INSDC_feature:sig_peptide
Signal peptide region of CDS
sequence
SO:0002251
Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484)
signal_peptide_region_of_CDS
CDS region corresponding to a transit peptide region of a polypeptide.
david
2020-05-13T13:40:30Z
INSDC_feature:transit_peptide
transit peptide region of CDS
sequence
SO:0002252
Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484)
transit_peptide_region_of_CDS
A portion of a stem loop secondary structure in RNA.
david
2020-05-13T11:40:30Z
stem loop region
sequence
SO:0002253
Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451)
stem_loop_region
The loop portion of a stem loop, which is not folded back upon itself.
david
2020-05-13T11:40:30Z
loop portion of stem loop
sequence
SO:0002254
Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451)
loop
The portion of a stem loop where the RNA is folded back upon itself.
david
2020-05-13T11:40:30Z
stem portion of stem loop
sequence
SO:0002255
Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451)
stem
A region of a stem in a stem loop structure where the sequences are non-complimentary.
david
2020-05-13T11:40:30Z
non-complimentary stem
noncomplimentary stem
sequence
SO:0002256
Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451)
non_complimentary_stem
Cytologically observable heterochromatic regions of chromosomes away from centromeres that contain predominatly large tandem repeats and retrotransposons.
david
2020-05-27T10:45:30Z
Heterochromatin Knob
sequence
SO:0002257
Added as per request from GitHub Issue #487 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/487)
knob
Cytologically observable heterochromatic regions of chromosomes away from centromeres that contain predominatly large tandem repeats and retrotransposons.
PMID:6439888
A binding motif with the consensus sequence TTAGGG to which Teb1 binds.
david
2020-05-27T11:03:30Z
teb1 recognition motif
sequence
SO:0002258
Requested by Antonia Locke, (Pombe) as per GitHub Issue Request #439 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/439)
teb1_recognition_motif
A binding motif with the consensus sequence TTAGGG to which Teb1 binds.
PMID:23314747
PMID:27901072
A region defined by a cluster of experimentally determined polyadenylation sites, typically less than 25 bp in length and associated with a single polyadenylation signal.
david
2020-05-27T14:17:30Z
polyA cluster
polyA site cluster
polyA_cluster
sequence
SO:0002259
Added as per GitHub Issue Request #450 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/450)
polyA_site_cluster
A region defined by a cluster of experimentally determined polyadenylation sites, typically less than 25 bp in length and associated with a single polyadenylation signal.
PMID:17202160
PMID:24072873
PMID:25906188
Large Retrotransposon Derivative elements are long-terminal repeats that contain reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins . The LARDs identified in barley and other Triticeae have LTRs ~5.5 kb and an interal domain of ~3.5 kb. LARDs lack coding domains and thus do not encode proteins.
david
2020-05-27T15:47:30Z
Large Retrotransposon Derivative
large_retrotransposon_derivative
sequence
SO:0002260
Added as per GitHub Issue Request #429 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/429)
LARD
Large Retrotransposon Derivative elements are long-terminal repeats that contain reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins . The LARDs identified in barley and other Triticeae have LTRs ~5.5 kb and an interal domain of ~3.5 kb. LARDs lack coding domains and thus do not encode proteins.
PMID:15082561
TRIM elements have terminal direct repeat sequences of 100-250 bp in length that flank an internal domain of 100–300 bp. TRIMs lack coding domains and thus do not encode proteins.
david
2020-05-27T15:47:30Z
terminal-repeat retrotransposons in miniature
terminal-repeat_retrotransposons_in_miniature
sequence
SO:0002261
Added as per GitHub Issue Request #429 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/429)
TRIM
TRIM elements have terminal direct repeat sequences of 100-250 bp in length that flank an internal domain of 100–300 bp. TRIMs lack coding domains and thus do not encode proteins.
PMID:11717436
An absolute reference to the strand. When a chromosome has p and q arms, the Watson strand is the strand whose 5'-end is on the short arm of the chromosome. Of note, the term 'plus strand' is typically based on a reference sequence where it's preferred for the plus strand to be the Watson strand, but might not be and 'plus strand' is therefore not an exact synonym.
david
2020-05-28T10:33:30Z
Plus strand
Forward strand
Top strand
Watson strand
sequence
SO:0002262
Added as per GitHub Issue Request #419 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/419)
Watson_strand
An absolute reference to the strand. When a chromosome has p and q arms, the Watson strand is the strand whose 5'-end is on the short arm of the chromosome. Of note, the term 'plus strand' is typically based on a reference sequence where it's preferred for the plus strand to be the Watson strand, but might not be and 'plus strand' is therefore not an exact synonym.
PMID:21303550
An absolute reference to the strand. When a chromosome has p and q arms, the Crick strand is the strand whose 5'-end is on the long arm of the chromosome. Of note, the term 'minus strand' is typically based on a reference sequence where it's preferred for the minus strand to be the Crick strand, but might not be and 'minus strand' is therefore not an exact synonym.
david
2020-05-28T10:33:30Z
Minus strand
Bottom strand
Crick strand
Reverse strand
sequence
SO:0002263
Added as per GitHub Issue Request #419 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/419)
Crick_strand
An absolute reference to the strand. When a chromosome has p and q arms, the Crick strand is the strand whose 5'-end is on the long arm of the chromosome. Of note, the term 'minus strand' is typically based on a reference sequence where it's preferred for the minus strand to be the Crick strand, but might not be and 'minus strand' is therefore not an exact synonym.
PMID:21303550
LTR retrotransposons in the Copia superfamily contain elements coding for specific proteins in this order: GAG, AP, INT, RT, RH. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
david
2020-06-25T14:00:30Z
Copia LTR retrotransposon
RLC retrotransposon
Ty1 retrotransposon
sequence
SO:0002264
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Copia_LTR_retrotransposon
LTR retrotransposons in the Copia superfamily contain elements coding for specific proteins in this order: GAG, AP, INT, RT, RH. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
PMID:17984973
LTR retrotransposons in the Gypsy superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
david
2020-06-25T14:00:30Z
Gypsy LTR retrotransposon
RLG retrotransposon
Ty3 retrotransposon
sequence
SO:0002265
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Gypsy_LTR_retrotransposon
LTR retrotransposons in the Gypsy superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
PMID:17984973
LTR retrotransposons in the Bel-Pao superfamily are similar to LTRs in the Gypsy and Retrovirus superfamilies. Mainly described in metazoan genomes, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH and INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
david
2020-06-25T14:00:30Z
Bel Pao LTR retrotransposon
Bel-Pao LTR retrotransposon
RLB retrotransposon
sequence
SO:0002266
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Bel_Pao_LTR_retrotransposon
LTR retrotransposons in the Bel-Pao superfamily are similar to LTRs in the Gypsy and Retrovirus superfamilies. Mainly described in metazoan genomes, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH and INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H.
PMID:17984973
LTR retrotransposons in the retrovirus superfamily are similar to LTR retrotransposons in the Gypsy and Bel-Pao superfamilies. Mainly described in vertebrate animals, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT, and ENV. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. ENV is envelop protein.
david
2020-06-25T14:00:30Z
RLR retrotransposon
Retrovirus LTR retrotransposon
sequence
SO:0002267
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Retrovirus_LTR_retrotransposon
LTR retrotransposons in the retrovirus superfamily are similar to LTR retrotransposons in the Gypsy and Bel-Pao superfamilies. Mainly described in vertebrate animals, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT, and ENV. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. ENV is envelop protein.
PMID:17984973
Endogenous retrovirus (ERV) retrotransposons are abundant in the genomes of jawed vertebrates. Human ERVs (HERVs) are classified based on their homologies to animal retroviruses. Class I families are similar in sequence to mammalian Gammaretroviruses (type C) and Epsilonretroviruses (Type E). Class II families show homology to mammalian Betaretroviruses (Type B) and Deltaretroviruses (Type D). F-Class III families are similar to foamy viruses.
david
2020-06-25T14:00:30Z
Endogenous Retrovirus LTR retrotransposon
HERV
RLE retrotransposon
sequence
SO:0002268
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Endogenous_Retrovirus_LTR_retrotransposon
Endogenous retrovirus (ERV) retrotransposons are abundant in the genomes of jawed vertebrates. Human ERVs (HERVs) are classified based on their homologies to animal retroviruses. Class I families are similar in sequence to mammalian Gammaretroviruses (type C) and Epsilonretroviruses (Type E). Class II families show homology to mammalian Betaretroviruses (Type B) and Deltaretroviruses (Type D). F-Class III families are similar to foamy viruses.
PMID:17984973
R2 retrotransposons are LINE elements (SO:0000194) that insert site-specifically into the host organism's 28S ribosomal RNA (rRNA) genes.
david
2020-06-25T14:00:30Z
R2 LINE retrotransposon
R2 retrotransposon
RIR retrotransposon
sequence
SO:0002269
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
R2_LINE_retrotransposon
R2 retrotransposons are LINE elements (SO:0000194) that insert site-specifically into the host organism's 28S ribosomal RNA (rRNA) genes.
PMID:21734471
RTE retrotransposons are LINE elements (SO:0000194) that contain a domain with homology to the apurinic-apyrimidic (AP) endonucleases in addition to the previously identified reverse transcriptase domain.
david
2020-06-25T14:00:30Z
RIT retrotransposon
RTE LINE retrotransposon
RTE retrotransposon
sequence
SO:0002270
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
RTE_LINE_retrotransposon
RTE retrotransposons are LINE elements (SO:0000194) that contain a domain with homology to the apurinic-apyrimidic (AP) endonucleases in addition to the previously identified reverse transcriptase domain.
PMID:9729877
Jockey retrotransposons are LINE elements (SO:0000194) found only in arthropods. The full-length element is ~ 5 kb and contains two open reading frames (SO:0000236), ORF1 (568 aa) and ORF2 (916 aa), the second of which encodes an apurinic endonuclease (APE) and a reverse transcriptase (RT).
david
2020-06-25T14:00:30Z
Jockey LINE retrotransposon
LINE Jockey element
RIJ retrotransposon
sequence
SO:0002271
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Jockey_LINE_retrotransposon
Jockey retrotransposons are LINE elements (SO:0000194) found only in arthropods. The full-length element is ~ 5 kb and contains two open reading frames (SO:0000236), ORF1 (568 aa) and ORF2 (916 aa), the second of which encodes an apurinic endonuclease (APE) and a reverse transcriptase (RT).
PMID:31709017
Long interspersed element-1 (LINE-1) elements are found in the human genome, which contains ORF1 (open reading frame1, including CC, coiled coil; RRM, RNA recognition motif; CTD, carboxyl-terminal domain) and ORF2 (including EN, endonuclease; RT, reverse transcriptase; C, cysteine-rich domain). The L1-encoded proteins (ORF1p and ORF2p) can mobilize nonautonomous retrotransposons, other noncoding RNAs, and messenger RNAs.
david
2020-06-25T14:00:30Z
L1 LINE retrotransposon
L1 element
LINE 1 element
LINE-1 element
sequence
SO:0002272
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
L1_LINE_retrotransposon
Long interspersed element-1 (LINE-1) elements are found in the human genome, which contains ORF1 (open reading frame1, including CC, coiled coil; RRM, RNA recognition motif; CTD, carboxyl-terminal domain) and ORF2 (including EN, endonuclease; RT, reverse transcriptase; C, cysteine-rich domain). The L1-encoded proteins (ORF1p and ORF2p) can mobilize nonautonomous retrotransposons, other noncoding RNAs, and messenger RNAs.
PMID:31709017
Elements of the LINE I superfamily are similar to the Jockey and L1 superfamily. They contains two ORFs, the.second of which includes Apurinic endonuclease (APE) and reverse transcriptase (RT). The I superfamily encodes an RH (RNase H) domain downstream of the RT domain.
david
2020-06-25T14:00:30Z
https://www.sciencedirect.com/topics/biochemistry-genetics-and-molecular-biology/long-interspersed-nuclear-element
I LINE retrotransposon
LINE I element
RII retrotransposon
sequence
SO:0002273
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
I_LINE_retrotransposon
Short interspersed elements that originated from tRNAs.
david
2020-06-25T14:00:30Z
RST retrotransposon
tRNA SINE element
tRNA SINE retrotransposon
sequence
SO:0002274
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
tRNA_SINE_retrotransposon
Short interspersed elements that originated from tRNAs.
PMID:21673742
Short interspersed elements that originated from 7SL RNAs.
david
2020-06-25T14:00:30Z
7SL SINE element
7SL SINE retrotransposon
RSL retrotransposon
sequence
SO:0002275
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
7SL_SINE_retrotransposon
Short interspersed elements that originated from 7SL RNAs.
PMID:21673742
Short interspersed elements that originated from 5S rRNAs.
david
2020-06-25T14:00:30Z
5S SINE element
5S SINE retrotransposon
RSS retrotransposon
sequence
SO:0002276
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
5S_SINE_retrotransposon
Short interspersed elements that originated from 5S rRNAs.
PMID:21673742
Crypton is a superfamily of DNA transposons that use tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules.
david
2020-06-25T14:00:30Z
Crypton YR transposon
Crypton transposon
DYC transposon
sequence
SO:0002277
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Crypton_YR_transposon
Crypton is a superfamily of DNA transposons that use tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules.
PMID:22011512
Elements of the Tc1-Mariner terminal inverted repeat transposon superfamily (also called mariner transposons) are named after the Transponon of C. elegans number 1 transposasse. Their activity creates a 2-bp (TA) target-site duplication (TSD). Stowaway is the non-autonomous element in this superfamily usually shorter than 600 bp.
david
2020-06-25T14:00:30Z
DTT transposon
Mariner
Stowaway
Tc1 Mariner TIR transposon
Tc1 transposon
TcMar-Stowaway transposon
sequence
SO:0002278
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Tc1_Mariner_TIR_transposon
Elements of the Tc1-Mariner terminal inverted repeat transposon superfamily (also called mariner transposons) are named after the Transponon of C. elegans number 1 transposasse. Their activity creates a 2-bp (TA) target-site duplication (TSD). Stowaway is the non-autonomous element in this superfamily usually shorter than 600 bp.
PMID:17984973
PMID:8556864
The hAT terminal inverted repeat transposon superfamily elements were first found in maize (the Ac/Ds elements). Members of the hAT superfamily have TSDs of 8 bp, relatively short TIRs of 5–27 bp and overall lengths of less than 4 kb.
david
2020-06-25T14:00:30Z
Ac transposon
Ac/Ds transposon
DTA transposon
Ds transposon
hAT TIR transposon
hAT transposon
hAT-Ac transposon
sequence
SO:0002279
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
hAT_TIR_transposon
The hAT terminal inverted repeat transposon superfamily elements were first found in maize (the Ac/Ds elements). Members of the hAT superfamily have TSDs of 8 bp, relatively short TIRs of 5–27 bp and overall lengths of less than 4 kb.
PMID:11454746
Members of the Mutator family of terminal inverted repeat (TIR) transposon are usually long but are also highly divergent, either sharing only terminal G…C nucleotides, or with the G…C nucleotides absent. The length of the TSD (7-11 bp, usually 9 bp) remains probably the most useful criterion for identification.
david
2020-06-25T14:00:30Z
DTM transposon
MLE transposon
MULE
Mu transposon
MuDR
Mutator TIR transposon
Mutator transposon
sequence
SO:0002280
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Mutator_TIR_transposon
Members of the Mutator family of terminal inverted repeat (TIR) transposon are usually long but are also highly divergent, either sharing only terminal G…C nucleotides, or with the G…C nucleotides absent. The length of the TSD (7-11 bp, usually 9 bp) remains probably the most useful criterion for identification.
PMID:17984973
Terminal inverted repeat transposon superfamily Merlin elements create 8-9 bp target-site duplications (TSD).
david
2020-06-25T14:00:30Z
DTE transposon
Merlin TIR transposon
Merlin transposon
sequence
SO:0002281
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Merlin_TIR_transposon
Terminal inverted repeat transposon superfamily Merlin elements create 8-9 bp target-site duplications (TSD).
PMID:17984973
Terminal inverted repeat (TIR) transposons of the superfamily Transib contain the DDE motif, which is related to the RAG1 protein involved in V(D)J recombination.
david
2020-06-25T14:00:30Z
DTR transposon
Transib TIR transposon
transib transposon
sequence
SO:0002282
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Transib_TIR_transposon
Terminal inverted repeat (TIR) transposons of the superfamily Transib contain the DDE motif, which is related to the RAG1 protein involved in V(D)J recombination.
PMID:17984973
Primarily found in animals, the terminal inverted repeat (TIR) transposon superfamily piggyBac elements favour insertion adjacent to TTAA.
david
2020-06-25T14:00:30Z
DTB transposon
PiggyBac transposable element
piggyBac TIR transposon
sequence
SO:0002283
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
piggyBac_TIR_transposon
Primarily found in animals, the terminal inverted repeat (TIR) transposon superfamily piggyBac elements favour insertion adjacent to TTAA.
PMID:17984973
Terminal inverted repeat transposons in the PIF/Harbinger/tourist superfamily create 3-bp target site duplication that are mainly 'TAA' or 'TTA'. The autonomous PIF-Harbinger elements are relatively small in size, usually a few kb in length. Non-autonomous elements in this superfamily usually shorter than 600 bp are referrred to as Tourist elements. The terminal sequences for PIF/Harbinger/Tourist elements are 'GGG/CCC…GGC/GCC' or 'GA/GGCA…TGCC/TC'.
david
2020-06-25T14:00:30Z
DTH transposon
Harbinger transposon
PIF Harbinger TIR transposon
PIF transposon
Tourist transposon element
sequence
SO:0002284
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
PIF_Harbinger_TIR_transposon
Terminal inverted repeat transposons in the PIF/Harbinger/tourist superfamily create 3-bp target site duplication that are mainly 'TAA' or 'TTA'. The autonomous PIF-Harbinger elements are relatively small in size, usually a few kb in length. Non-autonomous elements in this superfamily usually shorter than 600 bp are referrred to as Tourist elements. The terminal sequences for PIF/Harbinger/Tourist elements are 'GGG/CCC…GGC/GCC' or 'GA/GGCA…TGCC/TC'.
PMID:26709091
This terminal inverted repeat of the CACTA family generate 3-bp target site duplication (TSD) upon insertion. CACTA elements do not have a significant preference for genic region insertions. This terminal inverted repeat (TIR) transposon superfamily is named CACTA because their terminal sequences are 'CACTA/G…C/TAGTG'.
david
2020-06-25T14:00:30Z
CACTA TIR transposon
CACTA transposon element
CACTC transposon
CMC-EnSpm transposon
DTC transposon
En transposon
En-Spm transposon
EnSpm transposon
Spm transposon
dSpm transposon
sequence
SO:0002285
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
CACTA_TIR_transposon
This terminal inverted repeat of the CACTA family generate 3-bp target site duplication (TSD) upon insertion. CACTA elements do not have a significant preference for genic region insertions. This terminal inverted repeat (TIR) transposon superfamily is named CACTA because their terminal sequences are 'CACTA/G…C/TAGTG'.
PMID:26709091
Tyrosine Kinase (YR) retrotransposons are a subclass of non-LTR retrotransposons. These YR-encoding elements consist of central gag, pol and tyrosine recombinase (YR) open reading frames (ORFs) flanked with terminal repeat. The pol ORF includes a reverse transcriptase (RT), a RNase H (RH) and, in case of DIRS, a domain similar to bacterial and phage DNA N-6-adenine-methyltransferase (MT). Compared to the retroviral pol (LTR retrotransposons, non-LTR retrotransposons and Penelope elements), both aspartic protease and DDE integrase are absent from YR retrotransposons. YR retrotransposons have inverted terminal repeats (ITRs).
david
2020-06-25T14:00:30Z
YR retrotransposon
tyrosine kinase retrotransposon
sequence
SO:0002286
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
YR_retrotransposon
Tyrosine Kinase (YR) retrotransposons are a subclass of non-LTR retrotransposons. These YR-encoding elements consist of central gag, pol and tyrosine recombinase (YR) open reading frames (ORFs) flanked with terminal repeat. The pol ORF includes a reverse transcriptase (RT), a RNase H (RH) and, in case of DIRS, a domain similar to bacterial and phage DNA N-6-adenine-methyltransferase (MT). Compared to the retroviral pol (LTR retrotransposons, non-LTR retrotransposons and Penelope elements), both aspartic protease and DDE integrase are absent from YR retrotransposons. YR retrotransposons have inverted terminal repeats (ITRs).
PMID:24086727
Dictyostelium intermediate repeat sequence (DIRS) retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR, and MT. RT is a reverse transcriptase. RH is RNAse H. YR is tyrosine recombinase. MT is DNA N-6-adenine-methyltransferase.
david
2020-06-25T14:00:30Z
DIRS YR retrotransposon
DIRS retrotransposon
RYD retrotransposon
sequence
SO:0002287
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
DIRS_YR_retrotransposon
Dictyostelium intermediate repeat sequence (DIRS) retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR, and MT. RT is a reverse transcriptase. RH is RNAse H. YR is tyrosine recombinase. MT is DNA N-6-adenine-methyltransferase.
PMID:24086727
Ngaro retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in Ngaro are arranged in A-pol-B-A-B order where A and B represent ITRs.
david
2020-06-25T14:00:30Z
Ngaro YR retrotransposon
Ngaro retrotransposon
RYN retrotransposon
sequence
SO:0002288
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Ngaro_YR_retrotransposon
Ngaro retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in Ngaro are arranged in A-pol-B-A-B order where A and B represent ITRs.
PMID:24086727
VIPER retrotransposons are members of the YR_retrotransposon (SO:0002286 superfamily with protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in VIPER are arranged in A-pol-B-A-B order where A and B represent ITRs. VIPER is only found in kinetoplastida genomes.
david
2020-06-25T14:00:30Z
RYV retrotransposon
Viper YR retrotransposon
Viper retrotransposon
sequence
SO:0002289
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Viper_YR_retrotransposon
VIPER retrotransposons are members of the YR_retrotransposon (SO:0002286 superfamily with protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in VIPER are arranged in A-pol-B-A-B order where A and B represent ITRs. VIPER is only found in kinetoplastida genomes.
PMID:16297462
Penelope is a subclass of non_LTR_retrotransposons (SO:0000189). Penelope retrotransposons contains structural features of TR, RT, EN, TR, terminal repeats which can be in tandem or inverse orientation in different Penelope copies. RT is reverse transcriptase. EN is endonuclease.
david
2020-06-25T14:00:30Z
Penelope retrotransposon
RPP retrotransposon
sequence
SO:0002290
Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488)
Penelope_retrotransposon
Penelope is a subclass of non_LTR_retrotransposons (SO:0000189). Penelope retrotransposons contains structural features of TR, RT, EN, TR, terminal repeats which can be in tandem or inverse orientation in different Penelope copies. RT is reverse transcriptase. EN is endonuclease.
PMID:23914310
A non-coding RNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail.
david
2020-07-01T11:49:30Z
circRNA
circular ncRNA
noncoding circRNA
sequence
SO:0002291
Added as per GitHub Issue Request #490 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/490) and GitHub Issue Request #391 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/391)
circular_ncRNA
A non-coding RNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail.
PMID:29086764
PMID:29182528
PMID:29230098
PMID:29576969
PMID:29626935
An mRNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail.
david
2020-07-01T11:49:30Z
circular mRNA
coding circRNA
sequence
SO:0002292
Added as per GitHub Issue Request #490 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/490) and GitHub Issue Request #391 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/391)
circular_mRNA
An mRNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail.
PMID:29086764
PMID:29182528
PMID:29576969
The non-coding region of the mitochondrial genome that controls RNA and DNA synthesis.
david
2020-07-01T16:40:30Z
https://en.wikipedia.org/wiki/MtDNA_control_region
Mitochondrial A+T region
Mitochondrial DNA control region
Mitochondrial NCR
Mitochondrial noncoding region
MtDNA control region
MtDNA_control_region
sequence
SO:0002293
Added as per GitHub Issue Request #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417) from INSDC
mitochondrial_control_region
The non-coding region of the mitochondrial genome that controls RNA and DNA synthesis.
PMID: 19407924
PMID:10968878
https://en.wikipedia.org/wiki/MtDNA_control_region
wiki
Mitochondrial displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region.
david
2020-07-08T11:49:30Z
http://en.wikipedia.org/wiki/D_loop
Mitochondrial D loop
Mitochondrial displacement loop
sequence
SO:0002294
Added as per request by Terence Murphy (INSDC) for GitHub Issue #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417)
mitochondrial_D_loop
Mitochondrial displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/D_loop
wiki
A TF_binding_site that is involved in regulation of expression.
david
2020-08-05T11:49:30Z
TFRS
transcription factor regulatory site
sequence
SO:0002295
Added as per Mejia-Almonte et.al PMID:32665585
transcription_factor_regulatory_site
A TF_binding_site that is involved in regulation of expression.
Bacterial_regulation_working_group:CMA
PMID:32665585
The possible discontinuous stretch of DNA that is the combination of one or several TFRSs whose bound TFs work jointly in the regulation of a promoter.
david
2020-08-05T11:49:30Z
TFRS module
TFRS phrase
transcription factor regulatory site module
transcription factor regulatory site phrase
sequence
SO:0002296
Added as per Mejia-Almonte et.al PMID:32665585. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
TFRS_module
The possible discontinuous stretch of DNA that is the combination of one or several TFRSs whose bound TFs work jointly in the regulation of a promoter.
Bacterial_regulation_working_group:CMA
PMID:32665585
The possible discontinous stretch of DNA that encompass all the TFRSs that regulate a promoter.
david
2020-08-05T11:49:30Z
TFRS collection
transcription factor regulatory site collection
sequence
SO:0002297
Added as per Mejia-Almonte et.al PMID:32665585. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
TFRS_collection
The possible discontinous stretch of DNA that encompass all the TFRSs that regulate a promoter.
Bacterial_regulation_working_group:CMA
PMID:32665585
An operon whose transcription is coordinated on a single transcription unit.
david
2020-08-05T11:49:30Z
simple operon
sequence
SO:0002298
Added as per Mejia-Almonte et.al PMID:32665585
simple_operon
An operon whose transcription is coordinated on a single transcription unit.
Bacterial_regulation_working_group:CMA
PMID:32665585
An operon whose transcription is coordinated on several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
david
2020-08-05T11:49:30Z
complex operon
sequence
SO:0002299
Added as per Mejia-Almonte et.al PMID:32665585
complex_operon
An operon whose transcription is coordinated on several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
Bacterial_regulation_working_group:CMA
PMID:32665585
Transcription units or transcribed coding sequences.
david
2020-08-05T11:49:30Z
unit of gene expression
sequence
SO:0002300
Added as per Mejia-Almonte et.al PMID:32665585
unit_of_gene_expression
Transcription units or transcribed coding sequences.
Bacterial_regulation_working_group:CMA
PMID:32665585
DNA regions delimited by different nonspurious TSS-TTS pairs.
david
2020-08-05T11:49:30Z
transcription unit
sequence
SO:0002301
Added as per Mejia-Almonte et.al PMID:32665585
transcription_unit
DNA regions delimited by different nonspurious TSS-TTS pairs.
Bacterial_regulation_working_group:CMA
PMID:32665585
A regulon defined by considering one regulatory gene product.
david
2020-08-05T11:49:30Z
simple regulon
sequence
SO:0002302
Added as per Mejia-Almonte et.al PMID:32665585
simple_regulon
A regulon defined by considering one regulatory gene product.
Bacterial_regulation_working_group:CMA
PMID:32665585
A regulon defined by considering the units of expression regulated by a specified set of regulatory gene products.
david
2020-08-05T11:49:30Z
simple regulon
sequence
SO:0002303
Added as per Mejia-Almonte et.al PMID:32665585
complex_regulon
A regulon defined by considering the units of expression regulated by a specified set of regulatory gene products.
Bacterial_regulation_working_group:CMA
PMID:32665585
An instance of a self-interacting DNA region flanked by left and right TAD boundaries.
david
2020-08-12T14:01:30Z
TAD
topologically associated domain
sequence
SO:0002304
Added by Dave to be consistent with other ontologies updated with GREEKC initiative.
topologically_associated_domain
An instance of a self-interacting DNA region flanked by left and right TAD boundaries.
GREEKC:cl
PMID:32782014
A DNA region enriched in DNA loop anchors and across which DNA loops occur less often than expected by chance.
david
2020-08-12T14:01:30Z
TAD boundary
TAD_boundary
topologically associated domain boundary
sequence
SO:0002305
Added by Dave to be consistent with other ontologies updated with GREEKC initiative.
topologically_associated_domain_boundary
A DNA region enriched in DNA loop anchors and across which DNA loops occur less often than expected by chance.
GREEKC:cl
PMID:32782014
A region of a chromosome where regulatory events occur, including epigenetic modifications. These epigenetic modifications can include nucleosome modifications and post-replicational DNA modifications.
david
2020-08-12T14:01:30Z
chromatin regulatory region
sequence
SO:0002306
Added by Dave to be consistent with other ontologies updated with GREEKC initiative.
chromatin_regulatory_region
A region of a chromosome where regulatory events occur, including epigenetic modifications. These epigenetic modifications can include nucleosome modifications and post-replicational DNA modifications.
GREEKC:cl
PMID:32782014
A region of DNA between two loop anchor positions that are held in close physical proximity.
david
2020-08-12T14:01:30Z
DNA loop
sequence
SO:0002307
Added by Dave to be consistent with other ontologies updated with GREEKC initiative. DS updated defintion Feb 16, 2021. See GitHub Issue #534.
DNA_loop
A region of DNA between two loop anchor positions that are held in close physical proximity.
GREEKC:cl
PMID:32782014
The ends of a DNA loop where the two strands of DNA are held in close physical proximity. During interphase the anchors of DNA loops are convergently oriented CTCF binding sites.
david
2020-08-12T14:01:30Z
DNA loop anchor
sequence
SO:0002308
Added by Dave to be consistent with other ontologies updated with GREEKC initiative. DS updated defintion Feb 16, 2021. See GitHub Issue #534.
DNA_loop_anchor
The ends of a DNA loop where the two strands of DNA are held in close physical proximity. During interphase the anchors of DNA loops are convergently oriented CTCF binding sites.
GREEKC:cl
PMID:32782014
An element that always exists within the promoter region of a gene. When multiple transcripts exist for a gene, the separate transcripts may have separate core_promoter_elements.
david
2020-08-12T14:01:30Z
core promoter element
sequence
SO:0002309
Added by Dave to be consistent with other ontologies updated with GREEKC initiative.
core_promoter_element
An element that always exists within the promoter region of a gene. When multiple transcripts exist for a gene, the separate transcripts may have separate core_promoter_elements.
GREEKC:rl
The promoter of a cryptic gene.
david
2020-08-12T14:01:30Z
cryptic promoter
sequence
SO:0002310
Added by Dave to be consistent with other ontologies updated with GREEKC initiative.
cryptic_promoter
The promoter of a cryptic gene.
GREEKC:cl
A regulatory_region including the Transcription Start Site (TSS) of a gene found in genes of viruses.
david
2020-08-12T14:01:30Z
viral promoter
sequence
SO:0002311
viral_promoter
A regulatory_region including the Transcription Start Site (TSS) of a gene found in genes of viruses.
GREEKC:cl
An element that always exists within the promoter region of a prokaryotic gene.
david
2020-08-12T14:01:30Z
core prokaryotic promoter element
sequence
general transcription factor binding site
SO:0002312
core_prokaryotic_promoter_element
An element that always exists within the promoter region of a prokaryotic gene.
GREEKC:rl
An element that always exists within the promoter region of a viral gene.
david
2020-08-12T14:01:30Z
core viral promoter element
sequence
general transcription factor binding site
SO:0002313
core_viral_promoter_element
An element that always exists within the promoter region of a viral gene.
GREEKC:rl
A sequence variant that alters the level or amount of gene product produced. This high level term can be applied where the direction of level change (increased vs decreased gene product level) is unknown or not confirmed.
david
2020-12-18T22:35:30Z
altered gene product level
altered transcription level
altered_transcription_level
sequence
SO:0002314
Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612.
altered_gene_product_level
A sequence variant that alters the level or amount of gene product produced. This high level term can be applied where the direction of level change (increased vs decreased gene product level) is unknown or not confirmed.
GenCC:AR
A variant that increases the level or amount of gene product produced.
david
2020-12-18T22:35:30Z
increased gene product level
increased transcription level
increased_transcription_level
sequence
SO:0002315
Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612.
increased_gene_product_level
A variant that increases the level or amount of gene product produced.
GenCC:AR
A sequence variant that decreases the level or amount of gene product produced.
david
2020-12-18T22:35:30Z
decreased gene product level
decreased transcription level
decreased_transcription_level
reduced gene product level
reduced transcription level
reduced_gene_product_level
reduced_transcription_level
sequence
SO:0002316
Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612.
decreased_gene_product_level
A sequence variant that decreases the level or amount of gene product produced.
GenCC:AR
A sequence variant that results in no gene product.
david
2020-12-18T22:35:30Z
absent gene product
sequence
SO:0002317
Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612.
absent_gene_product
A sequence variant that results in no gene product.
GenCC:AR
A sequence variant that alters the sequence of a gene product.
david
2020-12-18T22:35:30Z
altered gene product structure
sequence
SO:0002318
Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501)
altered_gene_product_sequence
A sequence variant that alters the sequence of a gene product.
GenCC:AR
A sequence variant that leads to a change in the location of a termination codon in a transcript that leads to nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants.
david
2020-12-30T17:12:30Z
NMD triggering variant
nonsense-mediated decay triggering variant
sequence
SO:0002319
Added as per request from Ang Roberts as part of GenCC November 2020.
NMD_triggering_variant
A sequence variant that leads to a change in the location of a termination codon in a transcript that leads to nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants.
GenCC:AR
A sequence variant that leads to a change in the location of a termination codon in a transcript but allows the transcript to escape nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants.
david
2020-12-30T17:12:30Z
NMD escaping variant
nonsense-mediated decay escaping variant
sequence
SO:0002320
Added as per request from Ang Roberts as part of GenCC November 2020.
NMD_escaping_variant
A sequence variant that leads to a change in the location of a termination codon in a transcript but allows the transcript to escape nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants.
GenCC:AR
A stop_gained (SO:0001587) variant that is degraded by nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
stop gained variant-nonsense-mediated decay triggering
stop gained-NMD triggering
sequence
SO:0002321
Added as per request from Ang Roberts as part of GenCC November 2020.
stop_gained_NMD_triggering
A stop_gained (SO:0001587) variant that is degraded by nonsense-mediated decay (NMD).
GenCC:AR
A stop_gained (SO:0001587) variant that allows the transcript to escape nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
stop gained variant-nonsense-mediated decay escaping
stop gained-NMD escaping
sequence
SO:0002322
Added as per request from Ang Roberts as part of GenCC November 2020.
stop_gained_NMD_escaping
A stop_gained (SO:0001587) variant that allows the transcript to escape nonsense-mediated decay (NMD).
GenCC:AR
A frameshift_variant (SO:0001589) that is degraded by nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
frameshift variant-NMD triggering
frameshift variant-nonsense-mediated decay triggering
sequence
SO:0002323
Added as per request from Ang Roberts as part of GenCC November 2020.
frameshift_variant_NMD_triggering
A frameshift_variant (SO:0001589) that is degraded by nonsense-mediated decay (NMD).
GenCC:AR
A frameshift_variant (SO:0001589) that allows the transcript to escape nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
frameshift variant-NMD escaping
frameshift variant-nonsense-mediated decay escaping
sequence
SO:0002324
Added as per request from Ang Roberts as part of GenCC November 2020.
frameshift_variant_NMD_escaping
A frameshift_variant (SO:0001589) that allows the transcript to escape nonsense-mediated decay (NMD).
GenCC:AR
A splice_donor_variant (SO:0001575) that is degraded by nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
splice donor variant-NMD triggering
splice donor variant-nonsense-mediated decay triggering
sequence
SO:0002325
Added as per request from Ang Roberts as part of GenCC November 2020.
splice_donor_variant_NMD_triggering
A splice_donor_variant (SO:0001575) that is degraded by nonsense-mediated decay (NMD).
GenCC:AR
A splice_donor_variant (SO:0001575) that allows the transcript to escape nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
splice donor variant-NMD escaping
splice donor variant-nonsense-mediated decay escaping
sequence
SO:0002326
Added as per request from Ang Roberts as part of GenCC November 2020.
splice_donor_variant_NMD_escaping
A splice_donor_variant (SO:0001575) that allows the transcript to escape nonsense-mediated decay (NMD).
GenCC:AR
A splice_acceptor_variant (SO:0001574) that is degraded by nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
splice acceptor variant-NMD triggering
splice acceptor variant-nonsense-mediated decay triggering
sequence
SO:0002327
Added as per request from Ang Roberts as part of GenCC November 2020.
splice_acceptor_variant_NMD_triggering
A splice_acceptor_variant (SO:0001574) that is degraded by nonsense-mediated decay (NMD).
GenCC:AR
A splice_acceptor_variant (SO:0001574) that allows the transcript to escape nonsense-mediated decay (NMD).
david
2020-12-30T17:12:30Z
splice acceptor variant-NMD escaping
splice acceptor variant-nonsense-mediated decay escaping
sequence
SO:0002328
Added as per request from Ang Roberts as part of GenCC November 2020.
splice_acceptor_variant_NMD_escaping
A splice_acceptor_variant (SO:0001574) that allows the transcript to escape nonsense-mediated decay (NMD).
GenCC:AR
The region of mRNA 1 base long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
david
2021-02-03T20:33:30Z
minus 1 ribosomal frameshift
minus 1 ribosomal slippage
minus 1 translational frameshift
sequence
SO:0002329
Added along with the update to the definition of transaltional_frameshift SO:0001210 Feb 2021, brought to our attention by Terrence Murphy of INSDC. See GitHub Issue #522.
minus_1_translational_frameshift
The region of mRNA 1 base long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
SO:ds
The region of mRNA 2 bases long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
david
2021-02-03T20:33:30Z
minus 2 ribosomal frameshift
minus 2 ribosomal slippage
minus 2 translational frameshift
sequence
SO:0002330
Added along with the update to the definition of transaltional_frameshift SO:0001210 Feb 2021, brought to our attention by Terrence Murphy of INSDC. See GitHub Issue #522.
minus_2_translational_frameshift
The region of mRNA 2 bases long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different.
SO:ds
A region of DNA that is depleted of nucleosomes and accessible to DNA-binding proteins including transcription factors and nucleases.
david
2021-02-11T16:41:30Z
accessible DNA region
sequence
SO:0002331
Added as part of GREEKC terms. See GitHub Issues #531 & #534.
accessible_DNA_region
A region of DNA that is depleted of nucleosomes and accessible to DNA-binding proteins including transcription factors and nucleases.
PMID:25903461
SO:ds
A biological region implicated in inherited changes caused by mechanisms other than changes in the underlying DNA sequence.
david
2021-02-11T21:16:30Z
https://epi.grants.cancer.gov/epigen/#:~:text=mail.nih.gov-,Overview,a%20cell%20or%20entire%20organism.
epigenomically modified region
sequence
SO:0002332
Added as part of GREEKC terms to differentiate between inherited and not inherited epigenetic changes. See GitHub Issue #532.
epigenomically_modified_region
A biological region implicated in inherited changes caused by mechanisms other than changes in the underlying DNA sequence.
SO:ds
http://en.wikipedia.org/wiki/Epigenetics
A stop codon with the DNA sequence TAG.
david
2021-04-21T21:16:30Z
Amber stop codon
sequence
SO:0002333
Added as per GitHub request #537.
amber_stop_codon
A stop codon with the DNA sequence TAG.
https://en.wikipedia.org/wiki/Stop_codon
A stop codon with the DNA sequence TAA.
david
2021-04-21T21:16:30Z
Ochre stop codon
sequence
SO:0002334
Added as per GitHub request #537.
ochre_stop_codon
A stop codon with the DNA sequence TAA.
https://en.wikipedia.org/wiki/Stop_codon
A stop codon with the DNA sequence TGA.
david
2021-04-21T21:16:30Z
Opal stop codon
sequence
SO:0002335
Added as per GitHub request #537.
opal_stop_codon
A stop codon with the DNA sequence TGA.
https://en.wikipedia.org/wiki/Stop_codon
A gene that encodes for 2S ribosomal RNA, which functions as a component of the large subunit of the ribosome in Drosophila and at least some other Diptera.
david
2021-04-23T22:59:30Z
2S rRNA gene
rRNA 2S gene
sequence
SO:0002336
Added as a request from FlyBase. See GitHub Issue #507. Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_2S_gene
A gene that encodes for 2S ribosomal RNA, which functions as a component of the large subunit of the ribosome in Drosophila and at least some other Diptera.
PMID: 118436
PMID: 29474379
PMID: 3136294
PMID:10788608
PMID:407103
PMID:4847940
PMID:768488
Cytosolic 2S rRNA is a 30 nucleotide RNA component of the large subunit of cytosolic ribosomes in Drosophila and at least some other Diptera. It is homologous to the 3' part of other 5.8S rRNA molecules. The 3' end of the 5.8S molecule is able to base-pair with the 5' end of the 2S rRNA to generate a helical region equivalent in position to the 'GC-rich hairpin' found in all previously sequenced 5.8S molecules.
david
2021-04-23T22:59:30Z
cytosolic 2S rRNA
cytosolic rRNA 2S
sequence
SO:0002337
Added as a request from FlyBase. See GitHub Issue #507. Renamed from rRNA_2S to cytosolic_2S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_2S_rRNA
Cytosolic 2S rRNA is a 30 nucleotide RNA component of the large subunit of cytosolic ribosomes in Drosophila and at least some other Diptera. It is homologous to the 3' part of other 5.8S rRNA molecules. The 3' end of the 5.8S molecule is able to base-pair with the 5' end of the 2S rRNA to generate a helical region equivalent in position to the 'GC-rich hairpin' found in all previously sequenced 5.8S molecules.
PMID: 118436
PMID: 29474379
PMID: 3136294
PMID:10788608
PMID:407103
PMID:4847940
PMID:768488
A 57 to 71 nucleotide RNA that is a component of the U7 small nuclear ribonucleoprotein complex (U7 snRNP). The U7 snRNP is required for histone pre-mRNA processing.
david
2021-04-24T16:59:30Z
U7 small nuclear RNA
U7 snRNA
small nuclear RNA U7
snRNA U7
sequence
SO:0002338
Added as a request from FlyBase. See GitHub Issue #508
U7_snRNA
A 57 to 71 nucleotide RNA that is a component of the U7 small nuclear ribonucleoprotein complex (U7 snRNP). The U7 snRNP is required for histone pre-mRNA processing.
PMID:15526162
A gene that encodes for a scaRNA (small Cajal body-specific RNA).
david
2021-04-24T16:59:30Z
Small Cajal body-specific RNA gene
scaRNA gene
sequence
SO:0002339
Added as a request from FlyBase. See GitHub Issue #510
scaRNA_gene
A gene that encodes for a scaRNA (small Cajal body-specific RNA).
PMID:27775477
PMID:28869095
An abundant small nuclear RNA that, together with associated cellular proteins, regulates the activity of the positive transcription elongation factor b (P-TEFb). It is often described in literature as similar to a snRNA, except of longer length.
david
2021-04-27T14:50:30Z
7SK RNA
RNA 7SK
sequence
SO:0002340
Added as a request from FlyBase. See GitHub Issue #512
RNA_7SK
An abundant small nuclear RNA that, together with associated cellular proteins, regulates the activity of the positive transcription elongation factor b (P-TEFb). It is often described in literature as similar to a snRNA, except of longer length.
PMID:19246988
PMID:21853533
PMID:27369380
A gene encoding a 7SK RNA (SO:0002340).
david
2021-04-27T14:50:30Z
7SK RNA gene
RNA 7SK gene
sequence
SO:0002341
Added as a request from FlyBase. See GitHub Issue #512
RNA_7SK_gene
A gene encoding a 7SK RNA (SO:0002340).
PMID:19246988
PMID:21853533
PMID:27369380
A ncRNA_gene that encodes an ncRNA less than 200 nucleotides in length.
david
2021-04-27T14:50:30Z
small non-coding RNA gene
sncRNA gene
sequence
SO:0002342
Added as a request from FlyBase to make the ncRNA_gene branch in SO mirror the ncRNA branch. See GitHub Issue #514
sncRNA_gene
A ncRNA_gene that encodes an ncRNA less than 200 nucleotides in length.
PMID:28449079
PMID:30069443
PMID:30937442
Cytosolic rRNA is an RNA component of the small or large subunits of cytosolic ribosomes.
david
2021-06-10T16:45:30Z
cytosolic rRNA
cytosolic ribosomal RNA
sequence
SO:0002343
Added as a request from EBI. See GitHub Issue #493
cytosolic_rRNA
Cytosolic rRNA is an RNA component of the small or large subunits of cytosolic ribosomes.
PMID:3044395
Mitochondrial SSU rRNA is an RNA component of the small subunit of mitochondrial ribosomes.
david
2021-06-10T16:45:30Z
MT SSU rRNA
mitochondrial SSU rRNA
mitochondrial small subunit rRNA
sequence
SO:0002344
Added as a request from EMBL. See GitHub Issue #493
mt_SSU_rRNA
Mitochondrial SSU rRNA is an RNA component of the small subunit of mitochondrial ribosomes.
PMID: 24572720
PMID:3044395
Mitochondrial LSU rRNA is an RNA component of the large subunit of mitochondrial ribosomes.
david
2021-06-10T16:45:30Z
MT LSU rRNA
mitochondrial LSU rRNA
mitochondrial large subunit rRNA
sequence
SO:0002345
Added as a request from EMBL. See GitHub Issue #493
mt_LSU_rRNA
Mitochondrial LSU rRNA is an RNA component of the large subunit of mitochondrial ribosomes.
PMID: 24572720
PMID:3044395
Plastid rRNA is an RNA component of the small or large subunits of plastid (such as chloroplast) ribosomes.
david
2021-06-10T16:45:30Z
plastid rRNA
sequence
SO:0002346
Added as a request from EMBL. See GitHub Issue #493
plastid_rRNA
Plastid rRNA is an RNA component of the small or large subunits of plastid (such as chloroplast) ribosomes.
PMID: 24572720
PMID:3044395
Plastid SSU rRNA is an RNA component of the small subunit of plastid (such as chloroplast) ribosomes.
david
2021-06-10T16:45:30Z
plastid SSU rRNA
plastid small subunit rRNA
sequence
SO:0002347
Added as a request from EMBL. See GitHub Issue #493
plastid_SSU_rRNA
Plastid SSU rRNA is an RNA component of the small subunit of plastid (such as chloroplast) ribosomes.
PMID: 24572720
PMID:3044395
Plastid LSU rRNA is an RNA component of the large subunit of plastid (such as chloroplast) ribosomes.
david
2021-06-10T16:45:30Z
plastid LSU rRNA
plastid large subunit rRNA
sequence
SO:0002348
Added as a request from EMBL. See GitHub Issue #493
plastid_LSU_rRNA
Plastid LSU rRNA is an RNA component of the large subunit of plastid (such as chloroplast) ribosomes.
PMID: 24572720
PMID:3044395
A heritable locus on a chromosome that is prone to DNA breakage.
evan
2021-09-30T19:29:24Z
sequence
SO:0002349
See GitHub Issue #301.
fragile_site
A fragile site considered part of the normal chromosomal structure.
evan
2021-09-30T19:33:59Z
sequence
SO:0002350
See GitHub Issue #301.
common_fragile_site
A fragile site considered part of the normal chromosomal structure.
PMID: 16236432
PMID: 17608616
A fragile site found in the chromosomes of less than five percent of the human population.
evan
2021-09-30T19:34:13Z
sequence
SO:0002351
See GitHub Issue #301.
rare_fragile_site
A fragile site found in the chromosomes of less than five percent of the human population.
PMID:16236432
PMID:17608616
A non-coding RNA typically derived from intronic sequence of the sense strand of a cognate host gene, that is not rapidly degraded. It may contain exonic sequences, 5′ caps, and/or polyA tails.
evan
2021-09-30T21:07:18Z
Stable intronic sequence RNA
stable_intronic_sequence_RNA
sequence
SO:0002352
See GitHub Issue #515.
sisRNA
A non-coding RNA typically derived from intronic sequence of the sense strand of a cognate host gene, that is not rapidly degraded. It may contain exonic sequences, 5′ caps, and/or polyA tails.
PMID:27147469
PMID:29397203
PMID:30391089
A gene encoding a stem-bulge RNA.
evan
2021-09-30T21:25:37Z
Stem-bulge RNA gene
stem_bulge_RNA_gene
sequence
SO:0002353
See GitHub Issue #516.
sbRNA_gene
A gene encoding a stem-bulge RNA.
PMID:25908866
PMID:30666901
A small non-coding stem-loop RNA present in nematodes and insects, functionally and structurally related to vertebrate Y RNA.
evan
2021-09-30T21:29:19Z
Stem-bulge RNA
stem_bulge_RNA
sequence
SO:0002354
See GitHub Issue #516.
sbRNA
A small non-coding stem-loop RNA present in nematodes and insects, functionally and structurally related to vertebrate Y RNA.
PMID:25908866
PMID:30666901
A gene encoding a hpRNA.
evan
2021-10-07T17:09:18Z
Hairpin RNA gene
sequence
SO:0002355
See GitHub Issue #518.
hpRNA_gene
A gene encoding a hpRNA.
PMID:18463630
PMID:18719707
PMID:25544562
An RNA comprising an extended inverted repeat, the stem of which is typically much longer than that of miRNA precursors and can be up to 400 base pairs in length. hpRNAs are processed by Dicer-2 to generate endogenous short interfering RNAs (siRNAs).
evan
2021-10-07T17:35:56Z
Hairpin RNA
sequence
SO:0002356
See GitHub Issue #518.
hpRNA
An RNA comprising an extended inverted repeat, the stem of which is typically much longer than that of miRNA precursors and can be up to 400 base pairs in length. hpRNAs are processed by Dicer-2 to generate endogenous short interfering RNAs (siRNAs).
PMID:18463630
PMID:18719707
PMID:25544562
A physically clustered group of two or more genes in a particular genome that together encode a biosynthetic pathway for the production of a specialized metabolite (including its chemical variants).
evan
2021-10-07T18:20:34Z
Metabolic gene cluster
sequence
SO:0002357
See GitHub Issue #558.
biosynthetic_gene_cluster
A physically clustered group of two or more genes in a particular genome that together encode a biosynthetic pathway for the production of a specialized metabolite (including its chemical variants).
PMID:26284661
A gene that encodes a vault RNA.
evan
2021-11-11T23:25:13Z
sequence
SO:0002358
As of 11 November 2021 the HNGC lists 4 genes as RNA, vault. These are HGNC IDs: 12654, 12655, 12656, 37054.
vault_RNA_gene
A gene that encodes a vault RNA.
PMID:19298825
PMID:19491402
PMID:22058117
PMID:22926522
PMID:30773316
PMID:9535882
A gene that encodes a Y RNA.
evan
2021-11-11T23:52:21Z
sequence
SO:0002359
There are four genes from HGNC that are annotated this way. HGNC IDs: 10242, 10243, 10244, and 10248.
Y_RNA_gene
A gene that encodes a Y RNA.
PMID:1698620
PMID:6187471
PMID:6816230
PMID:7520568
PMID:7539809
PMID:8836182
A gene that codes for cytosolic rRNA.
evan
2021-11-19T04:30:40Z
sequence
SO:0002360
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_rRNA_gene
A gene that codes for cytosolic LSU rRNA.
evan
2021-11-19T04:32:20Z
cytosolic large subunit rRNA gene
sequence
SO:0002361
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_LSU_rRNA_gene
A gene that codes for cytosolic SSU rRNA.
evan
2021-11-19T04:37:02Z
cytosolic small subunit rRNA gene
sequence
SO:0002362
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
cytosolic_SSU_rRNA_gene
A gene that codes for mitochondrial rRNA.
evan
2021-11-19T04:55:58Z
mitochondrial rRNA gene
sequence
SO:0002363
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
mt_rRNA_gene
A gene that codes for mitochondrial LSU rRNA.
evan
2021-11-19T04:57:49Z
mitochondrial large subunit rRNA gene
rRNA 21S gene
rRNA_21S_gene
sequence
SO:0002364
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). Obsoleted term rRNA_21S_gene (SO:0002241) merged into this term on 12 Sept 2022, see GitHub Issue #513.
mt_LSU_rRNA_gene
A gene that codes for mitochondrial SSU rRNA.
evan
2021-11-19T04:58:07Z
mitochondrial small subunit rRNA gene
sequence
SO:0002365
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
mt_SSU_rRNA_gene
A gene that codes for plastid rRNA.
evan
2021-11-19T05:00:08Z
sequence
SO:0002366
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
plastid_rRNA_gene
A gene that codes for plastid LSU rRNA.
evan
2021-11-19T05:00:49Z
plastid large subunit rRNA gene
sequence
SO:0002367
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
plastid_LSU_rRNA_gene
A gene that codes for plastid SSU rRNA.
evan
2021-11-19T05:01:03Z
plastid small subunit rRNA gene
sequence
SO:0002368
Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513).
plastid_SSU_rRNA_gene
A scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs.
evan
2021-11-19T05:34:44Z
C/D scaRNA
sequence
SO:0002369
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
C_D_box_scaRNA
A scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs.
PMID:17099227
PMID:24659245
A scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs.
evan
2021-11-19T05:35:04Z
H/ACA scaRNA
sequence
SO:0002370
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
H_ACA_box_scaRNA
A scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs.
PMID:17099227
PMID:24659245
A scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs.
evan
2021-11-19T05:35:23Z
C/D-H/ACA scaRNA
sequence
SO:0002371
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
C-D_H_ACA_box_scaRNA
A scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs.
PMID:17099227
PMID:24659245
A gene that codes for scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs.
evan
2021-11-19T05:40:46Z
C/D scaRNA gene
sequence
SO:0002372
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
C_D_box_scaRNA_gene
A gene that codes for scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs.
PMID:17099227
PMID:24659245
A gene that codes for scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs.
evan
2021-11-19T05:40:58Z
H/ACA scaRNA gene
sequence
SO:0002373
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
H_ACA_box_scaRNA_gene
A gene that codes for scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs.
PMID:17099227
PMID:24659245
A gene that codes for scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs.
evan
2021-11-19T05:41:06Z
C/D-H/ACA scaRNA gene
sequence
SO:0002374
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
C-D_H_ACA_box_scaRNA_gene
A gene that codes for scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs.
PMID:17099227
PMID:24659245
A gene that codes a C_D_box_snoRNA. Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'.
evan
2021-11-19T05:49:55Z
box C/D snoRNA gene, C D box snoRNA gene, C/D box snoRNA gene
sequence
SO:0002375
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). Added citations. See GitHub Issue #565.
C_D_box_snoRNA_gene
A gene that codes a C_D_box_snoRNA. Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'.
PMID:12457565
PMID:22065625
A gene that codes for H_ACA_box_snoRNA. Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains.
evan
2021-11-19T05:50:14Z
box H/ACA snoRNA gene, H ACA box snoRNA gene, H/ACA box snoRNA gene
sequence
SO:0002376
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). Added citations. See GitHub Issue #565.
H_ACA_box_snoRNA_gene
A gene that codes for H_ACA_box_snoRNA. Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains.
PMID:12457565
PMID:22065625
A gene that codes for U14_snoRNA. U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates.
evan
2021-11-19T05:50:43Z
small nucleolar RNA U14 gene, snoRNA U14 gene, U14 small nucleolar RNA gene, U14 snoRNA gene
sequence
SO:0002377
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
U14_snoRNA_gene
A gene that codes for U3_snoRNA. U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA.
evan
2021-11-19T05:50:57Z
small nucleolar RNA U3 gene, snoRNA U3 gene, U3 small nucleolar RNA gene, U3 snoRNA gene
sequence
SO:0002378
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
U3_snoRNA_gene
A gene that codes for methylation_guide_snoRNA. A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue.
evan
2021-11-19T05:51:12Z
methylation guide snoRNA gene
sequence
SO:0002379
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
methylation_guide_snoRNA_gene
A gene that codes for methylation_guide_snoRNA. A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue.
PMID:12457565
A gene that codes for pseudouridylation_guide_snoRNA. A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue.
evan
2021-11-19T05:51:40Z
pseudouridylation guide snoRNA gene
sequence
SO:0002380
Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519).
pseudouridylation_guide_snoRNA_gene
A gene that codes for pseudouridylation_guide_snoRNA. A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue.
PMID:12457565
A long non-coding RNA which is produced using the promoter of a protein-coding gene but with transcription occurring in the opposite direction.
evan
2022-10-28T19:05:35Z
bidirectional lncRNA
bidirectional promoter lncRNA
bidirectional promoter long non-coding RNA
bidirectional_lncRNA
bidirectional_promoter_long_non-coding_RNA
sequence
SO:0002381
Created new term "bidirectional_promoter_lncRNA" (SO:0002381). See GitHub Issue #579.
bidirectional_promoter_lncRNA
A long non-coding RNA which is produced using the promoter of a protein-coding gene but with transcription occurring in the opposite direction.
PMID:30175284
PMID:34956340
PMID:26578749
A conserved cis-acting element that confers extreme-distance regulatory activity to an enhancer.
2024-06-06T16:53:56Z
evan
REX
sequence
SO:0002382
Added at the request of Chris Mungall (Berkeley Lab) on 6 June 2024. See GitHub Issue #649.
range_extender_element
A conserved cis-acting element that confers extreme-distance regulatory activity to an enhancer.
https://doi.org/10.1101/2024.05.26.595809
On its own, a range extender element is not a classical enhancer, but its addition can extend the range of action of a heterologous short-range enhancer by more than 10-fold compared to its native range. In extreme cases, extended ranges may span approximately 840 kilobases of genomic space. The evolutionarily conserved homeodomain motif [C/T]AATA, required for long-range enhancer activity, is present within the range extender element. Range extender elements are distinct from and do not share sequence similarity with other cis-regulatory elements like tethering and remote control elements in Drosophila; or CTCF sites, CpG islands, and enhancer booster elements in mammals. Rather than conferring robustness of remote enhancer activity, a range extender element is both required and sufficient for long-range enhancer-promoter activation.
A variant that has been found to be pathogenic in the context of a neoplastic disease.
2024-06-06T20:09:59Z
evan
sequence
SO:0002383
See GitHub Issue #643.
oncogenic_variant
A variant that has been found to be pathogenic in the context of a neoplastic disease.
PMID:35101336
A region of sequence that is involved in the control of a biological process.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Regulatory_region
INSDC_qualifier:other
regulatory region
sequence
SO:0005836
regulatory_region
A region of sequence that is involved in the control of a biological process.
SO:ke
http://en.wikipedia.org/wiki/Regulatory_region
wiki
The primary transcript of an evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA.
4.5S snRNA primary transcript
U14 snoRNA primary transcript
sequence
SO:0005837
U14_snoRNA_primary_transcript
The primary transcript of an evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA.
PMID:2251119
true
A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue.
methylation guide snoRNA
sequence
SO:0005841
Has RNA 2'-O-ribose methylation guide activity (GO:0030561).
methylation_guide_snoRNA
A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue.
GOC:mah
PMID:12457565
An ncRNA that is part of a ribonucleoprotein that cleaves the primary pre-rRNA transcript in the process of producing mature rRNA molecules.
rRNA cleavage RNA
sequence
SO:0005843
rRNA_cleavage_RNA
An ncRNA that is part of a ribonucleoprotein that cleaves the primary pre-rRNA transcript in the process of producing mature rRNA molecules.
GOC:kgc
An exon that is the only exon in a gene.
exon of single exon gene
singleton exon
sequence
single_exon
SO:0005845
exon_of_single_exon_gene
An exon that is the only exon in a gene.
RSC:cb
A gene that is a member of a gene cassette, which is a mobile genetic element.
cassette array member
sequence
SO:0005847
cassette_array_member
A gene that is a member of a gene cassette, which is a mobile genetic element.
gene cassette member
sequence
SO:0005848
gene_cassette_member
A gene that is a member of a group of genes that are either regulated or transcribed together within a larger group of genes that are regulated or transcribed together.
gene subarray member
sequence
SO:0005849
gene_subarray_member
Non-covalent primer binding site for initiation of replication, transcription, or reverse transcription.
http://en.wikipedia.org/wiki/Primer_binding_site
INSDC_feature:primer_bind
primer binding site
sequence
SO:0005850
primer_binding_site
Non-covalent primer binding site for initiation of replication, transcription, or reverse transcription.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Primer_binding_site
wiki
An array includes two or more genes, or two or more gene subarrays, contiguously arranged where the individual genes, or subarrays, are either identical in sequence, or essentially so.
gene array
sequence
SO:0005851
This would include, for example, a cluster of genes each encoding the major ribosomal RNAs and a cluster of histone gene subarrays.
gene_array
An array includes two or more genes, or two or more gene subarrays, contiguously arranged where the individual genes, or subarrays, are either identical in sequence, or essentially so.
SO:ma
A subarray is, by defintition, a member of a gene array (SO:0005851); the members of a subarray may differ substantially in sequence, but are closely related in function.
gene subarray
sequence
SO:0005852
This would include, for example, a cluster of genes encoding different histones.
gene_subarray
A subarray is, by defintition, a member of a gene array (SO:0005851); the members of a subarray may differ substantially in sequence, but are closely related in function.
SO:ma
A gene that can be substituted for a related gene at a different site in the genome.
http://en.wikipedia.org/wiki/Gene_cassette
gene cassette
sequence
SO:0005853
This would include, for example, the mating type gene cassettes of S. cerevisiae. Gene cassettes usually exist as linear sequences as part of a larger DNA molecule, such as a chromosome or plasmid.
gene_cassette
A gene that can be substituted for a related gene at a different site in the genome.
SGD:se
http://en.wikipedia.org/wiki/Gene_cassette
wiki
An array of non-functional genes whose members, when captured by recombination form functional genes.
gene cassette array
sequence
SO:0005854
This would include, for example, the arrays of non-functional VSG genes of Trypanosomes.
gene_cassette_array
An array of non-functional genes whose members, when captured by recombination form functional genes.
SO:ma
A collection of related genes.
gene group
sequence
SO:0005855
gene_group
A collection of related genes.
SO:ma
A primary transcript encoding seryl tRNA (SO:000269).
selenocysteine tRNA primary transcript
sequence
SO:0005856
selenocysteine_tRNA_primary_transcript
A primary transcript encoding seryl tRNA (SO:000269).
SO:ke
A tRNA sequence that has a selenocysteine anticodon, and a 3' selenocysteine binding region.
selenocysteinyl tRNA
selenocysteinyl-transfer RNA
selenocysteinyl-transfer ribonucleic acid
sequence
SO:0005857
selenocysteinyl_tRNA
A tRNA sequence that has a selenocysteine anticodon, and a 3' selenocysteine binding region.
SO:ke
A region in which two or more pairs of homologous markers occur on the same chromosome in two or more species.
syntenic region
sequence
SO:0005858
syntenic_region
A region in which two or more pairs of homologous markers occur on the same chromosome in two or more species.
http://www.informatics.jax.org/silverbook/glossary.shtml
A region of a peptide that is involved in a biochemical function.
biochemical motif
biochemical region of peptide
sequence
biochemical_region
SO:0100001
Range.
biochemical_region_of_peptide
A region of a peptide that is involved in a biochemical function.
EBIBS:GAR
A region that is involved a contact with another molecule.
sequence
molecular contact region
SO:0100002
Range.
molecular_contact_region
A region that is involved a contact with another molecule.
EBIBS:GAR
A region of polypeptide chain with high conformational flexibility.
intrinsically unstructured polypeptide region
sequence
disordered region
SO:0100003
intrinsically_unstructured_polypeptide_region
A region of polypeptide chain with high conformational flexibility.
EBIBS:GAR
disordered region
A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1.
catmat-3l
sequence
SO:0100004
catmat_left_handed_three
A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2.
catmat-4l
sequence
SO:0100005
catmat_left_handed_four
A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1.
catmat-3r
sequence
SO:0100006
catmat_right_handed_three
A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2.
catmat-4r
sequence
SO:0100007
catmat_right_handed_four
A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A motif of five consecutive residues and two H-bonds in which: H-bond between CO of residue(i) and NH of residue(i+4), H-bond between CO of residue(i) and NH of residue(i+3),Phi angles of residues(i+1), (i+2) and (i+3) are negative.
alpha beta motif
sequence
SO:0100008
alpha_beta_motif
A motif of five consecutive residues and two H-bonds in which: H-bond between CO of residue(i) and NH of residue(i+4), H-bond between CO of residue(i) and NH of residue(i+3),Phi angles of residues(i+1), (i+2) and (i+3) are negative.
EBIBS:GAR
http://www.ebi.ac.uk/msd-srv/msdmotif/
A peptide that acts as a signal for both membrane translocation and lipid attachment in prokaryotes.
lipoprotein signal peptide
prokaryotic membrane lipoprotein lipid attachment site
sequence
SO:0100009
lipoprotein_signal_peptide
A peptide that acts as a signal for both membrane translocation and lipid attachment in prokaryotes.
EBIBS:GAR
An experimental region wherean analysis has been run and not produced any annotation.
no output
sequence
SO:0100010
no_output
An experimental region wherean analysis has been run and not produced any annotation.
EBIBS:GAR
no output
The cleaved_peptide_region is the region of a peptide sequence that is cleaved during maturation.
cleaved peptide region
sequence
SO:0100011
Range.
cleaved_peptide_region
The cleaved_peptide_region is the region of a peptide sequence that is cleaved during maturation.
EBIBS:GAR
Irregular, unstructured regions of a protein's backbone, as distinct from the regular region (namely alpha helix and beta strand - characterised by specific patterns of main-chain hydrogen bonds).
peptide coil
sequence
coil
random coil
SO:0100012
peptide_coil
Irregular, unstructured regions of a protein's backbone, as distinct from the regular region (namely alpha helix and beta strand - characterised by specific patterns of main-chain hydrogen bonds).
EBIBS:GAR
coil
random coil
Hydrophobic regions are regions with a low affinity for water.
hydrophobic_region
sequence
hydropathic
hydrophobic region of peptide
hydrophobicity
SO:0100013
Range.
hydrophobic_region_of_peptide
Hydrophobic regions are regions with a low affinity for water.
EBIBS:GAR
The amino-terminal positively-charged region of a signal peptide (approx 1-5 aa).
sequence
N-region
SO:0100014
n_terminal_region
The amino-terminal positively-charged region of a signal peptide (approx 1-5 aa).
EBIBS:GAR
The more polar, carboxy-terminal region of the signal peptide (approx 3-7 aa).
sequence
C-region
SO:0100015
c_terminal_region
The more polar, carboxy-terminal region of the signal peptide (approx 3-7 aa).
EBIBS:GAR
The central, hydrophobic region of the signal peptide (approx 7-15 aa).
central hydrophobic region of signal peptide
sequence
H-region
central_hydrophobic_region
SO:0100016
central_hydrophobic_region_of_signal_peptide
The central, hydrophobic region of the signal peptide (approx 7-15 aa).
EBIBS:GAR
A conserved motif is a short (up to 20 amino acids) region of biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found.
sequence
motif
SO:0100017
polypeptide_conserved_motif
A conserved motif is a short (up to 20 amino acids) region of biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found.
EBIBS:GAR
A polypeptide binding motif is a short (up to 20 amino acids) polypeptide region of biological interest that contains one or more amino acids experimentally shown to bind to a ligand.
polypeptide binding motif
sequence
binding
SO:0100018
polypeptide_binding_motif
A polypeptide binding motif is a short (up to 20 amino acids) polypeptide region of biological interest that contains one or more amino acids experimentally shown to bind to a ligand.
EBIBS:GAR
binding
uniprot:feature_type
A polypeptide catalytic motif is a short (up to 20 amino acids) polypeptide region that contains one or more active site residues.
polypeptide catalytic motif
sequence
catalytic_motif
SO:0100019
polypeptide_catalytic_motif
A polypeptide catalytic motif is a short (up to 20 amino acids) polypeptide region that contains one or more active site residues.
EBIBS:GAR
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with DNA.
DNA_bind
polypeptide DNA contact
sequence
SO:0100020
polypeptide_DNA_contact
A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with DNA.
EBIBS:GAR
SO:ke
DNA_bind
uniprot:feature
A subsection of sequence with biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found.
polypeptide conserved region
sequence
SO:0100021
polypeptide_conserved_region
A subsection of sequence with biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found.
EBIBS:GAR
true
A sequence alteration where the length of the change in the variant is the same as that of the reference.
loinc:LA6690-7
sequence
SO:1000002
substitution
A sequence alteration where the length of the change in the variant is the same as that of the reference.
SO:ke
loinc:LA6690-7
Substitution
true
When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change.
complex substitution
sequence
SO:1000005
complex_substitution
When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
true
A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence.
http://en.wikipedia.org/wiki/Point_mutation
point mutation
sequence
SO:1000008
point_mutation
A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence.
SO:immuno_workshop
http://en.wikipedia.org/wiki/Point_mutation
wiki
Change of a pyrimidine nucleotide, C or T, into an other pyrimidine nucleotide, or change of a purine nucleotide, A or G, into an other purine nucleotide.
sequence
SO:1000009
transition
Change of a pyrimidine nucleotide, C or T, into an other pyrimidine nucleotide, or change of a purine nucleotide, A or G, into an other purine nucleotide.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A substitution of a pyrimidine, C or T, for another pyrimidine.
pyrimidine transition
sequence
SO:1000010
pyrimidine_transition
A substitution of a pyrimidine, C or T, for another pyrimidine.
SO:ke
A transition of a cytidine to a thymine.
C to T transition
sequence
SO:1000011
C_to_T_transition
A transition of a cytidine to a thymine.
SO:ke
The transition of cytidine to thymine occurring at a pCpG site as a consequence of the spontaneous deamination of 5'-methylcytidine.
C to T transition at pCpG site
sequence
SO:1000012
C_to_T_transition_at_pCpG_site
The transition of cytidine to thymine occurring at a pCpG site as a consequence of the spontaneous deamination of 5'-methylcytidine.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A transition of a thymine to a cytidine.
T to C transition
sequence
SO:1000013
T_to_C_transition
A substitution of a purine, A or G, for another purine.
purine transition
sequence
SO:1000014
purine_transition
A substitution of a purine, A or G, for another purine.
SO:ke
A transition of an adenine to a guanine.
A to G transition
sequence
SO:1000015
A_to_G_transition
A transition of an adenine to a guanine.
SO:ke
A transition of a guanine to an adenine.
G to A transition
sequence
SO:1000016
G_to_A_transition
A transition of a guanine to an adenine.
SO:ke
Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G, or vice versa.
http://en.wikipedia.org/wiki/Transversion
sequence
SO:1000017
transversion
Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G, or vice versa.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
http://en.wikipedia.org/wiki/Transversion
wiki
Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G.
pyrimidine to purine transversion
sequence
SO:1000018
pyrimidine_to_purine_transversion
Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G.
SO:ke
A transversion from cytidine to adenine.
C to A transversion
sequence
SO:1000019
C_to_A_transversion
A transversion from cytidine to adenine.
SO:ke
A transversion of a cytidine to a guanine.
C to G transversion
sequence
SO:1000020
C_to_G_transversion
A transversion from T to A.
T to A transversion
sequence
SO:1000021
T_to_A_transversion
A transversion from T to A.
SO:ke
A transversion from T to G.
T to G transversion
sequence
SO:1000022
T_to_G_transversion
A transversion from T to G.
SO:ke
Change of a purine nucleotide, A or G , into a pyrimidine nucleotide C or T.
purine to pyrimidine transversion
sequence
SO:1000023
purine_to_pyrimidine_transversion
Change of a purine nucleotide, A or G , into a pyrimidine nucleotide C or T.
SO:ke
A transversion from adenine to cytidine.
A to C transversion
sequence
SO:1000024
A_to_C_transversion
A transversion from adenine to cytidine.
SO:ke
A transversion from adenine to thymine.
A to T transversion
sequence
SO:1000025
A_to_T_transversion
A transversion from adenine to thymine.
SO:ke
A transversion from guanine to cytidine.
G to C transversion
sequence
SO:1000026
G_to_C_transversion
A transversion from guanine to cytidine.
SO:ke
A transversion from guanine to thymine.
G to T transversion
sequence
SO:1000027
G_to_T_transversion
A transversion from guanine to thymine.
SO:ke
A chromosomal structure variation within a single chromosome.
intrachromosomal mutation
sequence
SO:1000028
intrachromosomal_mutation
A chromosomal structure variation within a single chromosome.
SO:ke
An incomplete chromosome.
http://en.wikipedia.org/wiki/Chromosomal_deletion
chromosomal deletion
deficiency
sequence
(Drosophila)Df
(bacteria)&Dgr;
(fungi)D
SO:1000029
chromosomal_deletion
An incomplete chromosome.
SO:ke
http://en.wikipedia.org/wiki/Chromosomal_deletion
wiki
An interchromosomal mutation where a region of the chromosome is inverted with respect to wild type.
http://en.wikipedia.org/wiki/Chromosomal_inversion
chromosomal inversion
sequence
(Drosophila)In
(bacteria)IN
(fungi)In
SO:1000030
chromosomal_inversion
An interchromosomal mutation where a region of the chromosome is inverted with respect to wild type.
SO:ke
http://en.wikipedia.org/wiki/Chromosomal_inversion
wiki
A chromosomal structure variation whereby more than one chromosome is involved.
interchromosomal mutation
sequence
SO:1000031
interchromosomal_mutation
A chromosomal structure variation whereby more than one chromosome is involved.
SO:ke
A sequence alteration which included an insertion and a deletion, affecting 2 or more bases.
http://en.wikipedia.org/wiki/Indel
loinc:LA9659-9
deletion-insertion
indel
sequence
SO:1000032
Indels can have a different number of bases than the corresponding reference sequence. The term name was changed from indel to delins on 2/24/2019 to align with the HGVS nomenclature term for a deletion-insertion. Indel was causing confusion in the annotation community (github issue 445). The HGVS nomenclature definition of deletion-insertion (delins) is a sequence change where, compared to a reference sequence, one or more nucleotides are replaced by one or more other nucleotides and which is not a substitution, inversion or conversion.
delins
A sequence alteration which included an insertion and a deletion, affecting 2 or more bases.
http://varnomen.hgvs.org/recommendations/DNA/variant/delins/
http://en.wikipedia.org/wiki/Indel
wiki
loinc:LA9659-9
Insertion and Deletion
true
true
An insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome.
loinc:LA6686-5
nucleotide duplication
sequence
nucleotide_duplication
SO:1000035
duplication
An insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
NCBI:th
loinc:LA6686-5
Duplication
A continuous nucleotide sequence is inverted in the same position.
loinc:LA6689-9
inversion
sequence
SO:1000036
inversion
A continuous nucleotide sequence is inverted in the same position.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
loinc:LA6689-9
Inversion
inversion
http://www.ncbi.nlm.nih.gov/dbvar/
An extra chromosome.
http://en.wikipedia.org/wiki/Chromosomal_duplication
chromosomal duplication
sequence
(Drosophila)Dp
(fungi)Dp
SO:1000037
chromosomal_duplication
An extra chromosome.
SO:ke
http://en.wikipedia.org/wiki/Chromosomal_duplication
wiki
A duplication that occurred within a chromosome.
intrachromosomal duplication
sequence
SO:1000038
intrachromosomal_duplication
A duplication that occurred within a chromosome.
SO:ke
A tandem duplication where the individual regions are in the same orientation.
direct tandem duplication
sequence
SO:1000039
direct_tandem_duplication
A tandem duplication where the individual regions are in the same orientation.
SO:ke
A tandem duplication where the individual regions are not in the same orientation.
inverted tandem duplication
sequence
mirror duplication
SO:1000040
inverted_tandem_duplication
A tandem duplication where the individual regions are not in the same orientation.
SO:ke
A chromosome structure variation whereby a transposition occurred within a chromosome.
intrachromosomal transposition
sequence
(Drosophila)Tp
SO:1000041
intrachromosomal_transposition
A chromosome structure variation whereby a transposition occurred within a chromosome.
SO:ke
A chromosome structure variant where a monocentric element is caused by the fusion of two chromosome arms.
compound chromosome
sequence
SO:1000042
compound_chromosome
A chromosome structure variant where a monocentric element is caused by the fusion of two chromosome arms.
SO:ke
A non reciprocal translocation whereby the participating chromosomes break at their centromeres and the long arms fuse to form a single chromosome with a single centromere.
http://en.wikipedia.org/wiki/Robertsonian_fusion
Robertsonian fusion
centric-fusion translocations
whole-arm translocations
sequence
SO:1000043
Robertsonian_fusion
A non reciprocal translocation whereby the participating chromosomes break at their centromeres and the long arms fuse to form a single chromosome with a single centromere.
http://en.wikipedia.org/wiki/Robertsonian_translocation
http://en.wikipedia.org/wiki/Robertsonian_fusion
wiki
A chromosomal mutation. Rearrangements that alter the pairing of telomeres are classified as translocations.
http://en.wikipedia.org/wiki/Chromosomal_translocation
chromosomal translocation
sequence
(Drosophila)T
(fungi)T
SO:1000044
chromosomal_translocation
A chromosomal mutation. Rearrangements that alter the pairing of telomeres are classified as translocations.
FB:reference_manual
http://en.wikipedia.org/wiki/Chromosomal_translocation
wiki
A ring chromosome is a chromosome whose arms have fused together to form a ring, often with the loss of the ends of the chromosome.
http://en.wikipedia.org/wiki/Ring_chromosome
ring chromosome
sequence
(Drosophila)R
(fungi)C
SO:1000045
ring_chromosome
A ring chromosome is a chromosome whose arms have fused together to form a ring, often with the loss of the ends of the chromosome.
http://en.wikipedia.org/wiki/Ring_chromosome
http://en.wikipedia.org/wiki/Ring_chromosome
wiki
A chromosomal inversion that includes the centromere.
pericentric inversion
sequence
SO:1000046
pericentric_inversion
A chromosomal inversion that includes the centromere.
FB:reference_manual
A chromosomal inversion that does not include the centromere.
paracentric inversion
sequence
SO:1000047
paracentric_inversion
A chromosomal inversion that does not include the centromere.
FB:reference_manual
A chromosomal translocation with two breaks; two chromosome segments have simply been exchanged.
reciprocal chromosomal translocation
sequence
SO:1000048
reciprocal_chromosomal_translocation
A chromosomal translocation with two breaks; two chromosome segments have simply been exchanged.
FB:reference_manual
Any change in mature, spliced and processed, RNA that results from a change in the corresponding DNA sequence.
SO:0001576
SO:1000177
SO:1000179
mutation affecting transcript
sequence variant causing partially characterised change in transcript
sequence variant causing uncharacterised change in transcript
sequence variation affecting transcript
sequence_variant_causing_partially_characterised_change_in_transcript
sequence_variant_causing_uncharacterised_change_in_transcript
sequence
mutation causing partially characterised change in transcript
mutation causing uncharacterised change in transcript
SO:1000049
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_affecting_transcript
true
Any change in mature, spliced and processed, RNA that results from a change in the corresponding DNA sequence.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
No effect on the state of the RNA.
sequence variant causing no change in transcript
sequence
mutation causing no change in transcript
SO:1000050
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also as there is not change, it is not a good ontological term.
sequence_variant_causing_no_change_in_transcript
true
No effect on the state of the RNA.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
Any of the amino acid coding triplets of a gene are affected by the DNA mutation.
SO:0001580
mutation affecting coding sequence
sequence
sequence variation affecting coding sequence
SO:1000054
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_affecting_coding_sequence
true
Any of the amino acid coding triplets of a gene are affected by the DNA mutation.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The DNA mutation changes, usually destroys, the first coding triplet of a gene. Usually prevents translation although another initiator codon may be used.
SO:0001582
sequence variant causing initiator codon change in transcript
sequence
mutation causing initiator codon change in transcript
SO:1000055
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_initiator_codon_change_in_transcript
true
The DNA mutation changes, usually destroys, the first coding triplet of a gene. Usually prevents translation although another initiator codon may be used.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The DNA mutation affects the amino acid coding sequence of a gene; this region includes both the initiator and terminator codons.
SO:0001606
sequence variant causing amino acid coding codon change in transcript
sequence
mutaton causing amino acid coding codon change in transcript
SO:1000056
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_amino_acid_coding_codon_change_in_transcript
true
The DNA mutation affects the amino acid coding sequence of a gene; this region includes both the initiator and terminator codons.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The changed codon has the same translation product as the original codon.
SO:0001819
sequence variant causing synonymous codon change in transcript
sequence
mutation causing synonymous codon change in transcript
SO:1000057
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_synonymous_codon_change_in_transcript
true
The changed codon has the same translation product as the original codon.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A DNA point mutation that causes a substitution of an amino acid by an other.
SO:0001583
non-synonymous codon change in transcript
sequence variant causing non synonymous codon change in transcript
sequence
mutation causing non synonymous codon change in transcript
SO:1000058
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_non_synonymous_codon_change_in_transcript
true
A DNA point mutation that causes a substitution of an amino acid by an other.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The nucleotide change in the codon leads to a new codon coding for a new amino acid.
SO:0001583
sequence variant causing missense codon change in transcript
sequence
mutation causing missense codon change in transcript
SO:1000059
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_missense_codon_change_in_transcript
true
The nucleotide change in the codon leads to a new codon coding for a new amino acid.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The amino acid change following from the codon change does not change the gross properties (size, charge, hydrophobicity) of the amino acid at that position.
SO:0001585
sequence variant causing conservative missense codon change in transcript
sequence
mutation causing conservative missense codon change in transcript
SO:1000060
The exact rules need to be stated, a common set of rules can be derived from e.g. BLOSUM62 amino acid distance matrix.
sequence_variant_causing_conservative_missense_codon_change_in_transcript
true
The amino acid change following from the codon change does not change the gross properties (size, charge, hydrophobicity) of the amino acid at that position.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The amino acid change following from the codon change changes the gross properties (size, charge, hydrophobicity) of the amino acid in that position.
SO:0001586
sequence variant causing nonconservative missense codon change in transcript
sequence
mutation causing nonconservative missense codon change in transcript
SO:1000061
The exact rules need to be stated, a common set of rules can be derived from e.g. BLOSUM62 amino acid distance matrix.
sequence_variant_causing_nonconservative_missense_codon_change_in_transcript
true
The amino acid change following from the codon change changes the gross properties (size, charge, hydrophobicity) of the amino acid in that position.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The nucleotide change in the codon triplet creates a terminator codon.
SO:0001587
sequence variant causing nonsense codon change in transcript
sequence
mutation causing nonsense codon change in transcript
SO:1000062
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_nonsense_codon_change_in_transcript
true
The nucleotide change in the codon triplet creates a terminator codon.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The nucleotide change in the codon triplet changes the stop codon, causing an elongated transcript sequence.
SO:0001590
sequence variant causing terminator codon change in transcript
sequence
mutation causing terminator codon change in transcript
SO:1000063
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_terminator_codon_change_in_transcript
true
The nucleotide change in the codon triplet changes the stop codon, causing an elongated transcript sequence.
SO:ke
An umbrella term for terms describing an effect of a sequence variation on the frame of translation.
mutation affecting reading frame
sequence
sequence variation affecting reading frame
SO:1000064
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_affecting_reading_frame
true
An umbrella term for terms describing an effect of a sequence variation on the frame of translation.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A mutation causing a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three.
http://en.wikipedia.org/wiki/Frameshift_mutation
frameshift mutation
sequence
frameshift sequence variation
out of frame mutation
SO:1000065
frameshift_sequence_variation
true
A mutation causing a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three.
SO:ke
http://en.wikipedia.org/wiki/Frameshift_mutation
wiki
A mutation causing a disruption of the translational reading frame, due to the insertion of a nucleotide.
SO:0001594
plus 1 frameshift mutation
sequence variant causing plus 1 frameshift mutation
sequence
SO:1000066
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_plus_1_frameshift_mutation
true
A mutation causing a disruption of the translational reading frame, due to the insertion of a nucleotide.
SO:ke
A mutation causing a disruption of the translational reading frame, due to the deletion of a nucleotide.
SO:0001592
minus 1 frameshift mutation
sequence variant causing minus 1 frameshift
sequence
SO:1000067
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_minus_1_frameshift
true
A mutation causing a disruption of the translational reading frame, due to the deletion of a nucleotide.
SO:ke
A mutation causing a disruption of the translational reading frame, due to the insertion of two nucleotides.
SO:0001595
plus 2 frameshift mutation
sequence variant causing plus 2 frameshift
sequence
SO:1000068
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_plus_2_frameshift
true
A mutation causing a disruption of the translational reading frame, due to the insertion of two nucleotides.
SO:ke
A mutation causing a disruption of the translational reading frame, due to the deletion of two nucleotides.
SO:0001593
minus 2 frameshift mutation
sequence variant causing minus 2 frameshift
sequence
SO:1000069
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_minus_2_frameshift
true
A mutation causing a disruption of the translational reading frame, due to the deletion of two nucleotides.
SO:ke
Sequence variant affects the way in which the primary transcriptional product is processed to form the mature transcript.
SO:0001543
sequence variant affecting transcript processing
sequence
mutation affecting transcript processing
SO:1000070
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_transcript_processing
true
Sequence variant affects the way in which the primary transcriptional product is processed to form the mature transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A sequence_variant_effect where the way in which the primary transcriptional product is processed to form the mature transcript, specifically by the removal (splicing) of intron sequences is changed.
SO:0001568
sequence variant affecting splicing
sequence
mutation affecting splicing
SO:1000071
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_splicing
true
A sequence_variant_effect where the way in which the primary transcriptional product is processed to form the mature transcript, specifically by the removal (splicing) of intron sequences is changed.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A sequence_variant_effect that changes the splice donor sequence.
SO:0001575
splice donor mutation
sequence
mutation affecting splice donor
sequence variant affecting splice donor
SO:1000072
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_splice_donor
true
A sequence_variant_effect that changes the splice donor sequence.
SO:ke
A sequence_variant_effect that changes the splice acceptor sequence.
SO:0001574
splice acceptor mutation
sequence
mutation affecting splicing
sequence variant affecting splice acceptor
SO:1000073
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_splice_acceptor
true
A sequence_variant_effect that changes the splice acceptor sequence.
SO:ke
A sequence variant causing a new (functional) splice site.
SO:0001569
cryptic splice activator sequence variant
sequence variant causing cryptic splice activator
sequence
mutation causing cryptic splice activator
SO:1000074
A cryptic splice site is only used when the natural splice site has been disrupted by a sequence alteration.
sequence_variant_causing_cryptic_splice_activation
true
A sequence variant causing a new (functional) splice site.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
Sequence variant affects the editing of the transcript.
SO:0001544
sequence variant affecting editing
sequence
mutation affecting editing
SO:1000075
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_editing
true
Sequence variant affects the editing of the transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
Mutation affects the process of transcription, its initiation, progression or termination.
SO:0001549
sequence variant affecting transcription
sequence
mutation affecting transcription
SO:1000076
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_transcription
true
Mutation affects the process of transcription, its initiation, progression or termination.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A sequence variation that decreases the rate a which transcription of the sequence occurs.
sequence variation decreasing rate of transcription
sequence
mutation decreasing rate of transcription
SO:1000078
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_decreasing_rate_of_transcription
true
A sequence variation that decreases the rate a which transcription of the sequence occurs.
SO:ke
mutation affecting transcript sequence
sequence variation affecting transcript sequence
sequence
SO:1000079
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_affecting_transcript_sequence
true
sequence variation increasing rate of transcription
sequence
mutation increasing rate of transcription
SO:1000080
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_increasing_rate_of_transcription
true
A mutation that alters the rate a which transcription of the sequence occurs.
SO:0001550
sequence variant affecting rate of transcription
sequence
mutation affecting rate of transcription
SO:1000081
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_rate_of_transcription
true
A mutation that alters the rate a which transcription of the sequence occurs.
SO:ke
Sequence variant affects the stability of the transcript.
SO:0001546
sequence variant affecting transcript stability
sequence
mutation affecting transcript stability
SO:1000082
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence variant_affecting_transcript_stability
true
Sequence variant affects the stability of the transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
Sequence variant increases the stability (half-life) of the transcript.
sequence variant increasing transcript stability
sequence
mutation increasing transcript stability
SO:1000083
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_increasing_transcript_stability
true
Sequence variant increases the stability (half-life) of the transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
Sequence variant decreases the stability (half-life) of the transcript.
sequence variant decreasing transcript stability
sequence
mutation decreasing transcript stability
SO:1000084
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_decreasing_transcript_stability
true
Sequence variant decreases the stability (half-life) of the transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
A sequence variation that causes a change in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
SO:0001540
sequence variation affecting level of transcript
sequence
mutation affecting level of transcript
SO:1000085
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_affecting_level_of_transcript
true
A sequence variation that causes a change in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
SO:ke
A sequence variation that causes a decrease in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
mutation decreasing level of transcript
sequence
sequence variation decreasing level of transcript
SO:1000086
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_decreasing_level_of_transcript
true
A sequence variation that causes a decrease in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
SO:ke
A sequence_variation that causes an increase in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
mutation increasing level of transcript
sequence variation increasing level of transcript
sequence
SO:1000087
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variation_increasing_level_of_transcript
true
A sequence_variation that causes an increase in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence.
SO:ke
A sequence variant causing a change in primary translation product of a transcript.
SO:0001553
SO:1000090
SO:1000091
sequence variant affecting translational product
sequence variant causing partially characterised change of translational product
sequence variant causing uncharacterised change of translational product
sequence_variant_causing_partially_characterised_change_of_translational_product
sequence_variant_causing_uncharacterised_change_of_translational_product
sequence
mutation affecting translational product
mutation causing partially characterised change of translational product
mutation causing uncharacterised change of translational product
SO:1000088
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_translational_product
true
A sequence variant causing a change in primary translation product of a transcript.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The sequence variant at RNA level does not lead to any change in polypeptide.
sequence variant causing no change of translational product
sequence
mutation causing no change of translational product
SO:1000089
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also, as there is no change, this is not a good ontological term.
sequence_variant_causing_no_change_of_translational_product
true
The sequence variant at RNA level does not lead to any change in polypeptide.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
true
true
Any sequence variant effect that is known at nucleotide level but cannot be explained by using other key terms.
SO:0001539
sequence variant causing complex change of translational product
sequence
mutation causing complex change of translational product
SO:1000092
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_complex_change_of_translational_product
true
Any sequence variant effect that is known at nucleotide level but cannot be explained by using other key terms.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The replacement of a single amino acid by another.
SO:0001606
sequence variant causing amino acid substitution
sequence
mutation causing amino acid substitution
SO:1000093
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_amino_acid_substitution
true
The replacement of a single amino acid by another.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
SO:0001607
sequence variant causing conservative amino acid substitution
sequence
mutation causing conservative amino acid substitution
SO:1000094
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_conservative_amino_acid_substitution
true
SO:0001607
sequence variant causing nonconservative amino acid substitution
sequence
mutation causing nonconservative amino acid substitution
SO:1000095
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_nonconservative_amino_acid_substitution
true
The insertion of one or more amino acids from the polypeptide, without affecting the surrounding sequence.
SO:0001605
sequence variant causing amino acid insertion
sequence
mutation causing amino acid insertion
SO:1000096
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_amino_acid_insertion
true
The insertion of one or more amino acids from the polypeptide, without affecting the surrounding sequence.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The deletion of one or more amino acids from the polypeptide, without affecting the surrounding sequence.
SO:0001825
sequence variant causing amino acid deletion
sequence
mutation causing amino acid deletion
SO:1000097
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_amino_acid_deletion
true
The deletion of one or more amino acids from the polypeptide, without affecting the surrounding sequence.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The translational product is truncated at its C-terminus, usually a result of a nonsense codon change in transcript (SO:1000062).
SO:0001587
sequence variant causing polypeptide truncation
sequence
mutation causing polypeptide truncation
SO:1000098
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_polypeptide_truncation
true
The translational product is truncated at its C-terminus, usually a result of a nonsense codon change in transcript (SO:1000062).
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
The extension of the translational product at either (or both) the N-terminus and/or the C-terminus.
SO:0001609
sequence variant causing polypeptide elongation
sequence
mutation causing polypeptide elongation
SO:1000099
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_polypeptide_elongation
true
The extension of the translational product at either (or both) the N-terminus and/or the C-terminus.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
.
SO:0001611
mutation causing polypeptide N terminal elongation
polypeptide N-terminal elongation
sequence
SO:1000100
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
mutation_causing_polypeptide_N_terminal_elongation
true
.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
.
SO:0001610
mutation causing polypeptide C terminal elongation
polypeptide C-terminal elongation
sequence
SO:1000101
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
mutation_causing_polypeptide_C_terminal_elongation
true
.
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
SO:0001553
sequence variant affecting level of translational product
sequence
mutation affecting level of translational product
SO:1000102
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_level_of_translational_product
true
SO:0001555
sequence variant decreasing level of translation product
sequence
mutationdecreasing level of translation product
SO:1000103
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_decreasing_level_of_translation_product
true
sequence variant increasing level of translation product
sequence
mutationt increasing level of translation product
SO:1000104
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_increasing_level_of_translation_product
true
SO:0001603
sequence variant affecting polypeptide amino acid sequence
sequence
mutation affecting polypeptide amino acid sequence
SO:1000105
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_polypeptide_amino_acid_sequence
true
SO:0001614
inframe polypeptide N-terminal elongation
mutation causing inframe polypeptide N terminal elongation
sequence
SO:1000106
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
mutation_causing_inframe_polypeptide_N_terminal_elongation
true
SO:0001615
mutation causing out of frame polypeptide N terminal elongation
out of frame polypeptide N-terminal elongation
sequence
SO:1000107
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
mutation_causing_out_of_frame_polypeptide_N_terminal_elongation
true
SO:0001612
inframe_polypeptide C-terminal elongation
mutaton causing inframe polypeptide C terminal elongation
sequence
SO:1000108
mutaton_causing_inframe_polypeptide_C_terminal_elongation
true
SO:0001613
mutation causing out of frame polypeptide C terminal elongation
out of frame polypeptide C-terminal elongation
sequence
SO:1000109
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
mutation_causing_out_of_frame_polypeptide_C_terminal_elongation
true
A mutation that reverts the sequence of a previous frameshift mutation back to the initial frame.
frame restoring mutation
frame restoring sequence variant
sequence
SO:1000110
frame_restoring_sequence_variant
true
A mutation that reverts the sequence of a previous frameshift mutation back to the initial frame.
SO:ke
A mutation that changes the amino acid sequence of the peptide in such a way that it changes the 3D structure of the molecule.
SO:0001599
SO:1000113
SO:1000114
sequence variant affecting 3D structure of polypeptide
sequence variant affecting 3D-structure of polypeptide
sequence variant causing partially characterised 3D structural change
sequence variant causing uncharacterised 3D structural change
sequence_variant_causing_partially_characterised_3D_structural_change
sequence_variant_causing_uncharacterised_3D_structural_change
sequence
mutation affecting 3D structure of polypeptide
mutation causing partially characterised 3D structural change
mutation causing uncharacterised 3D structural change
SO:1000111
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_3D_structure_of_polypeptide
true
A mutation that changes the amino acid sequence of the peptide in such a way that it changes the 3D structure of the molecule.
SO:ke
sequence variant causing no 3D structural change
sequence
mutation causing no 3D structural change
SO:1000112
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also as there is no effect, it is not a good term.
sequence_variant_causing_no_3D_structural_change
true
true
true
SO:0001600
sequence variant causing complex 3D structural change
sequence
mutation causing complex 3D structural change
SO:1000115
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_complex_3D_structural_change
true
SO:0001601
sequence variant causing conformational change
sequence
mutation causing conformational change
SO:1000116
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_conformational_change
true
SO:0001554
sequence variant affecting polypeptide function
sequence
mutation affecting polypeptide function
SO:1000117
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_polypeptide_function
true
SO:0001559
sequence variant causing loss of function of polypeptide
sequence
loss of function of polypeptide
mutation causing loss of function of polypeptide
SO:1000118
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_loss_of_function_of_polypeptide
true
SO:0001560
sequence variant causing inactive ligand binding site
sequence
mutation causing inactive ligand binding site
SO:1000119
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_inactive_ligand_binding_site
true
SO:0001618
sequence variant causing inactive catalytic site
sequence
mutation causing inactive catalytic site
SO:1000120
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_inactive_catalytic_site
true
SO:0001558
sequence variant causing polypeptide localization change
sequence
mutation causing polypeptide localization change
SO:1000121
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_polypeptide_localization_change
true
SO:0001562
polypeptide post-translational processing affected
sequence variant causing polypeptide post translational processing change
sequence
mutation causing polypeptide post translational processing change
SO:1000122
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_polypeptide_post_translational_processing_change
true
sequence
polypeptide_post-translational_processing_affected
SO:1000123
polypeptide_post_translational_processing_affected
true
SO:0001561
partial loss of function of polypeptide
sequence variant causing partial loss of function of polypeptide
sequence
mutation causing partial loss of function of polypeptide
SO:1000124
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_partial_loss_of_function_of_polypeptide
true
SO:0001557
gain of function of polypeptide
sequence variant causing gain of function of polypeptide
sequence
mutation causing gain of function of polypeptide
SO:1000125
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_gain_of_function_of_polypeptide
true
A sequence variant that affects the secondary structure (folding) of the RNA transcript molecule.
SO:0001596
sequence variant affecting transcript secondary structure
sequence
mutation affecting transcript secondary structure
SO:1000126
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_transcript_secondary_structure
true
A sequence variant that affects the secondary structure (folding) of the RNA transcript molecule.
SO:ke
SO:0001597
sequence variant causing compensatory transcript secondary structure mutation
sequence
mutation causing compensatory transcript secondary structure mutation
SO:1000127
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_compensatory_transcript_secondary_structure_mutation
true
The effect of a change in nucleotide sequence.
sequence
sequence variant effect
SO:1000132
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
Updated after discussion with Peter Taschner - Feb 09.
sequence_variant_effect
true
The effect of a change in nucleotide sequence.
SO:ke
SO:0001616
sequence variant causing polypeptide fusion
sequence
mutation causing polypeptide fusion
SO:1000134
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_polypeptide_fusion
true
An autosynaptic chromosome is the aneuploid product of recombination between a pericentric inversion and a cytologically wild-type chromosome.
autosynaptic chromosome
sequence
(Drosophila)A
SO:1000136
autosynaptic_chromosome
An autosynaptic chromosome is the aneuploid product of recombination between a pericentric inversion and a cytologically wild-type chromosome.
PMID:6804304
A compound chromosome whereby two copies of the same chromosomal arm attached to a common centromere. The chromosome is diploid for the arm involved.
homo compound chromosome
homo-compound chromosome
sequence
SO:1000138
homo_compound_chromosome
A compound chromosome whereby two copies of the same chromosomal arm attached to a common centromere. The chromosome is diploid for the arm involved.
SO:ke
A compound chromosome whereby two arms from different chromosomes are connected through the centromere of one of them.
hetero compound chromosome
hetero-compound chromosome
sequence
SO:1000140
hetero_compound_chromosome
A compound chromosome whereby two arms from different chromosomes are connected through the centromere of one of them.
FB:reference_manual
SO:ke
A chromosome that occurred by the division of a larger chromosome.
chromosome fission
sequence
SO:1000141
chromosome_fission
A chromosome that occurred by the division of a larger chromosome.
SO:ke
An autosynaptic chromosome carrying the two right (D = dextro) telomeres.
dextrosynaptic chromosome
sequence
SO:1000142
Corrected spelling from dexstrosynaptic_chromosome to dextrosynaptic_chromosome on April 14, 2020 in response to GitHub request #447
dextrosynaptic_chromosome
An autosynaptic chromosome carrying the two right (D = dextro) telomeres.
FB:manual
LS is an autosynaptic chromosome carrying the two left (L = levo) telomeres.
laevosynaptic chromosome
sequence
SO:1000143
laevosynaptic_chromosome
LS is an autosynaptic chromosome carrying the two left (L = levo) telomeres.
FB:manual
A chromosome structure variation whereby the duplicated sequences are carried as a free centric element.
free duplication
sequence
SO:1000144
free_duplication
A chromosome structure variation whereby the duplicated sequences are carried as a free centric element.
FB:reference_manual
A ring chromosome which is a copy of another chromosome.
free ring duplication
sequence
(Drosophila)R
SO:1000145
free_ring_duplication
A ring chromosome which is a copy of another chromosome.
SO:ke
true
A chromosomal deletion whereby a translocation occurs in which one of the four broken ends loses a segment before re-joining.
deficient translocation
sequence
(Drosophila)Df
(Drosophila)DfT
SO:1000147
deficient_translocation
A chromosomal deletion whereby a translocation occurs in which one of the four broken ends loses a segment before re-joining.
FB:reference_manual
A chromosomal translocation whereby the first two breaks are in the same chromosome, and the region between them is rejoined in inverted order to the other side of the first break, such that both sides of break one are present on the same chromosome. The remaining free ends are joined as a translocation with those resulting from the third break.
inversion cum translocation
sequence
(Drosophila)InT
(Drosophila)T
SO:1000148
inversion_cum_translocation
A chromosomal translocation whereby the first two breaks are in the same chromosome, and the region between them is rejoined in inverted order to the other side of the first break, such that both sides of break one are present on the same chromosome. The remaining free ends are joined as a translocation with those resulting from the third break.
FB:reference_manual
An interchromosomal mutation whereby the (large) region between the first two breaks listed is lost, and the two flanking segments (one of them centric) are joined as a translocation to the free ends resulting from the third break.
bipartite duplication
sequence
(Drosophila)bDp
SO:1000149
bipartite_duplication
An interchromosomal mutation whereby the (large) region between the first two breaks listed is lost, and the two flanking segments (one of them centric) are joined as a translocation to the free ends resulting from the third break.
FB:reference_manual
A chromosomal translocation whereby three breaks occurred in three different chromosomes. The centric segment resulting from the first break listed is joined to the acentric segment resulting from the second, rather than the third.
cyclic translocation
sequence
SO:1000150
cyclic_translocation
A chromosomal translocation whereby three breaks occurred in three different chromosomes. The centric segment resulting from the first break listed is joined to the acentric segment resulting from the second, rather than the third.
FB:reference_manual
A chromosomal inversion caused by three breaks in the same chromosome; both central segments are inverted in place (i.e., they are not transposed).
bipartite inversion
sequence
(Drosophila)bIn
SO:1000151
bipartite_inversion
A chromosomal inversion caused by three breaks in the same chromosome; both central segments are inverted in place (i.e., they are not transposed).
FB:reference_manual
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
uninverted insertional duplication
sequence
(Drosophila)eDp
SO:1000152
uninverted_insertional_duplication
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
FB:reference_manual
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments.
inverted insertional duplication
sequence
(Drosophila)iDp
SO:1000153
inverted_insertional_duplication
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments.
FB:reference_manual
A chromosome duplication involving the insertion of a duplicated region (as opposed to a free duplication).
insertional duplication
sequence
(Drosophila)Dpp
SO:1000154
insertional_duplication
A chromosome duplication involving the insertion of a duplicated region (as opposed to a free duplication).
SO:ke
A chromosome structure variation whereby a transposition occurred between chromosomes.
interchromosomal transposition
sequence
(Drosophila)Tp
SO:1000155
interchromosomal_transposition
A chromosome structure variation whereby a transposition occurred between chromosomes.
SO:ke
An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segment.
inverted interchromosomal transposition
sequence
(Drosophila)iTp
SO:1000156
inverted_interchromosomal_transposition
An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segment.
FB:reference_manual
An interchromosomal transition where the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
uninverted interchromosomal transposition
sequence
(Drosophila)eTp
SO:1000157
uninverted_interchromosomal_transposition
An interchromosomal transition where the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
FB:reference_manual
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments.
inverted intrachromosomal transposition
sequence
(Drosophila)iTp
SO:1000158
inverted_intrachromosomal_transposition
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments.
FB:reference_manual
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
uninverted intrachromosomal transposition
sequence
(Drosophila)eTp
SO:1000159
uninverted_intrachromosomal_transposition
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments.
FB:reference_manual
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
unoriented insertional duplication
sequence
(Drosophila)uDp
SO:1000160
Flag - unknown in the definition.
unoriented_insertional_duplication
An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
FB:reference_manual
An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
unorientated interchromosomal transposition
sequence
(Drosophila)uTp
SO:1000161
FLAG - term describes an unknown.
unoriented_interchromosomal_transposition
An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
FB:reference_manual
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
unorientated intrachromosomal transposition
sequence
(Drosophila)uTp
SO:1000162
FLAG - definition describes an unknown.
unoriented_intrachromosomal_transposition
An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded.
FB:reference_manual
A chromosome structure variant that has not been characterized.
uncharacterized chromosomal mutation
sequence
SO:1000170
uncharacterized_chromosomal_mutation
A chromosomal deletion whereby three breaks occur in the same chromosome; one central region is lost, and the other is inverted.
deficient inversion
sequence
(Drosophila)Df
(Drosophila)DfIn
SO:1000171
deficient_inversion
A chromosomal deletion whereby three breaks occur in the same chromosome; one central region is lost, and the other is inverted.
FB:reference_manual
SO:ke
A duplication consisting of 2 identical adjacent regions.
tandem duplication
sequence
erverted
SO:1000173
tandem_duplication
A duplication consisting of 2 identical adjacent regions.
SO:ke
erverted
http://www.ncbi.nlm.nih.gov/dbvar/
A chromosome structure variant that has not been characterized fully.
partially characterized chromosomal mutation
sequence
SO:1000175
partially_characterized_chromosomal_mutation
true
true
A sequence_variant_effect that changes the gene structure.
SO:0001564
sequence variant affecting gene structure
sequence
mutation affecting gene structure
SO:1000180
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_gene_structure
true
A sequence_variant_effect that changes the gene structure.
SO:ke
A sequence_variant_effect that changes the gene structure by causing a fusion to another gene.
SO:0001565
sequence variant causing gene fusion
sequence
mutation causing gene fusion
SO:1000181
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_gene_fusion
true
A sequence_variant_effect that changes the gene structure by causing a fusion to another gene.
SO:ke
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
Jannovar:chromosome_number_variation
chromosome number variation
sequence
SO:1000182
chromosome_number_variation
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
SO:ke
Jannovar:chromosome_number_variation
http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html
An alteration of the genome that leads to a change in the structure or number of one or more chromosomes.
http://snpeff.sourceforge.net/SnpEff_manual.html
chromosome structure variation
snpEff:CHROMOSOME_LARGE_DELETION
sequence
SO:1000183
chromosome_structure_variation
snpEff:CHROMOSOME_LARGE_DELETION
A sequence variant affecting splicing and causes an exon loss.
sequence variant causes exon loss
sequence
mutation causes exon loss
SO:1000184
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causes_exon_loss
true
A sequence variant affecting splicing and causes an exon loss.
SO:ke
A sequence variant effect, causing an intron to be gained by the processed transcript; usually a result of a donor acceptor mutation (SO:1000072).
sequence variant causes intron gain
sequence
mutation causes intron gain
SO:1000185
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causes_intron_gain
true
A sequence variant effect, causing an intron to be gained by the processed transcript; usually a result of a donor acceptor mutation (SO:1000072).
EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html
SO:0001571
sequence variant causing cryptic splice donor activation
sequence
SO:1000186
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_cryptic_splice_donor_activation
true
SO:0001570
sequence variant causing cryptic splice acceptor activation
sequence
SO:1001186
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_causing_cryptic_splice_acceptor_activation
true
A transcript that is alternatively spliced.
alternatively spliced transcript
sequence
SO:1001187
alternatively_spliced_transcript
A transcript that is alternatively spliced.
SO:xp
A gene that is alternately spliced, but encodes only one polypeptide.
encodes 1 polypeptide
sequence
SO:1001188
encodes_1_polypeptide
A gene that is alternately spliced, but encodes only one polypeptide.
SO:ke
A gene that is alternately spliced, and encodes more than one polypeptide.
encodes greater than 1 polypeptide
sequence
SO:1001189
encodes_greater_than_1_polypeptide
A gene that is alternately spliced, and encodes more than one polypeptide.
SO:ke
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different stop codons.
encodes different polypeptides different stop
sequence
SO:1001190
encodes_different_polypeptides_different_stop
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different stop codons.
SO:ke
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start codons.
encodes overlapping peptides different start
sequence
SO:1001191
encodes_overlapping_peptides_different_start
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start codons.
SO:ke
A gene that is alternately spliced, and encodes more than one polypeptide, that do not have overlapping peptide sequences.
encodes disjoint polypeptides
sequence
SO:1001192
encodes_disjoint_polypeptides
A gene that is alternately spliced, and encodes more than one polypeptide, that do not have overlapping peptide sequences.
SO:ke
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start and stop codons.
encodes overlapping polypeptides different start and stop
sequence
SO:1001193
encodes_overlapping_polypeptides_different_start_and_stop
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start and stop codons.
SO:ke
sequence
SO:1001194
alternatively_spliced_gene_encoding_greater_than_1_polypeptide_coding_regions_overlapping
true
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences.
encodes overlapping peptides
sequence
SO:1001195
encodes_overlapping_peptides
A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences.
SO:ke
A maxicircle gene so extensively edited that it cannot be matched to its edited mRNA sequence.
sequence
SO:1001196
cryptogene
A maxicircle gene so extensively edited that it cannot be matched to its edited mRNA sequence.
SO:ma
A primary transcript that has the quality dicistronic.
dicistronic primary transcript
sequence
SO:1001197
dicistronic_primary_transcript
A primary transcript that has the quality dicistronic.
SO:xp
A gene that is a member of a group of genes that are either regulated or transcribed together.
member of regulon
sequence
SO:1001217
member_of_regulon
sequence
alternatively_spliced_transcript_encoding_greater_than_1_polypeptide_different_start_codon_different_stop_codon_coding_regions_non-overlapping
SO:1001244
alternatively_spliced_transcript_encoding_greater_than_1_polypeptide_different_start_codon_different_stop_codon_coding_regions_non_overlapping
true
A CDS with the evidence status of being independently known.
CDS independently known
sequence
SO:1001246
CDS_independently_known
A CDS with the evidence status of being independently known.
SO:xp
A CDS whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence.
orphan CDS
sequence
SO:1001247
orphan_CDS
A CDS whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence.
SO:ma
A CDS that is supported by domain similarity.
CDS supported by domain match data
sequence
SO:1001249
CDS_supported_by_domain_match_data
A CDS that is supported by domain similarity.
SO:xp
A CDS that is supported by sequence similarity data.
CDS supported by sequence similarity data
sequence
SO:1001251
CDS_supported_by_sequence_similarity_data
A CDS that is supported by sequence similarity data.
SO:xp
A CDS that is predicted.
CDS predicted
sequence
SO:1001254
CDS_predicted
A CDS that is predicted.
SO:ke
sequence
SO:1001255
status_of_coding_sequence
true
A CDS that is supported by similarity to EST or cDNA data.
CDS supported by EST or cDNA data
sequence
SO:1001259
CDS_supported_by_EST_or_cDNA_data
A CDS that is supported by similarity to EST or cDNA data.
SO:xp
A Shine-Dalgarno sequence that stimulates recoding through interactions with the anti-Shine-Dalgarno in the RNA of small ribosomal subunits of translating ribosomes. The signal is only operative in Bacteria.
internal Shine Dalgarno sequence
internal Shine-Dalgarno sequence
sequence
SO:1001260
internal_Shine_Dalgarno_sequence
A Shine-Dalgarno sequence that stimulates recoding through interactions with the anti-Shine-Dalgarno in the RNA of small ribosomal subunits of translating ribosomes. The signal is only operative in Bacteria.
PMID:12519954
SO:ke
The sequence of a mature mRNA transcript, modified before translation or during translation, usually by special cis-acting signals.
recoded mRNA
sequence
SO:1001261
recoded_mRNA
The sequence of a mature mRNA transcript, modified before translation or during translation, usually by special cis-acting signals.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8811194&dopt=Abstract
An attribute describing a translational frameshift of -1.
minus 1 translationally frameshifted
sequence
SO:1001262
minus_1_translationally_frameshifted
An attribute describing a translational frameshift of -1.
SO:ke
An attribute describing a translational frameshift of +1.
plus 1 translationally frameshifted
sequence
SO:1001263
plus_1_translationally_frameshifted
An attribute describing a translational frameshift of +1.
SO:ke
A recoded_mRNA where translation was suspended at a particular codon and resumed at a particular non-overlapping downstream codon.
mRNA recoded by translational bypass
sequence
SO:1001264
mRNA_recoded_by_translational_bypass
A recoded_mRNA where translation was suspended at a particular codon and resumed at a particular non-overlapping downstream codon.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8811194&dopt=Abstract
A recoded_mRNA that was modified by an alteration of codon meaning.
mRNA recoded by codon redefinition
sequence
SO:1001265
mRNA_recoded_by_codon_redefinition
A recoded_mRNA that was modified by an alteration of codon meaning.
SO:ma
sequence
SO:1001266
stop_codon_redefinition_as_selenocysteine
true
sequence
SO:1001267
stop_codon_readthrough
true
A site in an mRNA sequence that stimulates the recoding of a region in the same mRNA.
INSDC_feature:regulatory
INSDC_qualifier:recoding_stimulatory_region
recoding stimulatory region
recoding stimulatory signal
sequence
SO:1001268
recoding_stimulatory_region
A site in an mRNA sequence that stimulates the recoding of a region in the same mRNA.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12519954&dopt=Abstract
A non-canonical start codon with 4 base pairs.
4bp start codon
four bp start codon
sequence
SO:1001269
four_bp_start_codon
A non-canonical start codon with 4 base pairs.
SO:ke
sequence
SO:1001270
stop_codon_redefinition_as_pyrrolysine
true
An intron characteristic of Archaeal tRNA and rRNA genes, where intron transcript generates a bulge-helix-bulge motif that is recognised by a splicing endoribonuclease.
archaeal intron
sequence
SO:1001271
Intron characteristic of tRNA genes; splices by an endonuclease-ligase mediated mechanism.
archaeal_intron
An intron characteristic of Archaeal tRNA and rRNA genes, where intron transcript generates a bulge-helix-bulge motif that is recognised by a splicing endoribonuclease.
PMID:9301331
SO:ma
An intron found in tRNA that is spliced via endonucleolytic cleavage and ligation rather than transesterification.
pre-tRNA intron
tRNA intron
sequence
SO:1001272
Could be a cross product with Gene ontology, GO:0006388.
tRNA_intron
An intron found in tRNA that is spliced via endonucleolytic cleavage and ligation rather than transesterification.
SO:ke
A non-canonical start codon of sequence CTG.
CTG start codon
sequence
SO:1001273
CTG_start_codon
A non-canonical start codon of sequence CTG.
SO:ke
The incorporation of selenocysteine into a protein sequence is directed by an in-frame UGA codon (usually a stop codon) within the coding region of the mRNA. Selenoprotein mRNAs contain a conserved secondary structure in the 3' UTR that is required for the distinction of UGA stop from UGA selenocysteine. The selenocysteine insertion sequence (SECIS) is around 60 nt in length and adopts a hairpin structure which is sufficiently well-defined and conserved to act as a computational screen for selenoprotein genes.
http://en.wikipedia.org/wiki/SECIS_element
SECIS element
sequence
SO:1001274
SECIS_element
The incorporation of selenocysteine into a protein sequence is directed by an in-frame UGA codon (usually a stop codon) within the coding region of the mRNA. Selenoprotein mRNAs contain a conserved secondary structure in the 3' UTR that is required for the distinction of UGA stop from UGA selenocysteine. The selenocysteine insertion sequence (SECIS) is around 60 nt in length and adopts a hairpin structure which is sufficiently well-defined and conserved to act as a computational screen for selenoprotein genes.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00031
http://en.wikipedia.org/wiki/SECIS_element
wiki
Sequence coding for a short, single-stranded, DNA sequence via a retrotransposed RNA intermediate; characteristic of some microbial genomes.
sequence
SO:1001275
retron
Sequence coding for a short, single-stranded, DNA sequence via a retrotransposed RNA intermediate; characteristic of some microbial genomes.
SO:ma
The recoding stimulatory signal located downstream of the recoding site.
three prime recoding site
sequence
SO:1001277
three_prime_recoding_site
The recoding stimulatory signal located downstream of the recoding site.
SO:ke
A recoding stimulatory region, the stem-loop secondary structural element is downstream of the redefined region.
three prime stem loop structure
sequence
SO:1001279
three_prime_stem_loop_structure
A recoding stimulatory region, the stem-loop secondary structural element is downstream of the redefined region.
PMID:12519954
SO:ke
The recoding stimulatory signal located upstream of the recoding site.
five prime recoding site
sequence
SO:1001280
five_prime_recoding_site
The recoding stimulatory signal located upstream of the recoding site.
SO:ke
Four base pair sequence immediately downstream of the redefined region. The redefined region is a frameshift site. The quadruplet is 2 overlapping codons.
flanking three prime quadruplet recoding signal
sequence
SO:1001281
flanking_three_prime_quadruplet_recoding_signal
Four base pair sequence immediately downstream of the redefined region. The redefined region is a frameshift site. The quadruplet is 2 overlapping codons.
PMID:12519954
SO:ke
A stop codon signal for a UAG stop codon redefinition.
UAG stop codon signal
sequence
SO:1001282
UAG_stop_codon_signal
A stop codon signal for a UAG stop codon redefinition.
SO:ke
A stop codon signal for a UAA stop codon redefinition.
UAA stop codon signal
sequence
SO:1001283
UAA_stop_codon_signal
A stop codon signal for a UAA stop codon redefinition.
SO:ke
A set of units of gene expression directly regulated by a common set of one or more common regulatory gene products.
http://en.wikipedia.org/wiki/Regulon
sequence
SO:1001284
Definition updated with Mejia-Almonte et.al PMID:32665585 on Aug 5, 2020. Added relationship has_part SO:0002300
regulon
A set of units of gene expression directly regulated by a common set of one or more common regulatory gene products.
ISBN:0198506732
PMID:32665585
http://en.wikipedia.org/wiki/Regulon
wiki
A stop codon signal for a UGA stop codon redefinition.
UGA stop codon signal
sequence
SO:1001285
UGA_stop_codon_signal
A stop codon signal for a UGA stop codon redefinition.
SO:ke
A recoding stimulatory signal, downstream sequence important for recoding that contains repetitive elements.
three prime repeat recoding signal
sequence
SO:1001286
three_prime_repeat_recoding_signal
A recoding stimulatory signal, downstream sequence important for recoding that contains repetitive elements.
PMID:12519954
SO:ke
A recoding signal that is found many hundreds of nucleotides 3' of a redefined stop codon.
distant three prime recoding signal
sequence
SO:1001287
distant_three_prime_recoding_signal
A recoding signal that is found many hundreds of nucleotides 3' of a redefined stop codon.
http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8709208&dopt=Abstract
A recoding stimulatory signal that is a stop codon and has effect on efficiency of recoding.
stop codon signal
sequence
SO:1001288
This term does not include the stop codons that are redefined. An example would be a stop codon that partially overlapped a frame shifting site would be an example stimulatory signal.
stop_codon_signal
A recoding stimulatory signal that is a stop codon and has effect on efficiency of recoding.
PMID:12519954
SO:ke
The sequence referred to by an entry in a databank such as GenBank or SwissProt.
databank entry
sequence
accession
SO:2000061
databank_entry
The sequence referred to by an entry in a databank such as GenBank or SwissProt.
SO:ke
A gene component region which acts as a recombinational unit of a gene whose functional form is generated through somatic recombination.
gene segment
sequence
SO:3000000
Requested by tracker 2021594, July 2008, by Alex.
gene_segment
A gene component region which acts as a recombinational unit of a gene whose functional form is generated through somatic recombination.
GOC:add