Comprehensive comparisons of the current human, mouse, and rat RefSeq, Ensembl, EST, and FANTOM3 datasets: identification of new human genes with specific tissue expression profile
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Comprehensive comparisons of the current human, mouse, and rat RefSeq, Ensembl, EST, and FANTOM3 datasets : identification of new human genes with specific tissue expression profile. / Nordström, Karl J V; Mirza, Majd A I; Larsson, Thomas P; Gloriam, David E.; Fredriksson, Robert; Schiöth, Helgi B.
In: Biochemical and Biophysical Research Communications, Vol. 348, No. 3, 29.09.2006, p. 1063-74.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Comprehensive comparisons of the current human, mouse, and rat RefSeq, Ensembl, EST, and FANTOM3 datasets
T2 - identification of new human genes with specific tissue expression profile
AU - Nordström, Karl J V
AU - Mirza, Majd A I
AU - Larsson, Thomas P
AU - Gloriam, David E.
AU - Fredriksson, Robert
AU - Schiöth, Helgi B
PY - 2006/9/29
Y1 - 2006/9/29
N2 - Our understanding of functional genetic elements in the genomes is continuously growing and new entries are entered in various databases on a regular basis. We have here merged the genetic elements in RefSeq, Ensembl, FANTOM3, HINV, and NCBI:s ESTdb using the genome assemblies in order to achieve a comprehensive picture of the current status of the identity and gene number in human, mouse, and rat. The number of human protein coding genes has not increased (25,043) while the increased sequencing of mouse transcripts has provided the considerably higher number of protein coding genes (31,578) in mouse. The results indicate large discrepancies between the datasets, as considerable numbers of unique transcripts can be found in each dataset. Despite the high number of ncRNA (38,129 in mouse) there are also almost 20,000 EST clusters in both mouse and humans with more than one EST that do not overlap any transcript suggesting that several new genetic elements are still to be found. We also demonstrated presence of new genes by identifying new human ones that have specific tissue profiles, using RT-PCR on rat tissues.
AB - Our understanding of functional genetic elements in the genomes is continuously growing and new entries are entered in various databases on a regular basis. We have here merged the genetic elements in RefSeq, Ensembl, FANTOM3, HINV, and NCBI:s ESTdb using the genome assemblies in order to achieve a comprehensive picture of the current status of the identity and gene number in human, mouse, and rat. The number of human protein coding genes has not increased (25,043) while the increased sequencing of mouse transcripts has provided the considerably higher number of protein coding genes (31,578) in mouse. The results indicate large discrepancies between the datasets, as considerable numbers of unique transcripts can be found in each dataset. Despite the high number of ncRNA (38,129 in mouse) there are also almost 20,000 EST clusters in both mouse and humans with more than one EST that do not overlap any transcript suggesting that several new genetic elements are still to be found. We also demonstrated presence of new genes by identifying new human ones that have specific tissue profiles, using RT-PCR on rat tissues.
KW - Animals
KW - Computational Biology
KW - Databases, Genetic
KW - Expressed Sequence Tags
KW - Gene Expression Profiling
KW - Genome
KW - Genome, Human
KW - Humans
KW - Mice
KW - Multigene Family
KW - Organ Specificity
KW - Rats
KW - Reverse Transcriptase Polymerase Chain Reaction
KW - Software
U2 - 10.1016/j.bbrc.2006.07.153
DO - 10.1016/j.bbrc.2006.07.153
M3 - Journal article
C2 - 16904064
VL - 348
SP - 1063
EP - 1074
JO - Biochemical and Biophysical Research Communications
JF - Biochemical and Biophysical Research Communications
SN - 0006-291X
IS - 3
ER -
ID: 45811572