Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

Sandro J. De Souza, Anamaria A. Camargo, Marcelo R.S. Briones, Fernando F. Costa, Maria Aparecida Nagai, Sergio Verjovski-Almeida, Marco A. Zago, Luis Eduardo C. Andrade, Helaine Carrer, Hamza F.A. El-Dorry, Enilza M. Espreafico, Angelita Habr-Gama, Daniel Giannella-Neto, Gustavo H. Goldman, Arthur Gruber, Christine Hackel, Edna T. Kimura, Rui M.B. Maciel, Suely K.N. Marie, Elizabeth A.L. MartinsMarina P. Nóbrega, Maria Luisa Paçó-Larson, Maria Inès M.C. Pardini, Gonçalo G. Pereira, João Bosco Pesquero, Vanderlei Rodrigues, Silvia R. Rogatto, Ismael D.C.G. Da Silva, Mari C. Sogayar, Maria De Fátima Sonati, Eloiza H. Tajara, Sandro R. Valentini, Marcio Acencio, Fernando L. Alberto, Maria Elisabete J. Amaral, Ivy Aneas, Mario Henrique Bengtson, Dirce M. Carraro, Alex F. Carvalho, Lúcia Helena Carvalho, Janete M. Cerutti, Maria Lucia C. Corrêa, Maria Cristina R. Costa, Cyntia Curcio, Tsieko Gushiken, Paulo L. Ho, Elza Kimura, Luciana C.C. Leite, Gustavo Maia, Paromita Majumder, Mozart Marins, Adriana Matsukuma, Analy S.A. Melo, Carlos Alberto Mestriner, Elisabete C. Miracca, Daniela C. Miranda, Ana Lucia T.O. Nascimento, Francisco G. Nóbrega, Élida P.B. Ojopi, Jose Rodrigo C. Pandolfi, Luciana Gilbert Pessoa, Paula Rahal, Claudia A. Rainho, Nancy Da Ro's, Renata G. De Sá, Magaly M. Sales, Neusa P. Da Silva, Tereza C. Silva, Wilson Da Silva, Daniel F. Simão, Josane F. Sousa, Daniella Stecconi, Fernando Tsukumo, Valeria Valente, Heloisa Zalcberg, Ricardo R. Brentani, Luis F.L. Reis, Emmanuel Dias-Neto, Andrew J.G. Simpson

Research output: Contribution to journalArticle

49 Citations (Scopus)

Abstract

Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by GENSCAN. (http://genes.mit.edu/GENSCAN.html).

Original languageEnglish (US)
Pages (from-to)12690-12693
Number of pages4
JournalProceedings of the National Academy of Sciences of the United States of America
Volume97
Issue number23
DOIs
StatePublished - Nov 7 2000

Fingerprint

Chromosomes, Human, Pair 22
Expressed Sequence Tags
Human Chromosomes
Open Reading Frames
Complementary DNA
Genes
Sequence Alignment
Nucleic Acid Databases
Human Genome
Sequence Analysis
Exons
Chromosomes
Databases
Messenger RNA
DNA

ASJC Scopus subject areas

  • General

Cite this

De Souza, S. J., Camargo, A. A., Briones, M. R. S., Costa, F. F., Nagai, M. A., Verjovski-Almeida, S., ... Simpson, A. J. G. (2000). Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags. Proceedings of the National Academy of Sciences of the United States of America, 97(23), 12690-12693. https://doi.org/10.1073/pnas.97.23.12690

Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags. / De Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R.S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F.A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M.B.; Marie, Suely K.N.; Martins, Elizabeth A.L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inès M.C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; Da Silva, Ismael D.C.G.; Sogayar, Mari C.; De Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mario Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C.C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S.A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T.O.; Nóbrega, Francisco G.; Ojopi, Élida P.B.; Pandolfi, Jose Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; Da Ro's, Nancy; De Sá, Renata G.; Sales, Magaly M.; Da Silva, Neusa P.; Silva, Tereza C.; Da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valeria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F.L.; Dias-Neto, Emmanuel; Simpson, Andrew J.G.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 97, No. 23, 07.11.2000, p. 12690-12693.

Research output: Contribution to journalArticle

De Souza, SJ, Camargo, AA, Briones, MRS, Costa, FF, Nagai, MA, Verjovski-Almeida, S, Zago, MA, Andrade, LEC, Carrer, H, El-Dorry, HFA, Espreafico, EM, Habr-Gama, A, Giannella-Neto, D, Goldman, GH, Gruber, A, Hackel, C, Kimura, ET, Maciel, RMB, Marie, SKN, Martins, EAL, Nóbrega, MP, Paçó-Larson, ML, Pardini, MIMC, Pereira, GG, Pesquero, JB, Rodrigues, V, Rogatto, SR, Da Silva, IDCG, Sogayar, MC, De Fátima Sonati, M, Tajara, EH, Valentini, SR, Acencio, M, Alberto, FL, Amaral, MEJ, Aneas, I, Bengtson, MH, Carraro, DM, Carvalho, AF, Carvalho, LH, Cerutti, JM, Corrêa, MLC, Costa, MCR, Curcio, C, Gushiken, T, Ho, PL, Kimura, E, Leite, LCC, Maia, G, Majumder, P, Marins, M, Matsukuma, A, Melo, ASA, Mestriner, CA, Miracca, EC, Miranda, DC, Nascimento, ALTO, Nóbrega, FG, Ojopi, ÉPB, Pandolfi, JRC, Pessoa, LG, Rahal, P, Rainho, CA, Da Ro's, N, De Sá, RG, Sales, MM, Da Silva, NP, Silva, TC, Da Silva, W, Simão, DF, Sousa, JF, Stecconi, D, Tsukumo, F, Valente, V, Zalcberg, H, Brentani, RR, Reis, LFL, Dias-Neto, E & Simpson, AJG 2000, 'Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags', Proceedings of the National Academy of Sciences of the United States of America, vol. 97, no. 23, pp. 12690-12693. https://doi.org/10.1073/pnas.97.23.12690
De Souza, Sandro J. ; Camargo, Anamaria A. ; Briones, Marcelo R.S. ; Costa, Fernando F. ; Nagai, Maria Aparecida ; Verjovski-Almeida, Sergio ; Zago, Marco A. ; Andrade, Luis Eduardo C. ; Carrer, Helaine ; El-Dorry, Hamza F.A. ; Espreafico, Enilza M. ; Habr-Gama, Angelita ; Giannella-Neto, Daniel ; Goldman, Gustavo H. ; Gruber, Arthur ; Hackel, Christine ; Kimura, Edna T. ; Maciel, Rui M.B. ; Marie, Suely K.N. ; Martins, Elizabeth A.L. ; Nóbrega, Marina P. ; Paçó-Larson, Maria Luisa ; Pardini, Maria Inès M.C. ; Pereira, Gonçalo G. ; Pesquero, João Bosco ; Rodrigues, Vanderlei ; Rogatto, Silvia R. ; Da Silva, Ismael D.C.G. ; Sogayar, Mari C. ; De Fátima Sonati, Maria ; Tajara, Eloiza H. ; Valentini, Sandro R. ; Acencio, Marcio ; Alberto, Fernando L. ; Amaral, Maria Elisabete J. ; Aneas, Ivy ; Bengtson, Mario Henrique ; Carraro, Dirce M. ; Carvalho, Alex F. ; Carvalho, Lúcia Helena ; Cerutti, Janete M. ; Corrêa, Maria Lucia C. ; Costa, Maria Cristina R. ; Curcio, Cyntia ; Gushiken, Tsieko ; Ho, Paulo L. ; Kimura, Elza ; Leite, Luciana C.C. ; Maia, Gustavo ; Majumder, Paromita ; Marins, Mozart ; Matsukuma, Adriana ; Melo, Analy S.A. ; Mestriner, Carlos Alberto ; Miracca, Elisabete C. ; Miranda, Daniela C. ; Nascimento, Ana Lucia T.O. ; Nóbrega, Francisco G. ; Ojopi, Élida P.B. ; Pandolfi, Jose Rodrigo C. ; Pessoa, Luciana Gilbert ; Rahal, Paula ; Rainho, Claudia A. ; Da Ro's, Nancy ; De Sá, Renata G. ; Sales, Magaly M. ; Da Silva, Neusa P. ; Silva, Tereza C. ; Da Silva, Wilson ; Simão, Daniel F. ; Sousa, Josane F. ; Stecconi, Daniella ; Tsukumo, Fernando ; Valente, Valeria ; Zalcberg, Heloisa ; Brentani, Ricardo R. ; Reis, Luis F.L. ; Dias-Neto, Emmanuel ; Simpson, Andrew J.G. / Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags. In: Proceedings of the National Academy of Sciences of the United States of America. 2000 ; Vol. 97, No. 23. pp. 12690-12693.
@article{7a7ac5c507b84bc48b12f05647bb5c67,
title = "Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags",
abstract = "Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45{\%}) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6{\%}) of the 247 known genes, for 67 (44.6{\%}) of the 150 related genes, and for 45 of the 148 (30.4{\%}) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15{\%} of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by GENSCAN. (http://genes.mit.edu/GENSCAN.html).",
author = "{De Souza}, {Sandro J.} and Camargo, {Anamaria A.} and Briones, {Marcelo R.S.} and Costa, {Fernando F.} and Nagai, {Maria Aparecida} and Sergio Verjovski-Almeida and Zago, {Marco A.} and Andrade, {Luis Eduardo C.} and Helaine Carrer and El-Dorry, {Hamza F.A.} and Espreafico, {Enilza M.} and Angelita Habr-Gama and Daniel Giannella-Neto and Goldman, {Gustavo H.} and Arthur Gruber and Christine Hackel and Kimura, {Edna T.} and Maciel, {Rui M.B.} and Marie, {Suely K.N.} and Martins, {Elizabeth A.L.} and N{\'o}brega, {Marina P.} and Pa{\cc}{\'o}-Larson, {Maria Luisa} and Pardini, {Maria In{\`e}s M.C.} and Pereira, {Gon{\cc}alo G.} and Pesquero, {Jo{\~a}o Bosco} and Vanderlei Rodrigues and Rogatto, {Silvia R.} and {Da Silva}, {Ismael D.C.G.} and Sogayar, {Mari C.} and {De F{\'a}tima Sonati}, Maria and Tajara, {Eloiza H.} and Valentini, {Sandro R.} and Marcio Acencio and Alberto, {Fernando L.} and Amaral, {Maria Elisabete J.} and Ivy Aneas and Bengtson, {Mario Henrique} and Carraro, {Dirce M.} and Carvalho, {Alex F.} and Carvalho, {L{\'u}cia Helena} and Cerutti, {Janete M.} and Corr{\^e}a, {Maria Lucia C.} and Costa, {Maria Cristina R.} and Cyntia Curcio and Tsieko Gushiken and Ho, {Paulo L.} and Elza Kimura and Leite, {Luciana C.C.} and Gustavo Maia and Paromita Majumder and Mozart Marins and Adriana Matsukuma and Melo, {Analy S.A.} and Mestriner, {Carlos Alberto} and Miracca, {Elisabete C.} and Miranda, {Daniela C.} and Nascimento, {Ana Lucia T.O.} and N{\'o}brega, {Francisco G.} and Ojopi, {{\'E}lida P.B.} and Pandolfi, {Jose Rodrigo C.} and Pessoa, {Luciana Gilbert} and Paula Rahal and Rainho, {Claudia A.} and {Da Ro's}, Nancy and {De S{\'a}}, {Renata G.} and Sales, {Magaly M.} and {Da Silva}, {Neusa P.} and Silva, {Tereza C.} and {Da Silva}, Wilson and Sim{\~a}o, {Daniel F.} and Sousa, {Josane F.} and Daniella Stecconi and Fernando Tsukumo and Valeria Valente and Heloisa Zalcberg and Brentani, {Ricardo R.} and Reis, {Luis F.L.} and Emmanuel Dias-Neto and Simpson, {Andrew J.G.}",
year = "2000",
month = "11",
day = "7",
doi = "10.1073/pnas.97.23.12690",
language = "English (US)",
volume = "97",
pages = "12690--12693",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "23",

}

TY - JOUR

T1 - Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

AU - De Souza, Sandro J.

AU - Camargo, Anamaria A.

AU - Briones, Marcelo R.S.

AU - Costa, Fernando F.

AU - Nagai, Maria Aparecida

AU - Verjovski-Almeida, Sergio

AU - Zago, Marco A.

AU - Andrade, Luis Eduardo C.

AU - Carrer, Helaine

AU - El-Dorry, Hamza F.A.

AU - Espreafico, Enilza M.

AU - Habr-Gama, Angelita

AU - Giannella-Neto, Daniel

AU - Goldman, Gustavo H.

AU - Gruber, Arthur

AU - Hackel, Christine

AU - Kimura, Edna T.

AU - Maciel, Rui M.B.

AU - Marie, Suely K.N.

AU - Martins, Elizabeth A.L.

AU - Nóbrega, Marina P.

AU - Paçó-Larson, Maria Luisa

AU - Pardini, Maria Inès M.C.

AU - Pereira, Gonçalo G.

AU - Pesquero, João Bosco

AU - Rodrigues, Vanderlei

AU - Rogatto, Silvia R.

AU - Da Silva, Ismael D.C.G.

AU - Sogayar, Mari C.

AU - De Fátima Sonati, Maria

AU - Tajara, Eloiza H.

AU - Valentini, Sandro R.

AU - Acencio, Marcio

AU - Alberto, Fernando L.

AU - Amaral, Maria Elisabete J.

AU - Aneas, Ivy

AU - Bengtson, Mario Henrique

AU - Carraro, Dirce M.

AU - Carvalho, Alex F.

AU - Carvalho, Lúcia Helena

AU - Cerutti, Janete M.

AU - Corrêa, Maria Lucia C.

AU - Costa, Maria Cristina R.

AU - Curcio, Cyntia

AU - Gushiken, Tsieko

AU - Ho, Paulo L.

AU - Kimura, Elza

AU - Leite, Luciana C.C.

AU - Maia, Gustavo

AU - Majumder, Paromita

AU - Marins, Mozart

AU - Matsukuma, Adriana

AU - Melo, Analy S.A.

AU - Mestriner, Carlos Alberto

AU - Miracca, Elisabete C.

AU - Miranda, Daniela C.

AU - Nascimento, Ana Lucia T.O.

AU - Nóbrega, Francisco G.

AU - Ojopi, Élida P.B.

AU - Pandolfi, Jose Rodrigo C.

AU - Pessoa, Luciana Gilbert

AU - Rahal, Paula

AU - Rainho, Claudia A.

AU - Da Ro's, Nancy

AU - De Sá, Renata G.

AU - Sales, Magaly M.

AU - Da Silva, Neusa P.

AU - Silva, Tereza C.

AU - Da Silva, Wilson

AU - Simão, Daniel F.

AU - Sousa, Josane F.

AU - Stecconi, Daniella

AU - Tsukumo, Fernando

AU - Valente, Valeria

AU - Zalcberg, Heloisa

AU - Brentani, Ricardo R.

AU - Reis, Luis F.L.

AU - Dias-Neto, Emmanuel

AU - Simpson, Andrew J.G.

PY - 2000/11/7

Y1 - 2000/11/7

N2 - Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by GENSCAN. (http://genes.mit.edu/GENSCAN.html).

AB - Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by GENSCAN. (http://genes.mit.edu/GENSCAN.html).

UR - http://www.scopus.com/inward/record.url?scp=12944266836&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=12944266836&partnerID=8YFLogxK

U2 - 10.1073/pnas.97.23.12690

DO - 10.1073/pnas.97.23.12690

M3 - Article

C2 - 11070084

AN - SCOPUS:12944266836

VL - 97

SP - 12690

EP - 12693

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 23

ER -