Published online 19 November 2013 | Kim D. Pruitt*, Garth R. Brown, Susan M. Hiatt, Françoise Thibaud-Nissen, Alexander Astashyn, Olga Ermolaeva, Catherine M. Farrell, Jennifer Hart, Melissa J. Landrum, Kelly M. McGarvey, Michael R. Murphy, Nuala A. O'Leary, Shashikant Pujar, Bhanu Rajput, Sanjida H. Rangwala, Lillian D. Riddick, Andrei Shkeda, Hanzhen Sun, Pamela Tamez, Raymond E. Tully, Craig Wallin, David Webb, Janet Weber, Wendy Wu, Michael DiCuccio, Paul Kitts, Donna R. Maglott, Terence D. Murphy and James M. Ostell
The National Center for Biotechnology Information (NCBI) has updated its Reference Sequence (RefSeq) database, which includes annotated genomic, transcript, and protein sequences. The recent updates focus on the growth of the mammalian and human subsets, improvements to the eukaryotic annotation pipeline, and modifications to transcript and protein records. The addition of RNAseq data to the pipeline has significantly expanded the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent changes include reporting supporting evidence for transcript records, modifying exon feature annotation, and adding structured reports of gene and sequence attributes. The protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins has also been revised. The RefSeqGene project, which provides stable coordinate systems for clinical testing laboratories, is described, along with the current status of its development. The article highlights the comprehensive nature of the RefSeq dataset, its accessibility, and the ongoing efforts to improve its quality and utility.The National Center for Biotechnology Information (NCBI) has updated its Reference Sequence (RefSeq) database, which includes annotated genomic, transcript, and protein sequences. The recent updates focus on the growth of the mammalian and human subsets, improvements to the eukaryotic annotation pipeline, and modifications to transcript and protein records. The addition of RNAseq data to the pipeline has significantly expanded the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent changes include reporting supporting evidence for transcript records, modifying exon feature annotation, and adding structured reports of gene and sequence attributes. The protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins has also been revised. The RefSeqGene project, which provides stable coordinate systems for clinical testing laboratories, is described, along with the current status of its development. The article highlights the comprehensive nature of the RefSeq dataset, its accessibility, and the ongoing efforts to improve its quality and utility.