RefSeq GudangMovies21 Rebahinxxi LK21

      The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences (DNA, RNA) and their protein products. RefSeq was introduced in 2000. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule (i.e. DNA, RNA or protein) for major organisms ranging from viruses to bacteria to eukaryotes.
      For each model organism, RefSeq aims to provide separate and linked records for the genomic DNA, the gene transcripts, and the proteins arising from those transcripts. RefSeq is limited to major organisms for which sufficient data are available (121,461 distinct "named" organisms as of July 2022), while GenBank includes sequences for any organism submitted (approximately 504,000 formally described species).


      RefSeq categories


      RefSeq collection comprises different data types, with different origins, so it is necessary to establish standard categories and identifiers to store each data type. The most important categories are:

      For more details and more categories, see Table 1 in Chapter 18 of the book The Reference Sequence (RefSeq) Database.


      RefSeq Projects


      Several projects to improve RefSeq services are currently in development by the NCBI, often in collaboration with research centers such as EMBL-EBI:

      Consensus CDS (CCDS): This project aims to identify a core set of human and mouse protein-coding regions and standardize sets of genes with high and consistent levels of genomic annotation quality. This project was announced in 2009 and is still in development.
      RefSeq Functional Elements (RefSeqFE): It is focused on describing non-genic functional elements which are gene regulatory regions such as: enhancers, silencers, DNase I hypersensitive regions, DNA replication origins etc.). The current scope of this project is restricted to the human and mouse genomes.
      RefSeqGene: Its main goal is to define genomic sequences to be used as reference standards for well-characterized genes. Previously described mRNA, protein and chromosome sequences have the weaknesses of not providing explicit genomic coordinates of gene flanking and intronic regions as well as showing awkwardly large coordinates that change with every new genome assembly. The RefSeqGene project is designed to eliminate these errors.
      Targeted Loci: This project records molecular markers, specially protein-coding and ribosomal RNA loci that are used for phylogenetic and barcoding analysis. The scope of this project includes sequences for Archaea, Bacteria and Fungi organisms, accessible via Entrez and BLAST queries. It also includes GenBank sequences for Animals, Plants and Protists, accessible via BLAST queries.
      Virus Variation (ViV): It is a specific resource of sequence data processing pipelines and analysis tools for display and retrieval of sequences from several viral groups such as influenza virus, ebolavirus, MERS coronavirus or Zika virus. New viruses, processing pipelines, tools and other features are included regularly.
      RefSeq Select: This project aims to select datasets of RefSeq Select transcripts, as the most representative for every protein-coding gene, based on multiple criteria: prior use in clinical databases, transcript expression, evolutionary conservation of the coding region etc. Since many genes are represented by multiple RefSeq transcripts/proteins due to the biological process of alternative splicing, this complexity is problematic for studies such as comparative genomics or exchange of clinical variant data.
      MANE (Matched Annotation from the NCBI and EMBL-EBI): It is a collaborative project between NCBI and EMBL-EBI whose main goal is to define a set of transcripts and their proteins for all the protein-coding genes in the human genome. By doing that, the differences in transcripts annotation between RefSeq and Ensembl/GENCODE annotation systems are reduced. A MANE Select transcripts set are created as a useful universal standard for clinical reporting and comparative or evolutionary genomics. A second MANE Plus Clinical set are also created with additional transcripts to report all Pathogenic (P) or Likely Pathogenic (LP) clinical variants available in public resources. This project was announced in 2018 and is expected to finish in 2022.


      Statistics


      According to the RefSeq release 213 (July 2022), the number of species represented in the database by counting distinct taxonomic IDs are as follows:

      The counts of accession and basepairs per molecule type are:


      See also


      GenBank
      Sequence analysis
      Sequence profiling tool
      Sequence motif
      UniProt
      List of sequenced eukaryotic genomes
      List of sequenced archaeal genomes


      References




      Sources


      This article incorporates public domain material from NCBI Handbook. National Center for Biotechnology Information.


      External links


      RefSeq
      GenBank, RefSeq, TPA and UniProt: What's in a Name?

    Kata Kunci Pencarian:

    refseqrefseq ncbirefseq rnarefseq mrna databaserefseq mrnarefseq gtf filerefseq meaningrefseq accession numberrefseq ftprefseq in bioinformatics
    NCBI RefSeq Select

    NCBI RefSeq Select

    RefSeq non-redundant proteins

    RefSeq non-redundant proteins

    RefSeq non-redundant proteins

    RefSeq non-redundant proteins

    RefSeq non-redundant proteins

    RefSeq non-redundant proteins

    Nomenclature for the description of sequence variants: Figure reference ...

    Nomenclature for the description of sequence variants: Figure reference ...

    Prokaryotic RefSeq Genome Re-annotation Project

    Prokaryotic RefSeq Genome Re-annotation Project

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    NCBI RefSeq Functional Elements

    RefSeq Release 220 - NCBI Insights

    RefSeq Release 220 - NCBI Insights

    Search Results

    refseq

    Daftar Isi

    RefSeq: NCBI Reference Sequence Database - National Center …

    RefSeq: NCBI Reference Sequence Database A comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein. Using RefSeq

    About RefSeq - National Center for Biotechnology Information

    Mar 19, 2021 · The Reference Sequence (RefSeq) collection provides a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. RefSeq sequences form a foundation for medical, functional, and diversity studies.

    RefSeqGene - National Center for Biotechnology Information

    RefSeq Locus Reference Genomic (LRG) MedGen NCBI Variation (dbSNP and dbVar) Online Mendelian Inheritance In Man

    RefSeq - Wikipedia

    The Reference Sequence (RefSeq) database [1] is an open access, annotated and curated collection of publicly available nucleotide sequences (DNA, RNA) and their protein products. RefSeq was introduced in 2000.

    RefSeq Release 228 is Available! - NCBI Insights

    Jan 10, 2025 · Check out RefSeq release 228, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings. What’s included in this release? As of January 3, 2025, this full release incorporates genomic, transcript, and protein data …

    NCBI RefSeq: reference sequence standards through 25 years of …

    The Reference Sequence (RefSeq) resource created at the National Center for Biotechnology Information (NCBI) leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life.

    NCBI reference sequences (RefSeq): a curated non-redundant …

    The RefSeq database provides a critical foundation for integrating sequence, genetic and functional information, and is used internationally as a standard for genome annotation. RefSeq records are accessible in several NCBI resources including Entrez Nucleotide, Protein, Gene, Map Viewer and BLAST.

    Understanding RefSeq

    RefSeq is a precious and indispensable resource in genomics due to its meticulous manual curation of nucleotide sequences (DNA, RNA) and their associated protein products. RefSeq employs a rigorous curation process that combines experimental evidence, computational predictions, and manual curation.

    NCBI Reference Sequences (RefSeq): current status, new …

    The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records.

    RefSeq Frequently Asked Questions (FAQ) - RefSeq Help - NCBI Bookshelf

    Nov 15, 2010 · What is a Reference Sequence (RefSeq)? The NCBI Reference Sequence (RefSeq) project provides sequence records and related information for numerous organisms, and provides a baseline for medical, functional, and comparative studies.