TY - JOUR
T1 - proGenomes3
T2 - approaching one million accurately and consistently annotated high-quality prokaryotic genomes
AU - Fullam, Anthony
AU - Letunic, Ivica
AU - Schmidt, Thomas S. B.
AU - Ducarmon, Quinten R.
AU - Karcher, Nicolai
AU - Khedkar, Supriya
AU - Kuhn, Michael
AU - Larralde, Martin
AU - Maistrenko, Oleksandr M.
AU - Malfertheiner, Lukas
AU - Milanese, Alessio
AU - Rodrigues, Joao Frederico Matias
AU - Sanchis-López, Claudia
AU - Schudoma, Christian
AU - Szklarczyk, Damian
AU - Sunagawa, Shinichi
AU - Zeller, Georg
AU - Huerta-Cepas, Jaime
AU - von Mering, Christian
AU - Bork, Peer
AU - Mende, Daniel R.
N1 - Publisher Copyright: © 2023 The Author(s). Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2023/1/6
Y1 - 2023/1/6
N2 - The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/
AB - The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/
UR - http://www.scopus.com/inward/record.url?scp=85153113169&partnerID=8YFLogxK
U2 - https://doi.org/10.1093/nar/gkac1078
DO - https://doi.org/10.1093/nar/gkac1078
M3 - Article
C2 - 36408900
SN - 0305-1048
VL - 51
SP - D760-D766
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - D1
ER -