proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes

Anthony Fullam, Ivica Letunic, Thomas S. B. Schmidt, Quinten R. Ducarmon, Nicolai Karcher, Supriya Khedkar, Michael Kuhn, Martin Larralde, Oleksandr M. Maistrenko, Lukas Malfertheiner, Alessio Milanese, Joao Frederico Matias Rodrigues, Claudia Sanchis-López, Christian Schudoma, Damian Szklarczyk, Shinichi Sunagawa, Georg Zeller, Jaime Huerta-Cepas, Christian von Mering, Peer BorkDaniel R. Mende

Research output: Contribution to journalArticleAcademicpeer-review

10 Citations (Scopus)

Abstract

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/
Original languageEnglish
Pages (from-to)D760-D766
JournalNucleic Acids Research
Volume51
Issue numberD1
DOIs
Publication statusPublished - 6 Jan 2023

Cite this