Databases come in two flavours: reference only, or all genomes. Some also come
with external clusters (noted below) which give compatibility with existing schemes.
For larger databases you may wish to download using ftp for enhanced reliability/restarts
– replace http:// with ftp:// in the links below and use your favourite client.
Typically, the reference only database will be sufficient for the main use case of assigning new samples to PopPUNK clusters, and updating the database with new clusters which have been found. The reference databases are usually significantly smaller.
For more detailed analyses, you may wish to download the all genomes database. If you wish to run either poppunk-visualise or any subclustering within strains this will require the full database.
In either case only the reference genomes will actually be used for query assignment, which does not change the results but gives a good speed up in program runtime.