metacache

MetaCache - Metagenomic Classification System

MetaCache maps genomic sequences (short reads, long reads, contigs, etc.) from metagenomic samples to their most likely taxon of origin. It aims to reduce the memory requirement usually associated with k-mer based methods while retaining their speed. MetaCache uses locality sensitive hashing to quickly identify candidate regions within one or multiple reference genomes. A read is then classified based on the similarity to those regions.

For an independend comparison to other tools in terms of classification accuracy see the LEMMI benchmarking site.

The latest version of MetaCache classifies around 60 Million reads (of length 100) per minute against all complete bacterial, viral and archaea genomes from NCBI RefSeq Release 97 running with 88 threads on a workstation with 2 Intel(R) Xeon(R) Gold 6238 CPUs.

Applications

Mapping Reads From Metagenomic Bacterial Samples

Quick Start: Bacterial mapping with NCBI’s RefSeq
See the LEMMI benchmarking site for an independend comparison to other tools in terms of classification accuracy, speed and memory consumption.
The paper that introduced MetaCache.

AFS-MetaCache: Food Ingredient Detection & Abundance Analysis…

Documentation Overview

All Command Line Options

for mode build: build database(s) from reference genomes
for mode query: query reads against databases
for mode merge: merge results of independent queries
for mode modify: add reference genomes to database or update taxonomy
for mode info: get information about a database

MetaCache Copyright (C) 2016-2021 André Müller & Robin Kobus This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. See the file ‘LICENSE’ for details.

This site is open source. Improve this page.