aboutsummaryrefslogtreecommitdiff
path: root/academic/diamond/README
diff options
context:
space:
mode:
Diffstat (limited to 'academic/diamond/README')
-rw-r--r--academic/diamond/README33
1 files changed, 33 insertions, 0 deletions
diff --git a/academic/diamond/README b/academic/diamond/README
new file mode 100644
index 0000000000000..a5c66d0988923
--- /dev/null
+++ b/academic/diamond/README
@@ -0,0 +1,33 @@
+DIAMOND is a sequence aligner for protein and translated DNA searches,
+designed for high performance analysis of big sequence data. The key
+features are:
+
+- Pairwise alignment of proteins and translated DNA at 500x-20,000x
+ speed of BLAST.
+- Frameshift alignments for long read analysis.
+- Low resource requirements and suitable for running on standard
+ desktops or laptops.
+- Various output formats, including BLAST pairwise, tabular and XML,
+ as well as taxonomic classification.
+
+To now run an alignment task, we assume to have a protein database file
+in FASTA format named `nr.faa` and a file of DNA reads that we want to
+align named `reads.fna`.
+
+In order to set up a reference database for DIAMOND, the `makedb`
+command needs to be executed with the following command line:
+
+ $ diamond makedb --in nr.faa -d nr
+
+This will create a binary DIAMOND database file with the specified name
+(`nr.dmnd`). The alignment task may then be initiated using the `blastx`
+command like this:
+
+ $ diamond blastx -d nr -q reads.fna -o matches.m8
+
+The output file here is specified with the `–o` option and named
+`matches.m8`. By default, it is generated in BLAST tabular format.
+
+Publication:
+Buchfink B, Xie C, Huson DH, "Fast and sensitive protein alignment using
+DIAMOND", Nature Methods 12, 59-60 (2015). doi:10.1038/nmeth.3176