aboutsummaryrefslogtreecommitdiff
path: root/academic/muscle5/muscle5.1
blob: d1c2661ec23d8be89e3ff4d578d9f363f7088c8c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.48.5.
.TH MUSCLE "1" "January 2022" "muscle 5.1" "User Commands"
.SH NAME
muscle \- Multiple alignment program of protein sequences
.SH DESCRIPTION
MUSCLE is a multiple alignment program for protein sequences. MUSCLE
stands for multiple sequence comparison by log-expectation. In the
authors tests, MUSCLE achieved the highest scores of all tested
programs on several alignment accuracy benchmarks, and is also one of
the fastest programs out there.
.SH USAGE
.SS "Align FASTA input, write aligned FASTA (AFA) output:"
.IP
muscle \fB\-align\fR input.fa \fB\-output\fR aln.afa
.PP
Align large input using Super5 algorithm if \fB\-align\fR is too expensive,
typically needed with more than a few hundred sequences:
.IP
muscle \fB\-super5\fR input.fa \fB\-output\fR aln.afa
.SS "Single replicate alignment:"
.IP
muscle \fB\-align\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
muscle \fB\-super5\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
.IP
PERM is guide tree permutation none, abc, acb, bca (default none).
SEED is perturbation seed 0, 1, 2... (default 0 = don't perturb).
.PP
Ensemble of replicate alignments, output in Ensemble FASTA (EFA) format,
EFA has one aligned FASTA for each replicate with header line "<PERM.SEED":
.IP
muscle \fB\-align\fR input.fa \fB\-stratified\fR \fB\-output\fR stratified_ensemble.efa
muscle \fB\-align\fR input.fa \fB\-diversified\fR \fB\-output\fR diversified_ensemble.afa
.HP
\fB\-replicates\fR N
.IP
Number of replicates, defaults 4, 100, 100 for stratified,
.IP
diversified, resampled. With \fB\-stratified\fR there is one
replicate per guide tree permutation, total is 4 x N.
.PP
Generate resampled ensemble from existing ensemble by sampling columns
with replacement:
.IP
muscle \fB\-resample\fR ensemble.efa \fB\-output\fR resampled.efa
.HP
\fB\-maxgapfract\fR F
.IP
Maximum fraction of gaps in a column (F=0..1, default 0.5).
.HP
\fB\-minconf\fR CC
.IP
Minimum column confidence (CC=0..1, default 0.5).
.PP
If ensemble output filename has @, then one FASTA file is generated
for each replicate where @ is replaced by perm.s, otherwise all replicates
are written to one EFA file.
.SS "Calculate disperson of an ensemble:"
.IP
muscle \fB\-disperse\fR ensemble.efa
.SS "Extract replicate with highest total CC (diversified input recommended):"
.IP
muscle \fB\-maxcc\fR ensemble.efa \fB\-output\fR maxcc.afa
.SS "Extract aligned FASTA files from EFA file:"
.IP
muscle \fB\-efa_explode\fR ensemble.efa
.SS "Convert FASTA to EFA, input has one filename per line:"
.IP
muscle \fB\-fa2efa\fR filenames.txt \fB\-output\fR ensemble.efa
.PP
Update ensemble by adding two sequences of digits to each replicate, digits
are column confidence (CC) values, e.g. "73" means CC=0.73, "++" is CC=1.0:
.IP
muscle \fB\-addconfseqs\fR ensemble.efa \fB\-output\fR ensemble_cc.efa
.PP
Calculate letter confidence (LC) values, \fB\-ref\fR specifies the alignment to
compare against the ensemble (e.g. from \fB\-maxcc\fR), output is in aligned
FASTA format with LC values 0, 1 ... 9 instead of letters:
.IP
muscle \fB\-letterconf\fR ensemble.efa \fB\-ref\fR aln.afa \fB\-output\fR letterconf.afa
.HP
\fB\-html\fR aln.html
.IP
Alignment colored by LC in HTML format.
.HP
\fB\-jalview\fR aln.features
.IP
Jalview feature file with LC values and colors.
.SS "More documentation at:"
.IP
https://drive5.com/muscle
.SH AUTHOR
 This manpage was written by Andreas Tille for the Debian distribution and
 can be used for any other usage of the program.