aboutsummaryrefslogtreecommitdiff
path: root/academic
diff options
context:
space:
mode:
authorPetar Petrov <slackalaxy@gmail.com>2022-10-08 18:13:17 +0100
committerWilly Sudiarto Raharjo <willysr@slackbuilds.org>2022-10-15 10:47:28 +0700
commit1f43c07f3ce63f69512eda9410131fad0d266ea5 (patch)
tree5ed4ffaf5afc8a991dfa015786554799d23483ef /academic
parentd5622d171bc7a0a532386d1d82232d3c0b384275 (diff)
academic/muscle5: Added (MUSCLE 5: Next-generation MUSCLE)
Signed-off-by: Willy Sudiarto Raharjo <willysr@slackbuilds.org>
Diffstat (limited to 'academic')
-rw-r--r--academic/muscle5/README28
-rw-r--r--academic/muscle5/References5
-rw-r--r--academic/muscle5/muscle5.193
-rw-r--r--academic/muscle5/muscle5.SlackBuild118
-rw-r--r--academic/muscle5/muscle5.info10
-rw-r--r--academic/muscle5/slack-desc19
6 files changed, 273 insertions, 0 deletions
diff --git a/academic/muscle5/README b/academic/muscle5/README
new file mode 100644
index 0000000000000..bdea0f68e6d36
--- /dev/null
+++ b/academic/muscle5/README
@@ -0,0 +1,28 @@
+MUSCLE 5: Next-generation MUSCLE
+
+Muscle v5 is a major re-write of MUSCLE based on new algorithms.
+
+* Highest accuracy, scalable to thousands of sequences:
+Compared to previous versions, Muscle v5 is much more accurate, is often
+faster, and scales to much larger datasets. At the time of writing (late
+2021), Muscle v5 has the highest scores on multiple alignment benchmarks
+including Balibase, Bralibase, Prefab and Balifam. It can align tens of
+thousands of sequences with high accuracy on a low-cost commodity
+computer (say, an 8-core Intel CPU with 32 Gb RAM). On large datasets,
+Muscle v5 is 20-30% more accurate than MAFFT and Clustal-Omega.
+
+* Alignment ensembles:
+Muscle v5 can generate ensembles of high-accuracy alternative
+alignments. All replicates have equal average accuracy on benchmark
+test, including the MSA made with default parameters. By comparing
+results of downstream analysis (trees, structure prediction...) on
+different replicates, you can assess the effects of alignment errors on
+your study.
+
+* Manual:
+https://drive5.com/muscle5/manual/
+
+* Reference (included in the package)
+R.C. Edgar (2021) "MUSCLE v5 enables improved estimates of phylogenetic
+tree confidence by ensemble bootstrapping"
+https://www.biorxiv.org/content/10.1101/2021.06.20.449169v1.full.pdf
diff --git a/academic/muscle5/References b/academic/muscle5/References
new file mode 100644
index 0000000000000..e11f73531f734
--- /dev/null
+++ b/academic/muscle5/References
@@ -0,0 +1,5 @@
+References
+
+R.C. Edgar (2021) "MUSCLE v5 enables improved estimates of phylogenetic
+tree confidence by ensemble bootstrapping"
+https://www.biorxiv.org/content/10.1101/2021.06.20.449169v1.full.pdf
diff --git a/academic/muscle5/muscle5.1 b/academic/muscle5/muscle5.1
new file mode 100644
index 0000000000000..d1c2661ec23d8
--- /dev/null
+++ b/academic/muscle5/muscle5.1
@@ -0,0 +1,93 @@
+.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.48.5.
+.TH MUSCLE "1" "January 2022" "muscle 5.1" "User Commands"
+.SH NAME
+muscle \- Multiple alignment program of protein sequences
+.SH DESCRIPTION
+MUSCLE is a multiple alignment program for protein sequences. MUSCLE
+stands for multiple sequence comparison by log-expectation. In the
+authors tests, MUSCLE achieved the highest scores of all tested
+programs on several alignment accuracy benchmarks, and is also one of
+the fastest programs out there.
+.SH USAGE
+.SS "Align FASTA input, write aligned FASTA (AFA) output:"
+.IP
+muscle \fB\-align\fR input.fa \fB\-output\fR aln.afa
+.PP
+Align large input using Super5 algorithm if \fB\-align\fR is too expensive,
+typically needed with more than a few hundred sequences:
+.IP
+muscle \fB\-super5\fR input.fa \fB\-output\fR aln.afa
+.SS "Single replicate alignment:"
+.IP
+muscle \fB\-align\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
+muscle \fB\-super5\fR input.fa \fB\-perm\fR PERM \fB\-perturb\fR SEED \fB\-output\fR aln.afa
+.IP
+PERM is guide tree permutation none, abc, acb, bca (default none).
+SEED is perturbation seed 0, 1, 2... (default 0 = don't perturb).
+.PP
+Ensemble of replicate alignments, output in Ensemble FASTA (EFA) format,
+EFA has one aligned FASTA for each replicate with header line "<PERM.SEED":
+.IP
+muscle \fB\-align\fR input.fa \fB\-stratified\fR \fB\-output\fR stratified_ensemble.efa
+muscle \fB\-align\fR input.fa \fB\-diversified\fR \fB\-output\fR diversified_ensemble.afa
+.HP
+\fB\-replicates\fR N
+.IP
+Number of replicates, defaults 4, 100, 100 for stratified,
+.IP
+diversified, resampled. With \fB\-stratified\fR there is one
+replicate per guide tree permutation, total is 4 x N.
+.PP
+Generate resampled ensemble from existing ensemble by sampling columns
+with replacement:
+.IP
+muscle \fB\-resample\fR ensemble.efa \fB\-output\fR resampled.efa
+.HP
+\fB\-maxgapfract\fR F
+.IP
+Maximum fraction of gaps in a column (F=0..1, default 0.5).
+.HP
+\fB\-minconf\fR CC
+.IP
+Minimum column confidence (CC=0..1, default 0.5).
+.PP
+If ensemble output filename has @, then one FASTA file is generated
+for each replicate where @ is replaced by perm.s, otherwise all replicates
+are written to one EFA file.
+.SS "Calculate disperson of an ensemble:"
+.IP
+muscle \fB\-disperse\fR ensemble.efa
+.SS "Extract replicate with highest total CC (diversified input recommended):"
+.IP
+muscle \fB\-maxcc\fR ensemble.efa \fB\-output\fR maxcc.afa
+.SS "Extract aligned FASTA files from EFA file:"
+.IP
+muscle \fB\-efa_explode\fR ensemble.efa
+.SS "Convert FASTA to EFA, input has one filename per line:"
+.IP
+muscle \fB\-fa2efa\fR filenames.txt \fB\-output\fR ensemble.efa
+.PP
+Update ensemble by adding two sequences of digits to each replicate, digits
+are column confidence (CC) values, e.g. "73" means CC=0.73, "++" is CC=1.0:
+.IP
+muscle \fB\-addconfseqs\fR ensemble.efa \fB\-output\fR ensemble_cc.efa
+.PP
+Calculate letter confidence (LC) values, \fB\-ref\fR specifies the alignment to
+compare against the ensemble (e.g. from \fB\-maxcc\fR), output is in aligned
+FASTA format with LC values 0, 1 ... 9 instead of letters:
+.IP
+muscle \fB\-letterconf\fR ensemble.efa \fB\-ref\fR aln.afa \fB\-output\fR letterconf.afa
+.HP
+\fB\-html\fR aln.html
+.IP
+Alignment colored by LC in HTML format.
+.HP
+\fB\-jalview\fR aln.features
+.IP
+Jalview feature file with LC values and colors.
+.SS "More documentation at:"
+.IP
+https://drive5.com/muscle
+.SH AUTHOR
+ This manpage was written by Andreas Tille for the Debian distribution and
+ can be used for any other usage of the program.
diff --git a/academic/muscle5/muscle5.SlackBuild b/academic/muscle5/muscle5.SlackBuild
new file mode 100644
index 0000000000000..541a2182a3f3b
--- /dev/null
+++ b/academic/muscle5/muscle5.SlackBuild
@@ -0,0 +1,118 @@
+#!/bin/bash
+
+# Slackware build script for muscle5
+
+# Copyright 2022 Petar Petrov slackalaxy@gmail.com
+# All rights reserved.
+#
+# Redistribution and use of this script, with or without modification, is
+# permitted provided that the following conditions are met:
+#
+# 1. Redistributions of this script must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED
+# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
+# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+cd $(dirname $0) ; CWD=$(pwd)
+
+PRGNAM=muscle5
+VERSION=${VERSION:-5.1}
+BUILD=${BUILD:-1}
+TAG=${TAG:-_SBo}
+PKGTYPE=${PKGTYPE:-tgz}
+
+SRCNAM=muscle
+
+if [ -z "$ARCH" ]; then
+ case "$( uname -m )" in
+ i?86) ARCH=i586 ;;
+ arm*) ARCH=arm ;;
+ *) ARCH=$( uname -m ) ;;
+ esac
+fi
+
+# If the variable PRINT_PACKAGE_NAME is set, then this script will report what
+# the name of the created package would be, and then exit. This information
+# could be useful to other scripts.
+if [ ! -z "${PRINT_PACKAGE_NAME}" ]; then
+ echo "$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE"
+ exit 0
+fi
+
+TMP=${TMP:-/tmp/SBo}
+PKG=$TMP/package-$PRGNAM
+OUTPUT=${OUTPUT:-/tmp}
+
+if [ "$ARCH" = "i586" ]; then
+ SLKCFLAGS="-O2 -march=i586 -mtune=i686"
+ LIBDIRSUFFIX=""
+elif [ "$ARCH" = "i686" ]; then
+ SLKCFLAGS="-O2 -march=i686 -mtune=i686"
+ LIBDIRSUFFIX=""
+elif [ "$ARCH" = "x86_64" ]; then
+ SLKCFLAGS="-O2 -fPIC"
+ LIBDIRSUFFIX="64"
+else
+ SLKCFLAGS="-O2"
+ LIBDIRSUFFIX=""
+fi
+
+set -e
+
+rm -rf $PKG
+mkdir -p $TMP $PKG $OUTPUT
+cd $TMP
+rm -rf $SRCNAM-$VERSION
+tar xvf $CWD/$SRCNAM-$VERSION.tar.gz
+cd $SRCNAM-$VERSION
+
+chown -R root:root .
+find -L . \
+ \( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \
+ -o -perm 511 \) -exec chmod 755 {} \; -o \
+ \( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \
+ -o -perm 440 -o -perm 400 \) -exec chmod 644 {} \;
+
+cd src
+
+# do not create static executable
+sed -i "s:LDFLAGS += -static:#LDFLAGS += -static:" Makefile
+make CFLAGS="$SLKCFLAGS" \
+CXXFLAGS="$SLKCFLAGS"
+
+install -D -m755 Linux/$SRCNAM $PKG/usr/bin/$PRGNAM
+cd ..
+
+# Thanks to Debian for the man page
+mkdir -p $PKG/usr/man/man1
+cp $CWD/$PRGNAM.1 $PKG/usr/man/man1/$PRGNAM.1
+
+# The Makefile strips the binary...
+#find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \
+# | cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true
+
+find $PKG/usr/man -type f -exec gzip -9 {} \;
+for i in $( find $PKG/usr/man -type l ) ; do ln -s $( readlink $i ).gz $i.gz ; rm $i ; done
+
+mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION
+cp -a \
+ CONTRIBUTING.md LICENSE README.md \
+ $PKG/usr/doc/$PRGNAM-$VERSION
+
+cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild
+cat $CWD/References > $PKG/usr/doc/$PRGNAM-$VERSION/References
+
+mkdir -p $PKG/install
+cat $CWD/slack-desc > $PKG/install/slack-desc
+
+cd $PKG
+/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE
diff --git a/academic/muscle5/muscle5.info b/academic/muscle5/muscle5.info
new file mode 100644
index 0000000000000..1749642b988a9
--- /dev/null
+++ b/academic/muscle5/muscle5.info
@@ -0,0 +1,10 @@
+PRGNAM="muscle5"
+VERSION="5.1"
+HOMEPAGE="https://github.com/rcedgar/muscle"
+DOWNLOAD="https://github.com/rcedgar/muscle/archive/v5.1/muscle-5.1.tar.gz"
+MD5SUM="99b5ef38a119994e7a8f0ea7a12b5987"
+DOWNLOAD_x86_64=""
+MD5SUM_x86_64=""
+REQUIRES=""
+MAINTAINER="Petar Petrov"
+EMAIL="slackalaxy@gmail.com"
diff --git a/academic/muscle5/slack-desc b/academic/muscle5/slack-desc
new file mode 100644
index 0000000000000..bc8ca327050a3
--- /dev/null
+++ b/academic/muscle5/slack-desc
@@ -0,0 +1,19 @@
+# HOW TO EDIT THIS FILE:
+# The "handy ruler" below makes it easier to edit a package description.
+# Line up the first '|' above the ':' following the base package name, and
+# the '|' on the right side marks the last column you can put a character in.
+# You must make exactly 11 lines for the formatting to be correct. It's also
+# customary to leave one space after the ':' except on otherwise blank lines.
+
+ |-----handy-ruler------------------------------------------------------|
+muscle5: muscle5 (MUSCLE 5: Next-generation MUSCLE)
+muscle5:
+muscle5: Muscle v5 is a major re-write of MUSCLE based on new algorithms.
+muscle5: Compared to previous versions, Muscle v5 is much more accurate,
+muscle5: faster, and scales to much larger datasets.
+muscle5:
+muscle5: https://drive5.com/muscle5/
+muscle5: https://drive5.com/muscle5/manual/
+muscle5:
+muscle5:
+muscle5: