diff options
author | Petar Petrov <slackalaxy@gmail.com> | 2017-10-08 15:06:27 +0100 |
---|---|---|
committer | Willy Sudiarto Raharjo <willysr@slackbuilds.org> | 2017-10-11 06:19:52 +0700 |
commit | 54bdcd5c7316c1cd1cf86aef60bf34183469cb4a (patch) | |
tree | 20c446e56bdece7702239962c8f9c6bb19314d69 /academic/meme-suite | |
parent | de99c62a1757d76ae87d3c127bcba5ea71eed030 (diff) |
academic/meme-suite: Added (Motif based sequence analysis tools).
Signed-off-by: David Spencer <idlemoor@slackbuilds.org>
Diffstat (limited to 'academic/meme-suite')
-rw-r--r-- | academic/meme-suite/README | 35 | ||||
-rw-r--r-- | academic/meme-suite/References | 78 | ||||
-rw-r--r-- | academic/meme-suite/meme-suite.SlackBuild | 123 | ||||
-rw-r--r-- | academic/meme-suite/meme-suite.info | 10 | ||||
-rw-r--r-- | academic/meme-suite/slack-desc | 19 |
5 files changed, 265 insertions, 0 deletions
diff --git a/academic/meme-suite/README b/academic/meme-suite/README new file mode 100644 index 000000000000..f516c7e37141 --- /dev/null +++ b/academic/meme-suite/README @@ -0,0 +1,35 @@ +The MEME suite: motif based sequence analysis tools + +The MEME suite provides tools for discovering and using protein and +DNA sequence motifs. A motif is a pattern of nucleotides or amino acids +that appears repeatedly in a group of related DNA or protein sequences. + +The MEME suite represents motifs as position-dependent scoring matrices. +It consists of programs which allow you to: + +- meme - discovery of motifs shared by a group of sequences +- mast - search of databases for sequences containing these motifs +- tomtom - searching databases of motifs for similar motifs +- gomo - finding Gene Ontology terms linked to the motifs +- glam2 - discovery of gapped motifs +- glam2scan - scanning sequences with gapped motifs +- fimo - scanning sequences with motifs +- mcast - finding motif clusters +- meme-chip - analysis of large DNA datasets like ChIPseq output +- spamo - finding motif complexes by analysing motif spacing +- dreme - discovery of short regular expression motifs + +Note: building on a 32bit architecture fails at the 'make test' step +(check script). If the step is disabled, the suite builds, however it +may or may NOT work properly. Therefore, 32bit is set as 'UNTESTED'. +The 'make test' step will also fail if you don't build in a proper root +environment. + +To cite the full MEME suite: +Timothy L. Bailey, Mikael Bodén, Fabian A. Buske, Martin Frith, +Charles E. Grant, Luca Clementi, Jingyuan Ren, Wilfred W. Li, +William S. Noble, "MEME SUITE: tools for motif discovery and searching", +Nucleic Acids Research, 37:W202-W208, 2009. + +To cite individual tools, please check the citation page: +http://meme-suite.org/doc/cite.html diff --git a/academic/meme-suite/References b/academic/meme-suite/References new file mode 100644 index 000000000000..242a70d914e3 --- /dev/null +++ b/academic/meme-suite/References @@ -0,0 +1,78 @@ +Authors + +The MEME Suite was developed by + +Timothy Bailey at the Institute for Molecular Bioscience at the University of Queensland, +William Stafford Noble in the Department of Genome Sciences at the University of Washington and +with input from + +Charles Elkan and +Michael Gribskov. +Development of portions of the MEME Suite have previously been supported by + +Columbia University, +the Computational Biology Research Center at the National Institute of Advanced Industrial Science and Technology, Japan, +the National Biomedical Computation Resource, and +the San Diego Supercomputer Center. + +Citing MEME Suite Programs + +To cite the full MEME Suite +Timothy L. Bailey, Mikael Bodén, Fabian A. Buske, Martin Frith, Charles E. Grant, Luca Clementi, Jingyuan Ren, Wilfred W. Li, William S. Noble, "MEME SUITE: tools for motif discovery and searching", Nucleic Acids Research, 37:W202-W208, 2009. [full text] +To cite individual tools +AMA +Fabian A. Buske, Mikael Bodén, Denis C. Bauer and Timothy L. Bailey, "Assigning roles to DNA regulatory motifs using comparative genomics", Bioinformatics, 26(7):860-866, 2010. [full text] +AME +Robert C. McLeay, Timothy L. Bailey, "Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data", BMC Bioinformatics, 11:165, 2010. [full text] +CentriMo +Timothy L. Bailey and Philip Machanick, "Inferring direct DNA binding from ChIP-seq", Nucleic Acids Research, 40:e128, 2012. [Abstract and Full Text] +DREME +Timothy L. Bailey, "DREME: Motif discovery in transcription factor ChIP-seq data", Bioinformatics, 27(12):1653-1659, 2011. [full text] +FIMO +Charles E. Grant, Timothy L. Bailey, and William Stafford Noble, "FIMO: Scanning for occurrences of a given motif", Bioinformatics 27(7):1017–1018, 2011. [full text] +GLAM2 and GLAM2SCAN +Martin C. Frith, Neil F. W. Saunders, Bostjan Kobe, Timothy L. Bailey, "Discovering sequence motifs with arbitrary insertions and deletions", PLoS Computational Biology, 4(5):e1000071, 2008. [full text] +GOMO +Fabian A. Buske, Mikael Bodén, Denis C. Bauer and Timothy L. Bailey, "Assigning roles to DNA regulatory motifs using comparative genomics", Bioinformatics, 26(7), 860-866, 2010. [full text] +MAST +Timothy L. Bailey and Michael Gribskov, "Combining evidence using p-values: application to sequence homology searches", Bioinformatics, 14(1):48-54, 1998. [pdf] +MEME +Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. [postscript] [pdf] +MEME-ChIP +Philip Machanick and Timothy L. Bailey, "MEME-ChIP: motif analysis of large DNA datasets", Bioinformatics 27(12):1696-1697, 2011. [full text] +PSPs +Timothy L. Bailey, Mikael Bodén, Tom Whitington, and Philip Machanick, "The value of position-specific priors in motif discovery using MEME", BMC Bioinformatics, 11(1):179, 2010. [full text] +MCAST +Timothy Bailey and William Stafford Noble, "Searching for statistically significant regulatory modules", Bioinformatics (Proceedings of the European Conference on Computational Biology), 19(Suppl. 2):ii16-ii25, 2003. [full text] +SpaMo +Tom Whitington, Martin C. Frith, James Johnson, and Timothy L. Bailey "Inferring transcription factor complexes from ChIP-seq data", Nucl. Acids Res. 39(15):e98, 2011. [full text] +Tomtom +Shobhit Gupta, JA Stamatoyannopolous, Timothy Bailey and William Stafford Noble, "Quantifying similarity between motifs", Genome Biology, 8(2):R24, 2007. [full text] +Related papers +GOMO related +Mikael Bodén and Timothy L. Bailey, "Associating transcription factor binding site motifs with target Go terms and target genes", Nucl. Acids Res, 36, 4108-4117, 2008. [full text] +MAST related +Timothy L. Bailey and Michael Gribskov, "Score distributions for simultaneous matching to multiple motifs" Journal of Computational Biology, Vol. 4, pp. 45-59, 1997. [pdf] +Timothy L. Bailey and Michael Gribskov, "Methods and statistics for combining motif match scores" Journal of Computational Biology, Vol. 5, pp. 211-221, 1998. [pdf] +MEME related +Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. [postscript] [pdf] +Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", UCSD Technical Report, CS94-351, March, 1994, University of California at San Diego. [pdf] +Timothy L. Bailey, "Discovering motifs in DNA and protein sequences: The approximate common substring problem", Ph.D. dissertation, University of California at San Diego, 1995. [pdf] +Timothy L. Bailey and Charles Elkan, "Unsupervised Learning of Multiple Motifs in Biopolymers using EM" Machine Learning, 21(1-2):51-80, October, 1995. [pdf] +Timothy L. Bailey and Charles Elkan, "The value of prior knowledge in discovering motifs with MEME", Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, pp. 21-29, AAAI Press, Menlo Park, California, 1995. [pdf] +Timothy L. Bailey and Michael Gribskov, "The megaprior heuristic for discovering protein sequence patterns"", Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, pp. 15-24 AAAI Press, Menlo Park, California, 1996. [pdf] +William N. Grundy, Timothy L. Bailey and Charles P. Elkan, "ParaMEME: A Parallel Implementation and a Web Interface for a DNA and Protein Motif Discovery Tool", Computer Applications in the Biological Sciences (CABIOS), Vol. 12(4), pp. 303-310, 1996. [pdf] +Timothy L. Bailey, Michael E. Baker and Charles P. Elkan, "An artificial intelligence approach to motif discovery in protein sequences: application to steriod dehydrogenases", Journal of steroid biochemistry and molecular biology, Vol. 62, 1997. +Martin Tompa, Nan Li, Timothy L. Bailey, George M. Church, Bart De Moor, Eleazar Eskin, Alexander V. Favorov, Martin C. Frith, Yutao Fu, W. James Kent, Vsevolod J. Makeev, Andrei A. Mironov,, William S. Noble, Giulio Pavesi, Graziano Pesole, Mireille Regnier, Nicolas Simonis, Saurabh Sinha, Gert Thijs, Jacques van Helden, Mathias Vandenbogaert, Zhiping Weng, Christopher Workman, Chun Ye and Zhou Zhu. "Assessing computational tools for the discovery of transcription factor binding sites." Nature Biotechnology, 23(1), 137-144, 2005. [full text] +Timothy L. Bailey, Nadya Williams, Chris Misleh, and Wilfred W. Li, "MEME: discovering and analyzing DNA and protein sequence motifs" Nucleic Acids Research, Vol. 34, pp. W369-W373, 2006. [html] [pdf] +Wenxiu Ma, William Stafford Noble and Timothy L. Bailey, "Motif-based analysis of large nucleotide datasets using MEME-ChIP" Nature Protocols, 9(6):1428-1450, 2014. [html] +Tomtom related +Emi Tanaka, Timothy L. Bailey, Charles E. Grant, William S. Noble, and Uri Keich, "Improved similarity scores for comparing motifs" Bioinformatics 27(12): 1603-1609, 2011. [full text] +Other +William N. Grundy, Timothy L. Bailey, Charles P. Elkan and Michael E. Baker. "Hidden Markov Model Analysis of Motifs in Steroid Dehydrogenases and their Homologs" Biochemical and Biophysical Research Communications, Vol 231, pp. 760-766, 1997. [pdf] +William N. Grundy, Timothy L. Bailey, Charles P. Elkan and Michael E. Baker. "Meta-MEME: Motif-based Hidden Markov Models of Protein Families" Computer Applications in the Biological Sciences (CABIOS), Vol. 13(4), pp. 397-406, 1997. [pdf] +Michael E. Baker, William N. Grundy, and Charles P. Elkan, "Spinach CSP41, an mRNA-binding protein and ribonuclease, is homologous to nucleotide-sugar epimerases and hydroxysteroid dehydrogenases", Biochemical and Biophysical Research Communications 248(2), 250-254, 1998. [pdf] +Michael E. Baker, William N. Grundy, and Charles P. Elkan, "A common ancestor for a subunit in the mitochondrial proton-translocating NADH:ubiquinone oxidoreductase (complex I) and short-chain dehydrogenases/reductases", Cellular and Molecular Life Sciences, Vol. 55(3), 450-455, 1999. +John Hawkins, Charles Grant, William S. Noble, and Timothy L. Bailey, "Assessing phylogenetic motif models for predicting transcription factor binding sites" Bioinformatics (Proceedings of the Intelligent Systems for Molecular Biology Conference), 25(12), i339--347, 2009. [full text] +Gabriel Cuellar-Partida, Fabian A. Buske, Robert C. McLeay, Tom Whitington, William Stafford Noble, and Timothy L. Bailey, "Epigenetic priors for identifying active transcription factor binding sites", Bioinformatics 28(1): 56-62, 2012 [pdf] +Timothy L. Bailey, James Johnson, Charles E. Grant and William Stafford Noble, "The MEME Suite." Nucleic Acids Resesearch, 43(W1):W39-49, 2015 [full text] diff --git a/academic/meme-suite/meme-suite.SlackBuild b/academic/meme-suite/meme-suite.SlackBuild new file mode 100644 index 000000000000..a21bb00ea193 --- /dev/null +++ b/academic/meme-suite/meme-suite.SlackBuild @@ -0,0 +1,123 @@ +#!/bin/sh + +# Slackware build script for meme-suite + +# Copyright 2017 Petar Petrov slackalaxy@gmail.com +# All rights reserved. +# +# Redistribution and use of this script, with or without modification, is +# permitted provided that the following conditions are met: +# +# 1. Redistributions of this script must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED +# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO +# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; +# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR +# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +PRGNAM=meme-suite +SRCNAM=meme +VERSION=${VERSION:-4.12.0} +BUILD=${BUILD:-1} +TAG=${TAG:-_SBo} + +if [ -z "$ARCH" ]; then + case "$( uname -m )" in + i?86) ARCH=i586 ;; + arm*) ARCH=arm ;; + *) ARCH=$( uname -m ) ;; + esac +fi + +CWD=$(pwd) +TMP=${TMP:-/tmp/SBo} +PKG=$TMP/package-$PRGNAM +OUTPUT=${OUTPUT:-/tmp} + +if [ "$ARCH" = "i586" ]; then + SLKCFLAGS="-O2 -march=i586 -mtune=i686" + LIBDIRSUFFIX="" +elif [ "$ARCH" = "i686" ]; then + SLKCFLAGS="-O2 -march=i686 -mtune=i686" + LIBDIRSUFFIX="" +elif [ "$ARCH" = "x86_64" ]; then + SLKCFLAGS="-O2 -fPIC" + LIBDIRSUFFIX="64" +else + SLKCFLAGS="-O2" + LIBDIRSUFFIX="" +fi + +set -e + +rm -rf $PKG +mkdir -p $TMP $PKG $OUTPUT +cd $TMP +rm -rf ${SRCNAM}_${VERSION} +tar xvf $CWD/${SRCNAM}_${VERSION}.tar.gz +cd ${SRCNAM}_${VERSION} +chown -R root:root . +find -L . \ + \( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \ + -o -perm 511 \) -exec chmod 755 {} \; -o \ + \( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \ + -o -perm 440 -o -perm 400 \) -exec chmod 644 {} \; + +# Fix a few paths +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" Makefile.in +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/css/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/images/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/js/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/examples/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/examples/compute_prior_dist_example_output_files/Makefile.* +sed -i "s:share/doc:doc/$PRGNAM-$VERSION/doc/:g" doc/examples/sample_opal_scripts/Makefile.* +sed -i "s:\\$(libdir)/perl:&/vendor_perl:g" scripts/Makefile.* +sed -i "s:\\$(libdir)/python:$(libdir)/python2.7/site-packages:g" scripts/Makefile.* + +./configure \ + --prefix=/usr \ + --libdir=/usr/lib${LIBDIRSUFFIX} \ + --docdir=/usr/doc/$PRGNAM-$VERSION \ + --with-url="http://meme-suite.org" \ + --datarootdir=/usr/share/$PRGNAM/data \ + --sysconfdir=/usr/share/$PRGNAM/etc \ + --localstatedir=/var \ + --with-db=/var/lib/$PRGNAM \ + --with-logs=/var/log/$PRGNAM \ + --with-temp=/tmp + +# CFLAGS should be specified here, otherwise they are not accepted +make CFLAGS="$SLKCFLAGS -std=gnu89" + +# The tests are recommended, but take quite some time. Be patient or comment out the line below. +# Also, some tests fail on a 32bit system, therefore it is listed as unsupported. If you find a +# fix, let me know. +make test +make install DESTDIR=$PKG + +find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \ + | cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true + +# The databases directory should contain folders for motif, gomo and fasta databases +mkdir -p $PKG/var/lib/$PRGNAM/{motif_databases,gomo_databases,fasta_databases} + +mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION +cp -a \ + AUTHORS COPYING INSTALL README \ + $PKG/usr/doc/$PRGNAM-$VERSION +cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild +cat $CWD/References > $PKG/usr/doc/$PRGNAM-$VERSION/References + +mkdir -p $PKG/install +cat $CWD/slack-desc > $PKG/install/slack-desc + +cd $PKG +/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.${PKGTYPE:-tgz} diff --git a/academic/meme-suite/meme-suite.info b/academic/meme-suite/meme-suite.info new file mode 100644 index 000000000000..9f67bd2c7e41 --- /dev/null +++ b/academic/meme-suite/meme-suite.info @@ -0,0 +1,10 @@ +PRGNAM="meme-suite" +VERSION="4.12.0" +HOMEPAGE="http://meme-suite.org/" +DOWNLOAD="UNSUPPORTED" +MD5SUM="" +DOWNLOAD_x86_64="http://meme-suite.org/meme-software/4.12.0/meme_4.12.0.tar.gz" +MD5SUM_x86_64="40d282cc33f7dedb06b24b9f34ac15c1" +REQUIRES="openmpi perl-HTML-Template perl-HTML-Tree perl-File-Which perl-JSON" +MAINTAINER="Petar Petrov" +EMAIL="slackalaxy@gmail.com" diff --git a/academic/meme-suite/slack-desc b/academic/meme-suite/slack-desc new file mode 100644 index 000000000000..67c8c76d61eb --- /dev/null +++ b/academic/meme-suite/slack-desc @@ -0,0 +1,19 @@ +# HOW TO EDIT THIS FILE: +# The "handy ruler" below makes it easier to edit a package description. +# Line up the first '|' above the ':' following the base package name, and +# the '|' on the right side marks the last column you can put a character in. +# You must make exactly 11 lines for the formatting to be correct. It's also +# customary to leave one space after the ':' except on otherwise blank lines. + + |-----handy-ruler------------------------------------------------------| +meme-suite: meme-suite (Motif based sequence analysis tools) +meme-suite: +meme-suite: The MEME suite: motif based sequence analysis tools +meme-suite: +meme-suite: The MEME suite provides tools for discovering and using protein +meme-suite: and DNA sequence motifs. A motif is a pattern of nucleotides or +meme-suite: amino acids that appears repeatedly in a group of related DNA or +meme-suite: protein sequences. +meme-suite: +meme-suite: Home: http://meme-suite.org/ +meme-suite: |