academic/hyphy/README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

HyPhy: Hypothesis testing using Phylogenies

HyPhy is an open-source software package for the analysis of genetic
sequences (in particular the inference of natural selection) using
techniques in phylogenetics, molecular evolution, and machine learning.
It features a rich scripting language for limitless customization of
analyses. Additionally, HyPhy features support for parallel computing
environments (via message passing interface).

HyPhy was designed to allow the specification and fitting of a broad
class of continuous-time discrete-space Markov models of sequence
evolution. To implement these models, HyPhy provides its own scripting
language - HBL, or HyPhy Batch Language, which can be used to develop
custom analyses or modify existing ones. Importantly, it is not
necessary to learn (or even be aware of) HBL in order to use HyPhy, as
most common models and analyses have been implemented for user
convenience. Once a model is defined, it can be fitted to data (using a
fixed topology tree), its parameters can be constrained in user-defined
ways to test various hypotheses (e.g. is rate1 > rate2), and simulate
data from. HyPhy primarily implements maximum likelihood methods, but
it can also be used to perform some forms of Bayesian inference (e.g.
FUBAR), fit Bayesian graphical models to data, run genetic algorithms to
perform complex model selection.

Features
- Support for arbitrary sequence data, including nucleotide, amino-acid,
  codon, binary, count (microsattelite) data, including multiple
  partitions mixing differen data types.
- Complex models of rate variation, including site-to-site, branch-to-
  branch, hidden markov model (autocorrelated rates), between/within
  partitions, and co-varion type models.
- Fast numerical fitting routines, supporting parallel and distributed
  execution.
- A broad collection of pre-defined evolutionary models.
- The ability to specify flexible constraints on model parameters and
  estimate confidence intervals on MLEs.
- Ancestral sequence reconstruction and sampling.
- Simulate data from any model that can be defined and fitted in the
  language.
- Apply unique (for this domain) machine learning methods to discover
  patterns in the data, e.g. genetic algorithms, stochastic context free
  grammars, Bayesian graphical models.
- Script analyses completely in HBL including flow control, I/O,
  parallelization, etc.

Registration
you are highly advised to fill the registration form found at:
https://veg.github.io/hyphy-site/register/

Citing
Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse (2005)
HyPhy: hypothesis testing using phylogenies.
Bioinformatics 21(5): 676-679