indel-Seq-Genv2.0
Author:Cory Strope cstrope AT cse DOT unl DOT edu
indel-Seq-Gen allows the user to simulate multiple subsequences according to different
evolutionary parameters, allowing for the simulation of realistic families of
protein sequences. indel-Seq-Gen records all evolutionary events and outputs the ``true''
multiple alignment of the sequences, and can generate a larger simulated
sequence space by allowing the use of multiple related root sequences. indel-Seq-Gen can be
used to test the accuracy of multiple alignment methods, evolutionary hypotheses, ancestral
protein reconstruction methods, and protein superfamily classification methods. indel-Seq-Gen
is built on top of Seq-Gen [3].
indel-Seq-Gen (v. 2.0) Downloadable files:
Obselete versions:
To extract .tar.gz files, type:
- gunzip [filename].tar.gz
- tar -xvf [filename].tar.
For a manual of indel-Seq-Gen v2.0 commands, click here.
- v1.0.3
- indel-Seq-Gen has been tested on RedHat Linux, SuSE Linux, IRIX, and MacOS X (versions 10.3.9,
10.2.8, and 10.4.7). In addition to the gcc compiler (if the provided executables do
not run on your system), indel-Seq-Gen requires PERL. We have not tested
indel-Seq-Gen on Windows environment. The indel-Seq-Gen ZIP archive provided above does not
include executables for Windows.
- v2.0
- indel-Seq-Gen version 2.0 [2] (Open Access) includes the following
functionality upgrades:
- Includes protein, coding DNA, and non-coding DNA simulation,
- Incorporates PROSITE-like regular expressions for motif conservation,
- Allows the user to impose minimum and maximum lengths on subsequences (e.g., GPCR transmembrane
subsequence can be a minimum of 17 and a maximum of 24 amino acids),
- Changed continuous simulation by breaking up branch lengths into small, discrete values.
- Logs insertion and deletion events, outputting
- Type of event,
- branch
of occurrence,
- length of event,
- relative time of occurrence, and
- position of the inserted characters (for insertions) or gaps (for deletion) in the true multiple
sequence alignment.
- Introduced a novel representation of event probability (insertions, deletions and substitutions) for
a sequence that is constrained, and
- Fixed a flaw in the modeling of indels (see [2] for details).
The Perl script from iSGv1.0 has been converted into ANSI C++, and iSGv2.0 is now packaged using GNU
autotools for compatibility.
- 1.0.3
- Fixed the problem of no indels occurring with the Chang and Benner model of indel evolution.
- 1.0.2
- Fixed ancestral sequence option (indel-Seq-Gen option ``-w a''). NOTE:
Unlike seq-gen, indel-Seq-Gen will allow you to use the ancestral sequence output option for
multi-partition simulation runs. However, all partitions MUST have the same branching pattern
for the output ancestral sequences to be meaningful. Ancestral sequences are also included in
the true multiple alignment.
- 1.0.1
- indel-Seq-Gen script finds PERL environment, makes portable calls to seq-gen-i.
Fixed bug related to the phylip output format.
- 1
- Strope, C.L., Scott, S.D., and Moriyama, E.N. 2007. indel-Seq-Gen: A new
protein family simulator incorporating domains, motifs, and indels. Mol. Biol. Evol 24: 640-649.
- 2
- Strope, C.L., Abel, K., Scott, S.D., and Moriyama, E.N. 2009. Biological sequence
simulation for complex evolutionary hypotheses with indel-Seq-Gen version 2. Mol. Biol. Evol
doi:0.1093/molbev/msp174.
- 3
- Rambaut, A. and Grassly,N. 1997. Seq-Gen: an application for the
monte carlo simulation of DNA sequence evolution along phylogenetic trees. CABIOS 13:235-238.
cory strope
2009-08-10