Sections
You are here: Home Software TreeMaker Manual

Manual

TreeMaker

Interactive construction of taxonomies and species richness data

Author: Paul-Michael Agapow
Contact: treemaker@agapow.net
Date: 2008/8/4
Web site:http://www.agapow.net/software/treemaker

Introduction

Biodiversity assessment demands objective measures, because ultimately conservation is an issue of economics, prioritizing the use of limited resources for preserving taxa. The most general framework for such metrics are those that assess evolutionary distinctiveness as judged by how much of a phylogeny is conserved. However, their applicability is limited by the still small proportion of taxa that have been reliably placed in a phylogeny. Given that this is unlikely to be corrected soon, alternatives are needed. Taxonomy can be used as a reasonable surrogate for phylogeny. Combining this with searches for combinations of local sites containing maximal diversity, the efficacy of any conservation schemes can be determined from a taxonomy of the organisms involved and the abundance data at potential preservation sites.

To this end, TreeMaker is software that allows the interactive building and editing of a taxonomy and its conversion into a phylogeny for the above calculations. It also allows the editing of site abundance and species richness data. This data may be imported from and exported to a variety of formats for interoperability with other programs. While it is mainly intended for use in conservation and biodiversity, it can be used as a simple tool for building phylogenies.

Technical description

TreeMaker can be downloaded from http://www.agapow.net/software/treemaker. Several associated programs (like MeSA and Conserve) can be found on the same site at http://www.agapow.net/software/. TreeMaker is available as a standalone program for MacOS (as a Universal Binary), Windows and Linux. Across platforms, it has only cosmetic not functional differences. Similarly, the datafiles TreeMaker produces may be used across platform. There are no special memory or library requirements.

The TreeMaker distribution includes:

  • The TreeMaker application

  • A set of example files including:

    • example.tree, a data file (in raw tree format)
    • example.trmk, a dat file (in TreeMaker format)
  • treemaker_manual, this manual

TreeMaker may be installed by simply copying it to an appropriate place on a local hardisk. To use the online help from within TreeMaker, the HTML manual file must be in the same directory as the application.

Typical use

To illustrate the use of TreeMaker, we'll follow the construction of a small taxonomy along with some abundance data. Minor details may differ depending on the version of TreeMaker used. First, we create a new TreeMaker document using New on the File menu. This presents a dialog that allows us to specify the initial number of taxonomic levels:

new_doc.jpg

A new document is created with a taxonomy containing a single node, the root. We can now extend the taxonomy by selecting the root node and choosing New Daughter from the Tree-building menu:

new_daughter.jpg

We continue this for some time, adding nodes to the right places on the tree. Note how the list of the terminal taxa updates in the right-hand abundance pane as the taxonomy changes.

Of course, all the nodes have the default and cryptic names (indicated by being grayed out). We need to rename them to something meaningful. Select a node and choose Rename Node from the Tree-building menu:

rename_node.jpg

Continuing on, we complete our taxonomy:

full_tax.jpg

Now we want to add abundance data, how many times particular species of ants have been seen at particular sites. So first we add a site:

add_site.jpg

After adding another, we can directly edit the abundance data for each terminal taxa:

edit_abundance.jpg

Now we can save the document for later use. Also we can export the data to another format for use in another program, using the Export option on the File menu. Parameters for how the data is exported can be found in the Settings option on the same menu:

settings.jpg

The document

example_doc.jpg

A TreeMaker document presents its data in two panes. On the left is the tree hierarchy. This presents a taxonomy in a semi-columnar format. Each column represents a taxonomic level, e.g. family, genera, species, and can be named as users desire. The number of levels can be defined at document creation or by later adding or deleting levels from the Tree-building menu. Note that level names are mainly cosmetic, and for help in laying out a taxonomy.

Below the level names is the taxonomy laid out as a staggered tree. Nodes in the same column are considered to be at the same taxonomic level. Any child (immediately descendant) nodes will be in the

On the right is the site data. This allows the association of terminal taxa in the taxonomy with abundance (or incidence) data at a series of discrete sites. Sites may be added or deleted from data sets. If the terminal taxa of the taxonomy are changed (i.e. a tip is deleted or gains a daughter), the rows of site data are updated automatically. Sites may be selected for menu operations by selecting the column header. Individual site data can be editted directly by clicking on them.

The menus

File

New
Create a new TreeMaker document. The user will first be asked for the number of initial taxonomic levels and given a chance to name them.
Open
Open a previously saved TreeMaker document.
Save
Save the current data as a TreeMaker document. If you wish to save in other formats (e.g. for using with other programs), use the Export To option.
Save As
Save the current data as a TreeMaker document with a new name.
Export To
Save the data in a foreign format suitable for use with other programs.
Settings
This produces a dialog that allows the setting of various options controlling the presentation and export of data.

Edit

This presents the usual options for editing text fields and boxes that present in TreeMaker documents and dialog boxes.

Tree-building

Rename Node
Change the name of the selected node. Note that all nodes, not just terminal ones, may be named. If nodes are not given a specific name, a default is supplied but displayed in grayed out text.
Rename Level
Change the name of the level. Again, if a level name is not given, a default will be generated and displayed in light grey.
Add Level
Add another level to the taxonomy
Remove Level
Delete the selected level from the taxonomy. For safety, a level cannot be removed unless it is empty. That is, level removal will not happy until the level contains no nodes.
Shift Left / Delete
Delete the selected node. If it has any children, make those children of their grandparent node. In effect this shifts a subtree to the left.
Shift Right
Insert a new node above the selected node, thus shifting a subtree to the right.
New Daughter
Add a child node to the current selected node. Note that this operation is not available for nodes in the terminal level of the taxonomy, a safety measure to stop the accidental addition of levels.
Flatten to Star Phylogeny
Transform the tree so that all terminal taxa are immediate children of the root node.

Abundance

Rename Site
If an abundance figure is selected for editing, change the name of the selected site.
Add Site
Add a new column to the site data pane, representing a new discrete site
Delete Site
Delete the selected site any associated abundance data.

Tree

This is a context dependent menu to ease navigation around large trees. If a node is selected in the tree pane, then this menu changes its name to that name and gives a number of options relating to operations upon this node. These are New daughter (which works as the option Tree-building as above) and Rename (which works as per Rename node). Attached below these are a series of cascading menus for the subtree headed by the selected node. This allows the same two operations on any of the descendent nodes.

Other menu choices

The exact position and style of these may differ based on platform and version.

About TreeMaker
Shows an information box with credits and the application version number.
TreeMaker Help
Opens the local copy of the help file (essentially this document) in a web browser. Note that for this to work, the help file must be in the same directory as the application.
Go To Website
Open a web browser pointing at the TreeMaker home page.

Settings & export

This leads to a dialog for several options that control how data is presented and exported, in particular how branch lengths are treated within the exported phylogenies. Note that these work on a per-document basis - changes in the settings for one document do not effect those in another. In combination, these options can create confusion, so a simple example taxonomy will be used to illustrate how trees are produced:

sample-taxonomy.jpg

If translated directly into a phylogeny, in Newick format this tree would normally be represented as:

(((A)), ((B), (C, D)))

or, with the intermediate taxonomic node indicated:

sample-phyl.png

The difficulty arises in the case of the tips A & B. These "singleton" nodes are the only child of their parent node. In theory, the Newick format permits such nodes. In practice, many programs do not, and expect any parent node to be at least bifurcating. This is prima facie reasonable: internal nodes in molecular phylogenies or cladograms are inferred by the presence of at least two child nodes. However, there are cases where such solitary nodes can arise. Our present case where taxonomies are literally translated into phylogenies is one. Some families may contain only one genus, some genera only one species. Another case is where phyletic transformation has lead to one species given rise to a distinct and different one. Finally, extinction may cull the children of a speciation event so that a parent species gives rise to a single child species.

The obvious way to handle such singletons is to collapse them up into their parents until a valid (and at least bifurcating) tree is formed. TreeMaker provides a number of options for this.

Taxa names as binomials

If this option is checked, in the respective case (display or export), the names of terminal taxa as x y, where y is the node name and x is the name of their parent node, i.e. the penultimate taxonomic group. For example:

((('one A')), (('two B'), ('three C', 'three D')))

Collapse singletons

If any tree nodes are singletons (as above) collapse them up into their parents in the exported tree. For example:

(A, (B, (C, D)))

Branchlengths: None

No distances are written to the output tree. For example:

(((A)), ((B), (C, D)))

Branchlengths: As is

In the exported trees, use any branchlengths that already in the tree when it was imported. Otherwise, set any branchlength as 1.0. For example, if the tree was imported from a file that marked C and D are joining their parents with branches of 0.5 in length:

(((A:1.0):1.0):1.0, ((B:1.0):1.0, (C:0.5, D:0.5):1.0):1.0)

Branchlengths: All inter-level distances equal

Every branch in the tree is set to the value give in the "Distance" field. So, if "Distance"" is set to 0.7:

(((A:0.7):0.7):0.7, ((B:0.7):0.7, (C:0.7, D:0.7):0.7):0.7)

If the "collapse singletons" option is set, the result will be:

(A:0.7, (B:0.7, (C:0.7, D:0.7):0.7):0.7)

This has no effect if "None" or "As is" is used.

Distance is total

If set, "Distance" is interpreted as the maximum distance (root to tip) in the tree, and the branch distances are set as the this divided by the maximum path length in nodes. So if "Distance" is set to 2.0:

(((A:0.67):0.67):0.67, ((B:0.67):0.67, (C:0.67, D:0.67):0.67):0.67)

That is, from the root to the furtherest tips (all of them) is 2.0. If the "collapse singletons" option is set, the result will be:

(A:0.67, (B:0.67, (C:0.67, D:0.67):0.67):0.67)

That is, from the root to the furtherest tips (C & D) is 2.0. Obviously this only applies with "All inter-level distances equal".

Singletons accumulate distances

If singletons are collapsed, they accquire the distance of the sum of branches that have been collapsed. That is, if M subtends N subtends O subtends P, with branches of 1.0 each, it collapses to M-P with a branch-length of 3. This applies to "As is", and "All inter-level distances equal" only when singletons are collapsed. So, if "Distance"" is set to 0.7 and "Collapse singletons" is on, we get:

(A:2.1, (B:1.4, (C:0.7, D:0.7):0.7):0.7)

Data format

TreeMaker uses a simple plain-text format based on a subset of the YAML specification [3]. This is done so that if necessary users can hack at the data files, converting datasets to and from TreeMaker format where other methods fall short. The format is briefly described below, but more can be learnt by studying TreeMaker output and YAML documentation. Minor details may differ depending on the version of TreeMaker used.

TreeMaker documents begin with #TREEMAKER on the first line by itself. Following this is an optional comment section that is ignored until the formal start of the YAML document. This is indicated by a four hyphen notation ---, again on a line by itself. The YAMl elements that follow are line-based, in which indentation is used to indicate nesting of sections and ownership. The main two types of document elements are key:value pairs, lists and key:value lists:

key:value pair
The key is an identifier associated with some following data, which can be a simple value, a list or entire indented section. The key name is followed by a colon :.
list
A list is a series or items, all indented to the same depth and prefixed by a hyphen -.
inline list
A list placed on a single line, with members separated by commas and flanked by square braces.
inline key:value list
A series of key:value pairs, separated by commas and flacked by braces, e.g. { a, b, c }.

The TreeMaker data is therefore a series of sections given as key:value pairs. These sections are "version", "levels", "sites", "tree" and "abundances".

The version section indicates the format version used by the file and should not be changed.

The levels section gives an inline list of the level names. If a level has not been named, a blank is given.

Similarly, sites is a list of the site names.

The tree section specifies a phylogeny as a list of nodes. Each node is identified with a unique id and give an inline key:value list with information about the node - what its name is and what the id of its parent node is. Note that the root has no entry for a parent.

Abundances is a list of the above tree node ids, each followed by an inline list of the abundances at each site. These abundances are integers.

The settings section provides the values for the document options that control presentation and output. I'd recommend that you don't mess with this. Actually this section can be deleted without harm.

Note that at the moment the order of these sections is immutable, and that the indentation depth is set to 3 spaces. This is not strictly in line with the YAML specification but is done for simplicity. The type of line-breaks (eolns) used in the file is unimportant, although by convention TreeMaker saves using Unix line-endings.

Q & A

Can subtrees be cut from one part of the taxonomy and pasted onto another?
No.
Can version TreeMaker version x open files created by TreeMaker version y?
Generally any version of TreeMaker can open a file created by any previous version. However, there is no guarantee that it will be able to open a file created by any future version. This is because the capabilities of TreeMaker have expanded through time and the data file format has changed to accomodate these.
Is there an Undo function to reverse mistakes?
No.
Can there be?
No.
Are there any limits on the types of trees produced?
TreeMaker has been used (and abused) to make trees with hundreds of nodes. However, attention may have to be paid to how data is exported, as some external programs only read a subset of legal tree formats. For example, some programs may expect only strictly bifurcating trees (e.g. ((A,B),(C,D))) while others do not tolerate singleton nodes (where a node has only a single descendant, e.g. ((A)). (See the section of data export above.) Others will have restrictions about how nodes are named.

Credits

If you use TreeMaker in any research resulting in a publication, please cite [1]. TreeMaker was written in RealBasic [4] and developed under MacOSX. It was produced with the help of Ross Crozier and Lisa Dunnett of James Cook University, Australia. Lisa has also produced a program called TreeMaker with roughly equivalent functionality. It can be found at http://homes.jcu.edu.au/~jc125033/Treemaker.htm.

References

[1]Ross H Crozier, Lisa J Dunnett, Paul-Michael Agapow (2005). Phylogenetic biodiversity assessment based on systematic nomenclature. Evolutionary Bioinformatics Online 1:11-36.
[2]MeSA - Macroevolutionary Simulation & Analysis. Software. Website at <http://www.agapow.net/software/mesa>.
[3]YAML - Yet Another Markup Language. http://www.yaml.org/
[4]RealBasic. http://www.realsoftware.com/
[5]Conserve. http://www.agapow.net/software/conserve
Document Actions
Visitors
Locations of visitors to this page