expam’s tree module
A programmatic API to interact with phylogenetic trees, particularly those used in reference databases.
expam.tree.location.Location
- class expam.tree.location.Location(name='', type='', dist=0.0, coord=None, accession_id=None, taxid=None, **kwargs)
Represents a node in the phylogeny.
- expam.tree.location.Location.__init__(self, name='', type='', dist=0.0, coord=None, accession_id=None, taxid=None, **kwargs)
- Parameters
name (str, optional) – name of node, defaults to “”
type (str, optional) – Leaf or Branch, defaults to “”
dist (float, optional) – distance to parent node, defaults to 0.0
coord (list, optional) – binary coordinate from root to node, defaults to None
accession_id (str, optional) – NCBI accession id, defaults to None
taxid (int, optional) – NCBI taxonomy id, defaults to None
- Variables
name – node name
type – “Leaf” or “Branch”
distance – distance to parent node
coordinate – list of binary binary numbers representing path from root to node
nchildren – number of children below this node
accession_id – NCBI accession id (only valid for leaves)
taxid – NCBI taxonomy id
expam.tree.tree.Index
- class expam.tree.tree.Index
Phylogeny index that can load, save and manipulate Newick trees.
- expam.tree.tree.Index.load_newick(path, keep_names=False, verbose=True)
load_newick Load Newick tree from file.
- Parameters
path (str) – path to Newick file
- Raises
OSError – file does not exist
- Returns
name of leaves and phylogeny Index object
- Return type
List[str], expam.tree.Index
- expam.tree.tree.Index.from_newick(newick_string, keep_names=False, verbose=True)
from_newick Parse Newick string.
- Parameters
newick_string (str) – Newick string encoding tree.
- Returns
name of leaves and phylogeny Index object
- Return type
List[str], expam.tree.Index
Example loading an Index object from a Newick string.
>>> from expam.tree.tree import Index >>> tree_string = "(B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);" >>> leaves, index = Index.from_newick(tree_string) * Initialising node pool... * Checking for polytomies... Polytomy (degree=3) detected! Resolving... Polytomy (degree=3) detected! Resolving... * Finalising index... >>> leaves ['B', 'A', 'C', 'E', 'D'] >>> index <Phylogeny Index, length=10> >>> index['A'] <expam.tree.Location object at 0x109ac7970> >>> index['A'].name 'A' >>> index['A'].coordinate [0, 0, 1, 0]
- expam.tree.tree.Index.resolve_polytomies(pool)
If the phylogeny contains polytomies, continually join the first two children with parents of distance 0 until the polytomy is resolved.
- Parameters
pool – List.
- Returns
None
- Return type
None
- expam.tree.tree.Index.coord(self, coordinate)
coord Return Location (node) at coordinate.
- Parameters
coordinate (list) – binary list representing path to node
- Returns
node in tree
- Return type
expam.tree.Location
- expam.tree.tree.Index.to_newick(self)
to_newick Output tree to Newick format.
- Returns
Newick format tree
- Return type
str
- expam.tree.tree.Index.yield_child_nodes(self, node_name)
yield_child_nodes Yields node and children nodes (both branches and leaves).
- Parameters
node_name (str) – name of node to start yielding from
- Yield
node names at or below node_name
- Return type
str
>>> for node in index.yield_child_nodes('p1'): # p1 will always be the root ... print(node) ... 1 D 2 B 3 E 4 A C
Note
Internal node (branch) names can start with ‘p’, but this may also be neglected.
- expam.tree.tree.Index.yield_leaves(self, node_name)
yield_leaves Yield only the leaves at or below some node.
- Parameters
node_name (str) – node to retrieve leaves from.
- Yield
leaf names at or below node_name.
- Return type
str
- expam.tree.tree.Index.get_child_nodes(self, node_name)
get_child_nodes Return list of nodes at or below node_name.
- Parameters
node_name (str) – name of node
- Returns
list of node names
- Return type
List[str]
>>> index.get_child_nodes('1') ['1', 'D', '2', 'B', '3', 'E', '4', 'A', 'C'] >>> index.get_child_nodes('E') ['E']
- expam.tree.tree.Index.get_child_leaves(self, node_name)
get_child_leaves Get list of leaves at or below node_name.
- Parameters
node_name (str) – name of node
- Returns
list of leaf names
- Return type
List[str]