Data Reference
Import path
from phylognn.data import TreeFeatureEngineer, TreeToGraphConverter
Feature engineering
- class TreeFeatureEngineer(num_time_bins=101, extant_sampling_probability=1.0, custom_features=None, traversal_strategy='preorder', time_tolerance=1e-8)
Attach numeric node-level features to an
ete3.Tree.Important attributes are
feature_names, an ordered immutable feature list, andavailable_features, an immutable membership set. Usefeature_namesas the column order for graph conversion.Built-in features include
node_time,time_bin,is_internal,is_tip,is_fossil,is_extant, sampled-ancestor indicators,branch_length,rescale_factor, andextant_sampling_probability.Common failures include invalid constructor values, non-positive
origin_time, duplicate feature requests, unknown feature names, and rescaling trees with no non-zero branch lengths.- add_features(tree, origin_time, feature_names=None, rescale=True, inplace=True)
Compute requested node features and attach them to each node.
- rescale_tree(tree, inplace=True)
Rescale non-zero branch lengths so their mean becomes one.
- get_available_features()
Return the registered feature names in stable order.
Graph conversion
- class TreeToGraphConverter(feature_names=None, add_virtual_nodes=False, num_time_bins=None, traversal_strategy='preorder', bidirectional=True, connect_virtual_to_real=True, connect_virtual_chain=True, append_is_virtual_feature=True, preserve_node_names=True, copy_sampling_prob_to_virtual=True)
Convert a feature-bearing
ete3.Treeto a PyTorch GeometricDataobject.feature_namesdefines the feature column order. Every node must contain every requested feature and each feature value must be numeric.Output fields include
x,edge_index,edge_type,original_num_nodes,virtual_node_mask,node_type, optionalnode_names, optionalnum_time_bins, and user-provided graph-level attributes.Edge type values are
0for tree edges,1for virtual-to-real edges, and2for virtual-chain edges.- convert(tree, graph_attrs=None)
Return graph data for one tree.
- convert_and_save(tree, path, graph_attrs=None, create_dirs=True)
Convert a tree, save the graph, and return it.
- save_data(data, path, create_dirs=True)
Save a PyG
Dataobject withtorch.save.
- static load_data(path, map_location=None)
Load a saved PyG
Dataobject from a trusted project output.