Feature Engineering
TreeFeatureEngineer computes numeric node attributes on an ete3.Tree. The
converter later reads those attributes into graph feature columns.
Basic workflow
from phylognn import TreeFeatureEngineer
engineer = TreeFeatureEngineer(num_time_bins=101)
tree = engineer.add_features(tree, origin_time=10.0, rescale=True)
feature_order = engineer.feature_names
Built-in features include node_time, time_bin, tip/internal indicators,
fossil/extant indicators, sampled-ancestor indicators, branch_length,
rescale_factor, and extant_sampling_probability.
When to use it
Run feature engineering before graph conversion whenever the converter should read computed node attributes into graph feature columns. Keep custom features numeric and registered by name before requesting them.
Feature order and determinism
feature_names is an immutable ordered tuple. Use it when constructing a
converter so feature columns stay stable across runs. Custom features are
appended after built-in features in registration order.
Validation
origin_time must be positive. Requested feature names must be unique and
must exist in available_features. num_time_bins must be at least two,
extant_sampling_probability must be in [0, 1], and traversal strategy must
be one of preorder, postorder, or levelorder. These contracts are checked
before graph conversion so invalid features fail early.
Rescaling
When rescale=True, non-zero branch lengths are scaled so their mean becomes
one. The same factor is used for feature computation and is attached as
rescale_factor. Trees with no non-zero branch lengths cannot be rescaled.