Quickstart

This first tutorial creates an ete3.Tree, attaches node features, converts it to a PyTorch Geometric Data object, adds a target label, runs a tiny training smoke test, and prints a prediction.

Create a small tree

def build_tree() -> Tree:
    return Tree("((A:1.0,B:1.5)C:0.5,D:2.0)root:0.0;", format=1)

Attach node features

TreeFeatureEngineer writes numeric attributes to each tree node. Use feature_names as the stable column order for graph conversion.

def make_graph() -> Data:
    engineer = TreeFeatureEngineer(num_time_bins=6)
    tree = engineer.add_features(
        build_tree(),
        origin_time=4.0,
        feature_names=FEATURE_NAMES,
        rescale=False,
        inplace=True,
    )
    converter = TreeToGraphConverter(
        feature_names=FEATURE_NAMES,
        add_virtual_nodes=False,
        append_is_virtual_feature=False,
        traversal_strategy=engineer.traversal_strategy,
    )
    data = converter.convert(tree, graph_attrs={"sample_id": "quickstart"})
    data.y = torch.tensor([1.0], dtype=torch.float32)
    return data

Convert the tree to graph data

TreeToGraphConverter reads node attributes into graph tensors. The same snippet above also adds a dummy graph-level target label as data.y, which is the field the trainer expects during supervised training.

Add a target label

The smoke test uses a single regression target:

data.y = torch.tensor([1.0], dtype=torch.float32)

For real datasets, attach one target per graph and keep target shape compatible with the selected model head and loss.

Validate the graph fields

Before training, check the required tensor shapes and dtypes.

def validate_graph(data: Data) -> None:
    assert data.x.dim() == 2
    assert data.x.dtype == torch.float32
    assert data.edge_index.shape[0] == 2
    assert data.edge_index.dtype == torch.long
    assert data.y.shape == (1,)
    assert data.y.dtype == torch.float32

For complete field semantics, including data.x, data.edge_index, data.edge_type, data.time_bin, and deterministic node ordering, see Graph Conversion.

Run a tiny training smoke test

Run the maintained script from the repository root:

python examples/quickstart_training.py

The training function creates a temporary output directory, trains for two epochs on the one-graph dataset, and returns one prediction.

def train_and_predict(data: Data) -> float:
    with tempfile.TemporaryDirectory(prefix="phylognn_quickstart_") as temp_dir:
        model = TinyGraphRegressor(input_dim=data.x.size(1))
        config = TrainingConfig(
            epochs=2,
            batch_size=1,
            learning_rate=1e-2,
            weight_decay=0.0,
            scheduler=None,
            early_stopping_patience=None,
            save_dir=str(Path(temp_dir)),
            save_best_only=False,
            verbose=False,
        )
        trainer = Trainer(model=model, config=config)
        trainer.fit(train_dataset=[data])
        prediction = trainer.predict(dataset=[data])
    return float(prediction[0].detach().cpu().item())

Expected output includes stable markers like these:

Quickstart training summary
x shape:
edge_index shape:
target shape:
batch ready: true
prediction:

Completion summary

At this point you have created a tree, attached deterministic features, converted it to graph data, validated the required fields, trained a tiny model, and printed a prediction.

Next steps

Need

Go to

Prepare real trees and features

User Guide

Understand graph fields

Graph Conversion

Configure datasets, splits, and TOML training

Training Configuration

Run complete scripts

Examples