tensor-group-sym

Implementation of the ★_G tensor algebra for group-equivariant tensor decompositions and machine learning.

Repository Structure

tensor-group-sym/
├── setup_paths.m                  MATLAB path setup (run once per session)
│
├── core/                          Core algebra
│   ├── StarGAlgebra.m             ★_G algebra class (groups, convolution tensor, generalized Fourier, SVD)
│   ├── NeuralStarGFramework.m     Neural network layers built on ★_G
│   └── extractStarGFeatures.m     Invariant features via generalized Fourier + ★_G-SVD
│
├── experiments/                   Reproduce paper results
│   ├── run_main_comparison.m      Table 1: ★_G-SVD vs Neural ★_G vs MLPs (synthetic)
│   ├── run_invariance_demo.m      Rotation invariance demonstration (synthetic)
│   ├── run_qm9_nature.m           ** Nature pipeline: QM9 + symmetry discovery **
│   ├── run_product_group.m        ** Product group Z_n1 x Z_n2 experiment **
│   ├── QM9_experiment.m           QM9 data loading, feature computation, 5-method comparison
│   ├── product_group_experiment.m Dual-axis rotation, 8-method comparison + factorization discovery
│   ├── symmetry_discovery.m       Discover best group from data (score landscape)
│   ├── diagnose_ridge.m           Quick diagnostic for debugging
│   ├── starG_helpers.m            Utility functions (R², plotting)
│   ├── starG_methods.m            Experiment method wrappers
│   └── starG_mlp.m                MLP training (Adam optimizer)
│
├── tests/                         Verification
│   ├── StarGTestSuite.m           Comprehensive algebra tests
│   ├── test_neural_starG.m        Neural framework tests
│   └── run_tests.m                Test runner
│
├── python/                        Python implementation
│   ├── StarGAlgebra.py            Python port of core algebra (NumPy/SciPy/CuPy)
│   ├── NeuralStarGFramework.py    Python port of neural framework
│   ├── starG_helpers.py           Helper utilities
│   └── large_scale/               GPU-ready PyTorch reimplementation
│       ├── starg_torch/           ★_G algebra in torch (groups, product, SVD, features, neural)
│       ├── data/                  QM9 loader and featurizers
│       ├── train_starg.py         Unified ★_G entry point (ridge | neural)
│       ├── train_baseline_*.py    MLP / SchNet / e3nn / MACE baselines
│       ├── eval_collect.py        Aggregate per-(method, target, seed) JSON results
│       └── bsub/                  IBM CCC LSF submission files (one per method)
│
├── lean/                          Lean 4 formalization (zero sorry, 5 axioms)
│   ├── StarG/{Basic,Algebra,ProductGroup,Equivariance,WignerEckart,SVD}.lean
│   ├── lakefile.lean              uses ../../mathlib4 (shared with sibling Lean projects)
│   └── lean-toolchain
│
├── latex/                         Manuscript and submission files
│   ├── main.tex                   Main paper
│   ├── supplementary.tex          Supplementary Information (algorithms, proofs, Lean status)
│   ├── cover.tex                  Cover letter
│   ├── references.bib
│   └── figures/                   All paper figures (PDF + PNG)
│
├── results/                       Saved experimental outputs
│   └── neural_vs_enn_results/     Figures from main comparison
│
└── exploratory/                   Demos and future work (not in paper)
    ├── group_irreps_demo.m        Irreps of Z_n, D_n, S_n, Q_8, SU(2), SO(3)
    ├── SO3_irrep_demo.m           SO(3) irrep demo
    ├── LatticeQCDAlgebra.m        SU(3) gauge field extension
    ├── RealWorldDemos.m           Lattice QCD and QM9 demo class
    └── qm9_*.m, load_qm9_*.m     QM9 molecular benchmark scripts

Quick Start

% 1. Add paths
run('setup_paths.m');

% 2. Create a group algebra
G = StarGAlgebra('cyclic', 12);       % Z_12
G = StarGAlgebra('dihedral', 6);      % D_6
G = StarGAlgebra('symmetric', 3);     % S_3

% 3. Compute ★_G product
C = G.starG(A, B);

% 4. ★_G-SVD
[U, S, V] = G.starG_SVD(A);

% 5. Run tests
run('tests/run_tests.m');

Nature Paper Pipeline

The full pipeline for real-data experiments (QM9 + symmetry discovery):

run('setup_paths.m');

% With synthetic molecules (no external data needed):
run_qm9_nature

% With real QM9 data (download .xyz files from quantum-machine.org):
run_qm9_nature('qm9_dir', '/path/to/qm9/xyz', 'n_molecules', 5000)

% Skip the slow learned-algebra step:
run_qm9_nature('skip_part', 3)

This produces:

Part 1: 5-method comparison table (★_G-SVD, Neural ★_G, Standard MLP, Invariant MLP, Augmented MLP)
Part 2: Symmetry discovery score landscape over candidate groups
Part 3: Learned Fourier matrix and core tensor from data (experimental)

Results are saved to results/qm9_nature/.

Product Group Experiment

The key theoretical advantage of star_G: composing multiple symmetries.

run('setup_paths.m');

% Synthetic: Z_6 x Z_4 (rotation about two axes)
run_product_group

% Real QM9 data:
run_product_group('qm9_dir', '/path/to/xyz/', 'n_molecules', 1000)

% Different group sizes:
run_product_group('n1', 8, 'n2', 3)

This runs:

Part 1: 8-method comparison (product group vs each factor alone vs wrong cyclic vs baselines)
Part 2: Factorization discovery (which Z_a x Z_b decomposition best fits the data?)

Results in results/product_group/.

References

Kilmer et al., "Tensor-Tensor Algebra for Optimal Representation and Compression of Multiway Data," PNAS, 2021.
Huh, "Discovering Abstract Symbolic Relations by Learning Unitary Group Representations," 2024.