# tensor-group-sym Implementation of the ★_G tensor algebra for group-equivariant tensor decompositions and machine learning. ## Repository Structure ``` tensor-group-sym/ ├── setup_paths.m MATLAB path setup (run once per session) │ ├── core/ Core algebra │ ├── StarGAlgebra.m ★_G algebra class (groups, convolution tensor, generalized Fourier, SVD) │ ├── NeuralStarGFramework.m Neural network layers built on ★_G │ └── extractStarGFeatures.m Invariant features via generalized Fourier + ★_G-SVD │ ├── experiments/ Reproduce paper results │ ├── run_main_comparison.m Table 1: ★_G-SVD vs Neural ★_G vs MLPs (synthetic) │ ├── run_invariance_demo.m Rotation invariance demonstration (synthetic) │ ├── run_qm9_nature.m ** Nature pipeline: QM9 + symmetry discovery ** │ ├── run_product_group.m ** Product group Z_n1 x Z_n2 experiment ** │ ├── QM9_experiment.m QM9 data loading, feature computation, 5-method comparison │ ├── product_group_experiment.m Dual-axis rotation, 8-method comparison + factorization discovery │ ├── symmetry_discovery.m Discover best group from data (score landscape) │ ├── diagnose_ridge.m Quick diagnostic for debugging │ ├── starG_helpers.m Utility functions (R², plotting) │ ├── starG_methods.m Experiment method wrappers │ └── starG_mlp.m MLP training (Adam optimizer) │ ├── tests/ Verification │ ├── StarGTestSuite.m Comprehensive algebra tests │ ├── test_neural_starG.m Neural framework tests │ └── run_tests.m Test runner │ ├── python/ Python implementation │ ├── StarGAlgebra.py Python port of core algebra (NumPy/SciPy/CuPy) │ ├── NeuralStarGFramework.py Python port of neural framework │ ├── starG_helpers.py Helper utilities │ └── large_scale/ GPU-ready PyTorch reimplementation │ ├── starg_torch/ ★_G algebra in torch (groups, product, SVD, features, neural) │ ├── data/ QM9 loader and featurizers │ ├── train_starg.py Unified ★_G entry point (ridge | neural) │ ├── train_baseline_*.py MLP / SchNet / e3nn / MACE baselines │ ├── eval_collect.py Aggregate per-(method, target, seed) JSON results │ └── bsub/ IBM CCC LSF submission files (one per method) │ ├── lean/ Lean 4 formalization (zero sorry, 5 axioms) │ ├── StarG/{Basic,Algebra,ProductGroup,Equivariance,WignerEckart,SVD}.lean │ ├── lakefile.lean uses ../../mathlib4 (shared with sibling Lean projects) │ └── lean-toolchain │ ├── latex/ Manuscript and submission files │ ├── main.tex Main paper │ ├── supplementary.tex Supplementary Information (algorithms, proofs, Lean status) │ ├── cover.tex Cover letter │ ├── references.bib │ └── figures/ All paper figures (PDF + PNG) │ ├── results/ Saved experimental outputs │ └── neural_vs_enn_results/ Figures from main comparison │ └── exploratory/ Demos and future work (not in paper) ├── group_irreps_demo.m Irreps of Z_n, D_n, S_n, Q_8, SU(2), SO(3) ├── SO3_irrep_demo.m SO(3) irrep demo ├── LatticeQCDAlgebra.m SU(3) gauge field extension ├── RealWorldDemos.m Lattice QCD and QM9 demo class └── qm9_*.m, load_qm9_*.m QM9 molecular benchmark scripts ``` ## Quick Start ```matlab % 1. Add paths run('setup_paths.m'); % 2. Create a group algebra G = StarGAlgebra('cyclic', 12); % Z_12 G = StarGAlgebra('dihedral', 6); % D_6 G = StarGAlgebra('symmetric', 3); % S_3 % 3. Compute ★_G product C = G.starG(A, B); % 4. ★_G-SVD [U, S, V] = G.starG_SVD(A); % 5. Run tests run('tests/run_tests.m'); ``` ## Nature Paper Pipeline The full pipeline for real-data experiments (QM9 + symmetry discovery): ```matlab run('setup_paths.m'); % With synthetic molecules (no external data needed): run_qm9_nature % With real QM9 data (download .xyz files from quantum-machine.org): run_qm9_nature('qm9_dir', '/path/to/qm9/xyz', 'n_molecules', 5000) % Skip the slow learned-algebra step: run_qm9_nature('skip_part', 3) ``` This produces: - **Part 1**: 5-method comparison table (★_G-SVD, Neural ★_G, Standard MLP, Invariant MLP, Augmented MLP) - **Part 2**: Symmetry discovery score landscape over candidate groups - **Part 3**: Learned Fourier matrix and core tensor from data (experimental) Results are saved to `results/qm9_nature/`. ## Product Group Experiment The key theoretical advantage of star_G: composing multiple symmetries. ```matlab run('setup_paths.m'); % Synthetic: Z_6 x Z_4 (rotation about two axes) run_product_group % Real QM9 data: run_product_group('qm9_dir', '/path/to/xyz/', 'n_molecules', 1000) % Different group sizes: run_product_group('n1', 8, 'n2', 3) ``` This runs: - **Part 1**: 8-method comparison (product group vs each factor alone vs wrong cyclic vs baselines) - **Part 2**: Factorization discovery (which Z_a x Z_b decomposition best fits the data?) Results in `results/product_group/`. ## References - Kilmer et al., "Tensor-Tensor Algebra for Optimal Representation and Compression of Multiway Data," PNAS, 2021. - Huh, "Discovering Abstract Symbolic Relations by Learning Unitary Group Representations," 2024.