tensor-group-sym / python / large_scale / CONTRIBUTING.md
CONTRIBUTING.md
Raw

Contributing to the ★_G large-scale subtree

The ★_G algebra has a specific mathematical contract that distinguishes it from equivariant neural networks (ENNs). Code in this directory must preserve that contract or it isn't ★_G anymore, it's just another ENN. The four checks below are mandatory before any code change lands.

Pre-merge checklist

For every code change in this directory, answer all four:

1. Is the feature tensor shape (n_feat, |G|) per molecule, with no atom dimension?

★_G operates on a molecule-level summary that has been measured at every group element. The featurizer's output for one molecule is (n_feat, n_g), never (n_atoms, n_feat, n_g) or anything atom-keyed.

If your change introduces a per-atom dimension, you are reinventing a GNN, the comparison stops being "structure-vs-structure on identical input" and becomes a featurization comparison. Stop. If you genuinely want atom-level information, accept the change in scope and update the manuscript's Section 2.7 to reflect that the feature contract has changed.

2. Is the activation function ReLU/identity (not gated, not per-irrep)?

Neural ★_G layers use plain ReLU on hidden activations and a linear output. We do not add per-irrep gating, sigmoid gates on T$_1$ channels, or any equivariant non-linearity that mixes information between irreps in a learned way.

If your change introduces a learned per-irrep gate, you are reinventing e3nn-style gated equivariant networks. Stop.

3. Is there any nn.Parameter indexed per irrep, per group element, or per subgroup-chain level?

★_G layer weights are flat (n_out, n_in, |G|) tensors. There is no learned per-irrep weight, no per-Clebsch-Gordan coefficient, no multi-channel multiplicity index that the model adjusts during training.

If your change introduces such a parameter, you are reinventing MACE- style irreducible-channel attention. Stop.

4. Is there any aggregation/pooling step that is learnable over an atom or edge dimension?

The only aggregation in Neural ★_G is the final invariant pool: average across the group dimension and the feature dimension to produce a scalar prediction. This is not learnable, it is a fixed average. There is no learned attention, no message-passing, no edge weighting.

If your change introduces a learnable aggregation over atoms or edges, you are reinventing a graph neural network. Stop.


What goes in if all four pass

  • New featurizers that produce (n_feat, |G|) molecule-level summaries (e.g. Coulomb-matrix eigenvalues, sorted distance histograms, spherical-harmonic expansions of charge density). Append rows to the existing (14, |G|) tensor; do not add new dimensions.
  • New finite groups in starg_torch/algebra.py. Provide the multiplication table, generalized Fourier matrix, and irrep dimensions; the existing starg_product and starg_svd will consume them.
  • New downstream regression heads (deeper MLPs, Gaussian processes, symbolic regression on the invariant features). The head sees invariant features only; it does not interact with the group axis.
  • New analysis scripts (per-irrep R² decomposition, isomer audits, Pareto-figure generation) that consume existing trained models.

What stays out

  • nn.Module subclasses with forward(self, atomic_graph) signatures.
  • Anything that imports from e3nn.nn, e3nn.o3.experimental, or mace.modules for use inside a ★_G layer.
  • Test-time data augmentation, ensembling over rotations, or any other workaround that "fixes" insufficient input information by averaging. ★_G features are designed to be invariant by construction; if a property requires atom-level information, the right answer is to acknowledge the input contract and not regress that property here.

Why the discipline matters

The paper's claim is algebraic equivariance as an alternative to architectural equivariance, not "★_G with ENN-style improvements." Once you start adding per-irrep parameters or learnable aggregations, the comparison "★_G beats MLP on identical input" stops being clean , because you've changed what counts as ★_G.

The four checks above are how we keep the empirical claim in Section 2.7 honest. They are also how we keep the manuscript's title honest: this paper is about recovering Wigner--Eckart selection rules from data via the algebra, not about beating MACE on R². If we drift into reinventing MACE, we should rename the paper.