# Contributing to the ★_G large-scale subtree

The ★_G algebra has a specific mathematical contract that distinguishes
it from equivariant neural networks (ENNs). Code in this directory must
preserve that contract or it isn't ★_G anymore, it's just another
ENN. **The four checks below are mandatory before any code change
lands**.

## Pre-merge checklist

For every code change in this directory, answer all four:

### 1. Is the feature tensor shape `(n_feat, |G|)` per molecule, with **no** atom dimension?

★_G operates on a **molecule-level summary** that has been measured at
every group element. The featurizer's output for one molecule is
`(n_feat, n_g)`, never `(n_atoms, n_feat, n_g)` or anything atom-keyed.

If your change introduces a per-atom dimension, you are reinventing a
GNN, the comparison stops being "structure-vs-structure on identical
input" and becomes a featurization comparison. **Stop.** If you
genuinely want atom-level information, accept the change in scope and
update the manuscript's Section 2.7 to reflect that the feature contract
has changed.

### 2. Is the activation function ReLU/identity (not gated, not per-irrep)?

Neural ★_G layers use plain ReLU on hidden activations and a linear
output. We do **not** add per-irrep gating, sigmoid gates on T$_1$
channels, or any equivariant non-linearity that mixes information
between irreps in a learned way.

If your change introduces a learned per-irrep gate, you are reinventing
e3nn-style gated equivariant networks. **Stop.**

### 3. Is there any `nn.Parameter` indexed per irrep, per group element, or per subgroup-chain level?

★_G layer weights are flat `(n_out, n_in, |G|)` tensors. There is no
learned per-irrep weight, no per-Clebsch-Gordan coefficient, no
multi-channel multiplicity index that the model adjusts during training.

If your change introduces such a parameter, you are reinventing MACE-
style irreducible-channel attention. **Stop.**

### 4. Is there any aggregation/pooling step that is learnable over an atom or edge dimension?

The only aggregation in Neural ★_G is the final invariant pool: average
across the group dimension and the feature dimension to produce a scalar
prediction. This is **not learnable**, it is a fixed average. There
is no learned attention, no message-passing, no edge weighting.

If your change introduces a learnable aggregation over atoms or edges,
you are reinventing a graph neural network. **Stop.**

---

## What goes in if all four pass

- New featurizers that produce `(n_feat, |G|)` molecule-level summaries
  (e.g. Coulomb-matrix eigenvalues, sorted distance histograms,
  spherical-harmonic expansions of charge density). Append rows to
  the existing `(14, |G|)` tensor; do not add new dimensions.
- New finite groups in `starg_torch/algebra.py`. Provide the
  multiplication table, generalized Fourier matrix, and irrep
  dimensions; the existing `starg_product` and `starg_svd` will
  consume them.
- New downstream regression heads (deeper MLPs, Gaussian processes,
  symbolic regression on the invariant features). The head sees
  invariant features only; it does not interact with the group axis.
- New analysis scripts (per-irrep R² decomposition, isomer audits,
  Pareto-figure generation) that consume existing trained models.

## What stays out

- `nn.Module` subclasses with `forward(self, atomic_graph)` signatures.
- Anything that imports from `e3nn.nn`, `e3nn.o3.experimental`, or
  `mace.modules` for use inside a ★_G layer.
- Test-time data augmentation, ensembling over rotations, or any other
  workaround that "fixes" insufficient input information by averaging.
  ★_G features are designed to be invariant by construction; if a
  property requires atom-level information, the right answer is to
  acknowledge the input contract and not regress that property here.

## Why the discipline matters

The paper's claim is *algebraic equivariance as an alternative to
architectural equivariance*, not "★_G with ENN-style improvements."
Once you start adding per-irrep parameters or learnable aggregations,
the comparison "★_G beats MLP on identical input" stops being clean , 
because you've changed what counts as ★_G.

The four checks above are how we keep the empirical claim in
Section 2.7 honest. They are also how we keep the manuscript's title
honest: this paper is about recovering Wigner--Eckart selection rules
from data via the algebra, not about beating MACE on R². If we drift
into reinventing MACE, we should rename the paper.