\documentclass[11pt]{letter} \usepackage[top=0.95in,bottom=0.95in,left=1in,right=1in]{geometry} \usepackage{hyperref,amssymb} \usepackage{enumitem} % Letter-class boilerplate eats more vertical space than is necessary % for a two-page cover. Tighten the open/close blocks and zero the % block-quoted longindentation, but keep paragraph spacing relaxed so % the page reads at normal breathing room. \setlength{\parskip}{6pt} \setlength{\longindentation}{0pt} \renewcommand{\opening}[1]{\noindent #1\par\smallskip} \renewcommand{\closing}[1]{\par\medskip\noindent #1\par\smallskip \noindent\fromsig\par} \signature{Lior Horesh \\ (on behalf of all authors)} \address{IBM Research \\ Yorktown Heights, NY, USA} \begin{document} \begin{letter}{} %{The Editors \\ \textit{Nature}} \opening{Dear Editors,} We submit for your consideration our manuscript entitled \textbf{``Group-Algebraic Tensors: Provably-optimal Equivariant Learning and Physical Symmetry Discovery.''} The dominant approach to incorporating symmetry in machine learning relies on equivariant neural networks (ENNs), which embed each symmetry into the network architecture. While powerful, this paradigm requires bespoke engineering for every new symmetry, offers no optimality guarantees, and cannot compose multiple symmetries without redesigning the entire architecture. We propose a fundamentally different approach: rather than constraining the architecture, we change the algebra. The paper introduces the $\star_G$ tensor algebra, in which any finite group $G$ defines the multiplication rule. Our main contributions are: \begin{enumerate}[leftmargin=1.8em,itemsep=4pt,topsep=4pt,parsep=2pt] \item \textbf{Provably optimal decomposition.} We prove that the $\star_G$-SVD achieves Eckart--Young optimality, the first such result for symmetry-preserving tensor approximation. \item \textbf{Compositional symmetry.} Product groups compose via Kronecker factorization of the Fourier transform with no architectural changes. We demonstrate that $\mathbb{Z}_6 \times \mathbb{Z}_4$ achieves $R^2 = 1.000$ while each factor alone recovers at most 23\%. \item \textbf{Physical symmetry discovery from data alone.} Without any quantum-mechanical theory as input, we decompose QM9 molecular geometry over the octahedral group and recover the Wigner--Eckart selection rules governing angular momentum coupling: the T$_1$/A$_1$ predictive-power ratio is $5\times$ larger for vector than scalar observables, and the isotropic polarizability is uniquely insensitive to the $l\!=\!1$ channel, exactly as the representation-theoretic decomposition of symmetric rank-2 tensors demands. These selection rules, cornerstones of atomic spectroscopy since 1931, emerge here as empirical consequences of an algebraic decomposition applied to molecular geometry data. \item \textbf{Algebraic disentanglement of tensor components.} On the QM7-X tensorial polarizability benchmark, the same algebra delivers a result no equivariant neural network we tested can reproduce. Trained on a single octahedral irrep at a time, $\star_G$-SVD~+~Ridge (two parameters) achieves cross-selectivity between the $E_g$ and $T_{2g}$ components above $96\%$, while MACE, SchNet, and an e3nn-based SE(3)- equivariant network ($10^5$--$10^6$ parameters each) all sit below $1.1\%$ cross-selectivity despite achieving comparable per-component $R^2$. The algebra exposes a representation-theoretic structure that the neural architectures, by their end-to-end design, cannot. \item \textbf{Parameter efficiency in the data-scarce regime.} On full QM9 (130{,}831 molecules), $\star_G$-SVD with ridge regression delivers $R^2 = 0.998$ on ZPVE and $R^2 = 0.909$ on isotropic polarizability with 144 parameters, comparable to MLP baselines at $20$--$40\times$ more parameters. While equivariant neural networks achieve higher pooled $R^2$ at $10^5$--$10^6$ parameters, our within-isomer audit shows the gap is largely a size-prediction effect that vanishes once chemistry is controlled for; $\star_G$ retains its parameter-efficiency advantage in the data-scarce regime where neural training is infeasible. \item \textbf{Machine-verified proofs.} All core algebraic results are formalized in Lean~4 (600 lines, zero unresolved goals, five standard axioms), to our knowledge the first machine-verified Eckart--Young-type optimality theorem for symmetry-preserving tensor approximation. \end{enumerate} Beyond the specific results reported, the framework opens capabilities that were not previously available. Because the irreducible representation decomposition reveals \emph{which} angular momentum channels carry information about \emph{which} observables, the $\star_G$ algebra functions as a symmetry spectroscope for empirical data. This makes it possible to (i)~identify the physical symmetry content of observables directly from measurements, without solving the Schr\"odinger equation or invoking any quantum-mechanical theory; (ii)~perform meaningful molecular property prediction from as few as 100 molecules, a regime relevant to rare materials, radioactive compounds, and exotic states of matter where large training sets are unavailable; and (iii)~test candidate symmetry groups against data to determine which group best describes a system, a capability that could accelerate the study of materials whose symmetries are unknown, approximate, or under debate (e.g., quasicrystals, frustrated magnets, and biological macromolecules with pseudo-symmetry). The work bridges two classical results that share a common author in Carl Eckart: the Eckart--Young theorem (optimal low-rank matrix approximation, 1936) and the Wigner--Eckart theorem (angular momentum selection rules, 1931), unifying them through a single algebraic construction. That the same mathematics which delivers provably optimal compression also recovers, without any physics input, the selection rules that govern atomic spectroscopy suggests that the $\star_G$ algebra captures something fundamental about how symmetry organizes physical information. We believe this connection, together with the practical demonstrations on molecular data and the machine-verified proofs, makes the paper suitable for \textit{Nature Communications'} broad readership across mathematics, physics, chemistry, and machine learning. The manuscript has not been submitted elsewhere. All authors have reviewed the final version and approved its submission. We declare no competing interests. We suggest the following potential referees: \begin{itemize}[leftmargin=1.6em,itemsep=3pt,topsep=4pt,parsep=1pt] \item \textbf{Michael W.\ Mahoney} (UC Berkeley), randomized linear algebra, matrix approximation theory, and scientific machine learning. \item \textbf{Petros Drineas} (Purdue University), randomized matrix and tensor methods, low-rank approximation guarantees, and spectral algorithms. \item \textbf{Lek-Heng Lim} (University of Chicago), tensor rank, tensor decomposition complexity, and algebraic aspects of multilinear approximation. \item \textbf{Laurent Demanet} (MIT), computational harmonic analysis, Fourier methods, and spectral decomposition in scientific computing. \item \textbf{Soledad Villar} (Johns Hopkins University), mathematical foundations of equivariant machine learning, invariant theory, and group symmetry in learning. \item \textbf{Stefanie Jegelka} (MIT / TU Munich), geometric and combinatorial structure in machine learning, invariances, and generalization theory. \item \textbf{Shaul Mukamel} (UC Irvine), theoretical chemical physics, nonlinear spectroscopy, and angular momentum selection rules. \end{itemize} \closing{Sincerely,} \end{letter} \end{document}