On Soft Clustering for Correlation Estimators

Edward Berman *

Northeastern University

Sneh Pandya

Northeastern University and IAIFI

Jacqueline McCleary

Northeastern University

Marko Shuntov

Cosmic Dawn Center and Niels Bohr Institute, University of Copenhagen

Caitlin Casey

Cosmic Dawn Center and University of Texas at Austin

Nicole Drakos

University of Hawa, Hilo

Andreas Faisst

Caltech/IPAC

Steven Gillman

Cosmic Dawn Center and DTU-Space, Technical University of Denmark

Ghassem Gozaliasl

Aalto Univerisity and University of Helsinki

Natalie Hogg

Laboratoire Univers et Particules de Montpellier, CNRS & Universite de Montpellier, Parvis Alexander Grothendieck, Montpellier

Jeyhan Kartaltepe

Rochester Institut of Technology

Anton Koekemoer

Space Telescope Science Institute

Wilfried Mercier

Aix Marseille University, CNRS, CNES, LAM

Diana Scognamiglio

NASA JPL

COSMOS-Web, The JWST Cosmic Origins Survey

(in prep)

*correspondance to berman.ed@northeastern.edu

Abstract

Properly estimating correlations between objects at different spatial scales necessitates O(n2)\mathcal{O}(n^2) distance calculations. For this reason, most widely adopted packages for estimating correlations use clustering algorithms to approximate local trends. Methods for quantifying the error introduced by clustering have been understudied. In response, we present an algorithm for estimating correlations that is probabilistic in the way that it clusters objects, enabling us to quantify the uncertainty caused by clustering simply through model inference. We also observe that these soft clustering assignments enable correlation estimators that are theoretically differentiable with respect to their input catalogs. Thus, we follow by building up a theoretical framework for differentiable correlation functions and describe their utility in comparison to existing surrogate models. Notably, we find that the repeated use of the normalization and distance function calls makes gradient calculations slow and sparsity patterns in Jacobians that propagate through the chain rule makes the precision unstable, pointing towards either approximate or surrogate methods as a necessary solution to exact gradients from correlation functions. To that end, we close with a discussion of surrogate models as proxies for correlation functions. We provide an example that demonstrates the efficacy of surrogate models to enable gradient based optimization of astrophysical model parameters, successfully minimizing a correlation function output. Our numerical experiments cover science cases across cosmology, from point spread function (PSF) modeling efforts to gravitational simulations to intrinsic alignments (IA). We release the code used in this study at https://github.com/EdwardBerman/cosmo-corr and https://github.com/EdwardBerman/jax-cosmo-corr.

Model Uncertainty

Probabilistic clustering assignments enables us to study epistemic uncertainty in addition to aleatoric uncertainty.

alt text
Outline of Model Uncertainty Experiment
alt text
Rho Statistic with error bars from bootstrap baseline and clustering uncertainty

Differentiability

We forward model galaxy evolution with ODEs and show how we can backpropogate a shear-shear correlation all the way through to the underlying physics. The key method is using skip gradients to pass information from objects created from clustering to their constituent galaxies.

alt text
Outline of Differentiability Experiment
alt text
Gradients in our experiment

Surrogates

Using surrogates, we have a differential function from the underlying physics straight to the correlation value. We leverage this differentiability to perform Hamiltonian Monte Carlo (HMC) and find posterior distributions over IA parameters most likely to minimize a correlation function.

alt text
Outline of Surrogate Experiment
alt text
Posterior over IA parameters from HMC

BibTeX citation

  
  @misc{CosmoCorr,
  author = {Edward Berman},
  title = {CosmoCorr: Cosmological Correlation Function Estimator},
  year = {2024},
  howpublished = {\url{https://github.com/EdwardBerman/CosmoCorr}},
  note = {Accessed: 2024-09-23}
}