What is NRE? Neural Ratio Estimation is the fifth inference method applied to the
Basener-Sanford mutation-selection model. While the other methods (ABC-SMC, BSL, NPE, FMPE) all
answer the question "what parameters fit the data?", NRE can also answer a deeper question:
"which model of mutation-selection best fits the data?"
How it works, for biologists: Imagine you observe a population's fitness trajectory
declining over 200 generations. You want to know the mutation rate, the distribution of fitness
effects (DFE), and the fraction of beneficial mutations. NRE approaches this by learning what makes
a particular fitness trajectory more or less probable under different parameter values. Technically,
it learns the likelihood-to-evidence ratio r(x,θ) = p(x|θ)/p(x) —
a measure of how much more likely the observed data is under specific parameters compared to
average. This is subtly different from NPE/FMPE, which directly learn the posterior (which
parameters are most likely). The distinction matters because the likelihood ratio can be reused
across different models, enabling model comparison.
After training, NRE draws posterior samples using MCMC (Markov Chain Monte Carlo)
— a guided random walk through the five-dimensional parameter space. This is conceptually
similar to BSL (notebook 03) but with a neural network replacing the expensive simulation-based
likelihood, making sampling orders of magnitude faster.
Why Model Comparison Matters for Basener-Sanford: The original model assumes
mutations contribute independently to fitness (additive effects). But in real biology, mutations
interact — this is epistasis. Synergistic epistasis (where harmful mutations
are worse in combination) could accelerate meltdown, while antagonistic epistasis could slow it.
Kondrashov (1995) and Butcher (1995) showed that epistasis does not halt Muller's ratchet, but the
quantitative dynamics change significantly.
NRE enables Bayesian model comparison: given two competing models (e.g., additive
vs. epistatic fitness effects), the learned likelihood ratios can be combined to compute
Bayes factors — a principled measure of which model the data favor —
without retraining. This is a capability that NPE, FMPE, ABC-SMC, and BSL lack. It opens the door
to testing whether the assumptions built into the extended FTNS equation
(d(m̄)/dt ≈ Var(m) + μ·Eg[s]·b̄)
actually hold for real populations.
Figure 1: Marginal posteriors from NRE with MCMC sampling.
Red dashed lines show the true parameter values used to generate the synthetic fitness trajectory.
The NRE classifier was trained on 10,000 simulations drawn from the prior, then posterior samples
were drawn via MCMC (4 chains, 500 warmup steps each).
Biological interpretation: Each panel shows the range of plausible values for one
of the five mutation-selection parameters, given the observed fitness trajectory. Narrow, peaked
distributions indicate parameters that the fitness data strongly constrains. For example, if the
mutation rate (mu) posterior is narrow and centered on the true value, this means the shape of the
fitness trajectory alone is sufficient to estimate the mutation rate — a powerful result for
populations where direct mutation rate measurements are unavailable.
Technical note: Because NRE uses MCMC for sampling, the posterior comes with
standard convergence diagnostics (R-hat, effective sample size) that verify the samples are
trustworthy. This is an advantage over NPE/FMPE, which sample directly from a learned flow
and lack such diagnostics.
2. NRE vs NPE Comparison
Figure 2: Head-to-head comparison of NRE and NPE posteriors.
Both methods were trained on 10,000 simulations from the same prior distribution. NRE (blue)
learns the likelihood ratio and samples via MCMC; NPE (orange) learns the posterior directly and
samples from the learned flow. Red dashed lines mark the true parameter values.
What agreement means biologically: Where the blue and orange histograms overlap
substantially, the parameter estimate is robust — it does not depend on
which neural inference strategy was used. This gives us confidence that if we applied these methods
to real fitness data from an evolving population, the inferred mutation rate, DFE parameters, and
beneficial fraction would reflect genuine biological signal, not computational artifacts.
What disagreement means biologically: Where the histograms diverge (different
peaks or different widths), the parameter is sensitive to the inference methodology.
This signals that the fitness trajectory alone may not contain enough information to pin down that
parameter. For the Basener-Sanford model, such parameters would need additional data —
direct mutation rate measurements, fitness assays of individual mutants, or longer time series
— before we could reliably determine whether the population sits in the selection-dominated
or meltdown regime.
Method differences:
NPE (orange) learns the posterior density directly — fast sampling
but may exhibit mode-covering behavior (wider distributions that hedge between multiple
plausible solutions)
NRE (blue) learns the likelihood ratio — MCMC sampling is slower
but often produces tighter posteriors for well-identified parameters, and uniquely enables
model comparison via Bayes factors