finds.dev← search

// the find

pymc-labs/pymc-marketing

★ 1,166 · Python · Apache-2.0 · updated Jun 2026

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.

A Bayesian marketing analytics library built on top of PyMC, covering Media Mix Modeling, Customer Lifetime Value (BG/NBD, Pareto/NBD, Gamma-Gamma), and customer choice models. It's aimed at data scientists and marketing analysts who want uncertainty quantification in their marketing models rather than point estimates from tools like Robyn or Meta's MMM.

- The MMM implementation is genuinely sophisticated: geometric and delayed adstock, multiple saturation functions, time-varying media contributions via GP approximations, lift test calibration, and budget optimization — all with proper posterior uncertainty rather than just optimization output.

- CLV model coverage is thorough: BG/NBD, Pareto/NBD, Modified BG/NBD, Gamma-Gamma, shifted BG — the custom PyMC distributions (BetaGeoNBD, ParetoNBD, etc.) are real implementations of the underlying math, not wrappers around frequency tables.

- Notebook documentation is unusually dense and practical — 30+ notebooks covering migration guides, time-series cross-validation, causal identification with DAGs, ROAS unobserved confounders — these are the questions practitioners actually ask.

- YAML config-driven model specification and model serialization to/from InferenceData means you can version and deploy models without re-running MCMC, which matters for production use.

- MCMC sampling is slow for MMM at realistic data sizes; the README mentions GPU support and alternative NUTS samplers but doesn't set expectations that even a modest dataset can take 10-30 minutes — newcomers will be confused when their first fit takes forever.

- The budget optimization code uses scipy optimizers on posterior means rather than propagating posterior uncertainty through the optimization, so the 'Bayesian' budget allocation outputs can give false precision about optimal spend.

- The library has expanded into Bass diffusion, discrete choice (MNL, nested logit, mixed logit, MaxDiff), MVITS, and more — the scope is getting wide enough that some modules feel like proof-of-concept notebooks wrapped in a class rather than production-ready code.

- Dependency on PyMC/Pytensor means the install is heavy and version pinning is fragile; the conda-only recommendation in the README is a red flag for anyone trying to use this in a standard pip-based deployment pipeline.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →