Democratizing Marketing Mix Models (MMM) with Open Source and GenAI

This article is for technology-minded marketers! Data science and analytics professional Shakti Kothari takes a deep dive into constructing a Marketing Mix Model using freely available tools — giving you and your organization analytical insights regardless of your budget.

A practical system design combining open-source Bayesian MMM and GenAI for transparent, vendor independent marketing analytics insights.

Marketing Mix Models (MMM) have been in the industry for several years and recently they have experienced a renaissance. With digitally tracked signals being deprecated for increasing data privacy restrictions, marketers are turning back to MMMs for strategic, reliable, privacy-safe measurement and attribution framework.

Unlike user-level tracking tools, MMM uses aggregated time-series and cross-sectional data to estimate how marketing channels drive business KPIs. Advances in Bayesian modeling with enhanced computing power has pushed MMM back in the center of marketing analytics.

For years, advertisers and media agencies have used and relied on Bayesian MMM for understanding marketing channel contributions and marketing budget allocation.

Role of GenAI in Modern MMM

More companies are now utilizing GenAI features as an enhancement to MMM in several ways:

  1. Data preparation and feature engineering
  2. Pipeline automation: Generating code for MMM pipeline
  3. Insight explanation: Translating model insights into plain business language
  4. Scenario planning and budget optimization

While these capabilities are powerful, they rely on proprietary MMM engines.

The purpose of this article is not to showcase how Bayesian MMM works but to demonstrate a potential open-source and free system design that marketers can explore without the need of subscribing to black box MMM stack that vendors in the industry provide.

The approach combines:

  • Google Meridian as the open-source Bayesian MMM engine
  • Mistral 7B, as the large language model (LLM), providing an insight and interaction layer on top of Meridian’s output

Here is an architecture diagram representing the proposed open-source system design for marketers:

Open-Source Stack Workflow: Bayesian MMM + GenAI
This architecture diagram was created using Gen-AI assisted design tools for rapid prototyping.

This open-source workflow has several benefits:

  1. Democratization of Bayesian MMM: Eliminates black box problem of proprietary MMM tools.
  2. Cost efficiency: Reduces financial barriers for small/medium businesses to access advanced analytics.
  3. Preserves the statistical rigor required from MMM engines and makes it more accessible.
  4. With GenAI insights layer, the audience does not need to understand the Bayesian Math; instead they can interact using GenAI prompts to learn about model insights on channel contribution, ROI and possible budget allocation strategies.
  5. Adaptability to newer open-source tools: The GenAI layer can be replaced with newer LLMs as and when they are openly available to get enhanced insights.

Hands-on example of implementing Google Meridian MMM model with an LLM layer

For the purpose of this showcase, I have used Mistral 7B LLM which is an open-source LLM sourced locally from the Hugging Face platform hosted by the Llama engine.

This framework is supposed to be domain agnostic: Any alternative open-source MMM models such Meta’s Robyn or PyMC and LLM versions for GPT can be used, depending on the scale and scope of the insights desired.

Important notes:

  • A synthetic marketing dataset was created having KPI as conversions and marketing channels as TV, Search, Paid Social, Email and OOH (Out-of-Home media).
  • Google Meridian produces rich output such as ROI, channel coefficients and contributions in driving KPI, response curves, etc. While these output are statistically sound, they often require specialized expertise to interpret. This is where an LLM becomes valuable and can be used as an insight translator.
  • Google Meridian python code examples were used to run the Meridian MMM model on the synthetic marketing data created. For more information on how to run Meridian code, please refer to the Introduction to Meridian Demo.
  • An open-source LLM model Mistral 7B was utilized due to its compatibility with the free tier of Google Colab GPU resources and because it is an adequate model for generating instruction-based insights without relying on any API access requirements.

Example

This snippet of Python code was executed in Google Colab platform:

A synthetic marketing dataset (not shown in this code) was created as part of the Meridian workflow requirement; an input data builder instance was created, as show below:

Configure and execute the Meridian MMM model:

This code snippet runs the meridian model with defined priors for each channel on the input dataset generated. The next step is to assess model performance. While there are model output parameters such as R-squared, MAPE, P-Values, etc. that can be assessed, for the purpose of this article I am just including a visual assessment example:

Now that the Meridian MMM model has been executed, we have model output parameters for each media channel such as ROI, response curves, model coefficients, spend levels, etc. We can bring all this information into a single input JSON object that can be used directly as an input to LLM to generate insights:

Downloading Mistral 7B LLM from Hugging Face platform locally and installing the required Llama engine to execute the LLM:

Executing the Mistral LLM using the input JSON having Meridian MMM output and including the appropriate instructional prompt:

Example output:

Practical Considerations

  1. Model quality and insights is still dependent on input data quality.
  2. Prompt design is critical to avoid misleading insights.
  3. Automation for input data processing and model output reporting and visualization will help this stack to operate at scale.

Final Thoughts

This walkthrough illustrates how an open-source based Bayesian MMM augmented with GenAI workflow can translate complex Bayesian results into actionable insights for marketers and leaders.

This approach does not attempt to simplify the math behind Marketing Mix Models (MMMs). Instead, it preserves it and makes an attempt to make it more accessible for organizations with limited model knowledge and budget resources. 

As privacy-safe marketing analytics becomes a norm, open-source MMM systems with GenAI augmentation offers a sustainable path: transparent, adaptable and designed to evolve with business and underlying technology.

This article reflects an independent exploration of open-source tools. No commercial endorsements are implied. All code snippets and outputs shown in this article are illustrative and intended for educational purposes only.

Resources & References

Bayesian Marketing Mix Modeling (general methodology and industry application): https://towardsdatascience.com/understanding-bayesian-marketing-mix-modeling-a-deep-dive-into-prior-specifications-af400adb836e/

Google Colab (free GPU environment for prototyping): https://colab.google/

Google Meridian: https://developers.google.com/meridian/notebook/meridian-getting-started

Mistral 7B LLM (language model that provides efficiency and high performance to enable real-world applications): https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF

Submit Your Article

Submit your idea for one of our monthly guest writer spots.