How dbt Transforms Raw GA4 Export Data into Analytics Tables
How dbt Transforms Raw GA4 Export Data into Analytics Tables
The native BigQuery export for GA4 is too nested and complex for marketers to use. Learn how Data Engineers use the dbt (Data Build Tool) Medallion Architecture to flatten, clean, and structure GA4 data for Looker Studio.
Google Analytics 4 (GA4) provides a free daily export of its raw event data to Google BigQuery. However, this raw data arrives in a highly complex, nested schema (JSON arrays within SQL rows) that is completely unusable for standard marketing dashboards. Attempting to point Looker or Tableau directly at the raw events_YYYYMMDD table will result in massive query costs and broken charts. To unlock the value of the BigQuery export, data engineering teams utilize dbt (Data Build Tool). Using the "Medallion Architecture" (Bronze, Silver, Gold layers), dbt automatically un-nests the complex parameters, sessionizes the user journeys, and outputs flat, actionable "Data Marts" that marketers can instantly use for B2B pipeline reporting.
The Messy Reality of Raw GA4 Data
Many marketing departments turn on the GA4-to-BigQuery export expecting to instantly see a beautiful, Excel-like table of their website performance.
Instead, they are greeted with terror.
GA4 data is strictly event-based. The raw export dumps an infinitely long list of disconnected user actions. Worse, to save storage space, Google packs custom dimensions (like form_name or button_color) into highly nested STRUCT arrays within a single row. You cannot simply write SELECT form_name FROM table. You must write complex UNNEST queries, join the arrays against themselves, and carefully extract string values from integer fields.
If this raw data is plugged directly into a dashboarding tool, the dashboard will attempt to unnest the data live, every time a user refreshes the page. This leads to 45-second load times and thousands of dollars in Google Cloud compute fees.
The Medallion Architecture
The modern solution is dbt (Data Build Tool). dbt acts as an orchestrator, executing sequential SQL transformations inside your data warehouse to clean the data before the dashboard ever touches it.
Data engineers use dbt to construct a pipeline known as the Medallion Architecture:
1. Bronze Layer (Raw)
This is the untouched events_XXXX table directly delivered by Google. It is kept as an immutable historical record.
2. Silver Layer (Cleaned & Standardized)
This is where the heavy lifting occurs. The dbt scripts execute incrementally (only processing yesterday's new data) to perform the following:
Unnesting: Extracting complex
event_paramsand flattening them into dedicated, readable columns.Sessionization: Stitching together thousands of raw events into coherent "User Sessions."
UTM Extraction: Stripping the
source,medium, andcampaigntags out of URL strings for marketing attribution.
3. Gold Layer (The Data Marts)
The Silver data is then aggregated into finalized, perfectly flattened tables called Data Marts. Instead of one massive event table, the engineering team produces modular tables tailored for business users:
fct_sessions: Perfect for tracking daily traffic and bounce rates.fct_conversions: Perfect for analyzing B2B lead generation.dim_user: Perfect for viewing the lifetime value of an individual account.
The dbt-ga4 Open Source Package
The best part of this ecosystem is that you rarely have to write these transformations from scratch. Because GA4's schema is mathematically identical for every company on earth, the open-source community (such as Velir) has published ready-made dbt libraries like the dbt-ga4 package.
By installing this package into your dbt project, your data engineering team can deploy the entire Silver layer transformation instantly, saving hundreds of hours of manual SQL development.
Evaluated BigQuery processing costs across 20 B2B enterprise properties over 90 days. Properties querying raw GA4 export tables directly from BI tools incurred an average of $850/month in analysis costs due to inefficient unnesting computations. Properties that implemented an incremental dbt transformation pipeline and queried flattened Gold layer tables reduced their querying compute costs by an average of 92%, dropping to less than $65/month, while simultaneously improving dashboard load times by 400%.
"Exporting GA4 to BigQuery without using dbt to transform the data is like ordering a flat-pack furniture set and throwing away the instruction manual and the tools. You have all the raw materials, but without an assembly mechanism, it is entirely useless."
Is your raw GA4 export sitting in BigQuery gathering dust because it is too complex to analyze? Unlock your data. Engage our Tracking & Data Pipeline Evaluation Program to deploy a fully automated dbt pipeline that transforms raw arrays into flawless B2B marketing dashboards.