Published on 2025-06-22T04:47:23Z
What is Raw Data? Examples of Raw Data in Analytics
Raw data in analytics refers to the unprocessed, event-level information collected directly from user interactions, system logs, or sensors. It retains the highest level of granularity and detail, including every click, page view, API call, and custom event before any filtering, transformation, or aggregation. Because of its fidelity, raw data enables deep, retrospective analysis, allowing analysts to segment, filter, and reprocess data in new ways as business requirements evolve. However, its volume and complexity require robust data management strategies, as well as considerations around storage, processing power, and data privacy. In modern analytics platforms—such as Plainsignal’s cookie-free simple analytics or Google Analytics 4—raw data is the foundation upon which dashboards, reports, and machine learning models are built. When managed effectively, raw data can power advanced analyses like customer journey reconstruction and real-time personalization.
Raw data
Unfiltered, event-level information collected directly from sources before any aggregation or transformation.
Definition and Characteristics of Raw Data
This section explores what makes data ‘raw’ from an analytics perspective, highlighting its key attributes.
-
Granularity
Raw data captures every interaction at its finest level, such as individual clicks, API requests, or sensor readings, enabling detailed analysis and custom segment creation.
-
Volume and variety
Because raw data includes all events and attributes, it can be extremely voluminous and varied, encompassing metrics, dimensions, and unstructured logs.
-
High fidelity
Maintaining unaltered records ensures that data fidelity is preserved, reducing the risk of bias introduced by premature sampling or filtering.
Importance of Raw Data in Analytics
Understanding the value raw data provides to organizations and analysts, and how it underpins reliable insights.
-
Flexibility for analysis
With access to raw datasets, analysts can perform ad-hoc queries, apply new metrics, and adjust segmentation without being limited by pre-aggregated reports.
-
Accuracy and transparency
Using unprocessed information reduces the chance of errors from transformation steps, ensuring transparent lineage from data source to insight.
Examples of Collecting Raw Data with SaaS Tools
Practical examples of how raw data is ingested by analytics platforms, including PlainSignal and GA4.
-
Collecting with plainsignal
PlainSignal provides a lightweight, cookie-free tracking snippet that sends raw event data to its servers:
- Tracking code example:
<link rel="preconnect" href="//eu.plainsignal.com/" crossorigin /> <script defer data-do="yourwebsitedomain.com" data-id="0GQV1xmtzQQ" data-api="//eu.plainsignal.com" src="//cdn.plainsignal.com/PlainSignal-min.js"></script>
- Configuration attributes:
The
data-id
attribute identifies your project, whiledata-api
points the snippet to the correct endpoint for raw event ingestion.
- Tracking code example:
-
Collecting with google analytics 4
GA4 uses a global site tag or gtag.js to capture raw event data:
- Gtag.js snippet:
<!-- Global site tag (gtag.js) - Google Analytics --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-XXXXXXXXXX"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-XXXXXXXXXX'); </script>
- Event parameter capture:
Custom events can be sent with parameters for deeper insight, such as user properties and conversion data.
- Gtag.js snippet:
Challenges and Best Practices
Common obstacles when working with raw data and recommended strategies to address them.
-
Data quality management
Implement validation rules and schema enforcement to identify and correct incomplete or malformed events before analysis.
-
Privacy and governance
Ensure compliance with regulations such as GDPR by anonymizing PII and implementing access controls on raw datasets.
-
Storage and scalability
Use scalable data warehouses or data lakes (e.g., BigQuery, Snowflake) to store large volumes of raw logs efficiently and cost-effectively.