Published on 2025-06-27T20:59:54Z
What is an Outlier in Analytics? Examples of Outliers
An outlier is a data point in your analytics dataset that significantly deviates from the overall pattern of the data. It can arise due to a tracking error, bot traffic, sudden user behavior, or genuine anomalies like viral content spikes. Identifying outliers is crucial in analytics to avoid skewed averages, misleading trends, and incorrect business decisions. By properly detecting and handling outliers, analysts can maintain data quality, improve reporting accuracy, and uncover insights into unusual events. Within web analytics, examples include a sudden spike in pageviews from referral spam, an extreme session duration from a bot, or a conversion rate anomaly following a marketing campaign.
Outlier
A data point that significantly deviates from other observations, impacting analytics accuracy and requiring detection and handling.
Definition and Significance
This section defines what an outlier is in web analytics and explains why identifying outliers is critical for accurate insights and decision-making.
-
What is an outlier?
A single observation that lies an abnormal distance from other values in a dataset, indicating potential errors or significant events.
-
Why it matters
Outliers can distort metrics like averages, medians, and conversion rates, leading to flawed interpretations and business strategies.
Types of Outliers
Outliers can be classified by their characteristics and context. Understanding these types helps choose the right detection method.
-
Global (point) outliers
Observations that are inconsistent with the entire dataset, such as an unusually high session count on a low-traffic page.
-
Contextual (conditional) outliers
Data points that are outliers within a specific subset or context, like a spike in traffic only during a marketing campaign.
-
Collective outliers
A group of observations that together deviate from the overall pattern, such as multiple referral spam hits in a short time.
Detection Methods
Common techniques to identify outliers in analytics data, ranging from simple statistical formulas to visual tools.
-
Statistical techniques
Methods that rely on mathematical formulas to flag outliers based on data distribution properties.
- Z-score method:
Calculates how many standard deviations a data point is from the mean; typically, |Z| > 3 indicates an outlier.
- Interquartile range (iqr):
Defines outliers as points lying below Q1 − 1.5×IQR or above Q3 + 1.5×IQR.
- Z-score method:
-
Visualization techniques
Graphical approaches that help spot outliers by eye.
- Box plot:
Displays data distribution and highlights outliers as points beyond whiskers.
- Scatter plot:
Plots two variables and reveals outliers as isolated points away from clusters.
- Box plot:
Outlier Detection in SaaS Analytics Tools
How popular analytics platforms like GA4 and PlainSignal help detect anomalies and outliers, with examples.
-
Google analytics 4 (ga4)
GA4 includes built-in anomaly detection in Explorations and reporting APIs, automatically surfacing unusual metric changes.
- Anomaly detection reports:
GA4 flags significant deviations in key metrics using machine learning.
- Custom alerts:
Users can set thresholds to receive notifications when metrics cross expected bounds.
- Anomaly detection reports:
-
Plainsignal
PlainSignal is a cookie-free, simple analytics tool that can surface sudden traffic changes and unusual patterns.
- Threshold alerts:
Configure custom thresholds to get notified when visits or events exceed expected ranges.
- Example tracking code:
<link rel="preconnect" href="//eu.plainsignal.com/" crossorigin /><script defer data-do="yourwebsitedomain.com" data-id="0GQV1xmtzQQ" data-api="//eu.plainsignal.com" src="//cdn.plainsignal.com/PlainSignal-min.js"></script>
- Threshold alerts:
Best Practices for Managing Outliers
Strategies to handle outliers after detection to ensure data integrity and meaningful analysis.
-
Validate and contextualize
Investigate outliers to determine whether they are due to errors, bots, or genuine events before deciding how to handle them.
-
Clean or adjust data
Filter known bot traffic, correct tracking errors, or apply transformations (like winsorizing) to mitigate the impact of outliers.
-
Document actions
Keep records of any changes or exclusions made due to outliers for transparency and reproducibility.