As of Friday, March 1, Near transferred its operations to Azira, which you may access at https://azira.com.
Please review Azira’s website for more information about its policies and practices.
EN

Chihiro Fukami

Sr. Product Director

3 mins read

Data Anomalies 101: How Near Detects and Eliminates Invalid Data to Ensure High Quality 

In today’s digital age, data drives decision-making processes across various industries. As we delve into the vast realms of consumer behavior data, it’s crucial to understand that not all data is created equal. Anomalies, or irregularities, often creep into datasets, distorting the accuracy and reliability of the information they provide. Here’s a brief description of the kinds of anomalies we’ve encountered, with some detail about how Near goes about identifying and eliminating them, ensuring the integrity of the consumer behavior data we share with our customers.

Understanding Data Anomalies

At Near, we gather data from diverse sources, offering valuable insights into consumer behavior. However, the ever-evolving mobile ecosystem brings with it an influx of anomalous data that can misrepresent real consumer actions. Some common anomalies we encounter include:

Pauli Anomaly:
Data points with different device IDs but identical locations and timestamps.
Crosshairs Anomaly:
Observations forming a crosshair pattern, though they do not fall on a grid.
Grid Anomalies:
Data points forming grid patterns of various resolutions, often drawing from real
device histories.
Hot Circle Anomaly:
Data points filling large, circular areas in seemingly random locations.
Replay Anomaly:
Sequences of observations replayed at later times, leading to unrealistic traffic
patterns (such as high traffic at a store that closed).
Example of a Hot Circle Anomaly
Example of a Grid Anomaly

Near’s Data Anomaly Filters

Near has developed sophisticated algorithms that incorporate image recognition and advanced statistical analyses to detect and remove these anomalies. Additionally, we are continuing to research and develop new AI-based methodologies. However, as human movement, mobile device usage, and the fragmented mobile data ecosystem continue to evolve, we face a variety of challenges:

Data moved secure location

Emergence of New Anomalies Constant vigilance is required
as new anomalies and fraudulent
data sources regularly appear.

Data is anonymized

Balancing Data Volume: Aggressively removing anomalies can impact data volume. We aim for a balanced approach to ensure accurate data points are not discarded via overzealous scrubbing.

Market Research

Computational Complexity: Identifying complex anomalies at scale can be computationally intensive, demanding more computing resources and advanced techniques like image recognition and statistical analyses.

A Commitment to Data Quality

In the face of these challenges, Near remains steadfast in its commitment to maintaining high data quality standards. Our ongoing research and development efforts focus on innovative methodologies, incorporating cutting-edge technologies to safeguard the value of our data. As we continue to evolve alongside the dynamic mobile data landscape, Near persistently explores new avenues, ensuring the data you rely on is accurate, reliable, and free from anomalies.

For our latest anomaly updates, please review our topical data sheet, Understanding Data Anomalies in Consumer Behavior Data.

Want to learn more about how Near’s Consumer Behavior Data Intelligence platform ensures data quality? Check out our eBook: Not All Data Is Created Equal or request a demo.
This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.