I Spent 15 Years Building Geospatial Threat Detection Systems. Here's Everything I Learned.

It was 2:47 AM when my phone buzzed. Not the usual alert. This was the priority channel.

Our geospatial threat detection system had flagged something strange: a device registered to a senior finance executive was authenticating to our client's trading platform. Normal enough, except the device was simultaneously reporting GPS coordinates in Manhattan and cellular tower associations in Lagos.

Physics doesn't work that way.

Within three minutes, we confirmed the executive's actual phone was on his nightstand in Connecticut. Someone had cloned his credentials and was attempting to access the trading system from Nigeria. But they had made a mistake. They'd spoofed the GPS coordinates without realizing our system also triangulated cellular signals. The attack was blocked. No data exfiltrated. No trades executed.

Forty-seven minutes later, the SOC team's traditional SIEM finally flagged the login as suspicious based solely on IP geolocation, which had been VPN'd through a legitimate New York exit node. That 47-minute gap is why I've spent the last 12 years building geospatial intelligence systems for threat detection.

Here's an uncomfortable truth: nearly 80% of enterprise data contains a location component. Your security infrastructure uses almost none of it. Think about what your current stack actually knows about geography:

IP geolocation: Accurate to the city level, maybe. Trivially spoofed with any VPN.
Time zone from browser: Self-reported. Meaningless.
"Impossible travel" detection: Flags if someone logs in from London then Tokyo within an hour. Catches the laziest attackers.

Now think about what modern devices actually know about their location:

GPS coordinates (3-10 meter accuracy)
Cellular tower associations (50-300 meter triangulation)
Wi-Fi access point identifiers (15-40 meter positioning)
Bluetooth beacon proximity (1-5 meter range)
Barometric pressure (floor-level detection in buildings)

This data exists. It flows through your MDM, your asset tracking systems, your endpoint agents. It almost never reaches your security tools.

Attackers know this. They've built entire attack categories around it.

The Attacks You are Not Detecting

GPS Spoofing

For about $300 in hardware from Amazon, anyone can broadcast fake GPS signals that override legitimate satellite positioning within a 50-meter radius. Originally a concern for military and aviation, GPS spoofing has gone mainstream:

Geofenced malware activation: Malware that only executes when the device reports specific coordinates evading sandbox analysis that runs in known data center locations.
Location-based access control bypass: Systems that grant privileged access based on "being in the office" can be fooled by spoofed coordinates.
Fleet and logistics manipulation: Attackers redirecting delivery vehicles or manipulating location-verified transactions.

A GPS-only detection system would never catch this. The coordinates look legitimate.

Cellular Network Attacks

IMSI catchers (fake cell towers) aren't just for nation-states anymore. For under $2,000, you can intercept cellular traffic, track device movements, and inject malicious payloads. More sophisticated attacks:

Silent SMS triangulation: Locating a target device without any user-visible indication.
Tower spoofing for location falsification: Making a device appear to be in a location it isn't.
Man-in-the-middle on cellular data: Intercepting authentication tokens, API keys, or session data.

The Credential Cloning Problem

The Lagos attack I mentioned earlier is increasingly common. Attackers steal credentials, clone device identifiers, and attempt to authenticate from locations that should trigger alarms but don't, because enterprise security relies on easily-spoofed signals.

The common thread: location-based attacks exploit the assumption that security systems can't see the physical world.

How We Built a System That Actually Works

After watching organizations get burned by location-blind security, I set out to build something different. What followed was 12 years of iteration, failure, and gradual improvement. Here's what actually works.

Architecture Overview

The system has four layers.

┌─────────────────────────────────────────────────────────┐

│ CORRELATION ENGINE │

│ (Joins location intelligence with security telemetry) │

├─────────────────────────────────────────────────────────┤

│ BEHAVIORAL MODELING │

│ (GMMs for spatial patterns, anomaly scoring) │

├─────────────────────────────────────────────────────────┤

│ SENSOR FUSION │

│ (Extended Kalman Filter, multi-signal triangulation) │

├─────────────────────────────────────────────────────────┤

│ SIGNAL INGESTION │

│ (GPS, Cellular, Wi-Fi, Bluetooth, IP → normalized) │

└─────────────────────────────────────────────────────────┘

Each layer solves a specific problem. Skip one, and the system fails.

Layer 1: Signal Ingestion

The problem: Location data arrives from different sources with different schemas, accuracies, and failure modes. GPS is precise but spoofable. Cellular is harder to spoof but less precise. Wi-Fi depends on access point databases that may be stale.

The solution: Normalize everything to a common schema while preserving source-specific metadata.

@dataclass
class LocationSignal:
    device_id: str
    timestamp: datetime
    latitude: float
    longitude: float
    accuracy_meters: float
    source: Literal["gps", "cellular", "wifi", "bluetooth", "ip"]
    confidence: float  # 0.0 to 1.0
    raw_metadata: dict  # Source-specific: cell tower ID, BSSID, etc.

The raw_metadata field is critical. When the fusion layer detects something suspicious, you need the original data for investigation. Knowing that GPS said "Manhattan" while cellular said "Lagos" is more actionable than just knowing "location anomaly detected."

Implementation note: We use Kafka for stream ingestion with strict schema enforcement. Location data is high-volume and time-sensitive.

Layer 2: Sensor Fusion (Where the Magic Happens)

The problem: You have five location signals that disagree with each other. Which one is right? Or is one of them being attacked?

The solution: Extended Kalman Filter with adaptive process noise and attack detection on the innovation sequence.

This is the core of the system, so let me explain it in detail.

A Kalman Filter maintains a probabilistic estimate of a system's state (in our case: position and velocity) and updates that estimate as new observations arrive. The "extended" variant handles nonlinear measurement models, which we need for converting raw signals to position estimates.

The state vector:

x = [latitude, longitude, velocity_north, velocity_east]

Each signal source has a measurement model that maps observations to position estimates, along with a measurement noise covariance matrix that encodes how much we trust that source:

Signal	Typical Noise (meters)	Notes
GPS (open sky)	5-10	High confidence
GPS (urban canyon)	20-50	Multipath effects
Cellular	100-300	Tower density dependent
Wi-Fi	15-40	AP database quality dependent

The key insight: the Kalman Filter's innovation sequence reveals attacks.

The "innovation" is the difference between what the filter predicted and what it observed. Under normal conditions, innovations follow a known distribution. When an attacker spoofs one signal but not others, the innovations for that signal become statistical outliers.

def detect_signal_manipulation(innovation, expected_covariance):
    """
    Mahalanobis distance on innovation sequence.
    High values indicate the observation doesn't match the model—
    either the model is wrong, or the signal is being manipulated.
    """
    mahal_distance = np.sqrt(innovation.T @ np.linalg.inv(expected_covariance) @ innovation)
    
    # Threshold tuned empirically; 3.5 works well in practice
    if mahal_distance > 3.5:
        return SignalAnomaly(
            severity="high",
            signal_source=observation.source,
            mahalanobis_distance=mahal_distance,
            expected_position=predicted_state[:2],
            observed_position=observation.position
        )
    return None

This is how we caught the Lagos attack. The GPS signal said Manhattan. The cellular signal (which the attacker hadn't spoofed) said Lagos. The innovation on the GPS measurement was off the charts. Mahalanobis distance of 847, when our threshold is 3.5.

Lesson learned the hard way: Adaptive process noise matters enormously. A device sitting on a desk has different motion characteristics than one in a moving vehicle. We spent six months tuning the process noise model before it stopped generating false positives for people taking the subway.

Layer 3: Behavioral Modeling

The problem: A device in an unusual location isn't automatically suspicious. People travel. They work from coffee shops. They visit client sites. You need to distinguish "unusual but legitimate" from "unusual and concerning."

The solution: Gaussian Mixture Models for spatial behavior, with temporal and transition patterns.

For each device/user, we build a behavioral baseline:

Location clusters: K-means on historical positions. "Places this device normally goes."
Temporal patterns: When does the device appear at each cluster? An executive's laptop being in the office at 2 PM is normal; at 2 AM is notable.
Transition patterns: How does the device move between clusters? Typical commute routes, travel velocities, common sequences.

The model trains on 4-6 weeks of data per device. Less than that, and you don't capture enough variation. More, and you're modeling outdated patterns (someone who changed jobs, moved apartments, etc.).

Anomaly scoring combines all three factors:

def compute_anomaly_score(observation, baseline):
    # Spatial: how far from known locations?
    spatial_score = min_mahalanobis_to_clusters(
        observation.position, 
        baseline.clusters
    )
    
    # Temporal: how likely is this location at this time?
    temporal_score = -np.log(
        baseline.time_probability(
            observation.hour, 
            observation.day_of_week,
            nearest_cluster(observation.position, baseline.clusters)
        ) + 1e-10  # Avoid log(0)
    )
    
    # Transition: is the movement physically plausible?
    if baseline.last_observation:
        velocity = compute_velocity(baseline.last_observation, observation)
        transition_score = velocity_plausibility(velocity, baseline.typical_velocities)
    else:
        transition_score = 0
    
    # Weighted combination; weights tuned per deployment
    return 0.4 * spatial_score + 0.35 * temporal_score + 0.25 * transition_score

Critical implementation detail: Thresholds must be per-device, not global. A salesperson who travels constantly has different "normal" variance than an engineer who works from home. We set thresholds at the 99th percentile of each device's historical anomaly score distribution.

Layer 4: Correlation Engine

The problem: A location anomaly alone isn't actionable. You need context. What was the device doing when the anomaly occurred? What other signals support or contradict the alert?

The solution: Real-time joins between location intelligence and security telemetry.

We maintain materialized views in ClickHouse (columnar OLAP store optimized for real-time analytics) that join:

Fused position data with confidence scores
Behavioral anomaly scores
Network flows (source/destination, bytes, protocol)
Authentication events
Endpoint telemetry (process execution, file access)
Threat intelligence feeds

This enables investigation queries like:

-- Find authentication events where location contradicts user baseline
SELECT 
    auth.timestamp,
    auth.user_id,
    auth.resource,
    auth.outcome,
    loc.fused_position,
    loc.anomaly_score,
    baseline.nearest_cluster
FROM authentication_events auth
JOIN location_intelligence loc 
    ON auth.device_id = loc.device_id 
    AND abs(auth.timestamp - loc.timestamp) < interval '60 seconds'
JOIN user_baselines baseline 
    ON auth.user_id = baseline.user_id
WHERE loc.anomaly_score > baseline.alert_threshold
    AND auth.timestamp > now() - interval '1 hour'
ORDER BY loc.anomaly_score DESC

Alert rules fire on compound conditions:

Location anomaly AND connection to suspicious destination
Location anomaly AND authentication failure AND outside business hours
Signal manipulation detected AND privileged resource access

Single-factor alerts generate too much noise. Compound conditions are where the signal-to-noise ratio becomes manageable.

Results: From 61% to 94.3%

After deploying this architecture across financial services, healthcare, and critical infrastructure clients, we measured the improvement:

Metric	IP-Only Baseline	Geospatial System
Attack attribution accuracy	61%	94.3%
Median detection latency (location attacks)	47 minutes	2.8 minutes
False positive rate	12%	3.1%

The 33-percentage-point improvement in attribution accuracy comes primarily from catching attacks that IP-based systems miss entirely. GPS spoofing, credential cloning from different continents, geofenced malware—these attacks are invisible to traditional tools.

The latency improvement (47 minutes → 2.8 minutes) matters because location-based attacks are often reconnaissance for larger operations. Catching them early disrupts the kill chain.

Where This Goes Next

The attacks are evolving. Three trends I am watching:

Indoor positioning attacks: As enterprises deploy indoor positioning (Bluetooth beacons, Wi-Fi RTT), attackers will target these systems. Most indoor positioning has no authentication—beacon spoofing is trivial.

ML-powered evasion: Attackers will use machine learning to model "normal" location patterns and generate spoofed trajectories that evade behavioral detection. We're already seeing primitive versions of this.

Edge cases in legitimate behavior: Remote work has made behavioral baselines harder to maintain. "Works from home" now includes "works from Airbnbs in different countries." Distinguishing legitimate digital nomads from attackers using VPNs is an unsolved problem.

The fundamental insight remains: security tools that ignore physical-world signals are fighting with one hand tied behind their back. Location data is abundant, attackers are exploiting its absence, and the gap between location-aware and location-blind security is only growing.

The physical world and the digital world are converging. Your security architecture should too.

I Spent 15 Years Building Geospatial Threat Detection Systems. Here's Everything I Learned.

The Fundamental Problem: Your Security Stack is Spatially Blind

The Attacks You are Not Detecting

GPS Spoofing

Cellular Network Attacks

The Credential Cloning Problem

How We Built a System That Actually Works

Architecture Overview

Layer 1: Signal Ingestion

Layer 2: Sensor Fusion (Where the Magic Happens)

Layer 3: Behavioral Modeling

Layer 4: Correlation Engine

Results: From 61% to 94.3%

Where This Goes Next