Zum Inhalt springen
Prototyp pruefen

Detection Engine

This specification defines the detection engine: rule-based detectors vs. ML classifiers vs. reputation lookups, their inputs/outputs, confidence scoring, and evasion resistance.

Audience: Detection/ML engineers, security auditors.


In-ScopeOut-of-Scope
Detector types and their inputs/outputsML model architectures and training data
Confidence scoring and calibrationCloud Enrichment API internals
Evasion considerations and mitigationsPolicy evaluation logic (see Policy Engine)
Detection pipeline stage mappingEvent persistence and delivery

Runtime: Local device, deterministic Target latency: < 10 ms per signal

InputOutputExamples
Normalized signal (call metadata, URL, app signature)Match/no-match + confidence (0.0 or 1.0)Known fraud number database, malicious domain signatures, STIR/SHAKEN attestation failure

Characteristics:

  • Deterministic: identical inputs → identical outputs
  • No gradients: immune to adversarial ML attacks
  • High precision, limited recall (only catches known patterns)
  • Updated via signed threat signature packages

Runtime: Local device, neural network inference Target latency: < 100 ms per signal Model size: < 50 MB (quantized INT8/FP16)

InputOutputExamples
Feature-extracted signal representationConfidence score (0.0–1.0) + threat categorySocial engineering NLP patterns in text, suspicious app behavior analysis, URL feature extraction (domain age, certificate status, brand similarity)

Characteristics:

  • Probabilistic: outputs continuous confidence scores
  • Quantized models: reduced susceptibility to gradient-based adversarial attacks
  • Model integrity: signed delivery, signature verification before loading, no-downgrade (rollback protection)
  • Regular updates via Model Update Service (signed packages)

ML capabilities deployed:

CapabilityMethodInput
Social engineering detectionNLP modelsText messages (in-memory, discarded after)
Suspicious attachment pre-analysisImage classificationAttachment metadata and thumbnails
URL analysisFeature extractionDomain age, certificate status, brand similarity, homoglyph detection
Behavioral analysisPattern recognitionCommunication patterns, app usage anomalies

Documented product target: transformer/CNN model families, cross-platform export formats, and signed OTA model delivery. Not confirmed by the current workspace code.

3. Cloud Enrichment / Reputation Lookups (Stage 3)

Abschnitt betitelt „3. Cloud Enrichment / Reputation Lookups (Stage 3)“

Runtime: Cloud (optional, stateless) Trigger: Only when local detection yields ambiguous confidence (0.3–0.7 range)

InputOutputPurpose
SHA-256 hash of phone numberRisk assessment + campaign attributionKnown fraud number lookup
SHA-256 hash of domain/URLRisk assessment + threat categoryKnown phishing/malware domain lookup
SHA-256 hash of app signatureRisk assessment + distribution infoKnown malware signature lookup
Anonymized feature vectorComplex classification resultDeepfake detection, advanced NLP for novel attacks

Characteristics:

  • Stateless: no device ID or user ID in requests
  • Plaintext never transmitted
  • Feature vectors: dimensionality-reduced, irreversibly transformed before transmission

Documented product target: transformed feature-vector upload and low-latency cloud enrichment. Not confirmed by the current workspace code.


Escalation rule: Only signals that cannot be classified with sufficient confidence advance to the next stage. This minimizes latency and data exposure.


RangeInterpretationDefault Action
0.0–0.3No threat detectedAllow
0.3–0.7Suspicious, insufficient confidenceWarn (user informed)
0.7–1.0High confidence threatBlock (automatic protection)

Calibration: Confidence scores represent the model’s estimated probability that the signal is a true threat. A score of 0.8 means the model estimates an 80% probability.

Documented product target: calibrated confidence outputs and threshold configuration. The current workspace policy path uses simpler prototype logic instead.

When the Context Risk Engine correlates multiple signals, it produces a compound risk score that supersedes individual confidence scores. The compound score uses multiplicative combination, not additive.


Evasion TechniqueAffected DetectorMitigation
Known number rotationRule-based (fraud database)Cloud Enrichment updates database continuously. ML models detect behavioral patterns independent of number.
Domain rotation (< 24h)Rule-based (domain reputation)ML URL analysis uses feature extraction (domain age, certificate, brand similarity) independent of reputation database.
Adversarial text perturbationML NLP classifiersQuantized models less susceptible to gradient attacks. Multiple feature families (not solely text embedding).
Legitimate tool abuse (PowerShell, curl)App signature matchingBehavioral analysis detects anomalous usage patterns of legitimate tools (e.g., PowerShell executing encoded commands during active call).
Slow escalation (multi-day)Temporal correlationContext Risk Engine tracks risk accumulation over configurable time windows.
Encrypted channel attacksContent-based detectionMetadata analysis (timing, patterns, contact reputation) provides partial detection without content access.

FailureImpactMitigation
Model file corruptedML detection unavailableSignature verification rejects corrupted models. Agent falls back to rule-based detection only.
Model too oldReduced detection of novel threatsNo-downgrade policy enforces forward-only updates. Agent warns user about outdated models after configurable period.
Cloud unreachableNo Stage 3 enrichmentAmbiguous signals (0.3–0.7) default to Warn action instead of Block. User informed of reduced protection.
Rule database emptyNo heuristic detectionML detection still operational. Agent flags missing database as error condition.
Confidence miscalibrationElevated false positives or false negativesContinuous model evaluation against labeled validation sets. Calibration drift triggers model update.

Documented product target: safe-mode fallback for simultaneous detector failure. Not confirmed by the current workspace code.