Architecture — idpishield
Design Philosophy
idpishield follows a tiered defense architecture that prioritizes speed and simplicity while enabling deep semantic analysis when needed.
Core principle: Add one library, get protection. Add the service for AI-grade protection.
System Overview
┌──────────────────────────────────────────────────────────────────┐│ YOUR APPLICATION ││ (Go · Node.js/TypeScript · Rust · Python · any language) ││ ││ ┌─────────────────────────────────────────┐ ││ │ idpishield CLIENT LIB │ ◄── Tier 1 ││ │ • Domain allowlist check │ ││ │ • Unicode normalization │ ││ │ • Pattern matching (88 patterns) │ ││ │ • Risk scoring (0–100) │ ││ │ • Multi-language (EN/FR/ES/DE/JA) │ ││ │ • Sub-millisecond response │ ││ │ • Bounded decoding + input limits │ ││ └───────────────┬─────────────────────────┘ ││ │ (optional, score ≥ 60) ││ ▼ ││ ┌─────────────────────────────────────────┐ ││ │ idpishield SERVICE │ ◄── Tier 2 ││ │ (Python microservice, runs separately) │ ││ │ • Semantic similarity detection │ ││ │ • LLM-based intent analysis │ ││ │ • REST API on localhost:7432 │ ││ └─────────────────────────────────────────┘ │└──────────────────────────────────────────────────────────────────┘Analysis Pipeline
Input Text │ ├── [Mode: Fast] ───────────────────────────┐ │ │ ├── [Mode: Balanced/Deep] │ │ │ │ │ ▼ │ │ ┌──────────┐ │ │ │Normalizer│ │ │ │• Strip zero-width chars │ │ │• Map homoglyphs (Cyrillic→Latin) │ │ │• Full-width → ASCII │ │ │• Collapse whitespace │ │ └────┬─────┘ │ │ │ │ ▼ ▼ │┌──────────────┐ ││ Scanner │ ◄──────────────────────────────┘│• 88 compiled regex patterns ││• 7 threat categories ││• 5 languages │└──────┬───────┘ │ │ matches[] │ ▼ │┌──────────────┐ ││ Scorer │ ││• Category-weighted scoring ││• Cross-category amplification ││• Attack chain combo bonuses ││• Score clamped [0, 100] │└──────┬───────┘ │ │ RiskResult (local) │ │ │ ├── [score < 60 OR mode ≠ deep] ───► Return local result │ ├── [score ≥ 60 AND mode = deep AND service configured] │ │ │ ▼ │ ┌──────────────┐ │ │Service Client│ │ │POST /assess │ │ └──────┬───────┘ │ │ │ ┌────┴────┐ │ │ success? │ │ ├─ yes ──► Return service result │ └─ no ──► Retry transient failures, then return local resultScoring Algorithm
Severity Weights
| Severity | Weight | Meaning |
|---|---|---|
| 1 | 10 | Weak signal, likely benign |
| 2 | 15 | Mild signal |
| 3 | 25 | Suspicious |
| 4 | 35 | Very likely attack |
| 5 | 45 | Almost certain attack |
Score Computation
- Category grouping: For each category, take the highest severity match as the primary weight.
- Diminishing returns: Additional matches in the same category add
weight / 5each (max 3 extra). - Cross-category bonus:
+15per additional category beyond the first. - Attack chain bonuses:
- instruction-override + exfiltration:
+20 - jailbreak + instruction-override:
+15 - role-hijack + exfiltration:
+15
- instruction-override + exfiltration:
- Clamp to
[0, 100].
Score → Level Mapping
| Range | Level |
|---|---|
| 0–19 | safe |
| 20–39 | low |
| 40–59 | medium |
| 60–79 | high |
| 80–100 | critical |
Blocking Thresholds
| Mode | Threshold |
|---|---|
| Normal | score ≥ 60 |
| Strict | score ≥ 40 |
Normalization Strategy
The normalizer defeats common obfuscation techniques used in prompt injection attacks:
-
Invisible character stripping: Removes 20+ zero-width and invisible Unicode characters (U+200B, U+200C, U+200D, U+FEFF, etc.) that attackers insert between letters to break pattern matching.
-
Homoglyph mapping: Maps 50+ visually similar characters from Cyrillic, Greek, and mathematical symbol ranges to their ASCII equivalents. Example: Cyrillic ‘а’ (U+0430) → Latin ‘a’.
-
Full-width normalization: Converts full-width ASCII variants (U+FF01–U+FF5E) to standard ASCII (U+0021–U+007E).
-
Whitespace collapsing: Multiple whitespace characters (spaces, tabs, newlines) are collapsed to a single space.
Thread Safety
All components are safe for concurrent use:
- Scanner: Read-only after construction (compiled regex patterns).
- Normalizer: Stateless — pure function.
- Domain checker: Read-only after construction.
- Service client: Uses
http.Clientwhich is concurrency-safe. - Client: Composes all of the above — no mutable state during operation.
Runtime Guardrails
The current implementation includes lightweight production guardrails:
- Input bounding:
MaxInputByteslimits analysis payload size. - Decode bounding:
MaxDecodeDepthandMaxDecodedVariantscap obfuscation expansion. - Retry controls:
ServiceRetriesretries transient deep-service failures. - Circuit breaker:
ServiceCircuitFailureThreshold+ServiceCircuitCooldownprevent repeated service stalls from impacting local scans. - MCP HTTP auth: Optional bearer/API-key auth (
--auth-token) for streamable HTTP transport.
Circuit breaker scope:
- Breaker state is process-local (per running
idpishieldinstance). - In horizontally scaled deployments, each instance tracks failures independently.
- For shared breaker behavior across replicas, use an external shared-state mechanism in your platform layer.
Profile safety note:
defaultprofile is intentionally lightweight for local/dev usage.- Use
--profile productionin production startup commands and deployment manifests.
Dependencies
The root library primarily uses Go standard library components plus targeted external packages where needed (for example Unicode normalization). CLI/MCP adapter components have additional dependencies.