Overview
Guardian API uses a sophisticated multi-model ensemble architecture that combines machine learning models and rule-based heuristics to provide comprehensive content moderation.System Architecture
Components
1. Preprocessing Layer
The preprocessing layer prepares text for model inference:Preprocessing Steps
Preprocessing Steps
Text Cleaning:
- Removes URLs (
http://,https://,www.) - Removes user mentions (
@username) - Handles emojis (converts to descriptions)
- Normalizes whitespace
- Converts to lowercase
- Text length
- Caps abuse detection (>70% uppercase)
- Character repetition detection (3+ repeated chars)
- Exclamation mark count
- Sentiment indicators
backend/app/core/preprocessing.py2. Model Layer
Three independent models analyze the content in parallel:Sexism Classifier
Type: LASSO RegressionTraining: ~40k tweetsFeatures: 2500 n-grams (1-2)Output: Binary classification + confidence scoreThreshold: 0.400
Toxicity Model
Type: HuggingFace TransformerModel: unitary/unbiased-toxic-robertaLabels: 6 toxicity categoriesDevice: CUDA if availableFallback: CPU with warning
Rule Engine
Type: Heuristic-basedRules: JSON configuration filesChecks: Slurs, threats, self-harm, profanityPattern Matching: Regex + exact matchingExtensible: Easy to add new rules
3. Ensemble Layer
The ensemble layer combines outputs from all three models:1
Score Aggregation
Weighted fusion of model scores:
- 35% Sexism classifier
- 35% Toxicity model
- 30% Rule engine
2
Conflict Resolution
Rule-based detections override low model scores for critical issues:
- Threat detected → Override ensemble
- Self-harm detected → Override ensemble
- Slur detected → Boost ensemble score
3
Severity Calculation
Maps scores to severity levels:
- 0.0 - 0.3: Low
- 0.3 - 0.6: Moderate
- 0.6 - 1.0: High
4
Primary Issue
Identifies the main concern based on highest individual model score
backend/app/core/ensemble.py
4. Response Layer
Structured JSON response with comprehensive moderation data. See Response Structure for details.Design Principles
Parallel Processing
All models run in parallel for optimal performance:Fail-Safe Design
- Model failures don’t crash the API
- Toxicity model falls back to CPU if GPU unavailable
- Rate limiting fails open (allows requests if Redis down)
- Individual model errors logged but don’t block response
Extensibility
Easy to add new models or rules:- New ML Model: Implement in
backend/app/models/ - New Rules: Add JSON file in
backend/app/models/rules/ - New Ensemble Logic: Modify
backend/app/core/ensemble.py
Performance
Response Time
Average: 20-40msWith GPU: 15-25msBatch requests: 5-10ms per text
Throughput
Single instance: ~50 req/secWith rate limiting: ConfigurableScalable: Horizontal scaling supported
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| API Framework | FastAPI | REST API, async support |
| Server | Uvicorn | ASGI server |
| ML Framework | PyTorch, Scikit-learn | Model inference |
| NLP | HuggingFace Transformers | Toxicity detection |
| Rate Limiting | Redis (Upstash) | Optional rate limiting |
| Validation | Pydantic v2 | Request/response validation |
Deployment Architecture
- Single Instance
- Load Balanced
- Containerized