Skip to main content

System Requirements

Minimum Requirements

  • Python: 3.9 or higher
  • RAM: 4GB minimum (8GB recommended)
  • Storage: 2GB for models and dependencies
  • OS: Linux, macOS, or Windows
For optimal performance:
  • Python: 3.11
  • RAM: 8GB or more
  • GPU: CUDA-capable GPU (NVIDIA) for faster inference
  • CUDA: 12.1 or compatible version

Installation Methods

Clone the Repository

git clone https://github.com/Ksmith18skc/GuardianAPI.git
cd GuardianAPI

Install Python Dependencies

cd backend
pip install -r requirements.txt

Train the Sexism Model

cd ..
python scripts/train_and_save_sexism_model.py
Installation complete! The API is ready to run.

Dependency Overview

Guardian API uses the following major dependencies:
PackagePurposeVersion
fastapiWeb frameworkLatest
uvicornASGI serverLatest
torchPyTorch for ML2.1.0+
transformersHuggingFace modelsLatest
scikit-learnLASSO classifierLatest
redisRate limiting (optional)Latest
pydanticRequest validationv2+
A complete list of dependencies is available in backend/requirements.txt.

Training the Sexism Model

The sexism classifier must be trained before running the API. The training script uses the dataset in data/train_sexism.csv.

Training Process

python scripts/train_and_save_sexism_model.py
This script:
  1. Loads training data (~40k tweets)
  2. Trains a LASSO regression model with CountVectorizer
  3. Optimizes threshold for best F1 score
  4. Saves model files to backend/app/models/sexism/

Model Files

After training, you’ll have:
  • classifier.pkl - LASSO regression model
  • vectorizer.pkl - CountVectorizer with 2500 features

Training Output

Training LASSO model...
Model trained with test F1 score: 0.82
Optimal threshold: 0.400
Files saved:
  - backend/app/models/sexism/classifier.pkl
  - backend/app/models/sexism/vectorizer.pkl

Verification

Verify Installation

Check that all models load correctly:
cd backend
python -c "from app.models.sexism_classifier import SexismClassifier; print('✓ Sexism classifier loads')"
python -c "from app.models.toxicity_model import ToxicityModel; print('✓ Toxicity model loads')"
python -c "from app.models.rule_engine import RuleEngine; print('✓ Rule engine loads')"

Run Tests

cd backend
pytest -v
Expected output:
========== 72 passed in 12.34s ==========

Start the API

uvicorn app.main:app --reload
Visit http://localhost:8000/docs to see the interactive API documentation.

Common Issues

Problem: Python can’t find installed packagesSolution:
  1. Ensure you’re in the correct directory
  2. Activate your virtual environment if using one
  3. Reinstall dependencies: pip install -r requirements.txt
Problem: API can’t load classifier.pkl or vectorizer.pklSolution:
  1. Ensure you’ve run the training script
  2. Check that files exist: ls backend/app/models/sexism/
  3. If missing, run: python scripts/train_and_save_sexism_model.py
Problem: GPU runs out of memorySolution:
  1. Close other GPU-intensive applications
  2. Reduce batch size (if applicable)
  3. Fall back to CPU by not installing CUDA-enabled PyTorch
Problem: Port 8000 is already occupiedSolution:
# Use a different port
uvicorn app.main:app --reload --port 8080
Problem: Can’t connect to Redis for rate limitingSolution:
  • Rate limiting is optional and will be disabled if Redis isn’t available
  • Check your REDIS_URL in .env
  • Verify Redis is running if using local Redis
  • For Upstash, ensure the URL is correct

Next Steps

Quickstart

Follow the quickstart guide to run your first moderation request

Configuration

Configure environment variables and optional features

Architecture

Learn about the system architecture

API Reference

Explore the API endpoints