Skip to main content

System Requirements

Minimum Requirements

  • Python: 3.9 or higher
  • RAM: 4GB minimum (8GB recommended)
  • Storage: 2GB for models and dependencies
  • OS: Linux, macOS, or Windows
For optimal performance:
  • Python: 3.11
  • RAM: 8GB or more
  • GPU: CUDA-capable GPU (NVIDIA) for faster inference
  • CUDA: 12.1 or compatible version

Installation Methods

  • Standard Installation
  • With GPU Support
  • Development Setup

Clone the Repository

git clone https://github.com/Ksmith18skc/GuardianAPI.git
cd GuardianAPI

Install Python Dependencies

cd backend
pip install -r requirements.txt

Train the Sexism Model

cd ..
python scripts/train_and_save_sexism_model.py
Installation complete! The API is ready to run.

Dependency Overview

Guardian API uses the following major dependencies:
PackagePurposeVersion
fastapiWeb frameworkLatest
uvicornASGI serverLatest
torchPyTorch for ML2.1.0+
transformersHuggingFace modelsLatest
scikit-learnLASSO classifierLatest
redisRate limiting (optional)Latest
pydanticRequest validationv2+
A complete list of dependencies is available in backend/requirements.txt.

Training the Sexism Model

The sexism classifier must be trained before running the API. The training script uses the dataset in data/train_sexism.csv.

Training Process

python scripts/train_and_save_sexism_model.py
This script:
  1. Loads training data (~40k tweets)
  2. Trains a LASSO regression model with CountVectorizer
  3. Optimizes threshold for best F1 score
  4. Saves model files to backend/app/models/sexism/

Model Files

After training, you’ll have:
  • classifier.pkl - LASSO regression model
  • vectorizer.pkl - CountVectorizer with 2500 features

Training Output

Training LASSO model...
Model trained with test F1 score: 0.82
Optimal threshold: 0.400
Files saved:
  - backend/app/models/sexism/classifier.pkl
  - backend/app/models/sexism/vectorizer.pkl

Verification

Verify Installation

Check that all models load correctly:
cd backend
python -c "from app.models.sexism_classifier import SexismClassifier; print('✓ Sexism classifier loads')"
python -c "from app.models.toxicity_model import ToxicityModel; print('✓ Toxicity model loads')"
python -c "from app.models.rule_engine import RuleEngine; print('✓ Rule engine loads')"

Run Tests

cd backend
pytest -v
Expected output:
========== 72 passed in 12.34s ==========

Start the API

uvicorn app.main:app --reload
Visit http://localhost:8000/docs to see the interactive API documentation.

Common Issues

Problem: Python can’t find installed packagesSolution:
  1. Ensure you’re in the correct directory
  2. Activate your virtual environment if using one
  3. Reinstall dependencies: pip install -r requirements.txt
Problem: API can’t load classifier.pkl or vectorizer.pklSolution:
  1. Ensure you’ve run the training script
  2. Check that files exist: ls backend/app/models/sexism/
  3. If missing, run: python scripts/train_and_save_sexism_model.py
Problem: GPU runs out of memorySolution:
  1. Close other GPU-intensive applications
  2. Reduce batch size (if applicable)
  3. Fall back to CPU by not installing CUDA-enabled PyTorch
Problem: Port 8000 is already occupiedSolution:
# Use a different port
uvicorn app.main:app --reload --port 8080
Problem: Can’t connect to Redis for rate limitingSolution:
  • Rate limiting is optional and will be disabled if Redis isn’t available
  • Check your REDIS_URL in .env
  • Verify Redis is running if using local Redis
  • For Upstash, ensure the URL is correct

Next Steps