System Requirements
Minimum Requirements
- Python: 3.9 or higher
- RAM: 4GB minimum (8GB recommended)
- Storage: 2GB for models and dependencies
- OS: Linux, macOS, or Windows
Recommended Requirements
For optimal performance:- Python: 3.11
- RAM: 8GB or more
- GPU: CUDA-capable GPU (NVIDIA) for faster inference
- CUDA: 12.1 or compatible version
Installation Methods
- Standard Installation
- With GPU Support
- Development Setup
Dependency Overview
Guardian API uses the following major dependencies:| Package | Purpose | Version |
|---|---|---|
fastapi | Web framework | Latest |
uvicorn | ASGI server | Latest |
torch | PyTorch for ML | 2.1.0+ |
transformers | HuggingFace models | Latest |
scikit-learn | LASSO classifier | Latest |
redis | Rate limiting (optional) | Latest |
pydantic | Request validation | v2+ |
A complete list of dependencies is available in
backend/requirements.txt.Training the Sexism Model
The sexism classifier must be trained before running the API. The training script uses the dataset indata/train_sexism.csv.
Training Process
- Loads training data (~40k tweets)
- Trains a LASSO regression model with CountVectorizer
- Optimizes threshold for best F1 score
- Saves model files to
backend/app/models/sexism/
Model Files
After training, you’ll have:classifier.pkl- LASSO regression modelvectorizer.pkl- CountVectorizer with 2500 features
Training Output
Verification
Verify Installation
Check that all models load correctly:Run Tests
Start the API
Common Issues
ModuleNotFoundError
ModuleNotFoundError
Problem: Python can’t find installed packagesSolution:
- Ensure you’re in the correct directory
- Activate your virtual environment if using one
- Reinstall dependencies:
pip install -r requirements.txt
Model files not found
Model files not found
Problem: API can’t load
classifier.pkl or vectorizer.pklSolution:- Ensure you’ve run the training script
- Check that files exist:
ls backend/app/models/sexism/ - If missing, run:
python scripts/train_and_save_sexism_model.py
CUDA out of memory
CUDA out of memory
Problem: GPU runs out of memorySolution:
- Close other GPU-intensive applications
- Reduce batch size (if applicable)
- Fall back to CPU by not installing CUDA-enabled PyTorch
Port already in use
Port already in use
Problem: Port 8000 is already occupiedSolution:
Redis connection errors
Redis connection errors
Problem: Can’t connect to Redis for rate limitingSolution:
- Rate limiting is optional and will be disabled if Redis isn’t available
- Check your
REDIS_URLin.env - Verify Redis is running if using local Redis
- For Upstash, ensure the URL is correct