liam-cawley/cawley icon
public
Published on 6/23/2025
Productization Rule

Rules

Build & Development Commands

  • Use docker-compose up --build to build and run the app locally. - Ensure requirements.txt is up to date with all Python dependencies. - Use python -m app.api to run the API locally for testing. - Swagger UI is served from openapi.yaml and should reflect all available endpoints.

API Design Guidelines

  • /predict: Accepts a test dataset (e.g., JSON or file upload) and returns model predictions. - /train: Accepts a dataset and optional parameters to train a new model. - /models:
    • GET: Lists available models in the local registry.
    • POST: Uploads a new model to the registry.
  • /status/<job_id>: Returns the status of a training job (e.g., pending, running, completed, failed). - All endpoints must be documented in openapi.yaml for DART UI integration.

Model Registry Guidelines

  • Models are stored in the models/ directory with metadata (e.g., name, version, date). - model_registry.py must handle model lookup, registration, and versioning. - No cloud storage is used; all models are stored locally or on a private server.

Inference & Training Logic

  • inference.py should load the latest or specified model from the registry and run predictions. - training.py should support training from scratch or fine-tuning, saving the model to the registry. - Long-running training jobs should be handled asynchronously via jobs.py.

Job Management

  • Use jobs.py to manage background tasks (e.g., training). - Each job should have a unique ID and status tracking. - Consider using threading, multiprocessing, or a lightweight queue like RQ.

Documentation Guidelines

  • Keep README.md updated with setup, usage, and endpoint examples. - Ensure openapi.yaml is synchronized with actual API behavior. - Document model formats, expected input/output, and training parameters.

Containerization Guidelines

  • Use the provided Dockerfile and docker-compose.yml for reproducible builds. - Ensure all paths in config.py are relative or configurable via environment variables. - Avoid hardcoding file paths or secrets.

Configuration

  • Use config.py to load settings from a .env file or YAML/JSON config. - Include paths for model storage, logging, and job tracking.