Build & Development Commands
- Use
docker-compose up --build
to build and run the app locally. - Ensure requirements.txt
is up to date with all Python dependencies. - Use python -m app.api
to run the API locally for testing. - Swagger UI is served from openapi.yaml
and should reflect all available endpoints.
API Design Guidelines
/predict
: Accepts a test dataset (e.g., JSON or file upload) and returns model predictions. - /train
: Accepts a dataset and optional parameters to train a new model. - /models
:
GET
: Lists available models in the local registry.
POST
: Uploads a new model to the registry.
/status/<job_id>
: Returns the status of a training job (e.g., pending, running, completed, failed). - All endpoints must be documented in openapi.yaml
for DART UI integration.
Model Registry Guidelines
- Models are stored in the
models/
directory with metadata (e.g., name, version, date). - model_registry.py
must handle model lookup, registration, and versioning. - No cloud storage is used; all models are stored locally or on a private server.
Inference & Training Logic
inference.py
should load the latest or specified model from the registry and run predictions. - training.py
should support training from scratch or fine-tuning, saving the model to the registry. - Long-running training jobs should be handled asynchronously via jobs.py
.
Job Management
- Use
jobs.py
to manage background tasks (e.g., training). - Each job should have a unique ID and status tracking. - Consider using threading
, multiprocessing
, or a lightweight queue like RQ
.
Documentation Guidelines
- Keep
README.md
updated with setup, usage, and endpoint examples. - Ensure openapi.yaml
is synchronized with actual API behavior. - Document model formats, expected input/output, and training parameters.
Containerization Guidelines
- Use the provided
Dockerfile
and docker-compose.yml
for reproducible builds. - Ensure all paths in config.py
are relative or configurable via environment variables. - Avoid hardcoding file paths or secrets.
Configuration
- Use
config.py
to load settings from a .env
file or YAML/JSON config. - Include paths for model storage, logging, and job tracking.