jcrubio/sneaky-pie

jcrubio/sneaky-pie icon

public

Published on 7/19/2025

My Python Assistant

Coding in Python

Rules

Prompts

Models

Context

Models

Claude 3.7 Sonnet

anthropic

200kinput·8.192koutput

Claude 3.5 Haiku

anthropic

200kinput·8.192koutput

Codestral

mistral

Voyage AI rerank-2

voyage

voyage-code-3

voyage

deepseek-r1 8b

ollama

128kinput·32.768koutput

nomic-embed-text latest

ollama

Rules

You are a Python coding assistant. You should always try to - Use type hints consistently - Write concise docstrings on functions and classes - Follow the PEP8 style guide

- You are a PyTorch ML engineer
- Use type hints consistently
- Optimize for readability over premature optimization
- Write modular code, using separate files for models, data loading, training, and evaluation
- Follow PEP8 style guide for Python code

# dlt rules
## Basics
1. dlt means "data load tool". It is an open source Python library installable via `pip install dlt`.
2. To create a new pipeline, use `dlt init <source> <destination>`.
3. The dlt library comes with the `dlt` CLI. Add the `--help` flag to any command to verify its specs.
4. The preferred way to configure dlt (sources, resources, destinations, etc.) is to use `.dlt/config.toml` and `.dlt/secrets.toml`. Make sure to fill required fields when adding a source or resource.
5. During development, always set `dev_mode=True` when creating a dlt Pipeline. `pipeline = dlt.pipeline(..., dev_mode=True)`. This allows to reset the pipeline's schema and state between iterations.
6. Use type annotations only if you're certain you're properly importing the types.
7. Use dlt's REST API source if loading data from the web.
8. Use dlt's SQL source when loading data from an SQL database or backend.
9. Use dlt's filesystem source if loading data from files (CSV, PDF, Parquet, JSON, and more). This works for local filesystems and cloud buckets (AWS, Azure, GCP, Minio, etc.).

Docs

Pythonhttps://docs.python.org/3/

Matplotlib User Guidehttps://matplotlib.org/stable/users/index.html

Python OpenCV tutorials (official)https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html

Prompts

Write Cargo test

Write unit test with Cargo

Use Cargo to write a comprehensive suite of unit tests for this function

Create a new PyTorch module

Please create a new PyTorch module following these guidelines:
- Include docstrings for the model class and methods
- Add type hints for all parameters
- Add basic validation in __init__

Exploratory Data Analysis

Initial data exploration and key insights

Create an exploratory data analysis workflow that includes:

Data Overview:
- Basic statistics (mean, median, std, quartiles)
- Missing values and data types
- Unique value distributions

Visualizations:
- Numerical: histograms, box plots
- Categorical: bar charts, frequency plots
- Relationships: correlation matrices
- Temporal patterns (if applicable)

Quality Assessment:
- Outlier detection
- Data inconsistencies
- Value range validation

Insights & Documentation:
- Key findings summary
- Data quality issues
- Variable relationships
- Next steps recommendations
- Reproducible Jupyter notebook

The user has provided the following information:

Data Pipeline Development

Create robust and scalable data processing pipelines

Generate a data processing pipeline with these requirements:

Input:
- Data loading from multiple sources (CSV, SQL, APIs)
- Input validation and schema checks
- Error logging for data quality issues

Processing:
- Standardized cleaning (missing values, outliers, types)
- Memory-efficient operations for large datasets
- Numerical transformations using NumPy
- Feature engineering and aggregations

Quality & Monitoring:
- Data quality checks at key stages
- Validation visualizations with Matplotlib
- Performance monitoring

Structure:
- Modular, documented code with error handling
- Configuration management
- Reproducible in Jupyter notebooks
- Example usage and tests

The user has provided the following information:

Context

@code

Reference specific functions or classes from throughout your project

@docs

Reference the contents from any documentation site

@diff

Reference all of the changes you've made to your current branch

@terminal

Reference the last command you ran in your IDE's terminal and its output

@problems

Get Problems from the current file

@folder

Uses the same retrieval mechanism as @Codebase, but only on a single folder

@codebase

Reference the most relevant snippets from your codebase

@url

Reference the markdown converted contents of a given URL

@os

Reference the architecture and platform of your current operating system

@commit

Data

No Data configured

MCP Servers

Brave Search

npx -y @modelcontextprotocol/server-brave-search