ppio/ppio-data-science-machine-learning-assistant icon
public
Published on 4/21/2025
PPIO Data Science & Machine Learning Assistant

Rules
Prompts
Models
Context
PPIO baichuan2-13b-chat model icon

baichuan2-13b-chat

PPIO

PPIO deepseek-r1-community model icon

deepseek-r1-community

PPIO

PPIO bge-m3 model icon

bge-m3

PPIO

PPIO deepseek-r1-distill-qwen-32b model icon

deepseek-r1-distill-qwen-32b

PPIO

PPIO deepseek-r1-distill-llama-70b model icon

deepseek-r1-distill-llama-70b

PPIO

PPIO deepseek-r1-distill-llama-8b model icon

deepseek-r1-distill-llama-8b

PPIO

PPIO deepseek-r1-distill-qwen-14b model icon

deepseek-r1-distill-qwen-14b

PPIO

PPIO deepseek-r1-turbo model icon

deepseek-r1-turbo

PPIO

PPIO deepseek-v3-0324 model icon

deepseek-v3-0324

PPIO

PPIO deepseek-v3-community model icon

deepseek-v3-community

PPIO

PPIO deepseek-r1-turbo model icon

deepseek-r1-turbo

PPIO

PPIO gemma-3-27b-it model icon

gemma-3-27b-it

PPIO

PPIO glm-4-9b-chat model icon

glm-4-9b-chat

PPIO

PPIO llama-3.1-8b-instruct model icon

llama-3.1-8b-instruct

PPIO

PPIO llama-3.1-70b-instruct model icon

llama-3.1-70b-instruct

PPIO

PPIO llama-3.2-3b-instruct model icon

llama-3.2-3b-instruct

PPIO

PPIO llama-3.3-70b-instruct model icon

llama-3.3-70b-instruct

PPIO

PPIO llama-4-maverick model icon

llama-4-maverick

PPIO

PPIO llama-4-scout model icon

llama-4-scout

PPIO

PPIO qwen2.5-32b-instruct model icon

qwen2.5-32b-instruct

PPIO

PPIO qwen-2.5-72b-instruct model icon

qwen-2.5-72b-instruct

PPIO

PPIO qwen2.5-7b-instruct model icon

qwen2.5-7b-instruct

PPIO

PPIO qwen2.5-vl-72b-instruct model icon

qwen2.5-vl-72b-instruct

PPIO

PPIO qwq-32b model icon

qwq-32b

PPIO

PPIO yi-1.5-9b-chat model icon

yi-1.5-9b-chat

PPIO

PPIO yi-1.5-34b-chat model icon

yi-1.5-34b-chat

PPIO

You are an experienced data scientist who specializes in Python-based
data science and machine learning. You use the following tools:
- Python 3 as the primary programming language
- PyTorch for deep learning and neural networks
- NumPy for numerical computing and array operations
- Pandas for data manipulation and analysis
- Jupyter for interactive development and visualization
- Conda for environment and package management
- Matplotlib for data visualization and plotting
Pandashttps://pandas.pydata.org/docs/
torch.nn Docshttps://pytorch.org/docs/stable/nn.html
NumPyhttps://numpy.org/doc/stable/

Prompts

Learn more
Exploratory Data Analysis
Initial data exploration and key insights
Create an exploratory data analysis workflow that includes:

Data Overview:
- Basic statistics (mean, median, std, quartiles)
- Missing values and data types
- Unique value distributions

Visualizations:
- Numerical: histograms, box plots
- Categorical: bar charts, frequency plots
- Relationships: correlation matrices
- Temporal patterns (if applicable)

Quality Assessment:
- Outlier detection
- Data inconsistencies
- Value range validation

Insights & Documentation:
- Key findings summary
- Data quality issues
- Variable relationships
- Next steps recommendations
- Reproducible Jupyter notebook

The user has provided the following information:
Data Pipeline Development
Create robust and scalable data processing pipelines
Generate a data processing pipeline with these requirements:

Input:
- Data loading from multiple sources (CSV, SQL, APIs)
- Input validation and schema checks
- Error logging for data quality issues

Processing:
- Standardized cleaning (missing values, outliers, types)
- Memory-efficient operations for large datasets
- Numerical transformations using NumPy
- Feature engineering and aggregations

Quality & Monitoring:
- Data quality checks at key stages
- Validation visualizations with Matplotlib
- Performance monitoring

Structure:
- Modular, documented code with error handling
- Configuration management
- Reproducible in Jupyter notebooks
- Example usage and tests

The user has provided the following information:

Context

Learn more
@diff
Reference all of the changes you've made to your current branch
@codebase
Reference the most relevant snippets from your codebase
@url
Reference the markdown converted contents of a given URL
@folder
Uses the same retrieval mechanism as @Codebase, but only on a single folder
@terminal
Reference the last command you ran in your IDE's terminal and its output
@code
Reference specific functions or classes from throughout your project
@file
Reference any file in your current workspace

No Data configured

MCP Servers

Learn more

No MCP Servers configured