suyunhe/50

public
Published on 4/13/2025
ai_old
profile of ai_old
Prompts
ai_old
profile of ai_old
This is a user profile (in the prompt format), which includes behavioral traits across task and user dimensions. These traits describe the user’s technical skills, development habits, language and framework preferences, testing and build strategies, learning ability, etc. Always refer to this profile when answering questions, especially when giving code suggestions, recommending tools, proposing development strategies, or generating documentation. Ensure that your output is aligned with the user’s preferences and proficiency level # Basic

## 1. Language Usage & Proficiency
**Language Distribution:**

- **Python**: 210 files, 34,567 lines of code.  
- **R**: 47 files, 5,432 lines of code.

**Proficiency:**

- **Python**: Proficient in data analysis and modeling, using mainstream libraries such as `pandas`, `numpy`, and `scikit-learn`.
- **R**: Used for statistical analysis and visualization with libraries like `ggplot2`; proficiency is moderate compared to Python.

## 2. Tech Stack: Frequently Used Frameworks/Libraries
**Python:**

- **Data Analysis**: `pandas`, `numpy`
- **Visualization**: `matplotlib`, `seaborn`, `plotly`
- **Machine Learning**: `scikit-learn`, `xgboost`
- **Notebook Environment**: Jupyter

**R:**

- **Visualization**: `ggplot2`, `shiny`
- **Data Processing**: `dplyr`, `tidyr`

## 3. Coding Behavior: Read / Develop / Fix Ratio
- **Reading**: Frequent use of `SelectText` and `GoToDefinition`; high code reuse within Notebooks.
- **Development**: Prefers copying existing code blocks and adjusting parameters; frequent `copyPaste`, limited refactoring.
- **Fixing**: Focused on data dimension issues and model training failures.  
  - Average Python error resolution time: 12.7 minutes  
  - Average Jupyter cell fix time: 8 minutes

# Ability

## 1. AI Assistant Dependency
Highly dependent; frequently uses Copilot for model building and documentation prompts.

## 2. Language Proficiency
- **Python**: Advanced — skilled in major data science libraries and their combined use.
- **R**: Intermediate — used for statistical and visualization tasks.

## 3. Project Complexity
Projects involve data processing, analysis, and model training. Though language variety is limited, domains are diverse (finance, healthcare, retail).

## 4. Bug Fixing Efficiency
- **Python error diagnosis**: 12.7 minutes  
- **Jupyter cell debugging**: 8 minutes  
- **Conda environment setup failure recovery**: 35.4 minutes

## 5. Code Completion Dependence
High — especially relies on auto-completion for nested functions and model parameter settings (avg. 1.3 mins per usage).

## 6. Shortcut Usage Preferences
Frequently uses: `Run Cell` (Jupyter), `undo`, `findReferences`.

## 7. IDE Feature Dependence
Strong dependence on the Jupyter environment; frequent notebook operations, minimal use of code refactoring or formatting tools.

## 8. Code Productivity
Peak productivity: 24.8 lines/hour.  
Low commit frequency but high-value submissions (e.g., full pipeline or training workflows).

## 9. Build / Compile / Test Capability
No compilation needed — relies on notebook execution for fast feedback.  
Conda environment success rate: 82%.

## 10. Code Reuse Frequency
High — frequently reuses processing logic and model construction code from past notebooks or templates.

## 11. Unit Testing Frequency
Low — mainly relies on visual outputs and notebook execution for validation. Few projects use `pytest`.

## 12. Merge Conflict Frequency & Resolution
Low — mostly solo development. When collaborating, uses notebook-friendly version control tools (e.g., `nbdime`).

# Habit

## 1. Plugin Ecosystem
Frequently used plugins:

- **Jupyter**, **Python Extension**, **Pylance**
- **Visualization**: `ms-toolsai.jupyter-renderers`
- **Version Control**: `Git Graph`, `nbdime`

## 2. Activity Time Pattern
**Daily**:  
Low activity in the morning, peak from 13:00 to 18:00.

**Weekly**:  
Most active Tuesday to Friday; occasional notebook cleanup on weekends.

**Monthly**:  
Highest activity in the two weeks leading up to project deadlines.

## 3. Commenting Frequency
Uses markdown cells to explain data processing logic.  
Comment density is moderate but clear.

## 4. Local vs Remote Preference
Mainly works locally; some notebooks deployed on JupyterHub or Kaggle.

## 5. Debugging Frequency
Relies on step-by-step execution in notebooks; does not use traditional breakpoints.

# Learn

## 1. New Repo Exploration / Understanding Efficiency
Quick to ramp up on data-related projects, especially adept at understanding data structures and analysis logic via README and Notebooks.

## 2. New Plugin Usage Frequency
High — frequently tries new notebook and visualization-enhancing plugins.

## 3. New Framework / Library Adoption & Learning Efficiency
Very strong — quickly adopts and applies new libraries like `lightgbm`, `catboost`, `transformers`.