Skip to content

Contributing

Yasser Mustafa edited this page Feb 15, 2026 · 1 revision

Contributing to PipeFrame

First off, thank you for considering contributing to PipeFrame! It's people like you that make PipeFrame such a great tool for the data science community.


🌟 Ways to Contribute

πŸ› Report Bugs

Found a bug? Please open an issue with:

  • Clear, descriptive title
  • Steps to reproduce
  • Expected vs actual behavior
  • Code samples
  • Environment details (Python version, OS, pipeframe version)

πŸ’‘ Suggest Features

Have an idea? We'd love to hear it! Include:

  • Use case description
  • Proposed API (how it would work)
  • Example code showing the feature
  • Why it would be useful

πŸ“ Improve Documentation

Help others learn PipeFrame:

  • Fix typos or clarify explanations
  • Add examples
  • Create tutorials
  • Improve docstrings

πŸ”§ Submit Code

Ready to code? Awesome! See development setup below.


πŸš€ Development Setup

1. Fork and Clone

# Fork the repository on GitHub, then:
git clone https://github.com/YOUR_USERNAME/pipeframe.git
cd pipeframe

2. Create Virtual Environment

# Using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Using conda
conda create -n pipeframe python=3.10
conda activate pipeframe

3. Install Development Dependencies

# Install package in editable mode with dev dependencies
pip install -e ".[dev,test]"

# Or install from requirements
pip install -r requirements-dev.txt

4. Set Up Pre-commit Hooks

pre-commit install

πŸ”¨ Development Workflow

1. Create a Branch

git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-number-description

2. Make Changes

Follow these guidelines:

  • Write clear, readable code
  • Add docstrings (Google style)
  • Include type hints
  • Add tests for new features
  • Update documentation

3. Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=pipeframe --cov-report=html

# Run specific test
pytest tests/test_dataframe.py::test_filter

4. Format Code

# Format with black
black pipeframe/

# Sort imports
isort pipeframe/

# Lint
flake8 pipeframe/

# Type check
mypy pipeframe/

5. Commit Changes

git add .
git commit -m "feat: add amazing new feature"

Commit Message Format:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation changes
  • test: Test additions/changes
  • refactor: Code refactoring
  • perf: Performance improvements
  • chore: Maintenance tasks

6. Push and Create PR

git push origin feature/your-feature-name

Then create a Pull Request on GitHub with:

  • Clear description of changes
  • Link to related issues
  • Screenshots/examples if applicable

πŸ“‹ Code Style Guide

Python Style

  • Follow PEP 8
  • Use Black for formatting (line length: 100)
  • Use isort for import sorting
  • Type hints required for public APIs

Docstrings

Use Google style:

def awesome_function(param1: str, param2: int = 0) -> DataFrame:
    """
    Brief description of what this does.
    
    More detailed explanation if needed. Can span multiple
    lines and include usage notes.
    
    Args:
        param1: Description of param1
        param2: Description of param2. Defaults to 0.
    
    Returns:
        Description of return value
    
    Raises:
        ValueError: When param1 is empty
    
    Examples:
        >>> result = awesome_function("hello", 42)
        >>> print(result)
    """
    pass

Type Hints

from typing import Any, List, Optional, Union
from pipeframe.core.dataframe import DataFrame

def process_data(
    df: Union[DataFrame, pd.DataFrame],
    columns: Optional[List[str]] = None,
    **kwargs: Any
) -> DataFrame:
    ...

πŸ§ͺ Testing Guidelines

Writing Tests

import pytest
from pipeframe import DataFrame, filter, define

class TestDataFrame:
    def test_filter_basic(self):
        """Test basic filtering functionality."""
        df = DataFrame({'x': [1, 2, 3, 4]})
        result = df >> filter('x > 2')
        assert len(result) == 2
    
    def test_filter_empty_result(self):
        """Test filtering that returns no rows."""
        df = DataFrame({'x': [1, 2, 3]})
        result = df >> filter('x > 10')
        assert len(result) == 0
    
    def test_filter_invalid_column(self):
        """Test error handling for invalid column."""
        df = DataFrame({'x': [1, 2, 3]})
        with pytest.raises(PipeFrameColumnError):
            df >> filter('y > 2')

Test Organization

  • One test file per module
  • Test classes for related tests
  • Clear test names describing what's tested
  • Test both success and error cases
  • Test edge cases

πŸ“š Documentation Guidelines

README Updates

  • Keep examples simple and focused
  • Ensure all code examples actually work
  • Update table of contents if adding sections

API Documentation

  • Every public function/class needs docstring
  • Include parameters, returns, raises
  • Add usage examples
  • Note any security considerations

Tutorial Notebooks

  • Start simple, build complexity
  • Explain the "why" not just "how"
  • Include real-world examples
  • Test all code cells

πŸ” Code Review Process

What We Look For

  • βœ… Tests pass
  • βœ… Code is formatted (black, isort)
  • βœ… Type hints present
  • βœ… Docstrings complete
  • βœ… No breaking changes (or clearly documented)
  • βœ… Performance impact considered
  • βœ… Security implications reviewed

Review Timeline

  • Initial response: Within 2 days
  • Full review: Within 1 week
  • Revisions: As needed

🎯 Priority Areas

We especially welcome contributions in:

  1. Performance Optimization

    • Profiling and benchmarking
    • Vectorization improvements
    • Memory efficiency
  2. Additional Verbs

    • Join operations
    • Window functions
    • Time series helpers
  3. Backend Support

    • Polars integration
    • DuckDB support
    • Arrow format
  4. Documentation

    • More examples
    • Video tutorials
    • Translation
  5. Testing

    • Edge case coverage
    • Performance tests
    • Integration tests

πŸ’¬ Communication

Getting Help

Proposing Major Changes

For substantial changes:

  1. Open an issue first
  2. Discuss the approach
  3. Get feedback before coding
  4. Then submit PR

πŸ“œ License

By contributing, you agree that your contributions will be licensed under the MIT License.


πŸ™ Recognition

Contributors are recognized in:

  • README.md contributors section
  • Release notes
  • Annual contributor highlight

❓ Questions?

Don't hesitate to ask! We're here to help:

Thank you for making PipeFrame better! πŸŽ‰