Apply black formatting to fix linting issues

jeongyoonlee · claude · jeongyoonlee · commit 426d8376e4e5 · 2025-07-15T15:32:49.000-07:00
- Add black as a dependency in pyproject.toml - Format test_meta_learners.py with black - Fix code style issues for CI/CD pipeline 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
diff --git a/.claude/commands/fix-github-issue.md b/.claude/commands/fix-github-issue.md
@@ -0,0 +1,14 @@
+Please analyze and fix the GitHub issue: $ARGUMENTS.
+
+Follow these steps:
+
+1. Use `gh issue view` to get the issue details
+2. Understand the problem described in the issue
+3. Search the codebase for relevant files
+4. Implement the necessary changes to fix the issue
+5. Write and run tests to verify the fix
+6. Ensure code passes linting and type checking
+7. Create a descriptive commit message
+8. Push and create a PR
+
+Remember to use the GitHub CLI (`gh`) for all GitHub-related tasks.
diff --git a/.claude/settings.local.json b/.claude/settings.local.json
@@ -0,0 +1,11 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(gh issue view:*)",
+      "Bash(uv run:*)",
+      "Bash(git checkout:*)",
+      "Bash(git add:*)"
+    ],
+    "deny": []
+  }
+}
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,110 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+CausalML is a Python package for uplift modeling and causal inference with machine learning algorithms. It provides methods to estimate Conditional Average Treatment Effect (CATE) or Individual Treatment Effect (ITE) from experimental or observational data.
+
+## Development Setup
+
+### Environment Setup
+- Python 3.9+ required (supports 3.9-3.12)
+- Uses `uv` as the package manager (preferred) or `pip`
+- Install development dependencies with `make setup_local` (sets up pre-commit hooks)
+
+### Build Commands
+- `make build_ext`: Build Cython extensions (required before running code/tests)
+- `make build`: Build wheel distribution
+- `make install`: Install package locally
+- `make clean`: Clean build artifacts
+
+### Testing
+- `make test`: Run full test suite with coverage
+- `pytest -vs --cov causalml/`: Direct pytest command
+- `pytest tests/test_specific.py`: Run specific test file
+- Optional test flags:
+  - `pytest --runtf`: Include TensorFlow tests
+  - `pytest --runtorch`: Include PyTorch tests
+
+### Code Quality
+- Uses `black` for code formatting
+- Run `black .` before submitting PRs
+- Pre-commit hooks available via `make setup_local`
+- Flake8 configuration in tox.ini with max line length 120
+
+## Architecture
+
+### Core Module Structure
+```
+causalml/
+├── dataset/           # Synthetic data generation
+├── feature_selection/ # Feature selection utilities
+├── inference/         # Main inference algorithms
+│   ├── meta/         # Meta-learners (S, T, X, R, DR learners)
+│   ├── tree/         # Causal trees and uplift trees
+│   ├── tf/           # TensorFlow implementations (DragonNet)
+│   ├── torch/        # PyTorch implementations (CEVAE)
+│   └── iv/           # Instrumental variable methods
+├── metrics/          # Evaluation metrics
+├── optimize/         # Policy learning and optimization
+└── propensity.py     # Propensity score modeling
+```
+
+### Key Components
+
+#### Meta-Learners (`causalml/inference/meta/`)
+- **BaseLearner**: Abstract base class for all meta-learners
+- **S-Learner**: Single model approach
+- **T-Learner**: Two model approach
+- **X-Learner**: Cross-learner with propensity scores
+- **R-Learner**: Robinson's R-learner
+- **DR-Learner**: Doubly robust learner
+
+#### Tree-Based Methods (`causalml/inference/tree/`)
+- Causal trees and forests with Cython implementations
+- Uplift trees for classification problems
+- Custom splitting criteria for causal inference
+
+#### Propensity Score Models (`causalml/propensity.py`)
+- **PropensityModel**: Abstract base for propensity estimation
+- Built-in calibration support
+- Clipping bounds to avoid numerical issues
+
+### Cython Extensions
+The package includes Cython-compiled modules for performance:
+- Tree algorithms (`_tree`, `_criterion`, `_splitter`, `_utils`)
+- Causal tree components (`_builder`, causal trees)
+- Always run `make build_ext` after changes to .pyx files
+
+## Common Workflows
+
+### Adding New Meta-Learners
+1. Inherit from `BaseLearner` in `causalml/inference/meta/base.py`
+2. Implement `fit()` and `predict()` methods
+3. Add appropriate tests in `tests/test_meta_learners.py`
+
+### Working with Tree Methods
+1. Cython files are in `causalml/inference/tree/`
+2. Rebuild extensions with `make build_ext` after changes
+3. Test with synthetic data from `causalml.dataset`
+
+### Testing Different Backends
+- Core tests run without optional dependencies
+- TensorFlow tests: `pytest --runtf`
+- PyTorch tests: `pytest --runtorch`
+- Tests use fixtures from `tests/conftest.py` for data generation
+
+### Git Operations
+- **Pushing branches**: Use specific SSH key for authentication:
+  ```bash
+  GIT_SSH_COMMAND='ssh -i ~/.ssh/github_personal -o IdentitiesOnly=yes' git push -u origin branch_name
+  ```
+
+## Important Notes
+
+- The package uses both pandas DataFrames and numpy arrays internally
+- Propensity scores are clipped by default to avoid division by zero
+- Meta-learners support both single and multiple treatment scenarios
+- Tree methods include built-in visualization capabilities
+- Optional dependencies (TensorFlow, PyTorch) are marked clearly in tests
diff --git a/pyproject.toml b/pyproject.toml
@@ -40,6 +40,7 @@ dependencies = [
     "lightgbm",
     "packaging",
     "graphviz",
+    "black>=25.1.0",
 ]
 
 [project.optional-dependencies]
diff --git a/tests/test_meta_learners.py b/tests/test_meta_learners.py
@@ -1041,7 +1041,6 @@ def test_BaseDRLearner(generate_regression_data):
     assert auuc["cate_p"] > 0.5
 
 
-
 def test_BaseDRClassifier(generate_classification_data):
     np.random.seed(RANDOM_SEED)
 
@@ -1050,40 +1049,39 @@ def test_BaseDRClassifier(generate_classification_data):
     df["treatment_group_key"] = np.where(
         df["treatment_group_key"] == CONTROL_NAME, 0, 1
     )
-    
+
     # Extract features and outcome
     y = df[CONVERSION].values
     X = df[X_names].values
     treatment = df["treatment_group_key"].values
 
     learner = BaseDRClassifier(
-        learner=LogisticRegression(),
-        treatment_effect_learner=LinearRegression()
+        learner=LogisticRegression(), treatment_effect_learner=LinearRegression()
     )
 
     # Test fit and predict
     te = learner.fit_predict(X=X, treatment=treatment, y=y)
-    
+
     # Check that treatment effects are returned
     assert te.shape[0] == X.shape[0]
     assert te.shape[1] == len(np.unique(treatment[treatment != 0]))
-    
+
     # Test with return_components
     te, yhat_cs, yhat_ts = learner.fit_predict(
         X=X, treatment=treatment, y=y, return_components=True
     )
-    
+
     # Check that components are returned as probabilities
     for group in learner.t_groups:
         assert np.all((yhat_cs[group] >= 0) & (yhat_cs[group] <= 1))
         assert np.all((yhat_ts[group] >= 0) & (yhat_ts[group] <= 1))
-    
+
     # Test separate outcome and effect learners
     learner_separate = BaseDRClassifier(
         control_outcome_learner=LogisticRegression(),
         treatment_outcome_learner=LogisticRegression(),
-        treatment_effect_learner=LinearRegression()
+        treatment_effect_learner=LinearRegression(),
     )
-    
+
     te_separate = learner_separate.fit_predict(X=X, treatment=treatment, y=y)
     assert te_separate.shape == te.shape

Original file line number	Diff line number	Diff line change
`@@ -40,6 +40,7 @@ dependencies = [`
`40`	`40`	`"lightgbm",`
`41`	`41`	`"packaging",`
`42`	`42`	`"graphviz",`
	`43`	`+ "black>=25.1.0",`
`43`	`44`	`]`
`44`	`45`
`45`	`46`	`[project.optional-dependencies]`