You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/instructions/code-review.instructions.md
+27-25Lines changed: 27 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,45 +10,49 @@ Use these rules when reviewing pull requests to the Apache Airflow repository.
10
10
## Architecture Boundaries
11
11
12
12
-**Scheduler must never run user code.** It only processes serialized Dags. Flag any scheduler-path code that deserializes or executes Dag/task code.
13
-
-**Workers must not access the metadata DB directly.**Task execution communicates with the API server through the Execution API (`/execution` endpoints) only.
14
-
-**Dag Processor and Triggerer run user code in isolated processes.**Code in these components should maintain that isolation.
15
-
-**Providers must not import core internals** like `SUPERVISOR_COMMS` or task-runner plumbing. Providers interact through the public SDK and execution API only.
13
+
-**Flag any task execution code that accesses the metadata DB directly**instead of through the Execution API (`/execution` endpoints).
14
+
-**Flag any code in Dag Processor or Triggerer that breaks process isolation**— these components run user code in isolated processes.
15
+
-**Flag any provider importing core internals** like `SUPERVISOR_COMMS` or task-runner plumbing. Providers interact through the public SDK and execution API only.
16
16
17
17
## Database and Query Correctness
18
18
19
-
-**N+1 queries**: Flag SQLAlchemy queries that access relationships inside loops without `joinedload()` or `selectinload()`.
20
-
-**`run_id` is only unique per Dag.** Queries that group, partition, or join on `run_id` alone (without `dag_id`) will collide across Dags. Always require `(dag_id, run_id)` together.
21
-
-**Cross-database compatibility**: SQL changes must work on PostgreSQL, MySQL, and SQLite. Flag database-specific features (lateral joins, window functions) without cross-DB handling.
22
-
-**Session discipline**: In `airflow-core`, functions receiving a `session` parameter must not call `session.commit()`. Use keyword-only `session` parameters.
19
+
-**Flag any SQLAlchemy relationship access inside a loop** without `joinedload()` or `selectinload()` — this is an N+1 query.
20
+
-**Flag any query on `run_id` without `dag_id`.**`run_id` is only unique per Dag. Queries that filter, group, partition, or join on `run_id` alone will silently collide across Dags.
21
+
-**Flag any `session.commit()` call in `airflow-core`** code that receives a `session` parameter. Session lifecycle is managed by the caller, not the callee.
22
+
-**Flag any `session` parameter that is not keyword-only** (`*, session`) in `airflow-core`.
23
+
-**Flag any database-specific SQL** (e.g., `LATERAL` joins, PostgreSQL-only functions, MySQL-only syntax) without cross-DB handling. SQL must work on PostgreSQL, MySQL, and SQLite.
23
24
24
25
## Code Quality Rules
25
26
26
-
-No `assert` in production code (stripped in optimized Python).
27
-
-`time.monotonic()` for durations, not`time.time()`.
28
-
- Imports at top of file. Valid exceptions: circular imports, lazy loading for worker isolation, `TYPE_CHECKING` blocks.
29
-
-Guard heavy type-only imports (e.g., `kubernetes.client`) with `TYPE_CHECKING` in multi-process code paths.
30
-
-Unbounded caches are bugs: all `@lru_cache` must have `maxsize`.
31
-
-Resources (files, connections, sessions) must use context managers or `try/finally`.
27
+
-**Flag any `assert` in non-test code.**`assert` is stripped in optimized Python (`python -O`), making it a silent no-op in production.
28
+
-**Flag any `time.time()`used for measuring durations.** Use`time.monotonic()` instead — `time.time()` is affected by system clock adjustments.
29
+
-**Flag any `from` or `import` statement inside a function or method body.**Imports must be at the top of the file. The only valid exceptions are: (1) circular import avoidance, (2) lazy loading for worker isolation, (3) `TYPE_CHECKING` blocks. If the import is inside a function, ask the author to justify why it cannot be at module level.
30
+
-**Flag any `@lru_cache(maxsize=None)`.** This creates an unbounded cache — every unique argument set is cached forever. Note: `@lru_cache()` without arguments defaults to `maxsize=128` and is fine.
31
+
-**Flag any heavy import** (e.g., `kubernetes.client`) in multi-process code paths that is not behind a `TYPE_CHECKING` guard.
32
+
-**Flag any file, connection, or session opened without a context manager or `try/finally`.**
32
33
33
34
## Testing Requirements
34
35
35
-
- New behavior requires tests covering success, failure, and edge cases.
36
-
- Use pytest patterns, not `unittest.TestCase`.
37
-
- Use `spec`/`autospec` when mocking.
38
-
- Use `time_machine` for time-dependent tests.
39
-
- Imports belong at the top of test files, not inside test functions.
40
-
- Issue numbers do not belong in test docstrings.
36
+
-**Flag any new public method or behavior without corresponding tests.** Tests must cover success, failure, and edge cases.
37
+
-**Flag any `unittest.TestCase` subclass.** Use pytest patterns instead.
38
+
-**Flag any `mock.Mock()` or `mock.MagicMock()` without `spec` or `autospec`.** Unspec'd mocks silently accept any attribute access, hiding real bugs.
39
+
-**Flag any `time.sleep` or `datetime.now()` in tests.** Use `time_machine` for time-dependent tests.
40
+
-**Flag any issue number in test docstrings** (e.g., `"""Fix for #12345"""`) — test names should describe behavior, not track tickets.
41
41
42
42
## API Correctness
43
43
44
-
-`map_index` must be handled correctly for mapped tasks. Queries without `map_index` filtering may return arbitrary task instances.
45
-
- Execution API changes must follow Cadwyn versioning (CalVer format).
44
+
-**Flag any query on mapped task instances that does not filter on `map_index`.** Without it, queries may return arbitrary instances from the mapped set.
45
+
-**Flag any Execution API change without a Cadwyn version migration** (CalVer format).
46
46
47
47
## UI Code (React/TypeScript)
48
48
49
49
- Avoid `useState + useEffect` to sync derived state. Use nullish coalescing or nullable override patterns instead.
50
50
- Extract shared logic into custom hooks rather than copy-pasting across components.
51
51
52
+
## Generated Files
53
+
54
+
-**Flag any manual edits to files in `openapi-gen/`** or Task SDK generated models. These must be regenerated, not hand-edited.
55
+
52
56
## AI-Generated Code Signals
53
57
54
58
Flag these patterns that indicate low-quality AI-generated contributions:
@@ -63,7 +67,5 @@ Flag these patterns that indicate low-quality AI-generated contributions:
63
67
64
68
## Quality Signals to Check
65
69
66
-
The absence of these signals in a "fix" or "optimization" PR is itself a red flag:
67
-
68
-
-**Bug fixes need regression tests**: A test that fails without the fix and passes with it.
69
-
-**Existing tests must still pass without modification**: If existing tests need changes to pass, the PR may introduce a behavioral regression.
70
+
-**For bug-fix PRs, flag if there is no regression test** — a test that fails without the fix and passes with it.
71
+
-**Flag any existing test modified to accommodate new behavior** — this may indicate a behavioral regression rather than a genuine fix.
0 commit comments