-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
Problem
The source-iterable connector (v0.6.53) can permanently lose events that fall near the trailing edge of an incremental sync time window. This happens because:
- The connector queries the Iterable
/api/export/data.jsonendpoint with a time range fromcursortonow(). - The Iterable Export API has eventual consistency — events may not be immediately available in the export endpoint even after they are created.
- Once the sync completes, the cursor advances past the window. No future sync will re-query that time range.
This means events created near the end of a sync window that haven't yet been indexed by Iterable's export pipeline are silently and permanently dropped.
Evidence
We observed this in production across three consecutive sync jobs for the email_send stream:
| Job | email_send Window (UTC) | Records |
|---|---|---|
| 389472 | 16:20:29 → ~17:31:42 | 240,169 |
| 389530 | 17:30:37 → ~18:42:59 | 28,906 |
| 389585 | 18:42:52 → ~19:53:41 | 25,039 |
A specific email_send event with createdAt = 2026-02-18 19:53:24 UTC was confirmed to exist via the Iterable API (direct event lookup), but was not returned by the export API during job 389585's sync, even though:
- The event's timestamp (19:53:24) falls within job 389585's query window (18:42:52 → ~19:53:41) — approximately 17 seconds before the window end.
- The sync ran at 21:53 UTC, a full 2 hours after the event was created.
The cursor then advanced past 19:53:41, so subsequent syncs never re-queried this time range, and the event was permanently lost from the pipeline.
Affected Streams
All incremental streams that use the /api/export/data.json endpoint are affected:
email_send,email_open,email_click,email_bounce,email_complaint,email_subscribe,email_send_skip,custom_event
Proposed Solution
Option A: Add an end_datetime offset config parameter (preferred)
Add a user-configurable parameter (e.g., end_time_buffer_minutes, default 5) that shifts the end of each sync window backward:
end_datetime = now() - timedelta(minutes=end_time_buffer_minutes)
This gives events time to settle in the Iterable export pipeline before the sync window closes over them. The trade-off is a small delay in data freshness (e.g., 5 minutes), which is negligible for email event data.
Connector Version
source-iterable:0.6.53- Airbyte OSS,
container-orchestrator:1.8.1,destination-snowflake:3.15.5
Environment
- Airbyte OSS deployment
- Destination: Snowflake
- Sync frequency: ~1 hour
Internal Tracking: https://github.com/airbytehq/oncall/issues/11563