|
| 1 | +# Apple Store Source |
| 2 | + |
| 3 | +This is the repository for the Apple Store source connector, written in Python. |
| 4 | +For information the Apple Store connect API [see here](https://developer.apple.com/documentation/AppStoreConnectAPI/downloading-analytics-reports). |
| 5 | + |
| 6 | +## App Store Connect API Reports |
| 7 | + |
| 8 | +This connector extracts reports from the App Store Connect API. |
| 9 | + |
| 10 | +### Report Structure |
| 11 | + |
| 12 | +- Each report type has a unique ID which can be found in `reports.csv` |
| 13 | +- Reports are generated daily by Apple and typically contain data from the current day and previous 3 days,due to this rolling window, reports from consecutive days may contain duplicate data so in these connector for each report we only load data that corresonds to the day before the report was generated. |
| 14 | + |
| 15 | +### Sync Behavior |
| 16 | + |
| 17 | +- Each sync processes reports between `start_date` and `end_date` |
| 18 | + - `end_date` defaults to today |
| 19 | + - `start_date` defaults to 4 days ago |
| 20 | +- The connector downloads fresh copies of reports on each sync |
| 21 | +- If a sync is run multiple times in the same day, newer downloads will replace older ones |
| 22 | + |
| 23 | +### Report Processing |
| 24 | + |
| 25 | +- For each report type, the connector finds available instances (a report instance represents a specific date) |
| 26 | +- The processing date in the report metadata is the date when Apple generated the report |
| 27 | +- Records are filtered to include only data matching the expected date (1 day before the day of the report) |
| 28 | + |
| 29 | +### Available Reports |
| 30 | + |
| 31 | +#### App Install Performance |
| 32 | +- Documents app installation performance metrics |
| 33 | +- Documentation: https://developer.apple.com/documentation/analytics-reports/app-installs-performance |
| 34 | +- ID: r5-1032fee7-dfb3-4a4a-b24d-e603c95f5b09 |
| 35 | + |
| 36 | +#### Additional Reports |
| 37 | +- App Downloads Detailed |
| 38 | +- App Installation and Deletion Detailed |
| 39 | +- App Sessions Detailed |
| 40 | +- App Discovery and Engagement Detailed |
| 41 | + |
| 42 | +### Technical Notes |
| 43 | + |
| 44 | +- The connection may take several minutes to complete due to the API response times |
| 45 | +- A deduplication mechanism is implemented in the `read_records` function to handle overlapping data |
| 46 | + |
| 47 | +## Overview |
| 48 | + |
| 49 | +This connector extracts extract data from diferent apple store reports using the App store connect API. |
| 50 | + |
| 51 | +### Output schema |
| 52 | + |
| 53 | +This connector outputs the following streams that corresponds each to a specific report: |
| 54 | + |
| 55 | +1. **App Installs Performance** |
| 56 | +2. **App Downloads Detailed** |
| 57 | +3. **App Installation and Deletion Detailed** |
| 58 | +4. **App Sessions Detailed** |
| 59 | +5. **App Discovery and Engagement Detailed** |
| 60 | + |
| 61 | +### Configuration |
| 62 | + |
| 63 | +The connector requires the following configuration parameters: |
| 64 | + |
| 65 | +```yaml |
| 66 | + key_id: |
| 67 | + description: Your App Store Connect API Key ID. |
| 68 | + issuer_id: |
| 69 | + description: Your App Store Connect API Issuer ID (found in the API Keys section of App Store Connect). |
| 70 | + private_key: |
| 71 | + description: The private key content for the App Store Connect API. Include the entire key, including the BEGIN and END lines. |
| 72 | + start_date (optional): |
| 73 | + description: The date to start syncing data from, in YYYY-MM-DD format. If not provided, defaults to 3 days ago. The oldest date available is the 2025-04-02 (because no reports request were made before this date). |
| 74 | + examples: |
| 75 | + - "2023-01-01" |
| 76 | + end_date (optional): |
| 77 | + description: The date to sync data until, in YYYY-MM-DD format. If not provided, defaults to the current date. |
| 78 | + examples: |
| 79 | + - "2023-04-30" |
| 80 | + report_ids (optional): |
| 81 | + description: Custom report IDs to use for fetching data. If not provided, default IDs will be used. |
| 82 | + |
| 83 | +``` |
| 84 | + |
| 85 | + |
| 86 | + |
| 87 | +## Local development |
| 88 | + |
| 89 | +### Prerequisites |
| 90 | + |
| 91 | +#### Activate Virtual Environment and install dependencies |
| 92 | +From this connector directory, create a virtual environment: |
| 93 | +``` |
| 94 | +python -m venv .venv |
| 95 | +``` |
| 96 | +``` |
| 97 | +source .venv/bin/activate |
| 98 | +pip install -r requirements.txt |
| 99 | +``` |
| 100 | + |
| 101 | +### Locally running the connector |
| 102 | +``` |
| 103 | +python main.py spec |
| 104 | +python main.py check --config sample_files/config-example.json |
| 105 | +python main.py discover --config sample_files/config-example.json |
| 106 | +python main.py read --config sample_files/config-example.json --catalog sample_files/configured_catalog.json |
| 107 | +``` |
| 108 | + |
| 109 | +### Locally running the connector docker image |
| 110 | + |
| 111 | +```bash |
| 112 | +docker build -t airbyte/source-app-store:dev . |
| 113 | +# Running the spec command against your patched connector |
| 114 | +docker run airbyte/source-app-store:dev spec |
| 115 | +``` |
| 116 | + |
| 117 | +#### Run |
| 118 | +Then run any of the connector commands as follows: |
| 119 | +``` |
| 120 | +docker run --rm airbyte/source-app-store:dev spec |
| 121 | +docker run --rm -v $(pwd)/sample_files:/sample_files airbyte/source-app-store:dev check --config /sample_files/config-example.json |
| 122 | +docker run --rm -v $(pwd)/sample_files:/sample_files airbyte/source-app-store:dev discover --config /sample_files/config-example.json |
| 123 | +docker run --rm -v $(pwd)/sample_files:/sample_files -v $(pwd)/sample_files:/sample_files airbyte/source-app-sore:dev read --config /sample_files/config-example.json --catalog /sample_files/configured_catalog.json |
| 124 | +``` |
| 125 | + |
| 126 | +## Notes |
| 127 | + |
| 128 | +- reports.csv is the list of all the reports that exist with theyr name, category and id |
| 129 | +- be aware that the scripts can take some time to execute (some minutes) |
| 130 | + |
| 131 | +* App Install Performance |
| 132 | +a new report is generated every day, in this report we have data from the current day and the previous 3 days. so we can have duplicates in the data for example between 2025-04-01 and 2025-04-02 reports. |
| 133 | +This is why we implemented a deduplication mechanism in the read_records function. |
| 134 | + |
| 135 | +Doc about the report: https://developer.apple.com/documentation/analytics-reports/app-installs-performance |
| 136 | + |
| 137 | +if the sync is tun twice the same day, the second run will override the first one. (ir ancient downloads will be replaced by the new ones) |
| 138 | + |
| 139 | +Process every report with date between start_date and end_date (end_date defaults to today start_date defaults to 4 days ago) porcessing date is the date of the report. |
| 140 | + |
| 141 | +Each sync will download fresh copies of the reports |
| 142 | +Old files will be overwritten with new data |
| 143 | + |
| 144 | +Reports ids can be found in the reports.csv file, this file is generated after making a request report api call with adming credentials. |
| 145 | + |
| 146 | +for each report an instance is equivalent to a report's date so there is one instance per report per day |
0 commit comments