From 54fc3a9d663f29ac06d6446d0d1ac87b9686ce72 Mon Sep 17 00:00:00 2001 From: Luke Chavers Date: Thu, 12 Sep 2024 10:58:22 -0400 Subject: [PATCH 1/4] 1134: Create first round of DAP scan documentation --- pages/dap_scan.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 pages/dap_scan.md diff --git a/pages/dap_scan.md b/pages/dap_scan.md new file mode 100644 index 0000000..4e36e52 --- /dev/null +++ b/pages/dap_scan.md @@ -0,0 +1,29 @@ +### Digital Analytics Program (DAP) Scan +The [DAP scan](https://github.com/GSA/site-scanning-engine/blob/main/libs/core-scanner/src/scans/dap.ts) takes all outbound requests made from a page and analyzes them to determine if a valid DAP script candidate can be identified. While analyzing these requests we also collect some additional information such as `dap_parameters`, `dap_version`, and `ga_tag_ids`. + +#### Values Returned: +- `dapDetected`: A true/false value that denotes whether a valid DAP script candidate could be located within the outbound requests. +- `dapParameters`: All parameters being passed to the DAP script. +- `dapVersion`: If a DAP script version can be identified it will be populated here. +- `gaTagIds`: A comma delimited list of all GA tags found in the outbound requests. + +#### Scan Steps: +- **Build GA Tag List:** + - All outbound requests are checked for the presence of Universal (UA) and G4 tags. We check the URL and also any POST request values. We then returns a comma delimited list of tags found. +- **Identify DAP Script Candidates:** + - We iterate through the list of outbound requests and try to determine if a DAP script can be identified. We look for the presence of the `Universal-Federated-Analytics-Min.js` script file as well as the presence of the `G-CSLL4ZEK4L` property ID. + - When looking through the requests we check the URLS and POST data for any reference to the script or the GA tag. If either the script or tag are found we save these as a potential candidate and return this list of candidates for further analysis. +- **Further Analyze Candidates:** + - Now that we have narrowed down the outbound requests to only those that are potential DAP script candidate we can further analyze the requests and pull out the data we need to narrow down our list to one `Best` candidate. + - During this analysis we collect the `request body`, `url`, `url parameters`, `postData`, and `dap version`. These values will be used in the next step to determine the best candidate. +- **Determine Best Candidate:** + - All of the candidates are run through a list of checks to determine which one meets the requirements to be a `best` candidate. The checks become less strict as we go. + - Checks: + 1. Does the candidate contain the `Universal-Federated-Analytics-Min.js` script and also have a version number? + 2. Does the candidate contain the `G-CSLL4ZEK4L` property ID and a version number? + 3. Does the candidate have a version number? + 4. Does the candidate contain the `Universal-Federated-Analytics-Min.js` script, but NO version number? + 5. Does the candidate contain any of the items above individually (version OR script OR GA tag)? + - Going through these checks in order we determine which candidate is our `best` match. Check 1 being the best match and 5 being the final match check. +- **Return our results:** + - Once all of the above steps are complete we return the results. \ No newline at end of file From 69cc6e703e7ef1bcc15f5df21fb973b554c4bb65 Mon Sep 17 00:00:00 2001 From: Luke Chavers Date: Thu, 12 Sep 2024 11:06:20 -0400 Subject: [PATCH 2/4] 1134: Create first round of DAP scan documentation --- pages/dap_scan.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pages/dap_scan.md b/pages/dap_scan.md index 4e36e52..8626428 100644 --- a/pages/dap_scan.md +++ b/pages/dap_scan.md @@ -9,12 +9,12 @@ The [DAP scan](https://github.com/GSA/site-scanning-engine/blob/main/libs/core-s #### Scan Steps: - **Build GA Tag List:** - - All outbound requests are checked for the presence of Universal (UA) and G4 tags. We check the URL and also any POST request values. We then returns a comma delimited list of tags found. + - All outbound requests are checked for the presence of Universal (UA) and G4 tags. We check the URL and also any POST request values. We then return a comma delimited list of tags found. - **Identify DAP Script Candidates:** - We iterate through the list of outbound requests and try to determine if a DAP script can be identified. We look for the presence of the `Universal-Federated-Analytics-Min.js` script file as well as the presence of the `G-CSLL4ZEK4L` property ID. - When looking through the requests we check the URLS and POST data for any reference to the script or the GA tag. If either the script or tag are found we save these as a potential candidate and return this list of candidates for further analysis. - **Further Analyze Candidates:** - - Now that we have narrowed down the outbound requests to only those that are potential DAP script candidate we can further analyze the requests and pull out the data we need to narrow down our list to one `Best` candidate. + - Now that we have narrowed down the outbound requests to only those that are potential DAP script candidates we can further analyze the requests and pull out the data we need to narrow down our list to one `Best` candidate. - During this analysis we collect the `request body`, `url`, `url parameters`, `postData`, and `dap version`. These values will be used in the next step to determine the best candidate. - **Determine Best Candidate:** - All of the candidates are run through a list of checks to determine which one meets the requirements to be a `best` candidate. The checks become less strict as we go. From 8026864a1464e38002bd9be18b73d0b744cfa1c5 Mon Sep 17 00:00:00 2001 From: Luke Chavers Date: Thu, 12 Sep 2024 13:51:58 -0400 Subject: [PATCH 3/4] 1134: Further refine DAP scan documentation --- pages/dap_scan.md | 41 ++++++++++++++++------------------------- 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/pages/dap_scan.md b/pages/dap_scan.md index 8626428..5604675 100644 --- a/pages/dap_scan.md +++ b/pages/dap_scan.md @@ -1,29 +1,20 @@ ### Digital Analytics Program (DAP) Scan -The [DAP scan](https://github.com/GSA/site-scanning-engine/blob/main/libs/core-scanner/src/scans/dap.ts) takes all outbound requests made from a page and analyzes them to determine if a valid DAP script candidate can be identified. While analyzing these requests we also collect some additional information such as `dap_parameters`, `dap_version`, and `ga_tag_ids`. +The [DAP scan](https://github.com/GSA/site-scanning-engine/blob/main/libs/core-scanner/src/scans/dap.ts) checks if a website reports its statistics to the federal Data Analytics Program, which collects data from government sites. It also provides additional details like script settings and Google Analytics IDs. -#### Values Returned: -- `dapDetected`: A true/false value that denotes whether a valid DAP script candidate could be located within the outbound requests. -- `dapParameters`: All parameters being passed to the DAP script. -- `dapVersion`: If a DAP script version can be identified it will be populated here. -- `gaTagIds`: A comma delimited list of all GA tags found in the outbound requests. +#### What We Report: +- `dapDetected`: Tells if the DAP script or a specific Google Analytics property was found. +- `dapParameters`: Lists the settings used in the DAP script. +- `dapVersion`: Shows the version of the DAP script, if detected. +- `gaTagIds`: Lists all Google Analytics tags found in the site's outgoing requests. -#### Scan Steps: +#### How We Check: - **Build GA Tag List:** - - All outbound requests are checked for the presence of Universal (UA) and G4 tags. We check the URL and also any POST request values. We then return a comma delimited list of tags found. -- **Identify DAP Script Candidates:** - - We iterate through the list of outbound requests and try to determine if a DAP script can be identified. We look for the presence of the `Universal-Federated-Analytics-Min.js` script file as well as the presence of the `G-CSLL4ZEK4L` property ID. - - When looking through the requests we check the URLS and POST data for any reference to the script or the GA tag. If either the script or tag are found we save these as a potential candidate and return this list of candidates for further analysis. -- **Further Analyze Candidates:** - - Now that we have narrowed down the outbound requests to only those that are potential DAP script candidates we can further analyze the requests and pull out the data we need to narrow down our list to one `Best` candidate. - - During this analysis we collect the `request body`, `url`, `url parameters`, `postData`, and `dap version`. These values will be used in the next step to determine the best candidate. -- **Determine Best Candidate:** - - All of the candidates are run through a list of checks to determine which one meets the requirements to be a `best` candidate. The checks become less strict as we go. - - Checks: - 1. Does the candidate contain the `Universal-Federated-Analytics-Min.js` script and also have a version number? - 2. Does the candidate contain the `G-CSLL4ZEK4L` property ID and a version number? - 3. Does the candidate have a version number? - 4. Does the candidate contain the `Universal-Federated-Analytics-Min.js` script, but NO version number? - 5. Does the candidate contain any of the items above individually (version OR script OR GA tag)? - - Going through these checks in order we determine which candidate is our `best` match. Check 1 being the best match and 5 being the final match check. -- **Return our results:** - - Once all of the above steps are complete we return the results. \ No newline at end of file + - Look at all outgoing requests to find Google Analytics tags. We check URLs and any data sent through forms, then list the tags we find. +- **Find DAP Script Candidates:** + - Search through outgoing requests to see if we can find the DAP script or a specific Google Analytics ID. Save potential matches for closer inspection. +- **Analyze Candidates::** + - Examine the potential matches to get detailed information and find the best one. +- **Choose the Best Match:** + - Check each candidate to see which one best fits the criteria, starting with the most specific and moving to broader checks. +- **Report Results:** + - Once the analysis is complete, we provide the final results. \ No newline at end of file From 0bc1a9dc3963b4471bef616ee586f69efc99732e Mon Sep 17 00:00:00 2001 From: Luke Chavers Date: Thu, 12 Sep 2024 13:54:06 -0400 Subject: [PATCH 4/4] 1134: Further refine DAP scan documentation --- pages/dap_scan.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/dap_scan.md b/pages/dap_scan.md index 5604675..646c919 100644 --- a/pages/dap_scan.md +++ b/pages/dap_scan.md @@ -12,7 +12,7 @@ The [DAP scan](https://github.com/GSA/site-scanning-engine/blob/main/libs/core-s - Look at all outgoing requests to find Google Analytics tags. We check URLs and any data sent through forms, then list the tags we find. - **Find DAP Script Candidates:** - Search through outgoing requests to see if we can find the DAP script or a specific Google Analytics ID. Save potential matches for closer inspection. -- **Analyze Candidates::** +- **Analyze Candidates:** - Examine the potential matches to get detailed information and find the best one. - **Choose the Best Match:** - Check each candidate to see which one best fits the criteria, starting with the most specific and moving to broader checks.