-
Notifications
You must be signed in to change notification settings - Fork 35
remove dependency on {futile.logger} #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -39,6 +39,7 @@ jobs: | |
| uses: actions/checkout@v6 | ||
| with: | ||
| fetch-depth: 0 | ||
| persist-credentials: false | ||
| - uses: pre-commit/[email protected] | ||
| - name: set up R | ||
| uses: r-lib/actions/setup-r@v2 | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,17 +5,17 @@ exclude: | | |
| )$ | ||
| repos: | ||
| - repo: https://github.com/pre-commit/pre-commit-hooks | ||
| rev: v5.0.0 | ||
| rev: v6.0.0 | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just ran |
||
| hooks: | ||
| - id: end-of-file-fixer | ||
| - id: trailing-whitespace | ||
| - repo: https://github.com/maxwinterstein/shfmt-py | ||
| rev: v3.11.0.2 | ||
| rev: v3.12.0.1 | ||
| hooks: | ||
| - id: shfmt | ||
| args: ["--indent=4", "--space-redirects", "--write"] | ||
| - repo: https://github.com/shellcheck-py/shellcheck-py | ||
| rev: v0.10.0.1 | ||
| rev: v0.11.0.1 | ||
| hooks: | ||
| - id: shellcheck | ||
| args: ["--exclude=SC2002"] | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,7 +8,7 @@ | |
| #' as it has pulled this many hits. Default is \code{Inf}, meaning that | ||
| #' all possible hits will be pulled. | ||
| #' @param size Number of records per page of results. | ||
| #' See \href{https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-from-size}{Elasticsearch docs} for more. | ||
| #' See \href{https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-scroll}{Elasticsearch docs} for more. | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| #' Note that this will be reset to 0 if you submit a \code{query_body} with | ||
| #' an "aggs" request in it. Also see \code{max_hits}. | ||
| #' @param query_body String with a valid Elasticsearch query. Default is an empty query. | ||
|
|
@@ -17,7 +17,7 @@ | |
| #' The scroll context will be refreshed every time you ask Elasticsearch | ||
| #' for another record, so this parameter should just be the amount of | ||
| #' time you expect to pass between requests. See the | ||
| #' \href{https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-scroll}{Elasticsearch scroll/pagination docs} | ||
| #' \href{https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-scroll}{Elasticsearch scroll/pagination docs} | ||
| #' for more information. | ||
| #' @param n_cores Number of cores to distribute fetching and processing over. | ||
| #' @param break_on_duplicates Boolean, defaults to TRUE. \code{es_search} uses the size of the | ||
|
|
@@ -84,6 +84,7 @@ es_search <- function(es_host | |
| , break_on_duplicates = TRUE | ||
| , ignore_scroll_restriction = FALSE | ||
| , intermediates_dir = getwd() | ||
| , verbose = FALSE | ||
| ) { | ||
|
|
||
| # Check if this is an aggs or straight-up search query | ||
|
|
@@ -135,6 +136,7 @@ es_search <- function(es_host | |
| , es_index = es_index | ||
| , trailing_args = "size=0" | ||
| , query_body = query_body | ||
| , verbose = verbose | ||
| ) | ||
|
|
||
| return(chomp_aggs(aggs_json = result)) | ||
|
|
@@ -151,7 +153,8 @@ es_search <- function(es_host | |
| , n_cores = n_cores | ||
| , break_on_duplicates = break_on_duplicates | ||
| , ignore_scroll_restriction = ignore_scroll_restriction | ||
| , intermediates_dir = intermediates_dir)) | ||
| , intermediates_dir = intermediates_dir | ||
| , verbose = verbose)) | ||
| } | ||
|
|
||
| # nolint start | ||
|
|
@@ -162,7 +165,7 @@ es_search <- function(es_host | |
| # [param] es_host A string identifying an Elasticsearch host. This should be of the form | ||
| # [transfer_protocol][hostname]:[port]. For example, 'http://myindex.thing.com:9200'. | ||
| # [param] es_index The name of an Elasticsearch index to be queried. | ||
| # [param] size Number of records per page of results. See \href{https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-from-size}{Elasticsearch docs} for more | ||
| # [param] size Number of records per page of results. See \href{https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-scroll}{Elasticsearch docs} for more | ||
| # [param] query_body String with a valid Elasticsearch query to be passed to \code{\link[elastic]{Search}}. | ||
| # Default is an empty query. | ||
| # [param] scroll How long should the scroll context be held open? This should be a | ||
|
|
@@ -190,6 +193,7 @@ es_search <- function(es_host | |
| # value longer than an hour, set \code{ignore_scroll_restriction} | ||
| # to \code{TRUE}. | ||
| # [param] intermediates_dir passed through from es_search. See es_search docs. | ||
| # [param] verbose TRUE to print DEBUG-level logs. | ||
| # [examples] | ||
| # \dontrun{ | ||
| # | ||
|
|
@@ -224,6 +228,7 @@ es_search <- function(es_host | |
| , break_on_duplicates | ||
| , ignore_scroll_restriction | ||
| , intermediates_dir | ||
| , verbose | ||
| ) { | ||
|
|
||
| # Check es_host | ||
|
|
@@ -238,7 +243,7 @@ es_search <- function(es_host | |
| "\n\nIf you understand the costs and would like to make requests ", | ||
| "with a longer-lived context, re-run this function with ", | ||
| "ignore_scroll_restriction = TRUE.\n", | ||
| "\nPlease see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-scroll ", # nolint[line_length] | ||
| "\nPlease see https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-scroll ", # nolint[line_length] | ||
| "for more information.") | ||
| .log_fatal(msg) | ||
| } | ||
|
|
@@ -284,12 +289,13 @@ es_search <- function(es_host | |
| , es_index = es_index | ||
| , trailing_args = paste0("size=", size, "&scroll=", scroll) | ||
| , query_body = query_body | ||
| , verbose = verbose | ||
| ) | ||
|
|
||
| # Parse to JSON to get total number of documents matching the query | ||
| firstResult <- jsonlite::fromJSON(firstResultJSON, simplifyVector = FALSE) | ||
|
|
||
| major_version <- .get_es_version(es_host) | ||
| major_version <- .get_es_version(es_host, verbose = verbose) | ||
| if (as.integer(major_version) > 6) { | ||
| hits_to_pull <- min(firstResult[["hits"]][["total"]][["value"]], max_hits) | ||
| } else { | ||
|
|
@@ -458,6 +464,7 @@ es_search <- function(es_host | |
| # hits_to_pull - Total hits to be pulled (documents matching user's query). | ||
| # Or, in the case where max_hits < number of matching docs, | ||
| # max_hits. | ||
| # verbose - TRUE to print DEBUG-level logs. | ||
| #' @importFrom jsonlite fromJSON | ||
| .keep_on_pullin <- function(scroll_id | ||
| , out_path | ||
|
|
@@ -466,11 +473,12 @@ es_search <- function(es_host | |
| , scroll | ||
| , hits_pulled | ||
| , hits_to_pull | ||
| , verbose | ||
| ) { | ||
|
|
||
| # Note that the old scrolling strategy was deprecated in Elasticsearch 5.x and | ||
| # officially dropped in Elasticsearch 6.x. Need to grab the correct method here | ||
| major_version <- .get_es_version(es_host) | ||
| major_version <- .get_es_version(es_host, verbose = verbose) | ||
| scrolling_request <- switch( | ||
| major_version | ||
| , "1" = .legacy_scroll_request | ||
|
|
@@ -487,6 +495,7 @@ es_search <- function(es_host | |
| es_host = es_host | ||
| , scroll = scroll | ||
| , scroll_id = scroll_id | ||
| , verbose = verbose | ||
| ) | ||
| .stop_for_status(result) | ||
| resultJSON <- .content(result, as = "text") | ||
|
|
@@ -531,7 +540,7 @@ es_search <- function(es_host | |
| # [name] .new_scroll_request | ||
| # [description] Make a scrolling request and return the result | ||
| # [references] https://www.elastic.co/guide/en/elasticsearch/reference/6.7/search-request-scroll.html | ||
| .new_scroll_request <- function(es_host, scroll, scroll_id) { | ||
| .new_scroll_request <- function(es_host, scroll, scroll_id, verbose) { | ||
|
|
||
| # Set up scroll_url | ||
| scroll_url <- paste0(es_host, "/_search/scroll") # nolint[absolute_path,non_portable_path] | ||
|
|
@@ -541,14 +550,15 @@ es_search <- function(es_host | |
| verb = "POST" | ||
| , url = scroll_url | ||
| , body = sprintf('{"scroll": "%s", "scroll_id": "%s"}', scroll, scroll_id) | ||
| , verbose = verbose | ||
| ) | ||
| return(result) | ||
| } | ||
|
|
||
| # [title] Make a scroll request with the strategy supported by Elasticsearch 1.x and Elasticsearch 2.x | ||
| # [name] .legacy_scroll_request | ||
| # [description] Make a scrolling request and return the result | ||
| .legacy_scroll_request <- function(es_host, scroll, scroll_id) { | ||
| .legacy_scroll_request <- function(es_host, scroll, scroll_id, verbose) { | ||
|
|
||
| # Set up scroll_url | ||
| scroll_url <- paste0(es_host, "/_search/scroll?scroll=", scroll) | ||
|
|
@@ -558,6 +568,7 @@ es_search <- function(es_host | |
| verb = "POST" | ||
| , url = scroll_url | ||
| , body = scroll_id | ||
| , verbose = verbose | ||
| ) | ||
| return(result) | ||
| } | ||
|
|
@@ -594,7 +605,7 @@ es_search <- function(es_host | |
| portPattern <- ":[0-9]+$" | ||
| if (! grepl(portPattern, es_host) == 1) { | ||
| msg <- paste0("No port found in es_host! es_host should be a string of the" | ||
| , "form [transfer_protocol][hostname]:[port]). for " | ||
| , " form [transfer_protocol][hostname]:[port]. For " | ||
| , "example: 'http://myindex.mysite.com:9200'") | ||
| .log_fatal(msg) | ||
| } | ||
|
|
@@ -625,14 +636,15 @@ es_search <- function(es_host | |
| # version of Elasticsearch. | ||
| # [param] es_host A string identifying an Elasticsearch host. This should be of the form | ||
| # [transfer_protocol][hostname]:[port]. For example, 'http://myindex.thing.com:9200'. | ||
| .get_es_version <- function(es_host) { | ||
| .get_es_version <- function(es_host, verbose) { | ||
|
|
||
| # Hit the cluster root to get metadata | ||
| .log_info("Checking Elasticsearch version...") | ||
| result <- .request( | ||
| verb = "GET" | ||
| , url = es_host | ||
| , body = NULL | ||
| , verbose = verbose | ||
| ) | ||
| .stop_for_status(result) | ||
|
|
||
|
|
@@ -669,6 +681,7 @@ es_search <- function(es_host | |
| # For example, to limit the size of the returned results, you might pass | ||
| # "size=0". This can be a single string or a character vector of params, e.g. | ||
| # \code{c('size=0', 'scroll=5m')} | ||
| # [param] verbose TRUE to print DEBUG-level logs. | ||
| # [param] query_body A JSON string with valid Elasticsearch DSL | ||
| # [examples] | ||
| # \dontrun{ | ||
|
|
@@ -701,6 +714,7 @@ es_search <- function(es_host | |
| , es_index | ||
| , trailing_args = NULL | ||
| , query_body | ||
| , verbose | ||
| ) { | ||
|
|
||
| # Input checking | ||
|
|
@@ -717,6 +731,7 @@ es_search <- function(es_host | |
| verb = "POST" | ||
| , url = reqURL | ||
| , body = query_body | ||
| , verbose = verbose | ||
| ) | ||
| .stop_for_status(result) | ||
| result <- .content(result, as = "text") | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another tiny change I wanted to slip in here since we're touching the repo for the first time in months.
This is a small security patch that avoids storing the temporary git creds created by
actions/checkout, which makes it harder for injected code to do bad things.