-
Notifications
You must be signed in to change notification settings - Fork 3.1k
ReadManyItems API #42167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+2,606
−4
Merged
ReadManyItems API #42167
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
f494b52
feature:read_many_api first iteration
dibahlfi 1d29d85
read_many_items - adding logic for chunking/semaphores
dibahlfi b73001f
read_many_items - code refactor
dibahlfi 00672ca
read_many_items - created a new helper class for chunking/concurrency
dibahlfi 2e8f35b
fix: adding test cases
dibahlfi 91529c2
read_many_items - refactoring
dibahlfi c7d79e3
read_many_items - refactoring
dibahlfi 9be4c7d
read_many_items - clean up
dibahlfi 3f90167
Update sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py
dibahlfi f9367d7
Update sdk/cosmos/azure-cosmos/tests/test_read_many_items_partition_s…
dibahlfi 685084e
read_many_items - adding code for the sync flow.
dibahlfi d8005e3
Merge branch 'users/dibahl/readmanyitems_api' of https://github.com/A…
dibahlfi b6fe1bc
resolving conflicts
dibahlfi 488d8b7
fix: addressing comments
dibahlfi d4b66e8
fix: add support for aggregated request charges in the header
dibahlfi 03e0ff8
fix: fixing typos
dibahlfi 1cebd0a
fix: fixing split tests
dibahlfi d682db1
fix: fixing linting issues
dibahlfi 3ac03b5
fix: addressing comments
dibahlfi f750525
fix: linting errors
dibahlfi 3e780f8
fix: linting errors
dibahlfi a9c8d5f
fix: adding order
dibahlfi 285cbf2
fix: cleaning up
dibahlfi 0b704d6
fix: cleaning up
dibahlfi f59a12b
fix: cleaning up
dibahlfi c13d8eb
fix: bug fixing in the chunking logic
dibahlfi 396100b
fix: addressing comments
dibahlfi 4c763ae
fixing pylink comments
dibahlfi dd3da39
resolving conflict in CHANGELOG.md
dibahlfi 7ac6bb1
fixing pylint errors
dibahlfi 750b0e6
fix: adding samples
dibahlfi 7afdadf
fix: fixing tests
dibahlfi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
# The MIT License (MIT) | ||
# Copyright (c) 2014 Microsoft Corporation | ||
|
||
# Permission is hereby granted, free of charge, to any person obtaining a copy | ||
# of this software and associated documentation files (the "Software"), to deal | ||
# in the Software without restriction, including without limitation the rights | ||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
# copies of the Software, and to permit persons to whom the Software is | ||
# furnished to do so, subject to the following conditions: | ||
|
||
# The above copyright notice and this permission notice shall be included in all | ||
# copies or substantial portions of the Software. | ||
|
||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
# SOFTWARE. | ||
|
||
"""Internal query builder for multi-item operations.""" | ||
|
||
from typing import Dict, Tuple, Any, TYPE_CHECKING, Sequence | ||
|
||
from azure.cosmos.partition_key import _Undefined, _Empty, NonePartitionKeyValue | ||
if TYPE_CHECKING: | ||
from azure.cosmos._cosmos_client_connection import _PartitionKeyType | ||
|
||
|
||
class _QueryBuilder: | ||
"""Internal class for building optimized queries for multi-item operations.""" | ||
|
||
@staticmethod | ||
def _get_field_expression(path: str) -> str: | ||
"""Converts a path string into a query field expression. | ||
|
||
:param str path: The path string to convert. | ||
:return: The query field expression. | ||
:rtype: str | ||
""" | ||
field_name = path.lstrip("/") | ||
if "/" in field_name: | ||
# Handle nested paths like "a/b" -> c["a"]["b"] | ||
field_parts = field_name.split("/") | ||
return "c" + "".join(f'["{part}"]' for part in field_parts) | ||
# Handle simple paths like "pk" -> c.pk or c["non-identifier-pk"] | ||
return f"c.{field_name}" if field_name.isidentifier() else f'c["{field_name}"]' | ||
|
||
@staticmethod | ||
def is_id_partition_key_query( | ||
items: Sequence[Tuple[str, "_PartitionKeyType"]], | ||
partition_key_definition: Dict[str, Any] | ||
) -> bool: | ||
"""Check if we can use the optimized ID IN query. | ||
|
||
:param Sequence[tuple[str, any]] items: The list of items to check. | ||
:param dict[str, any] partition_key_definition: The partition key definition of the container. | ||
:return: True if the optimized ID IN query can be used, False otherwise. | ||
:rtype: bool | ||
""" | ||
partition_key_paths = partition_key_definition.get("paths", []) | ||
if len(partition_key_paths) != 1 or partition_key_paths[0] != "/id": | ||
return False | ||
|
||
for item_id, partition_key_value in items: | ||
xinlian12 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
pk_val = partition_key_value[0] if isinstance(partition_key_value, list) else partition_key_value | ||
if pk_val != item_id: | ||
return False | ||
return True | ||
|
||
@staticmethod | ||
def is_single_logical_partition_query( | ||
items: Sequence[Tuple[str, "_PartitionKeyType"]] | ||
) -> bool: | ||
"""Check if all items in a chunk belong to the same logical partition. | ||
|
||
This is used to determine if an optimized query with an IN clause can be used. | ||
|
||
:param Sequence[tuple[str, any]] items: The list of items to check. | ||
:return: True if all items belong to the same logical partition, False otherwise. | ||
:rtype: bool | ||
""" | ||
if not items or len(items) <= 1: | ||
return False | ||
first_pk = items[0][1] | ||
return all(item[1] == first_pk for item in items) | ||
|
||
@staticmethod | ||
def build_pk_and_id_in_query( | ||
items: Sequence[Tuple[str, "_PartitionKeyType"]], | ||
partition_key_definition: Dict[str, Any] | ||
) -> Dict[str, Any]: | ||
"""Build a query for items in a single logical partition using an IN clause for IDs. | ||
|
||
e.g., SELECT * FROM c WHERE c.pk = @pk AND c.id IN (@id1, @id2) | ||
|
||
:param Sequence[tuple[str, any]] items: The list of items to build the query for. | ||
:param dict[str, any] partition_key_definition: The partition key definition of the container. | ||
:return: A dictionary containing the query text and parameters. | ||
:rtype: dict[str, any] | ||
""" | ||
partition_key_path = partition_key_definition['paths'][0].lstrip('/') | ||
partition_key_value = items[0][1] | ||
|
||
id_params = {f"@id{i}": item[0] for i, item in enumerate(items)} | ||
id_param_names = ", ".join(id_params.keys()) | ||
|
||
query_text = f"SELECT * FROM c WHERE c.{partition_key_path} = @pk AND c.id IN ({id_param_names})" | ||
|
||
parameters = [{"name": "@pk", "value": partition_key_value}] | ||
parameters.extend([{"name": name, "value": value} for name, value in id_params.items()]) | ||
|
||
return {"query": query_text, "parameters": parameters} | ||
|
||
@staticmethod | ||
def build_id_in_query(items: Sequence[Tuple[str, "_PartitionKeyType"]]) -> Dict[str, Any]: | ||
"""Build optimized query using ID IN clause when ID equals partition key. | ||
|
||
:param Sequence[tuple[str, any]] items: The list of items to build the query for. | ||
:return: A dictionary containing the query text and parameters. | ||
:rtype: dict[str, any] | ||
""" | ||
id_params = {f"@param_id{i}": item_id for i, (item_id, _) in enumerate(items)} | ||
param_names = ", ".join(id_params.keys()) | ||
parameters = [{"name": name, "value": value} for name, value in id_params.items()] | ||
|
||
query_string = f"SELECT * FROM c WHERE c.id IN ( {param_names} )" | ||
|
||
return {"query": query_string, "parameters": parameters} | ||
|
||
@staticmethod | ||
def build_parameterized_query_for_items( | ||
items_by_partition: Dict[str, Sequence[Tuple[str, "_PartitionKeyType"]]], | ||
partition_key_definition: Dict[str, Any] | ||
) -> Dict[str, Any]: | ||
"""Builds a parameterized SQL query for reading multiple items. | ||
|
||
:param dict[str, Sequence[tuple[str, any]]] items_by_partition: A dictionary of items grouped by partition key. | ||
:param dict[str, any] partition_key_definition: The partition key definition of the container. | ||
:return: A dictionary containing the query text and parameters. | ||
:rtype: dict[str, any] | ||
""" | ||
all_items = [item for partition_items in items_by_partition.values() for item in partition_items] | ||
|
||
if not all_items: | ||
return {"query": "SELECT * FROM c WHERE false", "parameters": []} | ||
|
||
partition_key_paths = partition_key_definition.get("paths", []) | ||
query_parts = [] | ||
parameters = [] | ||
|
||
for i, (item_id, partition_key_value) in enumerate(all_items): | ||
id_param_name = f"@param_id{i}" | ||
parameters.append({"name": id_param_name, "value": item_id}) | ||
condition_parts = [f"c.id = {id_param_name}"] | ||
|
||
pk_values = [] | ||
if partition_key_value is not None and not isinstance(partition_key_value, type(NonePartitionKeyValue)): | ||
pk_values = partition_key_value if isinstance(partition_key_value, list) else [partition_key_value] | ||
if len(pk_values) != len(partition_key_paths): | ||
raise ValueError( | ||
f"Number of components in partition key value ({len(pk_values)}) " | ||
f"does not match definition ({len(partition_key_paths)})" | ||
) | ||
|
||
for j, path in enumerate(partition_key_paths): | ||
field_expr = _QueryBuilder._get_field_expression(path) | ||
pk_value = pk_values[j] if j < len(pk_values) else None | ||
|
||
if pk_value is None or isinstance(pk_value, (_Undefined, _Empty)): | ||
condition_parts.append(f"IS_DEFINED({field_expr}) = false") | ||
else: | ||
pk_param_name = f"@param_pk{i}{j}" | ||
parameters.append({"name": pk_param_name, "value": pk_value}) | ||
condition_parts.append(f"{field_expr} = {pk_param_name}") | ||
|
||
query_parts.append(f"( {' AND '.join(condition_parts)} )") | ||
|
||
query_string = f"SELECT * FROM c WHERE ( {' OR '.join(query_parts)} )" | ||
return {"query": query_string, "parameters": parameters} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.