-
Notifications
You must be signed in to change notification settings - Fork 51
Reducing complexity of implementation in order to be able to add Atlas text search token based pagination #1046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| slice_size = num_params_min_chunk or 1 | ||
| # If successful, continue with normal pagination | ||
| total_data = {"data": []} # type: dict | ||
| total_data["data"].extend(data["data"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should favor .append(...) w/itertools.chain.from_iterable(...) at the end rather than repeated calls to .extend (especially since there is a loop later).
lines: 656, 701, 732, 806
| for i in range(0, len(split_values), batch_size): | ||
| batch = split_values[i : i + batch_size] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a ways off for being the minimum py version, but in 3.12 itertools introduced batched. I've used the approximate implementation from the docs before:
def batched(iterable, n, *, strict=False):
# batched('ABCDEFG', 2) → AB CD EF G
if n < 1:
raise ValueError('n must be at least one')
iterator = iter(iterable)
while batch := tuple(islice(iterator, n)):
if strict and len(batch) != n:
raise ValueError('batched(): incomplete batch')
yield batch
tsmathis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really much to say on my end, I am be curious though about the performance/execution time of this implementation vs. the parallel approach.
Summary
Major changes: