50 add support for processors for multi input types#52
Conversation
|
Warning Rate limit exceeded@shouryashashank has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 24 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
WalkthroughThis change introduces processor-based generation support to the codebase. It adds methods for loading and using AutoProcessor objects from the transformers library, updates the generation logic to handle processor-based workflows alongside tokenizer-based ones, and exposes processor-related functions through the module’s public API. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Predacons
participant Generate
participant Model
participant Processor
User->>Predacons: generate(model, processor, messages, ...)
Predacons->>Generate: generate_output_with_processor(model, processor, messages, ...)
Generate->>Processor: process(messages)
Generate->>Model: generate(processed_inputs)
Model-->>Generate: output
Generate-->>Predacons: output
Predacons-->>User: output
Possibly related issues
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
app/predacons/src/predacons.py (2)
446-447: Minor improvement: Remove redundant default values.Since
Noneis the default return value fordict.get(), you can simplify these lines.- processor = kwargs.get('processor', None) - tokenizer = kwargs.get('tokenizer', None) + processor = kwargs.get('processor') + tokenizer = kwargs.get('tokenizer')🧰 Tools
🪛 Ruff (0.11.9)
446-446: Use
kwargs.get('processor')instead ofkwargs.get('processor', None)Replace
kwargs.get('processor', None)withkwargs.get('processor')(SIM910)
447-447: Use
kwargs.get('tokenizer')instead ofkwargs.get('tokenizer', None)Replace
kwargs.get('tokenizer', None)withkwargs.get('tokenizer')(SIM910)
743-756: Remove trailing whitespace.The function implementation is correct and well-documented. Just a minor formatting issue to fix.
return Generate.load_processor(processor_path,use_fast=use_fast,gguf_file=gguf_file) - +🧰 Tools
🪛 Pylint (3.3.7)
[convention] 756-756: Trailing whitespace
(C0303)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/predacons/__init__.py(1 hunks)app/predacons/src/generate.py(5 hunks)app/predacons/src/predacons.py(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
app/predacons/__init__.py (2)
app/predacons/src/generate.py (1)
load_processor(246-247)app/predacons/src/predacons.py (1)
load_processor(743-755)
app/predacons/src/predacons.py (2)
app/predacons/src/generate.py (8)
Generate(6-274)generate_output_with_processor_stream(273-274)generate_output_with_processor(270-271)generate_output_from_model_stream(255-256)generate_output_from_model(252-253)generate_chat_output_from_model_stream(267-268)generate_chat_output_from_model(264-265)load_processor(246-247)app/predacons/src/speculative_fast_generation.py (2)
generate_output_from_model(86-87)GPTFast(13-87)
🪛 Ruff (0.11.9)
app/predacons/__init__.py
18-18: .src.predacons.load_processor imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
app/predacons/src/predacons.py
446-446: Use kwargs.get('processor') instead of kwargs.get('processor', None)
Replace kwargs.get('processor', None) with kwargs.get('processor')
(SIM910)
447-447: Use kwargs.get('tokenizer') instead of kwargs.get('tokenizer', None)
Replace kwargs.get('tokenizer', None) with kwargs.get('tokenizer')
(SIM910)
468-468: Use is and is not for type comparisons, or isinstance() for isinstance checks
(E721)
475-475: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
app/predacons/src/generate.py
211-211: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
222-222: Local variable input_len is assigned to but never used
Remove assignment to unused variable input_len
(F841)
232-232: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🪛 Pylint (3.3.7)
app/predacons/src/predacons.py
[convention] 451-451: Line too long (138/100)
(C0301)
[convention] 452-452: Line too long (127/100)
(C0301)
[convention] 457-457: Line too long (149/100)
(C0301)
[convention] 458-458: Line too long (138/100)
(C0301)
[convention] 460-460: Line too long (116/100)
(C0301)
[convention] 465-465: Line too long (180/100)
(C0301)
[convention] 466-466: Line too long (169/100)
(C0301)
[convention] 470-470: Line too long (105/100)
(C0301)
[convention] 473-473: Line too long (153/100)
(C0301)
[convention] 474-474: Line too long (142/100)
(C0301)
[convention] 476-476: Line too long (102/100)
(C0301)
[convention] 479-479: Line too long (149/100)
(C0301)
[convention] 480-480: Line too long (138/100)
(C0301)
[refactor] 449-460: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 455-460: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[warning] 475-475: Catching too general exception Exception
(W0718)
[refactor] 468-474: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[convention] 468-468: Use isinstance() rather than type() for a typecheck.
(C0123)
[warning] 468-468: Access to a protected member _dynamo of a client class
(W0212)
[refactor] 438-484: Too many nested blocks (6/5)
(R1702)
[warning] 475-480: Unused variable 'e'
(W0612)
[convention] 520-520: Line too long (101/100)
(C0301)
[refactor] 524-528: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 531-535: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 625-630: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[convention] 756-756: Trailing whitespace
(C0303)
app/predacons/src/generate.py
[convention] 1-1: Line too long (155/100)
(C0301)
[convention] 1-1: Missing module docstring
(C0114)
[error] 1-1: Unable to import 'transformers'
(E0401)
[convention] 52-52: Line too long (104/100)
(C0301)
[convention] 54-54: Trailing whitespace
(C0303)
[error] 51-51: Method '__load_processor' should have "self" as first argument
(E0213)
[convention] 197-197: Trailing whitespace
(C0303)
[convention] 201-201: Line too long (102/100)
(C0301)
[convention] 208-208: Line too long (125/100)
(C0301)
[convention] 213-213: Line too long (105/100)
(C0301)
[convention] 216-216: Line too long (102/100)
(C0301)
[convention] 228-228: Line too long (127/100)
(C0301)
[convention] 233-233: Trailing whitespace
(C0303)
[error] 198-198: Method '__generate_output_with_processor' should have "self" as first argument
(E0213)
[error] 206-206: Instance of 'Generate' has no 'device' member
(E1101)
[error] 208-208: Instance of 'Generate' has no 'generate' member
(E1101)
[warning] 211-211: Consider explicitly re-raising using 'raise RuntimeError(f'Failed to generate output with processor: {str(e)}') from e'
(W0707)
[error] 213-213: Method '__generate_output_with_processor_stream' should have "self" as first argument
(E0213)
[error] 221-221: Instance of 'Generate' has no 'device' member
(E1101)
[error] 229-229: Instance of 'Generate' has no 'generate' member
(E1101)
[warning] 232-232: Consider explicitly re-raising using 'raise RuntimeError(f'Failed to generate output with processor stream: {str(e)}') from e'
(W0707)
[warning] 222-222: Unused variable 'input_len'
(W0612)
[convention] 248-248: Trailing whitespace
(C0303)
[convention] 246-246: Missing function or method docstring
(C0116)
[error] 246-246: Method 'load_processor' should have "self" as first argument
(E0213)
[convention] 271-271: Line too long (109/100)
(C0301)
[convention] 273-273: Line too long (103/100)
(C0301)
[convention] 274-274: Final newline missing
(C0304)
[convention] 274-274: Line too long (116/100)
(C0301)
[convention] 270-270: Missing function or method docstring
(C0116)
[error] 270-270: Method 'generate_output_with_processor' should have "self" as first argument
(E0213)
[convention] 273-273: Missing function or method docstring
(C0116)
[error] 273-273: Method 'generate_output_with_processor_stream' should have "self" as first argument
(E0213)
🔇 Additional comments (6)
app/predacons/__init__.py (1)
18-18: LGTM! Correctly exposes the new processor functionality.Adding
load_processorto the public API is consistent with the PR's objective of introducing processor-based generation support. The static analysis warning about unused import can be safely ignored as this is a standard pattern for__init__.pyfiles to re-export symbols for the public API.🧰 Tools
🪛 Ruff (0.11.9)
18-18:
.src.predacons.load_processorimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
app/predacons/src/predacons.py (3)
406-406: Good fix for the condition check.The updated condition correctly allows either
'sequence'or'chat'to be provided alongside'model_path', which aligns with the function's flexible input requirements.
519-536: Well-structured handling of multiple return formats.The updated logic correctly handles different return tuple formats and safely checks for the
decodemethod before using it. This defensive approach ensures compatibility with both tokenizer and processor objects.🧰 Tools
🪛 Pylint (3.3.7)
[convention] 520-520: Line too long (101/100)
(C0301)
[refactor] 524-528: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[refactor] 531-535: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
622-631: Consistent implementation with text_generate.The function correctly handles the 3-tuple return format expected from chat generation and safely checks for the
decodemethod.🧰 Tools
🪛 Pylint (3.3.7)
[refactor] 625-630: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
app/predacons/src/generate.py (2)
1-1: LGTM!The import correctly adds
AutoProcessorto support the new processor-based functionality.🧰 Tools
🪛 Pylint (3.3.7)
[convention] 1-1: Line too long (155/100)
(C0301)
[convention] 1-1: Missing module docstring
(C0114)
[error] 1-1: Unable to import 'transformers'
(E0401)
198-212: Well-implemented processor-based generation method.The method correctly:
- Handles missing chat templates with a sensible default
- Uses
torch.inference_mode()for optimal inference performance- Applies proper device placement and dtype conversion
- Includes comprehensive error handling
🧰 Tools
🪛 Ruff (0.11.9)
211-211: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
🪛 Pylint (3.3.7)
[convention] 201-201: Line too long (102/100)
(C0301)
[convention] 208-208: Line too long (125/100)
(C0301)
[error] 198-198: Method '__generate_output_with_processor' should have "self" as first argument
(E0213)
[error] 206-206: Instance of 'Generate' has no 'device' member
(E1101)
[error] 208-208: Instance of 'Generate' has no 'generate' member
(E1101)
[warning] 211-211: Consider explicitly re-raising using 'raise RuntimeError(f'Failed to generate output with processor: {str(e)}') from e'
(W0707)
| if type(model) == torch._dynamo.eval_frame.OptimizedModule: | ||
| print("generate_output using fast generation") | ||
| return GPTFast.generate_output_from_model(model, tokenizer, sequence, max_length) | ||
| else: | ||
| if stream: | ||
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | ||
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | ||
| except Exception as e: | ||
| print("Exception occurred while loading torch._dynamo.eval_frame.OptimizedModule") | ||
| print("generate_output using default generation") | ||
| if stream: | ||
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | ||
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) |
There was a problem hiding this comment.
Fix type checking and remove unused variable.
The code has two issues:
- Type comparison should use
isinstance()instead of==for better practice - The exception variable
eis assigned but never used
- try:
- if type(model) == torch._dynamo.eval_frame.OptimizedModule:
- print("generate_output using fast generation")
- return GPTFast.generate_output_from_model(model, tokenizer, sequence, max_length)
- else:
- if stream:
- return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
- return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
- except Exception as e:
- print("Exception occurred while loading torch._dynamo.eval_frame.OptimizedModule")
- print("generate_output using default generation")
- if stream:
- return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
- return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
+ try:
+ if isinstance(model, torch._dynamo.eval_frame.OptimizedModule):
+ print("generate_output using fast generation")
+ return GPTFast.generate_output_from_model(model, tokenizer, sequence, max_length)
+ else:
+ if stream:
+ return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
+ return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
+ except Exception:
+ print("Exception occurred while loading torch._dynamo.eval_frame.OptimizedModule")
+ print("generate_output using default generation")
+ if stream:
+ return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)
+ return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if type(model) == torch._dynamo.eval_frame.OptimizedModule: | |
| print("generate_output using fast generation") | |
| return GPTFast.generate_output_from_model(model, tokenizer, sequence, max_length) | |
| else: | |
| if stream: | |
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | |
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | |
| except Exception as e: | |
| print("Exception occurred while loading torch._dynamo.eval_frame.OptimizedModule") | |
| print("generate_output using default generation") | |
| if stream: | |
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | |
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length,trust_remote_code=trust_remote_code) | |
| try: | |
| if isinstance(model, torch._dynamo.eval_frame.OptimizedModule): | |
| print("generate_output using fast generation") | |
| return GPTFast.generate_output_from_model(model, tokenizer, sequence, max_length) | |
| else: | |
| if stream: | |
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length, trust_remote_code=trust_remote_code) | |
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length, trust_remote_code=trust_remote_code) | |
| except Exception: | |
| print("Exception occurred while loading torch._dynamo.eval_frame.OptimizedModule") | |
| print("generate_output using default generation") | |
| if stream: | |
| return Generate.generate_output_from_model_stream(model, tokenizer, sequence, max_length, trust_remote_code=trust_remote_code) | |
| return Generate.generate_output_from_model(model, tokenizer, sequence, max_length, trust_remote_code=trust_remote_code) |
🧰 Tools
🪛 Ruff (0.11.9)
468-468: Use is and is not for type comparisons, or isinstance() for isinstance checks
(E721)
475-475: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
🪛 Pylint (3.3.7)
[convention] 470-470: Line too long (105/100)
(C0301)
[convention] 473-473: Line too long (153/100)
(C0301)
[convention] 474-474: Line too long (142/100)
(C0301)
[convention] 476-476: Line too long (102/100)
(C0301)
[convention] 479-479: Line too long (149/100)
(C0301)
[convention] 480-480: Line too long (138/100)
(C0301)
[warning] 475-475: Catching too general exception Exception
(W0718)
[refactor] 468-474: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it
(R1705)
[convention] 468-468: Use isinstance() rather than type() for a typecheck.
(C0123)
[warning] 468-468: Access to a protected member _dynamo of a client class
(W0212)
[warning] 475-480: Unused variable 'e'
(W0612)
🤖 Prompt for AI Agents
In app/predacons/src/predacons.py around lines 468 to 480, replace the type
comparison using '==' with 'isinstance()' to properly check if 'model' is an
instance of 'torch._dynamo.eval_frame.OptimizedModule'. Also, remove the unused
exception variable 'e' from the except clause to clean up the code.
| def load_processor(tokenizer_path,use_fast=False,gguf_file=None): | ||
| return Generate.__load_processor(tokenizer_path,use_fast=use_fast,gguf_file=gguf_file) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add docstrings to public methods.
These public API methods should have docstrings to help users understand their purpose and parameters.
def load_processor(tokenizer_path,use_fast=False,gguf_file=None):
+ """
+ Load a processor from the specified path.
+
+ Args:
+ tokenizer_path (str): Path to the processor/tokenizer.
+ use_fast (bool): Whether to use fast processor. Defaults to False.
+ gguf_file (str, optional): Path to GGUF file. Defaults to None.
+
+ Returns:
+ AutoProcessor: The loaded processor instance.
+ """
return Generate.__load_processor(tokenizer_path,use_fast=use_fast,gguf_file=gguf_file)
# ... other code ...
def generate_output_with_processor(model, processor, messages, max_length, temperature=0.1):
+ """
+ Generate output using a processor for handling inputs.
+
+ Args:
+ model: The loaded model instance.
+ processor: The loaded processor instance.
+ messages: Input messages for generation.
+ max_length (int): Maximum length of generated output.
+ temperature (float): Sampling temperature. Defaults to 0.1.
+
+ Returns:
+ tuple: (inputs, outputs, processor)
+ """
return Generate.__generate_output_with_processor(model, processor, messages, max_length, temperature)
def generate_output_with_processor_stream(model, processor, messages, max_length, temperature=0.1):
+ """
+ Generate streaming output using a processor.
+
+ Args:
+ model: The loaded model instance.
+ processor: The loaded processor instance.
+ messages: Input messages for generation.
+ max_length (int): Maximum length of generated output.
+ temperature (float): Sampling temperature. Defaults to 0.1.
+
+ Returns:
+ tuple: (thread, streamer) for handling streaming output.
+ """
return Generate.__generate_output_with_processor_stream(model, processor, messages, max_length, temperature)Also applies to: 270-274
🧰 Tools
🪛 Pylint (3.3.7)
[convention] 248-248: Trailing whitespace
(C0303)
[convention] 246-246: Missing function or method docstring
(C0116)
[error] 246-246: Method 'load_processor' should have "self" as first argument
(E0213)
🤖 Prompt for AI Agents
In app/predacons/src/generate.py at lines 246 to 248 and also lines 270 to 274,
the public methods lack docstrings. Add clear and concise docstrings to these
methods describing their purpose, parameters, and return values to improve code
readability and usability for users.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…pes` Docstrings generation was requested by @shouryashashank. * #52 (comment) The following files were modified: * `app/predacons/src/generate.py` * `app/predacons/src/predacons.py`
|
Note Generated docstrings for this pull request at #54 |
Summary by CodeRabbit