feat: Add support for reading whole text files to read_text#6354
feat: Add support for reading whole text files to read_text#6354plotor wants to merge 1 commit intoEventual-Inc:mainfrom
read_text#6354Conversation
3abc79b to
ecaddb9
Compare
Greptile SummaryThis PR adds a Key findings:
Confidence Score: 3/5
Last reviewed commit: ecaddb9 |
| Ok(try_stream! { | ||
| let mut content = String::new(); | ||
| reader.read_to_string(&mut content).await?; | ||
|
|
||
| // Apply skip_blank_lines if needed (for whole file, this means skip if entire content is blank) | ||
| if convert_options.skip_blank_lines && content.trim().is_empty() { | ||
| return; | ||
| } | ||
|
|
||
| yield content; | ||
| }) |
There was a problem hiding this comment.
Limit pushdown not respected in whole_text mode
convert_options.limit is completely ignored inside read_into_whole_text_stream. In the existing line-oriented path (read_into_line_chunk_stream), the limit is enforced via a remaining counter that short-circuits the loop when it reaches zero. Here, if a scan task is given limit = Some(0) (i.e., the overall query limit is already satisfied by prior scan tasks), this function will still read the entire file and yield one row — producing incorrect results.
A minimal guard at the top of the try_stream! block would address this:
| Ok(try_stream! { | |
| let mut content = String::new(); | |
| reader.read_to_string(&mut content).await?; | |
| // Apply skip_blank_lines if needed (for whole file, this means skip if entire content is blank) | |
| if convert_options.skip_blank_lines && content.trim().is_empty() { | |
| return; | |
| } | |
| yield content; | |
| }) | |
| Ok(try_stream! { | |
| // Respect limit pushdown: in whole-text mode each file is exactly one row. | |
| if convert_options.limit == Some(0) { | |
| return; | |
| } | |
| let mut content = String::new(); | |
| reader.read_to_string(&mut content).await?; | |
| // Apply skip_blank_lines if needed (for whole file, this means skip if entire content is blank) | |
| if convert_options.skip_blank_lines && content.trim().is_empty() { | |
| return; | |
| } | |
| yield content; | |
| }) |
Signed-off-by: plotor <zhenchao.wang@hotmail.com>
ecaddb9 to
977676c
Compare
|
This is a supplementary implementation for #6111, adding a |
Changes Made
Add a
whole_textoption to theread_textAPI to support reading whole text contents as a single line. Consider scenarios such as inference scenarios where the content of a text might be a complete prompt, in which case it shouldn't be read line by line.Related Issues