[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Retrieval-Augmented Generation (RAG) Experiment:
CI results from here: marshelino-maged#10
How does it work?
We have some documents that are a lot, in our case (summary of classes).
gemini-embedding
modelWe have our question (query) in our case is the Java snippet that we want to translate
gemini-embedding
modelThose K documents will be our bindings context summary to give to the LLM when translating

Experiment Results:
Generate JNI bindings for these classes, which are 286 classes, with a total of 35K tokens.
for this snippet
Number of Tokens in the RAG Summary: 851 tokens
Top 10 classes retrieved
for this snippet
Number of Tokens in the RAG Summary: 2009 tokens
Top 10 classes retrieved