[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472

marshelino-maged · 2025-08-01T17:42:23Z

Retrieval-Augmented Generation (RAG) Experiment:
CI results from here: marshelino-maged#10

How does it work?

We have some documents that are a lot, in our case (summary of classes).

We calculate embedding for each of these classes using the gemini-embedding model
then store them in a vectorDB (chromaDB)

We have our question (query) in our case is the Java snippet that we want to translate

We calculate embedding for this query using the gemini-embedding model
We query the vectorDB with this, and ask it to return the most relevant K documents

Those K documents will be our bindings context summary to give to the LLM when translating

Experiment Results:

Generate JNI bindings for these classes, which are 286 classes, with a total of 35K tokens.

classes:
    - "java.io"
    - "com"

for this snippet

Boolean useEnums() {
    Example example = new Example();
    Boolean isTrueUsage = example.enumValueToString(Operation.ADD) == "Addition";
    return isTrueUsage;
}

Number of Tokens in the RAG Summary: 851 tokens

Top 10 classes retrieved

  Query Results:
  class Example extends jni$_.JObject 
  class Example$Operation extends jni$_.JObject 
  class $Example$Operation$Type extends jni$_.JObjType<Example$Operation> 
  class $Example$Operation$NullableType extends jni$_.JObjType<Example$Operation?> 
  class $Example$Type extends jni$_.JObjType<Example> 
  class $Example$NullableType extends jni$_.JObjType<Example?> 
  abstract class $ObjectInputValidation 
  class OptionalDataException extends ObjectStreamException 
  abstract class $FilenameFilter 
  abstract class $FileFilter

for this snippet

public class ReadFile {
    public static void main(String[] args) {
        String filePath = "my-file.txt";
        try (
            FileReader fileReader = new FileReader(filePath);
            BufferedReader bufferedReader = new BufferedReader(fileReader)
        ) {
            String line = bufferedReader.readLine();
            System.out.println("The first line of the file is: " + line);
        } catch (IOException e) {
            System.err.println("An error occurred while reading the file: " + e.getMessage());
        }
    }
}

Number of Tokens in the RAG Summary: 2009 tokens

Top 10 classes retrieved

  Query Results:
  class FileReader extends InputStreamReader 
  class BufferedReader extends Reader 
  class FileInputStream extends InputStream 
  class RandomAccessFile extends jni$_.JObject 
  class LineNumberReader extends BufferedReader 
  class InputStreamReader extends Reader 
  class DataInputStream extends FilterInputStream 
  class StringReader extends Reader 
  class StringBufferInputStream extends InputStream 
  class ObjectInputStream extends InputStream

marshelino-maged added 9 commits August 1, 2025 16:52

add chromaDB to CI

2172efc

Fix CI

f54d70b

Add RAG with tests

0a4cbb4

format and analyze

dfb8738

skip regenerating bindings and Dart snippets

9d74285

make running the chromaDB and tests in one job

dcbba38

Fix CI

0bcb33f

solve analysis error

e0aeb84

fix analyze

c36f78e

github-actions bot added type-infra A repository infrastructure change or enhancement package:native_doc_dartifier labels Aug 1, 2025

marshelino-maged mentioned this pull request Aug 1, 2025

[native_doc_dartifier] Public Signature Extraction Can Be Too Large #2396

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472

[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472

Uh oh!

marshelino-maged commented Aug 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472

Are you sure you want to change the base?

[native_doc_dartifier] Experiment usage of RAG in concising bindings context #2472

Uh oh!

Conversation

marshelino-maged commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How does it work?

Uh oh!

Uh oh!

marshelino-maged commented Aug 1, 2025 •

edited

Loading