diff --git a/README.md b/README.md
index b50037a..9729c34 100644
--- a/README.md
+++ b/README.md
@@ -302,6 +302,11 @@ Adjusting all implementation to the same tokenization scheme, one my experience
 |                                            | 86.80% collisions |   93.21% collisions |
 |                                            |    0.9992 entropy |      0.9967 entropy |
 
+The trickiest part, however, is analyzing the retrieval quality of those fingerprints and comparing them to other approaches.
+So, how many bits per fingerprint are needed to achieve a specific recall rate for a given dataset?
+Or, how does the average Levenshtein distance among the top-k nearest neighbors change with the fingerprint size?
+It must clearly decrease, but how fast, and how does that compare to ground truth?
+
 ## Replicating the Results
 
 ### Replicating the Results in Rust 🦀