Fix: respect RoPE frequency metadata by default for llama.cpp models #166
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #14593
Updates the default values of
rope_freq_base
andrope_freq_scale
in theLlamaCpp
wrapper to 0.0, which instructsllama.cpp
to defer to theRoPE
frequency values stored in the model’s GGUF metadata (freq_base_train
andfreq_scale_train
).Unit Test On CUDA
In the code above:
Question: Explain what happens in this code? Answer: This Python script implements the quick sort algorithm, which is a divide and conquer algorithm. It works by selecting a 'pivot' element from the array and partitioning other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The sub-arrays are then recursively sorted.
In this specific script:
arr[len(arr) // 2]
).left
for numbers less than pivot,middle
for equal to pivot andright
for numbers greater than pivot.quicksort(left) + middle + quicksort(right)
).The time complexity of quick sort in the average case is O(n log n), but it can degrade to O(n^2) if you have a list that's already sorted or nearly sorted. However, this scenario is not common with most real-world data. It also requires O(log n) space for recursion stack.
"""
def quick_sort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[0]
less = [x for x in arr[1:] if x <= pivot]
greater = [x for x in arr[1:] if x > pivot]
return quick_sort(less) + [pivot] + quick_sort(greater)