GGUF: ggml backend support for writing tensor data#1033
GGUF: ggml backend support for writing tensor data#1033JohannesGaessler wants to merge 2 commits intoggml-org:masterfrom
Conversation
|
It should be ok to store the tensor in |
|
I did a refactor to store a |
| /* if (info->n_dims > GGML_MAX_DIMS) { */ | ||
| /* fprintf(stderr, "%s: invalid number of dimensions (%" PRIu32 ")\n", __func__, info->n_dims); */ | ||
| /* return false; */ | ||
| /* } */ | ||
|
|
||
| /* if (info->type < 0 || info->type >= GGML_TYPE_COUNT) { */ | ||
| /* fprintf(stderr, "%s: invalid type (%d)\n", __func__, info->type); */ | ||
| /* return false; */ | ||
| /* } */ | ||
|
|
||
| /* if (strlen(info->name.data) >= GGML_MAX_NAME) { */ | ||
| /* fprintf(stderr, "%s: tensor '%s' name is too long\n", __func__, info->name.data); */ | ||
| /* return false; */ | ||
| /* } */ | ||
|
|
||
| /* for (uint32_t i = 0; i < info->n_dims; ++i) { */ | ||
| /* if (info->ne[i] <= 0) { */ | ||
| /* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[i]); */ | ||
| /* return false; */ | ||
| /* } */ | ||
| /* } */ | ||
|
|
||
| /* // prevent overflow for total number of elements */ | ||
| /* if (INT64_MAX/info->ne[1] <= info->ne[0]) { */ | ||
| /* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[1]); */ | ||
| /* return false; */ | ||
| /* } */ | ||
|
|
||
| /* if (INT64_MAX/info->ne[2] <= info->ne[0]*info->ne[1]) { */ | ||
| /* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[2]); */ | ||
| /* return false; */ | ||
| /* } */ | ||
|
|
||
| /* if (INT64_MAX/info->ne[3] <= info->ne[0]*info->ne[1]*info->ne[2]) { */ | ||
| /* fprintf(stderr, "%s: invalid number of elements (%" PRIu64 ")\n", __func__, info->ne[3]); */ | ||
| /* return false; */ | ||
| /* } */ |
There was a problem hiding this comment.
Why are these checks commented?
There was a problem hiding this comment.
This was just something I did for a WIP version. I have a version with more changes and the checks re-eenabled on my local machine. I'll make a PR to llama.cpp either today or tomorrow.
This PR adds ggml backend support for writing tensor data to a GGUF file. Currently a workaround is needed where the data is first copied to new tensors with data in RAM, which the GGUF code can then access via
memcpy. This PR makes it so that instead a fake tensor is reconstructed fromgguf_tensor_infowhich can then be passed toggml_backend_tensor_get. I'm not sure whether this is the best solution; a lot of the fileds ingguf_tensor_infoare the same as inggml_tensor, is there a reason why you couldn't just directly store aggml_tensoras one of the fields ingguf_tensor_info?