Skip to content

Maybe not a general method #11

@Peacer68

Description

@Peacer68

I implement slot in vllm in this way: train model to get and save $\delta$ that matches my system prompt, and then load $\delta$ in vllm.model_runner. I am sure slot is implemented correctly, but accuracy drops in my downstream task (evaluate factuality). I tried different hyper-parameters, none of which improves the accuracy, and so now I doubt the generality of this method. Here is my modified model_runner.py:

#### SLOT Begin Here
if not hasattr(self, 'ptuning_params'):
    print("Initializing Delta for SLOT")
    import os, json
    delta_path = os.environ.get("delta_path", "code/slot/saved_delta/systemprompt_v4_5_0.1.json")
    tensor_parallel_size_my = int(os.environ.get("tensor_parallel_size_my", 1))
    with open(delta_path, "r") as f:
        delta = json.load(f)["delta"] # shape
    delta = torch.tensor(delta[0], device=sample_hidden_states.device, dtype=sample_hidden_states.dtype)/tensor_parallel_size_my
    self.ptuning_params = delta
    
    assert delta.shape == sample_hidden_states.shape, f"Delta Shape Mismatch!!! delta.shape: {delta.shape}, sample_hidden_states.shape: {sample_hidden_states.shape}"
    print("Initializing END!")

logits = self.model.compute_logits(sample_hidden_states+self.ptuning_params, None)
#### SLOT End Here

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions