Maybe not a general method

I implement slot in vllm in this way: train model to get and save $\delta$ that matches my system prompt, and then load $\delta$ in vllm.model_runner. I am sure slot is implemented correctly, but accuracy drops in my downstream task (evaluate factuality). I tried different hyper-parameters, none of which improves the accuracy, and so now I doubt the generality of this method. Here is my modified model_runner.py:
```python
#### SLOT Begin Here
if not hasattr(self, 'ptuning_params'):
    print("Initializing Delta for SLOT")
    import os, json
    delta_path = os.environ.get("delta_path", "code/slot/saved_delta/systemprompt_v4_5_0.1.json")
    tensor_parallel_size_my = int(os.environ.get("tensor_parallel_size_my", 1))
    with open(delta_path, "r") as f:
        delta = json.load(f)["delta"] # shape
    delta = torch.tensor(delta[0], device=sample_hidden_states.device, dtype=sample_hidden_states.dtype)/tensor_parallel_size_my
    self.ptuning_params = delta
    
    assert delta.shape == sample_hidden_states.shape, f"Delta Shape Mismatch!!! delta.shape: {delta.shape}, sample_hidden_states.shape: {sample_hidden_states.shape}"
    print("Initializing END!")

logits = self.model.compute_logits(sample_hidden_states+self.ptuning_params, None)
#### SLOT End Here
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maybe not a general method #11

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Maybe not a general method #11

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions