[BUG] Template alignments skipped after processing.

[4f799d6c-5eee-43ed-90e4-54a958c69647.zip](https://github.com/user-attachments/files/23556900/4f799d6c-5eee-43ed-90e4-54a958c69647.zip)

Label: `OpenFold Consortium Member`

**Describe the bug**

Template are processed but template features are populated for only some chains. My leading suspicion is that gaps in a template alignment may be causing that template to be skipped entirely (see below).


**To Reproduce**
If you are able to re-run the OF3 inference query, you will can inspect the _batch.pt saved batch and focus on the template feature with key `template_backbone_frame_mask`. Inspecting this variable will show the that the first two chains (A,B in this case) are not read, e.g.

```
batch = torch.load(
    f"{your_path_to_file}/4f799d6c-5eee-43ed-90e4-54a958c69647/immrep00000/seed_2746317213/immrep00000_seed_2746317213_batch.pt"
) # change file path
batch["template_backbone_frame_mask"][0][0][:231] # this will be all zeros 
```

I've attached query json used to generate the above saved batch, please look in `query_inputs/tcrpmhc_query.json` and also the alignment files in `alignments` which should provide you with the input files needed to run OF3 and re-make the template features.

I've ran it with this yaml here:

```
experiment_settings:
  mode: predict
  seed: 42
  pytorch_ckpt_path: /mnt/inputs/of3_params/of3_ft3_v1.pt
  query_json: /mnt/inputs/dataset/queries/tcrpmhc_query.json
  # output_dir gets set by update_yaml(); keep it null here
  output_dir: null
  # use_templates is now always set by the workflow command line inputs; omit here

pl_trainer_args:
  devices: 4
  num_nodes: 1
  precision: bf16-mixed
  kubeflow: true
  mpi_plugin: false

model_update:
  presets: [predict] # pae and classifier are enabled in the base config

output_writer_settings:
  structure_format: pdb
  write_features: true # true for debugging, generally set to false
  write_latent_outputs: true # true for debugging, generally set to false
```

with these arguments:
```
        "--query_json",
        qjson,
        "--inference_ckpt_path",
        ckpt, # see check point above
        "--use_msa_server",
        "False",
        "--use_templates",
        "True",
```

**Expected behavior**
I expect template features to loaded in for all chains.  

**Stack trace**
In case it is helpful, I've looked deeper into the codebase here and found candidates to where I think the source of the bug is coming from.  Namely, I think templates are skipped with alignments in which the query does not map to any template residue (indicated by a -1 in `template_cache_entry.idx_map` in `map_token_pos_to_template_residues`). 


This triggers `has_multioccupancy_residue` here to be set to `True` 
```
    # Skip template if query and template are still misaligned, this can happen due to
    # unhandled multi-occupancy residues or author annotation errors
    # TODO: add fixes and logging for these cases
    has_multioccupancy_residue = (
        struc.get_residue_starts(atom_array_cropped_template).shape != repeats.shape
    )
```

and then triggers a return of an empty template here

```
   if has_multioccupancy_residue:
        template_slice = TemplateSlice(
            atom_array=AtomArray(0),
            query_token_positions=np.array([]),
            template_residue_repeats=np.array([]),
        )
```

In my example, a slice of `idx_map` looks like this:

```
[ 89  89]
[ 90  90]
[ 91  91]
[ 92  -1]
[ 93  -1]
[ 94  -1]
[ 95  92]
[ 96  93]
```

which leads me to believe the -1 may be causing the early return of an empty `TemplateSlice`

**Configuration (please complete the following information):**
 - GPU A100 node (96 cpus)
 - Installation from repo

**Additional context**
This may be related to another issue posted by my colleague here with properly formatting `.sto` files: https://github.com/aqlaboratory/openfold-3/issues/42
I used .sto files generated by the OpenFold2 pipeline. Please let me know if that is not valid.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Template alignments skipped after processing. #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Template alignments skipped after processing. #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions