Skip to content

Conversation

@Muedi
Copy link
Contributor

@Muedi Muedi commented Mar 25, 2024

Hi,

I had this lying around for some time, but wanted top open a draft request now finally.

I added a script, containing functions, that reads in A fasta file and compute the Shannon-entropy and KL-divergence per seq based on the sequences in that file.
It always builds a dict, containing the frequencies of AAs to work with.
The frequencies in question are OVERALL and not based on alignment. This was by choice as I think its much faster and I don't think aligning multi million seqs is practicable :D

There are old commits shown as not integrated, because they where merged into one last time I think. I kept everything as is, because there are some changes in the scripts folders (unifying scripts and script.py).

I also planned to write a function that takes the UNIPROT accession from the fastas and gets the PPL metrics of AF2 from google cloud, but I did not have an example fasta.

Best,
Max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant