Skip to content

Bgen#227

Open
SantiDu wants to merge 19 commits into
hakyimlab:masterfrom
SantiDu:bgen
Open

Bgen#227
SantiDu wants to merge 19 commits into
hakyimlab:masterfrom
SantiDu:bgen

Conversation

@SantiDu
Copy link
Copy Markdown
Contributor

@SantiDu SantiDu commented May 23, 2026

Three changes were made here:

  1. Conda/Mamba and Pixi seem to install numpy>=2 even though the yaml file states numpy=1.26, but scipy requires numpy<2:
/home/j/jidu/MetaXcan-0.8.1/software/metax/gwas/GWAS.py:5: UserWarning: A NumPy version >=1.22.4 and <1.29.0 is required for this version of SciPy (detected version 2.0.2)
  import scipy.stats as stats
Traceback (most recent call last):
  File "/home/j/jidu/MetaXcan-0.8.1/software/Predict.py", line 16, in <module>
    from metax.genotype import Genotype
...
    from ._kdtree import *
  File "/scratch/tmp/jidu/predict/.pixi/envs/imlabtools/lib/python3.9/site-packages/scipy/spatial/_kdtree.py", line 4, in <module>
    from ._ckdtree import cKDTree, cKDTreeNode
  File "_ckdtree.pyx", line 1, in init scipy.spatial._ckdtree
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

The proposed change in the yaml file uses pip to install scipy and numpy so this incompatibility disappeared.

  1. It takes forever to use bgen-reader to iterate a bgen file, so I switched to the bgen package. Now the speed of using bgen for prediction is really faster than using vcf.

  2. Update version from 0.7.5 to 0.8.2.

Please ignore all the commits that try out combinations of dependency package version numbers hahahaha..

@SantiDu SantiDu closed this May 23, 2026
@SantiDu SantiDu reopened this May 23, 2026
@SantiDu SantiDu closed this Jun 7, 2026
@SantiDu SantiDu reopened this Jun 7, 2026
@hakyim
Copy link
Copy Markdown
Contributor

hakyim commented Jun 7, 2026

Thanks for the PR! Before merging, one question: did you test with a phased BGEN file? The dosage formula for phased genotypes (x[1] + x[3]) assumes the same column ordering as bgen_reader — if bgen's variant.probabilities arranges phased probabilities differently, the dosage would be silently wrong. If your test data was unphased, it would be good to confirm the phased path is correct before we merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants