Skip to content

Commit 1eb2e64

Browse files
authored
Minor improvements toBiocFrame class (#129).
- Major update to type hints throughout the module for better type safety and consistency. - Fixed bug in slice operations where column indices might be incorrectly initialized. - Added missing index validation in `get_row()` for integer row indices. Similar index validation in `remove_columns()` and `remove_rows()` for out-of-range indices. - Accept a list of column values and column names to initialize a biocframe object. - Implement empty, contains, head, tail, - Coercion to list and `NamedList` from bioctuls.
1 parent 1a6a92d commit 1eb2e64

File tree

16 files changed

+1284
-164
lines changed

16 files changed

+1284
-164
lines changed

.github/workflows/publish-pypi.yml

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
# This workflow will install Python dependencies, run tests and lint with a single version of Python
2-
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
3-
41
name: Publish to PyPI
52

63
on:
@@ -19,10 +16,10 @@ jobs:
1916
steps:
2017
- uses: actions/checkout@v4
2118

22-
- name: Set up Python 3.11
19+
- name: Set up Python 3.12
2320
uses: actions/setup-python@v5
2421
with:
25-
python-version: 3.11
22+
python-version: 3.12
2623

2724
- name: Install dependencies
2825
run: |
@@ -33,6 +30,14 @@ jobs:
3330
run: |
3431
tox
3532
33+
- name: Build Project and Publish
34+
run: |
35+
python -m tox -e clean,build
36+
37+
# This uses the trusted publisher workflow so no token is required.
38+
- name: Publish to PyPI
39+
uses: pypa/gh-action-pypi-publish@release/v1
40+
3641
- name: Build docs
3742
run: |
3843
tox -e docs
@@ -45,11 +50,3 @@ jobs:
4550
branch: gh-pages # The branch the action should deploy to.
4651
folder: ./docs/_build/html
4752
clean: true # Automatically remove deleted files from the deploy branch
48-
49-
- name: Build Project and Publish
50-
run: |
51-
python -m tox -e clean,build
52-
53-
# This uses the trusted publisher workflow so no token is required.
54-
- name: Publish to PyPI
55-
uses: pypa/gh-action-pypi-publish@release/v1

.github/workflows/run-tests.yml

Lines changed: 55 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,73 @@
1-
name: Run tests
1+
name: Test the library
22

33
on:
44
push:
5-
branches: [master]
5+
branches:
6+
- master # for legacy repos
7+
- main
68
pull_request:
7-
branches: [master]
9+
branches:
10+
- master # for legacy repos
11+
- main
12+
workflow_dispatch: # Allow manually triggering the workflow
13+
schedule:
14+
# Run roughly every 15 days at 00:00 UTC
15+
# (useful to check if updates on dependencies break the package)
16+
- cron: "0 0 1,16 * *"
17+
18+
permissions:
19+
contents: read
20+
21+
concurrency:
22+
group: >-
23+
${{ github.workflow }}-${{ github.ref_type }}-
24+
${{ github.event.pull_request.number || github.sha }}
25+
cancel-in-progress: true
826

927
jobs:
10-
build:
11-
runs-on: ubuntu-latest
28+
test:
1229
strategy:
1330
matrix:
14-
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
15-
16-
name: Python ${{ matrix.python-version }}
31+
python: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
32+
platform:
33+
- ubuntu-latest
34+
- macos-latest
35+
- windows-latest
36+
runs-on: ${{ matrix.platform }}
37+
name: Python ${{ matrix.python }}, ${{ matrix.platform }}
1738
steps:
1839
- uses: actions/checkout@v4
1940

20-
- name: Setup Python
21-
uses: actions/setup-python@v5
41+
- uses: actions/setup-python@v5
42+
id: setup-python
2243
with:
23-
python-version: ${{ matrix.python-version }}
24-
cache: "pip"
44+
python-version: ${{ matrix.python }}
2545

2646
- name: Install dependencies
2747
run: |
2848
python -m pip install --upgrade pip
29-
pip install tox
49+
pip install tox coverage
3050
31-
- name: Test with tox
32-
run: |
51+
- name: Run tests
52+
run: >-
53+
pipx run --python '${{ steps.setup-python.outputs.python-path }}'
3354
tox
55+
-- -rFEx --durations 10 --color yes --cov --cov-branch --cov-report=xml # pytest args
56+
57+
- name: Check for codecov token availability
58+
id: codecov-check
59+
shell: bash
60+
run: |
61+
if [ ${{ secrets.CODECOV_TOKEN }} != '' ]; then
62+
echo "codecov=true" >> $GITHUB_OUTPUT;
63+
else
64+
echo "codecov=false" >> $GITHUB_OUTPUT;
65+
fi
66+
67+
- name: Upload coverage reports to Codecov with GitHub Action
68+
uses: codecov/codecov-action@v5
69+
if: ${{ steps.codecov-check.outputs.codecov == 'true' }}
70+
env:
71+
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
72+
slug: ${{ github.repository }}
73+
flags: ${{ matrix.platform }} - py${{ matrix.python }}

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
# Changelog
22

3+
## Version 0.7.0
4+
5+
- Major update to type hints throughout the module for better type safety and consistency.
6+
- Fixed bug in slice operations where column indices might be incorrectly initialized.
7+
- Added missing index validation in `get_row()` for integer row indices. Similar index validation in `remove_columns()` and `remove_rows()` for out-of-range indices.
8+
- Accept a list of column values and column names to initialize a biocframe object.
9+
- Implement empty, contains, head, tail,
10+
- Coercions to list and `NamedList` from bioctuls.
11+
312
## Version 0.6.3
413
- Implement `remove_rows()`.
514
- Implement `has_row()`.

README.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,106 @@ pip install biocframe
3232
pip install biocframe[optional]
3333
```
3434

35+
## Quick Examples
36+
37+
### Genomic Annotation Data
38+
39+
Genomic data often requires storing coordinates, annotations, and metadata together:
40+
41+
```python
42+
# Gene annotation with nested structures
43+
gene_annotations = BiocFrame({
44+
"gene_id": ["GENE1", "GENE2", "GENE3"],
45+
"symbol": ["BRCA1", "TP53", "EGFR"],
46+
"location": BiocFrame({
47+
"chromosome": ["chr17", "chr17", "chr7"],
48+
"start": [43044295, 7668422, 55019017],
49+
"end": [43125483, 7687550, 55211628],
50+
"strand": ["-", "-", "+"],
51+
}),
52+
"transcripts": [
53+
["NM_007294", "NM_007297", "NM_007300"],
54+
["NM_000546"],
55+
["NM_005228", "NM_201282"],
56+
],
57+
"pathways": [
58+
["DNA repair", "Cell cycle"],
59+
["Apoptosis", "Cell cycle", "DNA repair"],
60+
["Cell growth", "Signal transduction"],
61+
],
62+
}, row_names=["ENSG00000012048", "ENSG00000141510", "ENSG00000146648"])
63+
64+
print(gene_annotations)
65+
```
66+
67+
### Multi-Omics Data Integration
68+
69+
When combining different types of omics data with varying structures:
70+
71+
```python
72+
# Multi-omics data with different measurement types
73+
multi_omics = BiocFrame({
74+
"sample_id": ["S1", "S2", "S3"],
75+
"rna_seq": np.array([
76+
[100, 200, 150],
77+
[300, 250, 180],
78+
[120, 220, 160],
79+
], dtype=np.float32),
80+
"methylation": BiocFrame({
81+
"cg0001": [0.85, 0.92, 0.78],
82+
"cg0002": [0.45, 0.38, 0.52],
83+
"cg0003": [0.12, 0.15, 0.10],
84+
}),
85+
"clinical": BiocFrame({
86+
"age": [45, 52, 38],
87+
"gender": ["M", "F", "F"],
88+
"diagnosis": ["Type A", "Type B", "Type A"],
89+
}),
90+
}, column_data=BiocFrame({
91+
"data_type": ["identifier", "expression", "epigenetic", "clinical"],
92+
"source": ["lab", "sequencer", "array", "EHR"],
93+
}))
94+
95+
print(multi_omics)
96+
print("\nColumn metadata:")
97+
print(multi_omics.get_column_data())
98+
```
99+
100+
### Hierarchical Data Structures
101+
102+
For data with natural hierarchies (e.g., samples → patients → cohorts):
103+
104+
```python
105+
# Hierarchical clinical trial data
106+
clinical_trial = BiocFrame({
107+
"patient_id": ["P001", "P002", "P003"],
108+
"cohort": ["A", "A", "B"],
109+
"samples": [
110+
BiocFrame({
111+
"sample_id": ["S001", "S002"],
112+
"collection_date": ["2024-01-01", "2024-01-15"],
113+
"vital_status": ["alive", "alive"],
114+
}),
115+
BiocFrame({
116+
"sample_id": ["S003", "S004", "S005"],
117+
"collection_date": ["2024-01-02", "2024-01-16", "2024-01-30"],
118+
"vital_status": ["alive", "alive", "deceased"],
119+
}),
120+
BiocFrame({
121+
"sample_id": ["S006"],
122+
"collection_date": ["2024-01-03"],
123+
"vital_status": ["alive"],
124+
}),
125+
],
126+
}, metadata={
127+
"trial_name": "PHASE_III_STUDY",
128+
"start_date": "2024-01-01",
129+
"status": "ongoing",
130+
})
131+
132+
print(clinical_trial)
133+
```
134+
35135
## Construction
36136

37137
To construct a `BiocFrame` object, simply provide the data as a dictionary.

0 commit comments

Comments
 (0)