Skip to content

Conversation

@jhlegarreta
Copy link
Contributor

Do not remove fMRI notebook from list of discovered notebooks: this will allow running the notebook at the time scheduled by the notebooks.yml GHA workflow file.

@codecov
Copy link

codecov bot commented May 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.19%. Comparing base (ebfcbd8) to head (e5f6d0f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #117   +/-   ##
=======================================
  Coverage   71.19%   71.19%           
=======================================
  Files          23       23           
  Lines        1045     1045           
  Branches      121      121           
=======================================
  Hits          744      744           
  Misses        259      259           
  Partials       42       42           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jhlegarreta
Copy link
Contributor Author

Looks like SynthStrip is not being found:
https://github.com/jhlegarreta/nifreeze/actions/runs/14897027210/job/41841440531#step:11:102

as it fails at:

copy(bmsk_results.outputs.out_mask, bmask_path)

It also happened when I tried to run the notebook locally. Did not investigate further.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch from 2f60d2e to f0569bc Compare May 11, 2025 18:25
@jhlegarreta
Copy link
Contributor Author

@oesteban Re: f0569bc not sure where the model needs to be downloaded. Guidance is welcome.

@oesteban
Copy link
Member

I just filed an issue reporting what you see nipreps/synthstrip#11

Now, you can configure the model to reside under a FreeSurfer home directory (which can be empty except for the model itself) https://github.com/nipreps/synthstrip/blob/dd29a6c751f6c9cc8a31983d8bfbc82d1d97c4a6/nipreps/synthstrip/wrappers/nipype.py#L38

or, you can set an input on the interface:

" bmsk_results = SynthStrip(\n",
" in_file=str(avg_path),\n",
" use_gpu=True,\n",
" ).run(cwd=str(WORKDIR))\n",

Instead of:

        bmsk_results = SynthStrip(
            in_file=str(avg_path),
            use_gpu=True,
        ).run(cwd=str(WORKDIR))

you add a model argument pointing at the file you are downloading

        bmsk_results = SynthStrip(
            in_file=str(avg_path),
            use_gpu=True,
            model='/path/to/synthstripmodel.pt',
        ).run(cwd=str(WORKDIR))

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch from f0569bc to e894222 Compare May 12, 2025 01:38
@jhlegarreta
Copy link
Contributor Author

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch from e894222 to 0135322 Compare May 12, 2025 03:18
@jhlegarreta
Copy link
Contributor Author

@oesteban Seems we are still missing something:
https://github.com/jhlegarreta/nifreeze/actions/runs/14963050429/job/42028116883#step:12:124

Suggestions are welcome.

@oesteban
Copy link
Member

Suggestions are welcome.

I guess we'll need to close nipreps/synthstrip#11 to print out where the file is being sought

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch 3 times, most recently from 16d4ee7 to 83f248d Compare May 12, 2025 23:34
@jhlegarreta
Copy link
Contributor Author

I must be missing something very evident, but the model weights file is still not being found. The file is being downloaded properly at the testing data location:
https://github.com/jhlegarreta/nifreeze/actions/runs/14984758619/job/42096449224#step:6:632
but the nipreps interface is not being able to load it:
https://github.com/jhlegarreta/nifreeze/actions/runs/14984758619/job/42096449224#step:13:120

@oesteban Sorry, I am at a loss here about what else to check. Suggestions are welcome.

@oesteban
Copy link
Member

oesteban commented May 13, 2025

Okay, now we are getting a different error. Still FileNotFound, but now the interface crashes (as it should) and the not found file is the model. Reading the error it's clear we need to pass a directory not a model to the interface.

Okay you added a specific test before entering the interface. However, the test is failing because it expects a folder.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch 6 times, most recently from 34d7b98 to a65c768 Compare May 13, 2025 17:56
@jhlegarreta
Copy link
Contributor Author

This is proving time-consuming and challenging to debug 😔. My last bet would be to convert the notebook to a python script and to have it tested as such in the CI. This fsiled locally because Python didn't like asyncio being outside a function. Did not investigate further.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch 3 times, most recently from cca464f to af23400 Compare May 13, 2025 20:57
@jhlegarreta
Copy link
Contributor Author

jhlegarreta commented May 13, 2025

@oesteban OK, now errors make sense and I should have realized earlier: the issue is CUDA drivers:

250513-21:05:56,214 nipype.interface INFO:
	 stderr 2025-05-13T21:05:56.214307:    torch._C._cuda_init()
250513-21:05:56,214 nipype.interface INFO:
	 stderr 2025-05-13T21:05:56.214399:RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

https://github.com/nipreps/nifreeze/actions/runs/15006593539/job/42166634496?pr=117#step:14:1341

Thoughts:

  • AFAIK, GPU access is not offered for free on GHA, so we can:
    • Tell SynthStrip to use the CPU if that will not take too long.
    • Compute the masks locally, upload them to somewhere, download to the appropriate locations, strip that code from the notebook to a python script.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch from af23400 to e2ae7d0 Compare May 13, 2025 21:31
@jhlegarreta
Copy link
Contributor Author

After discussing this offline, it looks like some commit was missed when contributing the notebook and some cell is missing. I will open a separate PR to:

  • Install the necessary parts in the GHA workflow file.
  • Uncomment the fMRI notebook from the list of discovered notebooks.
  • Make the use_gpu flag depend on whether CUDA drivers have been found.

I will go ahead and merge the new PR as soon as I see that we are stuck in the same point as here. I will open a separate PR to try to install the CUDA drivers so that the notebook can run more quickly. We will then continue here or close this PR.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch 6 times, most recently from 76832fb to 0ca1a16 Compare May 19, 2025 02:15
@jhlegarreta
Copy link
Contributor Author

jhlegarreta commented May 20, 2025

I will open a separate PR to try to install the CUDA drivers so that the notebook can run more quickly.

I did it here for practical reasons and the conclusion is here:
https://github.com/nipreps/nifreeze/actions/runs/15102809502/job/42446319442#step:17:1525

 250519-02:25:37,434 nipype.interface INFO:
	 stderr 2025-05-19T02:25:37.434007:    torch._C._cuda_init()
250519-02:25:37,434 nipype.interface INFO:
	 stderr 2025-05-19T02:25:37.434058:RuntimeError: Found no NVIDIA driver on your system.
 Please check that you have an NVIDIA GPU and installed a driver from
 http://www.nvidia.com/Download/index.aspx

So unless there is something wrong with the installation, and after reading more about it, we will not be able to make this happen without a GPU @oesteban. If you happen to have a workaround, please go ahead and try it; otherwise, my proposal is to

  • See if the notebook execution completes running on a CPU when the notebook gets fixed with the missing cells that you have in some local version.
  • If it does, problem solved. If it does not, reduce the number of BOLD runs.

As opposed to the Python script that was run on a CPU:
https://github.com/nipreps/nifreeze/actions/runs/15030922733/job/42243003382

Running the code on Jupyter triggers a cell timeout:
https://github.com/nipreps/nifreeze/actions/runs/15102456036/job/42445432169

So I will push a separate PR to increase the timeout value so that we can eventually run the notebook to completion.

@jhlegarreta jhlegarreta force-pushed the DoNotRemovefMRINotebookFromList branch from 0ca1a16 to 0f4730d Compare May 20, 2025 13:22
@jhlegarreta
Copy link
Contributor Author

So I will push a separate PR to increase the timeout value so that we can eventually run the notebook to completion.

Done in PR #126.

So hopefully the only thing missing here is to recover those cells from somewhere in your local machine @oesteban and then rebase this branch on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants