Skip to content

Conversation

@pauladkisson
Copy link
Collaborator

@pauladkisson pauladkisson commented Aug 29, 2025

Fixes #152

@pauladkisson pauladkisson marked this pull request as ready for review September 2, 2025 23:49
@pauladkisson pauladkisson marked this pull request as draft September 3, 2025 00:15
@pauladkisson pauladkisson marked this pull request as ready for review September 3, 2025 20:00
@luiztauffer
Copy link
Collaborator

@pauladkisson what do you think of adding codecov here? its difficult to see where we might be missing coverage otherwise

@pauladkisson
Copy link
Collaborator Author

pauladkisson commented Sep 19, 2025

@pauladkisson what do you think of adding codecov here? its difficult to see where we might be missing coverage otherwise

I'd really rather add codecov along with various other more sophisticated CI and tests in a separate PR. This initial testing suite isn't supposed to be comprehensive.

Base automatically changed from packaging to dev October 14, 2025 19:15
@pauladkisson
Copy link
Collaborator Author

pauladkisson commented Oct 30, 2025

In sampleData_NPM_1, I get an error in step three when I try to read the raw data:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 567, in <module>
    main(input_parameters=input_parameters)
    ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 563, in main
    raise e
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 557, in main
    readRawData(input_parameters)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 541, in readRawData
    execute_import_doric(filepath, storesList, flag, op)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 460, in execute_import_doric
    raise Exception('More than one Doric csv file present at the location')
Exception: More than one Doric csv file present at the location

I tracked it down to readTevTsq.py, line 99, in which the following code:

df = pd.read_csv(path[i], index_col=False, dtype=float)

Raises the following error:

Traceback (most recent call last):
  File "pandas/_libs/parsers.pyx", line 1161, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 576, in <module>
    main(input_parameters=input_parameters)
    ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 572, in main
    raise e
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 566, in main
    readRawData(input_parameters)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 533, in readRawData
    flag = check_doric(filepath)
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 104, in check_doric
    raise(e)
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/readTevTsq.py", line 102, in check_doric
    df = pd.read_csv(path[i], index_col=False, dtype=float)
  File "/opt/anaconda3/envs/guppy_env/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/opt/anaconda3/envs/guppy_env/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 626, in _read
    return parser.read(nrows)
           ~~~~~~~~~~~^^^^^^^
  File "/opt/anaconda3/envs/guppy_env/lib/python3.13/site-packages/pandas/io/parsers/readers.py", line 1923, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
        ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        nrows
        ^^^^^
    )
    ^
  File "/opt/anaconda3/envs/guppy_env/lib/python3.13/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 234, in read
    chunks = self._reader.read_low_memory(nrows)
  File "pandas/_libs/parsers.pyx", line 838, in pandas._libs.parsers.TextReader.read_low_memory
  File "pandas/_libs/parsers.pyx", line 921, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 1066, in pandas._libs.parsers.TextReader._convert_column_data
  File "pandas/_libs/parsers.pyx", line 1167, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: could not convert string to float: 'pinknoise'

I don't think this bug is within the scope of this PR since it involves a deep-dive into figuring out a more robust way to determine the difference between Doric and NPM .csv files, and in fact, we might be able to sidestep the issue entirely by just having users specify whether they're data is coming from doric or npm or tdt or whatever.

So my plan is to just skip it for now in the tests.

@pauladkisson
Copy link
Collaborator Author

In sampleData_NPM_2, FiberData470.csv had one extra row that was duplicated with FiberData415.csv, which caused the following error in step 4:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1245, in <module>
    main(input_parameters=input_parameters)
    ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1241, in main
    raise e
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1235, in main
    extractTsAndSignal(input_parameters)
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1222, in extractTsAndSignal
    execute_zscore(folderNames, inputParameters)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1171, in execute_zscore
    compute_z_score(filepath, inputParameters)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 1005, in compute_z_score
    z_score, dff, control_fit = helper_z_score(control, signal, filepath, name, inputParameters)
                                ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 948, in helper_z_score
    norm_data, control_fit = execute_controlFit_dff(control, signal,
                             ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
                                                                                                isosbestic_control, filter_window)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 862, in execute_controlFit_dff
    control_fit = controlFit(control_smooth, signal_smooth)
  File "/Users/pauladkisson/Documents/CatalystNeuro/Guppy/GuPPy/src/guppy/preprocess.py", line 835, in controlFit
    p = np.polyfit(control, signal, 1)
  File "/opt/anaconda3/envs/guppy_env/lib/python3.13/site-packages/numpy/lib/_polynomial_impl.py", line 649, in polyfit
    raise TypeError("expected x and y to have same length")
TypeError: expected x and y to have same length

To fix this problem, I deleted the extra row, which allowed everything to be run smoothly. If we want to fix it in a more robust, long-term way, we'll need to do a follow-up.

@venus-sherathiya, could you update the file accordingly and upload a fixed version to the GDrive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants