Replace passivepy with a call to an LLM by nonprofittechy · Pull Request #147 · SuffolkLITLab/FormFyxer

nonprofittechy · 2025-09-25T15:30:26Z

This replaces the sentence tokenization we used in a few places with a regular expression (instead of NTLK) and replaces the use of PassivePy (via tools.suffolklitlab.org) with a call to an LLM.

PassivePy states accuracy of 98% on its test dataset; the gpt-5-nano LLM via promptfoo evaluation scores 95.65% on the same dataset of about 1,100 sentences. Spent a lot of time going through multiple rounds of tests and tweaks with few shot with extremely detailed instructions vs zero shot classification, and closer to zero shot with fewer rules in the prompt seems to perform the best for gpt-5-nano. Additionally, when I looked closely at the failures, they seem to mostly be because of ambiguous meanings of sentences that have a valid passive voice interpretation but were marked as active by PassivePY's human annotators. I feel confident that the current performance of the LLM is good enough to capture confusing sentences, as the sentences that our LLM prompt marked "passive" but the human marked "active" confused me!

Some of the "weird" sentences where we disagreed with human annotators:

The politics being discussed were causing scene. (human: passive, llm: active)
I am stunned at the impact politics is having on our country these days. (human: active, LLM: passive)
The debate tonight was heated (human: active, llm: passive)
Politics can be stressful to be involved in. (human: active, llm: passive)

Some patterns with adjectives vs verb confusion--I agree with the humans after looking closely, but the errors are on weird/ungrammatical sentences, pretty close calls with two valid meanings (one passive and one active), or with ambiguity in usage.

Note that gpt-5-nano is extremely inexpensive, and our prompt does well with caching. Testing 1,100 sentences = 12.5 cents.

If this lets us power off tools.suffolklitlab.org, that would be a significant savings, as this is likely to cost less than a dollar a month for even quite high usage.

Additionally, explored using the new Responses API extensively but ultimately stuck with tried and true ChatCompletion; Responses cannot be tested in the current version of PromptFoo and it seems that performance was worse than with the older ChatCompletion (But again, hard to test with promptfoo; any gains would be slight reduction in cost, which is fractions of a penny per thousand uses).

Progress toward #145

…ture evaluations

…te unit test accordingly

Copilot

Pull Request Overview

This PR replaces PassivePy (a Python library for passive voice detection) with a call to OpenAI's LLM (gpt-5-nano) for passive voice detection in text analysis, moving from a local library to an AI-powered cloud solution.

Removes dependency on PassivePy and tools.suffolklitlab.org API for passive voice detection
Implements new LLM-based passive voice detection using OpenAI's gpt-5-nano model
Replaces NLTK sentence tokenization with a regex-based approach to reduce dependencies

Reviewed Changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
formfyxer/passive_voice_detection.py	New module implementing LLM-based passive voice detection with OpenAI API
formfyxer/lit_explorer.py	Updated to use new passive voice detection module instead of tools API
formfyxer/tests/test_passive_voice_detection.py	Comprehensive unit tests for the new passive voice detection functionality
formfyxer/prompts/passive_voice.txt	Prompt template for LLM passive voice classification
promptfooconfig.yaml	Configuration for evaluating the LLM passive voice detector
test_passive_voice_detection.py	Integration test script for the passive voice detection module
formfyxer/tests/passive_voice_test_dataset.csv	Test dataset for passive voice evaluation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

formfyxer/passive_voice_detection.py

nonprofittechy · 2025-09-25T15:38:24Z

@BryceStevenWilley this turned out to take much more testing than I expected--I thought it would be the easier drop-in replacement, lol.

But I'll do a future PR off of this branch since we already replace sentence tokenization in this PR.

Same errors with the old ML dependencies; going to ignore those for now.

BryceStevenWilley

LGTM! My only nit is that we should rename + move the integration test file.

test_passive_voice_detection.py

formfyxer/lit_explorer.py

formfyxer/passive_voice_detection.py

LICENSE

README.md

Co-authored-by: Bryce Willey <bryce.willey@suffolk.edu>

…ance in the repo itself instead of just PR

…r into replace-passivepy

nonprofittechy added 5 commits September 24, 2025 09:12

Add license and passive voice test dataset from PassivePy repo for fu…

85e7d8c

…ture evaluations

Checkpoint - this basically works, could do more optimization though

2932927

Simplified to use chatcompletion API again; performance reaching 94%

6e58010

Do tokenization only with regex

a329380

Remove references to responses API as we don't use that anymore; upda…

749face

…te unit test accordingly

nonprofittechy requested review from BryceStevenWilley and Copilot September 25, 2025 15:30

Copilot AI reviewed Sep 25, 2025

View reviewed changes

formfyxer/passive_voice_detection.py Outdated Show resolved Hide resolved

Format with black; remove extra global call

d5223fa

nonprofittechy changed the base branch from main to migrate-from-spaCy-and-nltk September 25, 2025 15:42

BryceStevenWilley approved these changes Sep 25, 2025

View reviewed changes

nonprofittechy and others added 4 commits September 25, 2025 13:17

Update formfyxer/passive_voice_detection.py

4a82d3d

Co-authored-by: Bryce Willey <bryce.willey@suffolk.edu>

Update formfyxer/passive_voice_detection.py

9a7d5cd

Co-authored-by: Bryce Willey <bryce.willey@suffolk.edu>

Address feedback from PR; move integration test, drop note to perform…

f90822a

…ance in the repo itself instead of just PR

Merge branch 'replace-passivepy' of github.com:SuffolkLITLab/FormFyxe…

a720859

…r into replace-passivepy

nonprofittechy merged commit c08bfd4 into migrate-from-spaCy-and-nltk Sep 25, 2025
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace passivepy with a call to an LLM#147

Replace passivepy with a call to an LLM#147
nonprofittechy merged 10 commits intomigrate-from-spaCy-and-nltkfrom
replace-passivepy

nonprofittechy commented Sep 25, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

nonprofittechy commented Sep 25, 2025 •

edited

Loading

Uh oh!

BryceStevenWilley left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nonprofittechy commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

nonprofittechy commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BryceStevenWilley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nonprofittechy commented Sep 25, 2025 •

edited

Loading

nonprofittechy commented Sep 25, 2025 •

edited

Loading