-
-
Notifications
You must be signed in to change notification settings - Fork 143
Transcriptions V1 Test Plan
Mephistic edited this page Apr 17, 2025
·
3 revisions
The following instructions will let you test the scraping half of the implementation. The webhook half will of the implementation needs to be on the development project for an end-to-end test.
In order to perform this test you will need
- An assembly API key in your local environment (Ask Nathan Sanders or Matt Victor for this)
- IAM permissions for the maple dev project in the Google Cloud Console (Ask Matt King for this)
- A logged in and working firebase-admin session
- A local copy of github.com/codeforboston/maple with installed deps
Instructions
- Setup an ngrok tunnel for the local copy of the transcription webhook
- Get an account at https://dashboard.ngrok.com
- Connect your ngrok account
ngrok config add-authtoken your-auth-token
- Start the local function emulator with
yarn run dev:functions
- Open an ngrok tunnel:
ngrok http http://localhost:5001
- Find a hearing that has a video within the last 8 days. In this example we’ll use
hearing-5091
. - Modify
functions/src/events/scrapeEvents.ts
to create a local testing state- Hardcode
const EventId = 5091
into the beginning ofgetEvent
- Change the webhook endpoint in the assembly api call to the ngrok endpoint:
https://put-your-ngrok-tunnel-id-here.ngrok-free.app/demo-dtp/us-central1/transcription
- Hardcode
- Run the firebase admin console on the dev project with
yarn firebase-admin -e dev console
- From inside the the admin repl run the following command with the id of your hearing (using 5091 in this example):
const resp = await db.collection("events").doc("hearing-5091").get()
const data = resp.data()
JSON.stringify(data)
- Copy the output of
JSON.stringify(data)
without the surrounding quotes (so you have a JS object) and save it somewhere for the next step - Exit the repl with .exit
- From inside the the admin repl run the following command with the id of your hearing (using 5091 in this example):
- Start your local firebase emulator with
yarn dev:functions
- While that emulator is running, open a second shell
- Connect to the local firebase admin with
yarn firebase-admin -e local console
- Load the data into your local instance with
await db.collection('events').doc('hearing-5091').set(data)
where data is the js object on your clipboard. It should look something likeawait db.collection('events').doc('hearing-5091').set({"startsAt":{"_seconds":1741881600,"_nanoseconds":0},"id":"hearing-5091","type":"hearing","content":{"EventDate":"2025-03-13T12:00:00","HearingAgendas":[{"DocumentsInAgenda":[],"Topic":"Annual Hearing on a Potential Modification of the Health Care Cost Growth Benchmark","StartTime":"2025-03-13T12:00:00","EndTime":"2025-03-13T15:00:00"}],"RescheduledHearing":null,"StartTime":"2025-03-13T12:00:00","EventId":5091,"HearingHost":{"Details":"http://malegislature.gov/api/GeneralCourts/194/Committees/J24","CommitteeCode":"J24","GeneralCourtNumber":194},"Name":"Joint Committee on Health Care Financing","Description":"Joint Committee on Health Care Financing & Health Policy Commission Public Hearing on a Potential Modification of the CY 2026 Health Care Cost Growth Benchmark","Location":{"AddressLine2":null,"AddressLine1":"24 Beacon Street","State":"MA","ZipCode":"02133","City":"Boston","LocationName":"Gardner Auditorium"},"Status":"Completed"},"fetchedAt":{"_seconds":1741967707,"_nanoseconds":728000000}})
- Exit the repl with .exit
- Connect to the local firebase admin with
- In the local emulator ui edit 5091 (http://localhost:3010/firestore/data/events/hearing-5091)
- Delete
fetchedAt
andfetchedAt
, and replace them with timestamp type fields of the same property names and values of right now.
- Delete
- Back in the
yarn dev:functions
emulator runscrapeHearings()
- Check the local hearing-5901 document with
const resp = await db.collection("events").doc("hearing-5091").get()
resp.data()
- Look for the following props on the hearing 5901-doc:
- videoURL: location of the video hosted by the MA Legislature
- videoFetchedAt: timestamp of when the video url was fetched
- videoAssemblyId: id of the transcript in the Assembly SaaS API
- While that emulator is running, open a second shell
- Wait for the assembly job to finish (5-10 minutes usually)
- Check the transcriptions collection and look for a new transcript for hearing-5901: