Skip to content

Transcriptions V1 Test Plan

Mephistic edited this page Apr 17, 2025 · 3 revisions

Local testing guide

The following instructions will let you test the scraping half of the implementation. The webhook half will of the implementation needs to be on the development project for an end-to-end test.

In order to perform this test you will need

  • An assembly API key in your local environment (Ask Nathan Sanders or Matt Victor for this)
  • IAM permissions for the maple dev project in the Google Cloud Console (Ask Matt King for this)
  • A logged in and working firebase-admin session
  • A local copy of github.com/codeforboston/maple with installed deps

Instructions

  1. Setup an ngrok tunnel for the local copy of the transcription webhook
    1. Get an account at https://dashboard.ngrok.com
    2. Connect your ngrok account ngrok config add-authtoken your-auth-token
    3. Start the local function emulator with yarn run dev:functions
    4. Open an ngrok tunnel: ngrok http http://localhost:5001
  2. Find a hearing that has a video within the last 8 days. In this example we’ll use hearing-5091.
  3. Modify functions/src/events/scrapeEvents.ts to create a local testing state
    1. Hardcode const EventId = 5091 into the beginning of getEvent
    2. Change the webhook endpoint in the assembly api call to the ngrok endpoint: https://put-your-ngrok-tunnel-id-here.ngrok-free.app/demo-dtp/us-central1/transcription
  4. Run the firebase admin console on the dev project with yarn firebase-admin -e dev console
    1. From inside the the admin repl run the following command with the id of your hearing (using 5091 in this example):
      1. const resp = await db.collection("events").doc("hearing-5091").get()
      2. const data = resp.data()
      3. JSON.stringify(data)
    2. Copy the output of JSON.stringify(data) without the surrounding quotes (so you have a JS object) and save it somewhere for the next step
    3. Exit the repl with .exit
  5. Start your local firebase emulator with yarn dev:functions
    1. While that emulator is running, open a second shell
      1. Connect to the local firebase admin with yarn firebase-admin -e local console
      2. Load the data into your local instance with await db.collection('events').doc('hearing-5091').set(data) where data is the js object on your clipboard. It should look something like await db.collection('events').doc('hearing-5091').set({"startsAt":{"_seconds":1741881600,"_nanoseconds":0},"id":"hearing-5091","type":"hearing","content":{"EventDate":"2025-03-13T12:00:00","HearingAgendas":[{"DocumentsInAgenda":[],"Topic":"Annual Hearing on a Potential Modification of the Health Care Cost Growth Benchmark","StartTime":"2025-03-13T12:00:00","EndTime":"2025-03-13T15:00:00"}],"RescheduledHearing":null,"StartTime":"2025-03-13T12:00:00","EventId":5091,"HearingHost":{"Details":"http://malegislature.gov/api/GeneralCourts/194/Committees/J24","CommitteeCode":"J24","GeneralCourtNumber":194},"Name":"Joint Committee on Health Care Financing","Description":"Joint Committee on Health Care Financing & Health Policy Commission Public Hearing on a Potential Modification of the CY 2026 Health Care Cost Growth Benchmark","Location":{"AddressLine2":null,"AddressLine1":"24 Beacon Street","State":"MA","ZipCode":"02133","City":"Boston","LocationName":"Gardner Auditorium"},"Status":"Completed"},"fetchedAt":{"_seconds":1741967707,"_nanoseconds":728000000}})
      3. Exit the repl with .exit
    2. In the local emulator ui edit 5091 (http://localhost:3010/firestore/data/events/hearing-5091)
      1. Delete fetchedAt and fetchedAt, and replace them with timestamp type fields of the same property names and values of right now.
    3. Back in the yarn dev:functions emulator run scrapeHearings()
    4. Check the local hearing-5901 document with
      1. const resp = await db.collection("events").doc("hearing-5091").get()
      2. resp.data()
    5. Look for the following props on the hearing 5901-doc:
      1. videoURL: location of the video hosted by the MA Legislature
      2. videoFetchedAt: timestamp of when the video url was fetched
      3. videoAssemblyId: id of the transcript in the Assembly SaaS API
  6. Wait for the assembly job to finish (5-10 minutes usually)
  7. Check the transcriptions collection and look for a new transcript for hearing-5901:
    1. https://console.cloud.google.com/firestore/databases/-default-/data/panel/transcriptions/17c91397-c023-4f28-a621-4cef45c70749?authuser=1&project=digital-testimony-dev
Clone this wiki locally