Skip to content

feat: refactor recomendations to use user taste directly#435

Open
iFraan wants to merge 28 commits intofredrikburmester:mainfrom
iFraan:feat/refactor-recommendation-system
Open

feat: refactor recomendations to use user taste directly#435
iFraan wants to merge 28 commits intofredrikburmester:mainfrom
iFraan:feat/refactor-recommendation-system

Conversation

@iFraan
Copy link
Contributor

@iFraan iFraan commented Feb 27, 2026

Refactor recommendations to use new "user taste" embedding

Instead of creating a pseudo user taste on last watches, it calculates an embedding using full session history.
Older watches matter, just less than recent ones. Also, sessions can have negative weights if the user abandons them quickly.

Generates a user taste embedding and compares it directly to items embeddings.
For me it seems to improve recommendations, but basedOn items are lost (since it compares embeddings directly)

Summary by Sourcery

Refactor personalized recommendations to use precomputed user taste embeddings and add infrastructure to compute, store, and schedule these user embedding profiles.

New Features:

  • Introduce a user_embeddings table and relations to store per-user taste profile vectors with metadata.
  • Add a job and scheduler endpoints to compute and regularly sync user embedding profiles per server using watch history.
  • Provide a shared recommendation engine that retrieves recommendations via a single vector search against user taste profiles for movies and series.

Enhancements:

  • Simplify movie and series recommendation queries to delegate to the centralized profile-based recommendation engine, improving performance and consistency.
  • Adjust recommendation API and types so recommendations can omit basedOn data while remaining backward compatible.
  • Normalize and share vector utility helpers across job-server and Next.js app codebases.

Build:

  • Register the user embedding calculation job with the job queue and cron defaults so it runs nightly and can be manually triggered per server.

Summary by Sourcery

Refactor personalized recommendations to use precomputed per-user taste embeddings and introduce infrastructure to compute, store, and schedule these profiles while adapting APIs and UI to the new engine.

New Features:

  • Add a user_embeddings table and relations to persist per-user taste profile vectors with metadata.
  • Introduce a background job, scheduler wiring, and HTTP endpoint to calculate and sync user taste embeddings per server using watch history.
  • Expose a server settings control to manually trigger user embedding generation from the UI.

Enhancements:

  • Replace ad-hoc per-item similarity logic with a centralized recommendation engine that queries user taste profiles via a single vector search for movies and series.
  • Update recommendation APIs, chat tools, and dashboard widgets to consume the new profile-based recommendations while allowing optional basedOn data.
  • Share vector utilities for normalization and pgvector literal creation across database and job server code.
  • Clear user embeddings when item embeddings are reset to avoid stale taste profiles.

Build:

  • Register the user-embeddings-sync cron job and pg-boss worker so user profiles are computed on a schedule.

iFraan and others added 10 commits February 23, 2026 22:42
…mmendations

Replace per-request N+1 similarity searches with batch-precomputed user
taste profiles. Each user now has a single unified embedding vector that
combines their movie and series watch history, enabling single HNSW-indexed
vector searches for recommendations.

- Add user_embeddings table to store pre-computed taste profiles
- Add nightly batch job to compute user embeddings from watch history
- Create recommendation-engine.ts with profile-based recommendation logic
- Refactor movie and series recommendations to use pre-computed profiles
- Apply recency decay and bounce penalty weights in profile computation
- Add freshness boost for recently added items in recommendations
Add migration to create user_embeddings table with vector column for storing
pre-computed user preference embeddings. Includes foreign keys to users and
servers tables, unique constraint on user-server pairs, and index on server_id
for efficient queries.
Make the select shape internal to the module as it's only used within
the recommendation engine and doesn't need to be part of the public API.
Add manual trigger for user embeddings calculation per server:
- New triggerServerUserEmbeddingsSync method in SyncScheduler class
- POST /scheduler/trigger-user-embeddings-sync endpoint
- Validates server existence before queueing job
…agement

Replace maxPercentComplete metric with totalPlayDuration and expectedRuntime
for more accurate engagement measurement in user embedding calculations.
Also adjust recency decay half-life from ~70 days to ~200 days for more
forgiving recommendations.
The user embeddings feature has been implemented with the user_embeddings
table, sync job, and scheduler endpoint. This planning document is no
longer needed.
- Move vector math functions (`normalizeVector`, `toPgVectorLiteral`) from embedding jobs to shared `utils/vector.ts` module
- Update both job-server and nextjs-app to import from the new shared location
- Add dimension mismatch logging and configurable series engagement threshold
- Improve recommendation engine with similarity filtering and exclusion list capping
Add minimum completion thresholds to avoid penalizing series and movies
incorrectly. Introduce SERIES_BOUNCE_THRESHOLD (0.15) and
MOVIES_BOUNCE_THRESHOLD (0.10) constants, replacing hardcoded values.

For series, calculate average completion ratio using total play duration
and expected runtime. Apply bounce penalty (-0.3) when average completion
falls below threshold, preventing low-engagement samples from inflating
user profile weights. This aligns series engagement logic with existing
movie bounce detection.

Query additional fields (totalPlayDuration, avgEpisodeRuntimeTicks) to
support runtime calculations for weight determination.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 27, 2026

Reviewer's Guide

Refactors personalized recommendations to use precomputed per-user taste embeddings stored in a new user_embeddings table, introduces a scheduled job and manual trigger to compute/update these embeddings from watch history, centralizes profile-based recommendation queries in a shared engine, and updates APIs, UI, and AI tools to consume the new profile-based recommendations while keeping item-to-item similarity and hiding logic intact.

Sequence diagram for profile-based personalized recommendations

sequenceDiagram
  actor User
  participant Client as ClientApp
  participant API as NextjsAPI_recommendations
  participant Stats as SimilarStatistics_getSimilarStatistics
  participant Engine as RecommendationEngine_getProfileRecommendations
  participant DB as Database

  User->>Client: Request personalized recommendations
  Client->>API: GET /api/recommendations
  API->>Stats: getSimilarStatistics({ serverId, userId, type, limit, offset })

  alt userId missing
    Stats->>API: []
    API->>Client: Empty recommendations
  else userId resolved
    Stats->>Engine: getProfileRecommendations(serverId, userId, type, limit, offset)

    Engine->>DB: SELECT embedding FROM user_embeddings WHERE userId AND serverId
    alt no user embedding profile
      DB-->>Engine: []
      Engine-->>Stats: []
      Stats-->>API: []
      API-->>Client: Empty recommendations
    else profile exists
      DB-->>Engine: user embedding vector
      par load exclusions
        Engine->>DB: SELECT hiddenRecommendations WHERE userId AND serverId
        Engine->>DB: SELECT DISTINCT sessions.itemId (movies)
        Engine->>DB: SELECT DISTINCT sessions.seriesId (series)
      and vector search
        Engine->>DB: Vector search on items.embedding using user profile
      end
      DB-->>Engine: Candidate items with similarity
      Engine-->>Stats: RecommendationResult[]
      Stats-->>API: RecommendationResult[]
      API-->>Client: Recommendations with similarity and reasons
    end
  end
Loading

ER diagram for new user_embeddings taste profile table

erDiagram
  users {
    text id PK
    integer serverId FK
    timestamp lastActivityDate
  }

  servers {
    integer id PK
    integer embeddingDimensions
  }

  userEmbeddings {
    serial id PK
    text userId FK
    integer serverId FK
    vector embedding
    integer itemCount
    timestamp lastCalculatedAt
    timestamp createdAt
    timestamp updatedAt
  }

  items {
    text id PK
    integer serverId FK
    vector embedding
    text type
    timestamp createdAt
  }

  sessions {
    integer id PK
    integer serverId FK
    text userId FK
    text itemId FK
    text seriesId
    integer playDuration
    integer runtimeTicks
    timestamp endTime
  }

  hiddenRecommendations {
    integer id PK
    integer serverId FK
    text userId FK
    text itemId FK
  }

  users ||--o{ sessions : has
  servers ||--o{ sessions : has

  users ||--o{ userEmbeddings : has
  servers ||--o{ userEmbeddings : has

  servers ||--o{ items : has

  users ||--o{ hiddenRecommendations : has
  servers ||--o{ hiddenRecommendations : has
  items ||--o{ hiddenRecommendations : has
Loading

Class diagram for recommendation engine and user embedding job

classDiagram
  class RecommendationCardItem {
    +string id
    +string name
    +string type
    +number productionYear
    +number runtimeTicks
    +string[] genres
    +number communityRating
    +string primaryImageTag
    +string primaryImageThumbTag
    +string primaryImageLogoTag
    +string[] backdropImageTags
    +string seriesId
    +string seriesPrimaryImageTag
    +string parentBackdropItemId
    +string[] parentBackdropImageTags
    +string parentThumbItemId
    +string parentThumbImageTag
  }

  class RecommendationResult {
    +RecommendationCardItem item
    +number similarity
    +RecommendationCardItem[] basedOn
  }

  class RecommendationEngine {
    +getProfileRecommendations(serverId number, userId string, targetType string, limit number, offset number) RecommendationResult[]
  }

  class UserEmbeddingRecord {
    +number id
    +string userId
    +number serverId
    +number[] embedding
    +number itemCount
    +Date lastCalculatedAt
    +Date createdAt
    +Date updatedAt
  }

  class CalculateUserEmbeddingsJobData {
    +number serverId
  }

  class WatchedItemForProfile {
    +string itemId
    +number[] embedding
    +string type
    +string seriesId
    +number totalPlayDuration
    +number expectedRuntime
    +number lastWatchedMs
    +number episodeCount
  }

  class UserEmbeddingsJob {
    +calculateUserEmbeddingsJob(job PgBossJob) Promise
    -ensureUserEmbeddingIndex(dimensions number) Promise
    -computeUserProfile(serverId number, userId string, now number) object
  }

  class VectorUtils {
    +normalizeVector(vec number[]) number[]
    +toPgVectorLiteral(value number[]) string
  }

  class DatabaseTables {
    +userEmbeddings
    +items
    +sessions
    +hiddenRecommendations
  }

  RecommendationEngine --> RecommendationResult
  RecommendationResult --> RecommendationCardItem
  RecommendationEngine --> DatabaseTables
  RecommendationEngine --> VectorUtils

  UserEmbeddingsJob --> UserEmbeddingRecord
  UserEmbeddingsJob --> WatchedItemForProfile
  UserEmbeddingsJob --> DatabaseTables
  UserEmbeddingsJob --> VectorUtils

  DatabaseTables --> UserEmbeddingRecord
Loading

File-Level Changes

Change Details Files
Replace per-request watch-history-based movie recommendations with profile-based recommendations backed by precomputed user embeddings.
  • Remove getUserSpecificRecommendations and related watch-history aggregation from similar-statistics.ts, delegating user recommendations to a new getProfileRecommendations helper
  • Change getSimilarStatistics to accept a parameter object including type (Movie/Series/all) and call the profile-based recommender, falling back to current user via getMe when userId is omitted
  • Update error handling in recommendation flows to use console.error instead of debugLog and simplify similar-items query logic
apps/nextjs-app/lib/db/similar-statistics.ts
Replace per-request watch-history-based series recommendations with profile-based recommendations and simplify series similarity endpoints.
  • Remove getUserSpecificSeriesRecommendations and series-specific watch-history aggregation from similar-series-statistics.ts
  • Drop debug logging machinery and reuse shared RecommendationCardItem/RecommendationResult types from the new recommendation engine
  • Keep getSimilarSeriesForItem but simplify similarity query and error logging
apps/nextjs-app/lib/db/similar-series-statistics.ts
Introduce a shared recommendation engine that performs single vector searches against user taste embeddings with freshness boosting and watched/hidden exclusions.
  • Add recommendation-engine.ts with a shared RecommendationCardItem/RecommendationResult model and getProfileRecommendations function
  • Implement a single pgvector cosine search against userEmbeddings.embedding joined to items, with optional Movie/Series filtering
  • Apply freshness boost to recently added items and enforce minimum similarity and capped exclusion list sizes for performance
apps/nextjs-app/lib/db/recommendation-engine.ts
Add a user_embeddings table with relations and vector utilities to support per-user taste profiles.
  • Create userEmbeddings pgTable with embedding vector, itemCount, lastCalculatedAt, timestamps, unique (userId, serverId) constraint, and index on serverId
  • Wire up relations from servers and users to userEmbeddings and define userEmbeddingsRelations
  • Export UserEmbedding/NewUserEmbedding types and share vector utilities (normalizeVector, toPgVectorLiteral) via a new vector.ts module and database package exports
packages/database/src/schema.ts
packages/database/src/vector.ts
packages/database/src/index.ts
packages/database/package.json
packages/database/drizzle/0042_deep_grandmaster.sql
packages/database/drizzle/meta/0042_snapshot.json
packages/database/drizzle/meta/_journal.json
packages/database/dist/*
Implement a scheduled/background job pipeline to compute and maintain user embeddings per server, including manual triggering from the UI and API.
  • Add calculateUserEmbeddingsJob in user-embedding-job.ts that aggregates movies and series sessions into weighted, recency-decayed user taste vectors with bounce penalties and HNSW index management
  • Register USER_EMBEDDING_JOB_NAME with the job queue and worker registration, and add a cron-based user-embeddings-sync job default
  • Extend the scheduler and HTTP routes to enqueue user-embeddings-sync per server, and expose a Next.js server helper triggerUserEmbeddingsSync plus an EmbeddingsManager UI button to manually trigger user embeddings recomputation
apps/job-server/src/jobs/user-embedding-job.ts
apps/job-server/src/jobs/queue.ts
apps/job-server/src/jobs/index.ts
packages/database/src/job-defaults.ts
apps/job-server/src/jobs/scheduler.ts
apps/job-server/src/routes/jobs/scheduler.ts
apps/nextjs-app/lib/db/server.ts
apps/nextjs-app/app/(app)/servers/[id]/(auth)/settings/EmbeddingsManager.tsx
Update recommendation APIs, dashboard, and chat tools to consume the new profile-based recommender and relaxed basedOn semantics.
  • Change API route for /api/recommendations to call getSimilarStatistics with the new options object and use profile-based results for both movies and series, guarding against missing basedOn data
  • Update dashboard components to fetch movie/series recommendations via the new getSimilarStatistics signature
  • Adjust recommendation list types so basedOn is optional and ensure AI chat tools use getSimilarStatistics({ serverId, userId, limit, type }) and reason generation that does not rely on always-present basedOn items
apps/nextjs-app/app/api/recommendations/route.ts
apps/nextjs-app/app/(app)/servers/[id]/(auth)/dashboard/SimilarStatistics.tsx
apps/nextjs-app/app/(app)/servers/[id]/(auth)/dashboard/SimilarSeriesStatistics.tsx
apps/nextjs-app/app/(app)/servers/[id]/(auth)/dashboard/page.tsx
apps/nextjs-app/app/(app)/servers/[id]/(auth)/dashboard/recommendation-types.ts
apps/nextjs-app/lib/ai/tools.ts
Keep item embedding management consistent with user embeddings and improve error logging.
  • Ensure clearEmbeddings also deletes userEmbeddings rows for the server to avoid stale/mismatched dimensional spaces when item embeddings are reset
  • Replace custom debugLog usage with console.error in recommendation hiding flows
  • Rename the EmbeddingsManager section title from "Movie Embeddings" to "Item Embeddings" for clarity
apps/nextjs-app/lib/db/server.ts
apps/nextjs-app/lib/db/similar-statistics.ts
apps/nextjs-app/lib/db/similar-series-statistics.ts
apps/job-server/src/jobs/embedding-jobs.ts
apps/nextjs-app/app/(app)/servers/[id]/(auth)/settings/EmbeddingsManager.tsx

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@fredrikburmester
Copy link
Owner

fredrikburmester commented Feb 27, 2026

How can you compare an arbitrary "user watch" with items? Those vectors won't be similar at all, no? The point of the current system is that we pull items deterministically based on user sessions and then find matching items with embeddings, comparing items with items, giving good results.

I think the improvement should be to filter out the list of user watched items before doing the embedding comparison, excluding items not watched fully or similar.

But I'm very open to improvements so if this actually works, I'd like to know more about how! Can you explain it a bit more in detail?

Edit: Maybe comparing user sessions to items actually works and i'm just too restrictive in what i think embeddings can do haha.

Convert freshness threshold to ISO string for proper date comparison in SQL query, ensuring consistent behavior across different database systems.
@iFraan
Copy link
Contributor Author

iFraan commented Feb 28, 2026

@fredrikburmester

I was having a problem that all recommended items felt 'generic' and 'not for me'.
Tried a lot of embeddings models. Even local ones that 'beated' OpenAI.
i've even pr an implementation for gemini embeddings at #419
Look at this screenshot:

image

They have all the same basedOn items.

Now onto the implementation

So, to be clear, I'm not comparing session data against items directly.
The user vector is actually just a weighted average of the item embeddings they've watched, so it lives in the exact same vector space. Think of it as the "center of gravity" of their taste. Comparing it via cosine similarity to other items works well because it's the same vector space.

The user embedding is not an exact average since we weight different things to affect the final result, like completion and a recency decay.
For completion i opted for total playtime / avg runtime ticks to account for partial watches (but can be useful for rewatches now that i think about it, to boost it more).

With that we fabricate an unique user taste thingy. Having it this way allows us to have a 'true' taste since we are not omitting any watch, and we can filter hidden recommendations or watched items in the final query only. No more post fetch sort/limit hacks, since it's all handled by the Postgres engine.

Oh, and did i mention is super fast?

Writing this comment gave me some ideas like hidden recommendations giving negative weight and considering an n+1 query to return the 'basedOn' items.

I would love to hear a review on how it works for you! Really.

@fredrikburmester
Copy link
Owner

Yeah I agree, I also feel like the recommendations are too generic.. The pre-computed user profile approach seems like a solid improvement. Just a few things:

  • The vector utils (normalizeVector, toPgVectorLiteral) are duplicated identically in both job-server/src/utils/vector.ts and nextjs-app/lib/utils/vector.ts, would be nice to have these in the shared packages/ somewhere instead
  • _timeWindow parameter in getSimilarStatistics is silently ignored which is a breaking change for any callers relying on it
  • There's some dead code left over: the debugLog no-ops, itemCardWithEmbeddingColumns, etc. Worth a cleanup pass
  • Can we align these types: SeriesRecommendationItem[] and as RecommendationItem[] ? Don't like casts, becuase it could hide type mismatches
  • What happens when no user profile exists yet?
  • Small thing: parseInt(m[1]) should be Number.parseInt(m[1], 10) per our conventions
  • Don't do // ─── Constants ─── style heading comments, try to keep comments as "why" explanations rather than this

Other than that i think it's a solid improvement we can merge!

@iFraan
Copy link
Contributor Author

iFraan commented Mar 3, 2026

@fredrikburmester
In the current implementation users with less than 3 watches don't get an embedding, so no recommendations for them.

What if we average all user embeddings to create a 'popular on the server' or similar. Just for this no-user-data cases.
Maybe it's worth a shot.

iFraan added 6 commits March 3, 2026 15:19
Moves vector normalization and pgvector literal conversion utilities from
individual apps to the shared @streamystats/database package. Also removes
duplicate type definitions in similar-statistics modules by reusing types
from recommendation-engine, removes unused time window parameters, and
cleans up debug logging.
Adds a new database table to store user embeddings with vector data,
supporting the refactored recommendation system. Includes foreign keys
to users and servers tables, a unique constraint on user-server pairs,
and an index on server_id for efficient queries.
…rofile

Add server-wide average embeddings fallback when user has no watch history.
The recommendation system now returns a source field ("user", "server", or "none")
to indicate whether recommendations are personalized or based on server popularity.
Components display appropriate titles and descriptions based on source.
@fredrikburmester
Copy link
Owner

@fredrikburmester

In the current implementation users with less than 3 watches don't get an embedding, so no recommendations for them.

What if we average all user embeddings to create a 'popular on the server' or similar. Just for this no-user-data cases.

Maybe it's worth a shot.

Yeah sure, but let's put that in another PR to not bloat this one.

@iFraan
Copy link
Contributor Author

iFraan commented Mar 3, 2026

Yeah sure, but let's put that in another PR to not bloat this one.

Already too late haha
What happens now for cold-start users:

≥ 3 users with embeddings → they see "Popular Movies/Series on This Server" (server-average fallback)
< 3 users → section stays hidden (same as before)
API consumers → get an additive source ("user" / "server" / "none") field in the response, no breaking changes

Let me just test this a bit and I'll mark the pr as ready to review

@fredrikburmester
Copy link
Owner

Yeah sure, but let's put that in another PR to not bloat this one.

Already too late haha

What happens now for cold-start users:

≥ 3 users with embeddings → they see "Popular Movies/Series on This Server" (server-average fallback)

< 3 users → section stays hidden (same as before)

API consumers → get an additive source ("user" / "server" / "none") field in the response, no breaking changes

Let me just test this a bit and I'll mark the pr as ready to review

The reason I pushed for that to be a separate PR is because it probably needs more discussion.

  • Why don't users always have access to top items? Why only show it for users with <3 session?
  • If only one other user of the server this basically becomes revealing the other users top items lol.

Might be more things I'm not thinking of rn.

Moved recommendation-related interfaces and types from recommendation-engine.ts to a dedicated recommendation-types.ts file to improve code organization and reduce circular dependencies. This change centralizes all recommendation type definitions in one location, making the codebase more maintainable and easier to understand.

The refactoring affects multiple files including the recommendation engine, similar statistics modules, and API routes, which now import types from the new centralized location.
@iFraan
Copy link
Contributor Author

iFraan commented Mar 4, 2026

Yeah sure, but let's put that in another PR to not bloat this one.

Already too late haha
What happens now for cold-start users:
≥ 3 users with embeddings → they see "Popular Movies/Series on This Server" (server-average fallback)
< 3 users → section stays hidden (same as before)
API consumers → get an additive source ("user" / "server" / "none") field in the response, no breaking changes
Let me just test this a bit and I'll mark the pr as ready to review

The reason I pushed for that to be a separate PR is because it probably needs more discussion.

  • Why don't users always have access to top items? Why only show it for users with <3 session?
  • If only one other user of the server this basically becomes revealing the other users top items lol.

Might be more things I'm not thinking of rn.

I meant servers with ≥ 3 users, and servers with < 3 users.
The idea behind minimum users was to prevent revealing. Can be set to a higher number.

But you are right. Since we have it, we can also use it to show popular shows to all users.

It deserves its own PR. Reverting the last bit.

@iFraan iFraan marked this pull request as ready for review March 4, 2026 02:06
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In computeUserProfile's series aggregation, avgEpisodeRuntimeTicks is computed from sessions.runtimeTicks, but runtimeTicks is a property of items, so this should be switched to AVG(items.runtimeTicks) to avoid referencing a non-existent column and to get correct duration data.
  • In getProfileRecommendations, capping the NOT IN exclusion list at MAX_EXCLUSION_LIST_SIZE can silently drop some hidden/watched IDs from the filter; consider switching to a NOT EXISTS join or a temp table approach so you can exclude all relevant IDs without truncation.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `computeUserProfile`'s series aggregation, `avgEpisodeRuntimeTicks` is computed from `sessions.runtimeTicks`, but `runtimeTicks` is a property of `items`, so this should be switched to `AVG(items.runtimeTicks)` to avoid referencing a non-existent column and to get correct duration data.
- In `getProfileRecommendations`, capping the `NOT IN` exclusion list at `MAX_EXCLUSION_LIST_SIZE` can silently drop some hidden/watched IDs from the filter; consider switching to a `NOT EXISTS` join or a temp table approach so you can exclude all relevant IDs without truncation.

## Individual Comments

### Comment 1
<location path="apps/job-server/src/jobs/embedding-jobs.ts" line_range="149-150" />
<code_context>
   };
 }

-function toPgVectorLiteral(value: number[]): string {
-  return `[${value.join(",")}]`;
-}

</code_context>
<issue_to_address>
**issue (bug_risk):** toPgVectorLiteral is removed here but not re-imported, which will break existing usages in this file.

The helper now lives in `@streamystats/database/vector`, but this file doesn’t import it even though it’s still used later (e.g. for embedding upserts), which will cause a compile-time error. Add the appropriate import (for example `import { toPgVectorLiteral } from "@streamystats/database";`) and rely on that instead of an inline implementation.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@quillfires
Copy link
Contributor

please view this PR #455 before margin this. because i think it is relevant since this will change how the recommendation engine behaves.

Inside the SQL queries the system already does:
• vector similarity search
• threshold filtering
• ranking

Meaning only the top matches per base item survive.
So the similarity list passed to the formula is already something like:
[0.71, 0.65, 0.12]
not something huge like:
[0.71, 0.65, 0.40, 0.35, 0.32, 0.18, 0.15, 0.12...]
That means:
It is good and is behaving like Top-K Mean, The dilution problem mostly happens during cross-genre averaging, not because of noise from dozens of weak items. This I have addressed in my PR.

1 - It fixes cross-genre dilution
2 - It keeps multi-genre matches strong
3 - It requires no schema changes
4 - It is computationally trivial

The real gain will come from raising the candidate threshold.
0.1 to 0.3
That is actually the most important change.
Cosine similarity usually behaves like this:
0.80+ extremely similar
0.60 strong
0.40 moderate
0.25 weak
<0.20 noise

So 0.1 was letting in a lot of garbage candidates.
0.3 is much healthier.

Please review before making drastic change to the recommendation engine. keeping separate table for user taste and all would still work even with the formular that I have proposed. and I plan to make a new PR that would address long-running series bias fix (for users who has watched a series with 100s of episodes like The Simpsons and other making the base list polluted with long shows)

@quillfires
Copy link
Contributor

Before my PR:
image

After my PR:
image

same model, same embeddings, same user

and I believe after long-running series bias fix it would be much better

@fredrikburmester
Copy link
Owner

A last idea. Would it be possible to have a setting (either personal or system wide) to select which method we use to deliver recommendations?

@iFraan
Copy link
Contributor Author

iFraan commented Mar 6, 2026

A last idea. Would it be possible to have a setting (either personal or system wide) to select which method we use to deliver recommendations?

Do you mean pre-pr recommendations or server-wide recommendations vs user-wide recommendations?

For server-wide vs user-wide we should create a 'server-embedding' to both prevent running the average every time, and to maybe weight-in different stuff like recent added content, or even an admin curated list. Maybe repurpose the user-embeddings table to hold both.

For pre-pr i guess it can be done; the question is, should we?

@iFraan
Copy link
Contributor Author

iFraan commented Mar 6, 2026

@quillfires

Can you try how this PR works for your case?

It avoids comparing "weak similarity" items because it no longer calculates the cosine similarity of recent watches against the entire item catalog.

The new system averages the entire watch history rather than limiting it to the top or last 15 entries.
Naturally, not all history should carry equal weight, so we tinker a bit on how individual sessions are weighted.

long-running series bias fix (for users who has watched a series with 100s of episodes like The Simpsons and other making the base list polluted with long shows)

To fix long-running series bias, this defines "engagement" as more than five episodes watched and use runtime to scale weights. But it caps at five. From five to infinite, the episode count doesn't matter anymore. Just runtime averages.

For example, if a user watches less than 15% of an episode on average, it suggests they didn't enjoy it. Instead of just ignoring that episode, the system now actively steers recommendations away from similar series. On the other hand, if a user watches an episode twice (as an average across the series, not just a single re-watch), we amplify the weight to push even more content like it.

The averaging algorithm can be tweaked even more (add admin curated lists, genre-average weight up, etc.) but I personally think moving towards a user taste is a step in the right direction. We can change some details later down the line if needed.

@quillfires
Copy link
Contributor

As mentioned before, I was working on a separate PR (#455) to fix cross-genre score dilution in the existing N+1 recommendation logic, but after reviewing your approach I'm
closing mine in favour of this.

The pre-computed taste profile with recency decay, bounce detection, and series engagement normalization is the right architectural direction. It also naturally solves the long-running series bias (Simpsons/Family Guy dominating the base list) which I was planning to address in a follow-up PR. Your SERIES_FULL_ENGAGEMENT_EPISODES cap handles it cleanly.

One thing I noticed: if a user has no profile yet in user_embeddings, getProfileRecommendations returns an empty array with no fallback. New users
or users whose background job hasn't run yet would see no recommendations at all. Worth considering the old per-item query as a fallback for that case.

Great work on this.

@quillfires
Copy link
Contributor

One more thing: if a user clears item embeddings to switch models or
dimensions, the user_embeddings table should also be cleared. Otherwise the
taste profile vector will be in a different dimensional space than the item
embeddings it is being compared against, and cosine similarity results will
be meaningless or error out entirely.

This could be handled wherever the embedding reset/clear action is triggered, a simple DELETE FROM user_embeddings WHERE server_id = ? alongside the
item embedding clear would be sufficient.

When item embeddings are cleared for a server, user embeddings that derive from them
must also be cleared. Otherwise they remain in a stale dimensional space, causing
mismatched recommendation calculations until recalculated.
@iFraan
Copy link
Contributor Author

iFraan commented Mar 6, 2026

@quillfires

One thing I noticed: if a user has no profile yet in user_embeddings, getProfileRecommendations returns an empty array with no fallback.

We evaluated using server-average to present "popular on server" but decided it was out of scope for this pr, but will revisit later.

One more thing: if a user clears item embeddings to switch models or
dimensions, the user_embeddings table should also be cleared.

You are right! There's a window until the job is re-run that will break the recommendations. Deleting the current embedding should be enough. The user-embedding-job will see the dimension mismatch and recreate the index if necessary.

I think we should also put a button on settings/ai to trigger the job manually.

Adds the ability to manually trigger user taste embeddings generation from the settings UI. This includes:
- New `triggerUserEmbeddingsSync` function in the server library to call the job server
- New UI section in EmbeddingsManager for manually generating user embeddings
- Updated label from "Movie Embeddings" to "Item Embeddings" for broader terminology
@quillfires
Copy link
Contributor

quillfires commented Mar 6, 2026

some models share the same output dimensions but produce embeddings in a completely different vector space depending on their training data. In this case no error occurs, but the cosine similarity between the stale profile vector and the freshly re-embedded items will be semantically meaningless, producing nonsensical recommendations with no visible indication that anything is wrong. This is the more dangerous case. So if the item embeddings are cleared, its better to automatically clean the user embeddings as well (if they changed the model but kept the dimensions same)

@quillfires
Copy link
Contributor

We evaluated using server-average to present "popular on server" but decided it was out of scope for this pr, but will revisit later.

This will be a nice touch. But I think you are over thinking this. Popular on server or trending on server can be the most played moves and series across all users during the last x number of days. No need to hook the recommendation engine to it. It would be an in house trending generator

@quillfires
Copy link
Contributor

One more issue I noticed while looking at the code:
getSimilarStatistics() has "Movie" hardcoded as the type passed to
getProfileRecommendations():

    return await getProfileRecommendations(
      serverIdNum,
      targetUserId,
      "Movie", // hardcoded
      limit,
      offset,
    );

The getPersonalizedRecommendations tool in tools.ts accepts a type parameter
("Movie", "Series", "all") and filters results after the fact, but by then
getSimilarStatistics has already returned only movies. Asking the AI chat to
recommend a series will always come back empty.

The fix needs two changes:

  1. getSimilarStatistics should accept and forward a type parameter to
    getProfileRecommendations instead of hardcoding "Movie"
  2. getPersonalizedRecommendations in tools.ts should pass its type argument
    down to getSimilarStatistics rather than post-filtering a movies-only result
  • getSimilarStatistics is in apps/nextjs-app/lib/db/similar-statistics.ts
  • getPersonalizedRecommendations is in apps/nextjs-app/lib/ai/tools.ts

@iFraan
Copy link
Contributor Author

iFraan commented Mar 9, 2026

One more issue I noticed while looking at the code:
getSimilarStatistics() has "Movie" hardcoded as the type passed to
getProfileRecommendations():

The original implementation had 'Movie' hardcoded too, the component that uses it expects it.
I think is a simple enough fix though; we can just default it.

Now that getSimilarSeries in @db/similar-series-statistics.ts is looking very similar to getSimilarStatistics, i can just delete it and update the references.

Add support for filtering recommendations by media type (Movie, Series, or all) across the recommendation system. This refactors the query logic in `getProfileRecommendations` and `getSimilarStatistics` to accept a type parameter instead of filtering results post-query, improving efficiency.
@iFraan iFraan marked this pull request as draft March 9, 2026 15:01
iFraan added 2 commits March 9, 2026 12:14
…s into single function

Replaced the separate `getSimilarSeries` function with a unified `getSimilarStatistics` function that accepts a `type` parameter. Updated all callers across the codebase (dashboard components, API routes, and AI tools) to use the new object parameter pattern. Removed the now-redundant `similar-series-statistics` module.
@iFraan iFraan marked this pull request as ready for review March 9, 2026 16:30
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In calculateUserEmbeddingsJob.computeUserProfile the series query uses AVG(${sessions.runtimeTicks}), but runtimeTicks appears to be an items field elsewhere; this likely should be AVG(${items.runtimeTicks}) to avoid a broken query or incorrect runtime aggregation.
  • The recommendation types are now treated as if basedOn can be absent (e.g. using r.basedOn ?? [] and basedOn?: in dashboard types), but RecommendationResult in recommendation-engine.ts still types basedOn as a required array; consider making it optional there too to keep the type system aligned with the new behavior.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `calculateUserEmbeddingsJob.computeUserProfile` the series query uses `AVG(${sessions.runtimeTicks})`, but `runtimeTicks` appears to be an `items` field elsewhere; this likely should be `AVG(${items.runtimeTicks})` to avoid a broken query or incorrect runtime aggregation.
- The recommendation types are now treated as if `basedOn` can be absent (e.g. using `r.basedOn ?? []` and `basedOn?:` in dashboard types), but `RecommendationResult` in `recommendation-engine.ts` still types `basedOn` as a required array; consider making it optional there too to keep the type system aligned with the new behavior.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@quillfires
Copy link
Contributor

The original implementation had 'Movie' hardcoded too

Yes, confirmed. In the current main branch, the AI chat can't recommend series via embeddings at all. When asked, it falls back to a genre based keyword search using watch history instead of the actual embedding similarity. The user just sees "embeddings not configured" message with no indication of why.
I noticed the bug, but instead of opening a separate PR, I thought I'd mention here since it touches the same recommendation engine you're refactoring here. Better to have it land in one place.
Good call on consolidating getSimilarSeries into getSimilarStatistics with a type parameter. That cleans up the duplication nicely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants