Skip to content

Evaluate user deletion strategy: soft delete vs hard delete + anonymization #453

@tompscanlan

Description

@tompscanlan

Problem

We currently use soft deletes for users (deletedAt timestamp), but this causes issues:

  1. Orphaned records - GroupMember/EventAttendee records reference soft-deleted users, causing null reference crashes (fixed in fix(group-member): filter out members with soft-deleted users #452, but symptomatic of deeper issue)

  2. Query complexity - Every query touching users needs WHERE deletedAt IS NULL filtering

  3. GDPR compliance - Soft delete doesn't satisfy "right to erasure" - the data is still there

  4. Growing cruft - Database accumulates "deleted" data forever

Current Behavior

When a user is soft-deleted:

  • User record gets deletedAt timestamp
  • GroupMember records remain (orphaned)
  • EventAttendee records remain (orphaned)
  • Events they created still reference them as creator
  • Groups they own still reference them as owner

Questions to Resolve

1. What happens to groups when the owner is deleted?

Options:

  • a) Transfer ownership to another admin/member
  • b) Transfer to a system "orphaned" owner
  • c) Delete the group entirely
  • d) Prevent deletion until ownership is transferred

2. What happens to events when the creator is deleted?

Options:

  • a) Anonymize creator: show "Deleted User" or null
  • b) Transfer to group owner (for group events)
  • c) Keep creator reference but handle null gracefully
  • d) Delete the event

3. What happens to event series when the creator is deleted?

Same options as events, but also consider ongoing/future occurrences.

4. What about the user's attendee/member records?

Options:

  • a) Hard delete (cascade) - they're no longer attending/members
  • b) Anonymize - keep record but remove PII
  • c) Current: soft delete leaves orphans

5. What about content they created?

  • Comments on events
  • Messages in groups
  • Profile information

6. Bluesky/ATProto considerations

  • Records published to ATProto are immutable
  • Can tombstone but not truly delete
  • How does this affect our deletion strategy?

Possible Approaches

A. Keep Soft Delete, Fix Orphans

  • Add cascade soft-delete to related records
  • Add null checks everywhere (whack-a-mole)
  • Doesn't solve GDPR

B. Hard Delete + Anonymization

  • Hard delete user record
  • Set foreign keys to null or sentinel value
  • Cascade delete memberships/attendances
  • Keep content but anonymize authorship
  • Simpler, GDPR-compliant

C. Hard Delete + Ownership Transfer

  • Require ownership transfer before deletion allowed
  • Then hard delete everything else
  • Most data-preserving for communities

Related

Next Steps

  1. Decide on desired behavior for each question above
  2. Document the user deletion flow
  3. Implement chosen approach
  4. Migration plan for existing soft-deleted users

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions