Skip to content

Implement fast track deletion for unused wikis#605

Open
pixDeVl wants to merge 18 commits intomiraheze:mainfrom
pixDeVl:master
Open

Implement fast track deletion for unused wikis#605
pixDeVl wants to merge 18 commits intomiraheze:mainfrom
pixDeVl:master

Conversation

@pixDeVl
Copy link

@pixDeVl pixDeVl commented Oct 22, 2024

This was requested and approved by the community in https://meta.miraheze.org/wiki/Requests_for_Comment/Content_and_Dormancy_Policies back in March(over half a year ago, so long overdue and then some).
Since many, many, many wiki requests unfortunately never really get into editing and just give up on or forget about making entirely, leaving swarms of ghost wikis on the site taking up space.
This pull request implements the 'fast-track' deletion, where wikis that have no edits(besides MediaWiki default) are marked as inactive after 15 day, and deleted after 30.
Rest in peace our 10k badge.

Summary by CodeRabbit

  • Refactor
    • Restructured configuration settings to distinguish between standard activity and no-edit scenarios, resulting in refined thresholds for inactivity, closure, removal, and deletion.
    • Adjusted deletion grace period logic to enforce consistent behavior based on the updated configurations.
    • Improved activity log filtering by excluding default system entries.
    • Strengthened error handling for missing configurations, promoting smoother operation overall.

@Universal-Omega Universal-Omega self-requested a review October 22, 2024 04:26
@coderabbitai
Copy link

coderabbitai bot commented Oct 27, 2024

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This pull request restructures the configuration for wiki state days by nesting the previous flat structure into two objects: default and no-edits. The change affects how inactivity, closure, removal, and deletion periods are determined. The deletion process in the Wiki Manager now uses the no-edits configuration for the deletion grace period. Additionally, a filtering condition is added in the wiki activity check to exclude log entries from the default actor. The inactive wiki management script is updated to retrieve and validate configuration values more strictly, throwing an error if critical configuration is missing.

Changes

File(s) Change Summary
extension.json Restructured CreateWikiStateDays configuration from direct integer values into a nested structure with default and no-edits objects holding the respective values.
includes/Services/WikiManagerFactory.php and maintenance/ManageInactiveWikis.php Updated the retrieval of configuration values to use the nested no-edits object (e.g., deletion grace period and inactive days). In ManageInactiveWikis.php, added error handling (via ConfigException) and fallback logic to choose between default and no-edits based on activity status.
maintenance/CheckLastWikiActivity.php Introduced a new variable to fetch the default actor's ID and modified the timestamp query to exclude log entries associated with this actor by adding an extra filtering condition to the query.

Sequence Diagram(s)

sequenceDiagram
    participant Cron as Cron Job
    participant MIW as ManageInactiveWikis
    participant Config as Configuration
    participant Wiki as Wiki Data
    Cron->>MIW: execute()
    MIW->>MIW: checkLastActivity()
    MIW->>Config: Retrieve 'default' state config
    alt 'default' config exists
        MIW->>MIW: Determine config (default/no-edits) based on last activity
    else 'default' missing
        MIW->>MIW: Throw ConfigException
    end
    MIW->>Wiki: Update inactivity, closure, and removal statuses
Loading
sequenceDiagram
    participant User as User Action
    participant WMF as WikiManagerFactory
    participant Config as Configuration
    participant Wiki as Wiki Data
    User->>WMF: Request deletion
    WMF->>Config: Retrieve deletion grace period from 'no-edits' config
    WMF->>WMF: Evaluate deletion conditions
    alt Conditions met
        WMF->>Wiki: Perform deletion
    else Conditions not met
        WMF->>User: Return error/skip deletion
    end
Loading

Suggested reviewers

  • Universal-Omega

Poem

I'm a bunny with a code-hop beat,
New nests for settings now look neat,
With default and no-edits in a clever twist,
Logs and grace periods all coexist,
Hopping through changes with a joyful leap,
CodeRabbit cheers—these changes run deep!
🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@pixDeVl pixDeVl requested a review from BlankEclair October 27, 2024 23:34
Copy link
Member

@BlankEclair BlankEclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay--was busy setting up my new phone, and you know how that is... I have to copy everything, then configure all my settings, then reverse engineer one of my banking apps to figure out how it detects root. Y'know, all those typical things

$dbName = $row->wiki_dbname;
$remoteWiki = $remoteWikiFactory->newInstance( $dbName );
$inactiveDays = (int)$this->getConfig()->get( ConfigNames::StateDays )['inactive'];
$inactiveDays = (int)$this->getConfig()->get( ConfigNames::StateDays )['default']['inactive'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be no-edits?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so? I'll do some thinking, but also need to look to make sure we're not breaking any previously implemented assumptions about the state from when there was only one track.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the no_edits_inactivity < default_inactivity, but we only do checking for when it's default, then the no edits stuff won't trigger, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to no-edits, but we should take a look to see if that doesn't break and assumptions and cause it to early close any wikis it shouldn't. A bit late for me to be doing a thoughtful look right now though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pixDeVl pixDeVl requested a review from BlankEclair November 19, 2024 03:00
@BlankEclair
Copy link
Member

BlankEclair commented Nov 19, 2024 via email

Copy link
Member

@Universal-Omega Universal-Omega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly very minor nitpick stuff, but one critical issue as well.


$lastActivityTimestamp = $activity->getTimestamp();

$track = ( $lastActivityTimestamp !== 0 ) ? 'default': 'no-edits'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary parentheses and syntax error. (missing semicolon)

$track = $lastActivityTimestamp ? 'default' : 'no-edits';

Also we need some validation here if configured incorrectly. If no-edits is not set at all, always use default, if neither is not set use the old behavior and add null checks with ??. Maybe this should be split to seperate method? I guess we could just throw ConfigException if configured incorrectly also...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Universal-Omega configured incorrectly being not defining no edits or default or just not defining no edits?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not defining both I believe

@codecov
Copy link

codecov bot commented Mar 9, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 23.47%. Comparing base (b0cae5c) to head (45a7c37).

Files with missing lines Patch % Lines
maintenance/ManageInactiveWikis.php 0.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #605      +/-   ##
============================================
- Coverage     26.55%   23.47%   -3.08%     
- Complexity      819      820       +1     
============================================
  Files            50       50              
  Lines          4135     4140       +5     
============================================
- Hits           1098      972     -126     
- Misses         3037     3168     +131     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

}

public function getTimestamp(): int {
$defaultActor = $this->getServiceContainer()->getUserFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should have a test case in CheckLastWikiActivityTest to make sure properly excludes MediaWiki default edits.

$this->fatalError( 'Enable $wgCreateWikiEnableManageInactiveWikis to run this script.' );
}

$inactiveDays = (int)$this->getConfig()->get( ConfigNames::StateDays )['no-edits']['inactive'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we hardcode no-edits here? I'm uncertain about that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance wise, the check this is used in is to tell if it's worth doing more processing. Using the function to get the proper check would mean that code(querying the database) runs for every wiki even if they're active, while if we leave it here, checklastActivity() would run on every wiki that hasn't been edited for at least 7 days under default config. So I guess it depends on which performance hit we think will hurt less? It would probably be activity->getTimestamp() I feel.. in which case we should change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs tests in ManageInactiveWikisTest but I can probably put some work into doing that at some point.

Copy link
Member

@Universal-Omega Universal-Omega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pixDeVl pixDeVl requested a review from Universal-Omega March 17, 2025 03:01
@pixDeVl
Copy link
Author

pixDeVl commented Mar 17, 2025

@CodeRabbit review

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb4d447 and 8f0edcf.

📒 Files selected for processing (4)
  • extension.json (1 hunks)
  • includes/Services/WikiManagerFactory.php (1 hunks)
  • maintenance/CheckLastWikiActivity.php (2 hunks)
  • maintenance/ManageInactiveWikis.php (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.php`: Follow MediaWiki PHP Coding Standards:

Indentat...

**/*.php: Follow MediaWiki PHP Coding Standards:

Indentation:

  • Use tabs for indentation, not spaces.

Spaces:

  • Use spaces on either side of binary operators.
    Example: $a = $b + $c; (Correct) $a=$b+$c; (Incorrect)
  • Add spaces inside parentheses, except where they are empty.
    Example: $a = getFoo( $b ); (Correct) $a = getFoo($b); (Incorrect)
  • Do not add a space after a function name.
    Example: $c = getBar(); (Correct) $c = getBar (); (Incorrect)

Function Return Type Hint:

  • Add a space after the colon in the function return type hint.
    Example: function square( int $x ): int { return $x * $x; }

Arrays:

  • Use spaces in brackets when declaring an array, except where the array is empty.
    Example: $a = [ 'foo', 'bar' ]; (Correct) $a = ['foo', 'bar']; (Incorrect)
  • Do not put spaces in brackets when accessing array elements.
    Example: $c = $a[0]; (Correct) $c = $a[ 0 ]; (Incorrect)

Control Structures:

  • Add a space after control structure keywords.
    Example: if ( isFoo() ) { ... } (Correct) if(isFoo()){ ... } (Incorrect)

Type Casting:

  • Do not use a space within or after the cast operator.
    Example: (int)$foo; (Correct) (int) $bar; (Incorrect)

Comments:

  • In comments, there should be one space between the # or // and the comment text.
    Example: // This is a comment. (Correct) //This is a comment. (Incorrect)

Ternary Operator:

  • Keep ternary operator expressions short and obvious.
  • For multi-line ternary expressions, place the '?' and ':' at the beginning of the new lines.

File Formatting:
- Avoid trailing whitespace on lines.

  • maintenance/CheckLastWikiActivity.php
  • maintenance/ManageInactiveWikis.php
  • includes/Services/WikiManagerFactory.php
🔇 Additional comments (7)
maintenance/CheckLastWikiActivity.php (1)

44-44: Good implementation of additional filtering condition

The new condition successfully extends the existing filters to exclude the default actor's actions from activity calculations.

includes/Services/WikiManagerFactory.php (1)

314-314:

Details

❓ Verification inconclusive

Updated to use the new configuration structure for deletion grace period

This change properly aligns the deletion logic with the restructured configuration by accessing the 'deleted' value from the 'no-edits' section instead of from the root level.

Let's verify that the configuration access is consistent throughout the codebase:


🏁 Script executed:

#!/bin/bash
# Check for any other direct references to StateDays that might need updating
rg -t php "StateDays\)\[\"" --include="*.php" -A 1 -B 1

Length of output: 498


Configuration access verification for deletion grace period update

File: includes/Services/WikiManagerFactory.php (Line 314)

The deletion grace period is now calculated using the new configuration structure. The code:

$deletionGracePeriod = (int)$this->options->get(ConfigNames::StateDays)["no-edits"]['deleted'] * 86400;

correctly retrieves the ‘deleted’ value from the nested 'no-edits' section, aligning with the updated configuration.

Action Required:
Please verify that this revised configuration access pattern is applied consistently throughout the codebase. To assist with this, run the following command which uses the corrected glob flag to scan PHP files for additional references:

#!/bin/bash
# Check for any other direct references to StateDays that might need updating
rg -t php "StateDays\)\[\"" --glob "*.php" -A 1 -B 1
extension.json (1)

385-396: Restructured configuration to support fast-track deletion

The configuration has been refactored from a flat structure to a nested one with 'default' and 'no-edits' objects. This allows for different inactivity/deletion timelines based on whether a wiki has been edited.

The 'no-edits' configuration accelerates the process:

  • Inactive after 15 days (vs 45 days in default)
  • Closed after 30 days (vs 15 days after inactivity in default)
  • Immediate removal and deletion (0 days) once closed

This implementation aligns with the PR objective to implement fast-track deletion for unused wikis.

maintenance/ManageInactiveWikis.php (4)

6-6: Added ConfigException import for error handling

Good addition of the ConfigException import to properly handle configuration errors.


36-37: Optimized inactiveDays retrieval

Moving the inactiveDays retrieval outside the loop is a good optimization, as it avoids repeatedly retrieving the same configuration value for each wiki.


92-92: Logical track selection based on activity

Good implementation of determining which configuration track to use based on last activity timestamp. This is a key part of the fast-track deletion logic.


94-97: Updated configuration access to use track-specific values

The code now properly retrieves configuration values from the appropriate track (default or no-edits), ensuring the correct timing rules are applied based on wiki activity.

Comment on lines +22 to +24
$defaultActor = $this->getServiceContainer()->getUserFactory()
->newFromName( 'MediaWiki default' )
->getActorId();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Improvement to exclude MediaWiki default actor from activity checks

This change adds logic to filter out log entries from the MediaWiki default actor, ensuring that only meaningful user activities are counted when determining wiki activity.

A test case in CheckLastWikiActivityTest would be valuable to verify that this properly excludes MediaWiki default edits, as mentioned in a previous review.


🏁 Script executed:

#!/bin/bash
# Check if any test exists for this functionality
rg -t php "MediaWiki default" tests/phpunit

Length of output: 45


Action Required: Add a test to verify MediaWiki default actor exclusion

The logic in maintenance/CheckLastWikiActivity.php (lines 22–24) correctly retrieves the actor ID for "MediaWiki default" to later exclude its log entries from wiki activity checks. However, our search in the test suite didn’t find any test case covering this functionality. To fully validate that log entries from the MediaWiki default actor are excluded, please add a test case (preferably in CheckLastWikiActivityTest.php) that explicitly checks for this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants