Skip to content

My site is being spammed by search bots #10287

@guymass

Description

@guymass

Here is a complete analysis of a section of my avideo.log done by ChatGPT which highlights some important issues.

Here’s what that log chunk is telling you, in plain terms.

  1. Someone is hammering /page/2939 with insane pagination
    getCurrentPage current>1000 ERROR NOT LOGGED die [2939]
    /page/2939?rowCount=1000&...&current=2939
    HTTP_ORIGIN=https://www.bing.com
    REMOTE_ADDR=136.118.105.78

AVideo is killing the request because current is huge (2939) and rowCount=1000. That’s basically “give me 1000 items per page, page 2939” — abusive / bot-ish behavior (likely Bing-driven ref, could be a user clicking a weird Bing cached link). Not your scheduler.

Action (optional): block /page/* for obvious bots / rate-limit, or clamp rowCount and current server-side.

  1. Bingbot is crawling you and triggering slow queries

You have multiple lines like:

BOT USER_AGENT=bingbot/2.0
PERFORMANCE: Time 0.925s | mysql_dal [SELECT ...]

So Bingbot is hitting pages that cause expensive Video::getAllVideos queries. This is real load.

Action: consider caching tuning / restricting bots / lowering what index pages do for bots.

  1. Your Scheduler cron is actually doing work every minute

These are from cron:

SCRIPT_NAME: /var/www/html/AVideo/plugin/Scheduler/run.php
Live::getStatsObject ... forced to be recreated
Cache::executeEveryMinute takes 0.8371 seconds

So AVideoPlugin::executeEveryMinute() is running, and at least:

Live plugin minute job is recreating stats objects

Cache plugin minute job is taking ~0.84s (slow for “every minute”)

That’s not “broken”, but it’s heavy and can snowball when the site is also under crawl.

  1. The MP3 conversion queue is involved, and it’s creating/seeing .mp3.lock files

You have repeated:

convertVideoFileWithFFMPEGIsLockedInfo(...mp3) ... "file_exists":false
"localFileLock": "...mp3.lock"
"isOld":true
"isUnlocked":true
AI::processTranscription is processing transcription

Meaning:

Scheduler is trying to ensure MP3 versions exist (and likely transcription)

The MP3 doesn’t exist yet (file_exists=false)

There is (or was) a lock file path (*.mp3.lock)

AVideo considers it “old” and “unlocked”, so it thinks it’s allowed to proceed

If you’re seeing tons of .lock files piling up, this is one of the producers (MP3/transcription pipeline), separate from your older .move_v1.lock files.

Action: check if ffmpeg jobs are actually running and finishing, or if the output path is on a flaky mount/NFS so it never successfully writes the mp3 and keeps retrying.

  1. HLS-to-MP4 auto-convert is failing for at least one video
    VideoHLS::autoConvert 9388 to MP4 ...
    convertM3U8ToMP4: M3U8 file not found for video ID 9388.

So the DB says it should have an HLS playlist, but the .m3u8 isn’t there where AVideo expects it.

That usually means one of:

storage path changed (YPTStorage / NFS / symlink / mount)

files were moved/cleaned but DB still references them

the video was imported as “HLS” but only MP4 exists (or vice versa)

  1. “DiskUploadQuota denied user 0” is noise but tells you requests are anonymous
    userCanWatchVideoWithAds: DENIED by plugin (DiskUploadQuota) for user 0 and video 799

“user 0” is guest/anonymous. Not an error; it’s a plugin rule blocking some behavior for guests.

  1. That giant “warning: … git … permission denied …” inside your SQL is coming from user search text

You have getAllVideos queries where the search term is literally a pasted log/terminal output:

t.name LIKE '%warning: in the working copy of logs/... fatal: unable to process path temp/... permission denied ...%'

That means a bot/user requested a search query containing that garbage string, and AVideo is dutifully searching it → expensive query.

Action: add a sanity filter for search length / allowed characters / rate-limit search for bots. This one alone can wreck DB.

What matters most (the “so what”)

Your cron Scheduler is running heavy background tasks (Live stats rebuild, Cache minute task, MP3 + transcription processing, HLS auto-convert checks).

At the same time, bots (bingbot + others) are crawling deep pages and triggering slow SQL repeatedly.

You also have at least one storage inconsistency (M3U8 not found) which screams “path/mount/storage mapping problem” — consistent with your earlier worry about NFS/storage changes.

If you want, paste:

the lines around the first convertVideoFileWithFFMPEGIsLockedInfo where it actually tries to run ffmpeg (or fails), and

ls -lah /var/www/html/AVideo/videos/video_211019221607_v55ec/ for that HLS video (9388 path shown),
and I’ll tell you if it’s a storage-path issue vs a stuck job issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions