Skip to content

Conversation

@KyriosGN0
Copy link
Contributor

@KyriosGN0 KyriosGN0 commented Oct 24, 2025

User description

Signed-off-by: AvivGuiser [email protected]

Thanks for contributing to the Docker-Selenium project!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines, applied for this repository.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

Part if #2650

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

PR Type

Enhancement


Description

  • Migrate shell script to Python for better maintainability

  • Implement GraphQL endpoint polling with retry logic

  • Add capability extraction and video filename normalization

  • Maintain backward compatibility with environment variables


Diagram Walkthrough

flowchart LR
  A["video_graphQLQuery.sh<br/>Bash Script"] -->|"Replaced by"| B["video_graphQLQuery.py<br/>Python Script"]
  B --> C["Get GraphQL Endpoint"]
  B --> D["Poll Session Data"]
  B --> E["Extract Capabilities"]
  B --> F["Normalize Filename"]
  C --> G["Output: RECORD_VIDEO<br/>TEST_NAME ENDPOINT"]
  D --> G
  E --> G
  F --> G
Loading

File Walkthrough

Relevant files
Migration
video_graphQLQuery.sh
Remove original Bash script implementation                             

Video/video_graphQLQuery.sh

  • Entire bash script removed (85 lines deleted)
  • Functionality migrated to Python implementation
+0/-85   
Enhancement
video_graphQLQuery.py
Add Python implementation of GraphQL query script               

Video/video_graphQLQuery.py

  • New Python implementation with 269 lines
  • Implements GraphQL endpoint retrieval with environment variable
    fallback
  • Adds session polling with retry logic and timeout handling
  • Extracts session capabilities from JSON response
  • Normalizes video filenames with character filtering and truncation
  • Maintains full backward compatibility with bash version via
    environment variables
+269/-0 

@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Oct 24, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true -->

</details></td></tr>
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Oct 24, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Eliminate temporary file for communication

Refactor the script to pass data between functions directly through memory as
arguments and return values, instead of using a temporary file. This change will
improve efficiency and robustness by removing unnecessary file I/O.

Examples:

Video/video_graphQLQuery.py [238-243]
    poll_session(graphql_endpoint, session_id, poll_interval)

    # Extract capabilities
    record_video_raw, test_name_raw, video_name_raw = extract_capabilities(
        session_id, video_cap_name, test_name_cap, video_name_cap
    )

Solution Walkthrough:

Before:

def poll_session(session_id, ...):
    # ... polls endpoint ...
    if successful_poll:
        _persist_body(session_id, body_text) # Writes to /tmp/graphQL_...json
    return response_data # This return value is ignored by the caller

def extract_capabilities(session_id, ...):
    path = f"/tmp/graphQL_{session_id}.json"
    with open(path, "r") as f:
        data = json.load(f)
    # ... extracts capabilities from data ...
    return ...

def main():
    # ...
    poll_session(session_id, ...)
    record_video, ... = extract_capabilities(session_id, ...)
    # ...

After:

def poll_session(session_id, ...):
    # ... polls endpoint ...
    if successful_poll:
        response_data = json.loads(body_text)
    # ...
    return response_data # Return the final data dictionary

def extract_capabilities(session_data, ...):
    if not session_data:
        return None, None, None
    # ... extracts capabilities directly from session_data dictionary ...
    return ...

def main():
    # ...
    session_data = poll_session(session_id, ...)
    record_video, ... = extract_capabilities(session_data, ...)
    # ...
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies an anti-pattern where a temporary file is used for communication between poll_session and extract_capabilities, proposing a much cleaner in-memory data flow that improves design, efficiency, and robustness.

Medium
General
Decouple auth header generation logic

Refactor build_basic_auth_header to return a dictionary instead of a string, and
update poll_session to use it, decoupling the two functions.

Video/video_graphQLQuery.py [54-79]

+def build_basic_auth_header() -> dict[str, str] | None:
+    username = os.getenv("SE_ROUTER_USERNAME")
+    password = os.getenv("SE_ROUTER_PASSWORD")
+    if username and password:
+        token = base64.b64encode(f"{username}:{password}".encode()).decode()
+        return {"Authorization": f"Basic {token}"}
+    return None
+
+
 def poll_session(endpoint: str, session_id: str, poll_interval: float) -> dict | None:
     """Poll the GraphQL endpoint for the session.
 
     Returns full parsed response dict if any request succeeded (HTTP 200) else None.
     Saves last successful body to /tmp/graphQL_<session_id>.json (for parity).
     """
     if not endpoint:
         return None
 
     query_obj = {
         "query": (
             f"{{ session (id: \"{session_id}\") {{ id, capabilities, startTime, uri, nodeId, nodeUri, "
             "sessionDurationMillis, slot { id, stereotype, lastStarted } }} }} "
         )
     }
     headers = {
         "Content-Type": "application/json",
     }
-    basic_auth_header = build_basic_auth_header()
-    if basic_auth_header:
-        # urllib expects header name:value separately; we split at first space after name for compatibility.
-        # Our header already includes 'Authorization: Basic <token>' so we parse.
-        name, value = basic_auth_header.split(": ", 1)
-        headers[name] = value
+    auth_header = build_basic_auth_header()
+    if auth_header:
+        headers.update(auth_header)
 
     response_data: dict | None = None
     ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies tight coupling and proposes a good refactoring that improves code quality, robustness, and maintainability by returning a dictionary instead of a formatted string.

Low
Learned
best practice
Replace magic numbers with env-config

Read these values from environment with sane bounds-checking to mirror
configurable behavior and avoid hard-coded magic numbers.

Video/video_graphQLQuery.py [17-18]

-MAX_TIME_SECONDS = 1
-RETRY_TIME = 3
+def _get_int_env(name: str, default: int, min_value: int = 1, max_value: int = 60) -> int:
+    try:
+        v = int(os.getenv(name, str(default)))
+        return max(min_value, min(v, max_value))
+    except Exception:
+        return default
 
+MAX_TIME_SECONDS = _get_int_env("SE_GRAPHQL_MAX_TIME", 1)
+RETRY_TIME = _get_int_env("SE_GRAPHQL_RETRY_TIME", 3)
+
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Harden input handling and polling defaults; avoid magic numbers by using env-configured constants with validation.

Low
  • Update

Signed-off-by: AvivGuiser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant