Skip to content

Conversation

@6-dehan
Copy link
Contributor

@6-dehan 6-dehan commented Dec 30, 2025

The func is used to verify screen content by
using Google Gemini API

ID: 4900
Signed-off-by: Dehan Meng demeng@redhat.com

Summary by CodeRabbit

Release Notes

  • New Features
    • Integrated Google Gemini AI for intelligent screen verification and comprehensive image analysis
    • Supports flexible AI model selection and configurable image processing parameters
    • Includes optional failed image capture and logging to results directory for debugging
    • Gracefully handles missing or unavailable AI dependencies with appropriate warnings
    • Added supporting dependencies for AI-powered analysis features

✏️ Tip: You can customize this high-level summary in your review settings.

The func is used to verify screen content by
using Google Gemini API

Signed-off-by: Dehan Meng <demeng@redhat.com>
@coderabbitai
Copy link

coderabbitai bot commented Dec 30, 2025

Walkthrough

This pull request adds Google Generative AI (Gemini) integration to the project. Three dependencies are introduced: google-generativeai for API access, Pillow for image processing, and importlib-metadata for Python versions below 3.10. A compatibility layer is added for importlib.metadata.packages_distributions on older Python versions. A new public function verify_screen_with_gemini() is introduced in virttest/ppm_utils.py that validates Gemini API availability, loads and converts PPM images to JPEG format, iterates through candidate Gemini models with retry logic, handles API failures, and optionally saves failed images with timestamps.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding a new function 'verify_screen_with_gemini' to ppm_utils.py. It is concise and specific about the primary contribution.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
virttest/ppm_utils.py (1)

1-211: Fix formatting issues flagged by pipeline.

The pre-commit hooks detected formatting issues (trailing whitespace, isort, black). These must be fixed before the PR can be merged.

Run the following commands locally to fix these issues:

#!/bin/bash
# Fix formatting issues
pip install pre-commit
pre-commit run --all-files

After running these commands, commit the changes and push.

🧹 Nitpick comments (6)
virttest/ppm_utils.py (6)

74-86: Document exceptions in docstring.

The function docstring should document the exceptions that can be raised (ImportError, ValueError).

🔎 Suggested addition to docstring
     :param resize_max_dim: Max dimension to resize image to (maintains aspect ratio).
                            Set to None to disable resizing.
     :return: The text response from Gemini (stripped).
+    :raises ImportError: If google-generativeai or Pillow is not installed.
+    :raises ValueError: If Gemini API key is not provided.
+    :raises Exception: If all Gemini models fail or other API errors occur.
     """

119-123: Add explicit seek(0) for BytesIO robustness.

After writing to img_byte_arr, the file position is at the end. While Image.open likely handles this, explicitly seeking to the start improves clarity and robustness.

🔎 Suggested fix
             img_byte_arr = io.BytesIO()
             img.convert("RGB").save(img_byte_arr, format="JPEG", quality=85)
+            img_byte_arr.seek(0)
             img_jpeg = Image.open(img_byte_arr)

158-164: Use bare raise statements within except blocks.

When re-raising the same exception within an except block, use a bare raise statement instead of raise e to preserve the original traceback.

🔎 Suggested fix
                         except Exception as e:
                             if "404" in str(e) or "not found" in str(e).lower():
                                 # Model not found, break inner retry to try next model
-                                raise e 
+                                raise
                             if attempt == max_retries - 1:
-                                raise e
+                                raise
                             LOG.warning("Gemini API call failed (attempt %d/%d) for model %s: %s. Retrying...", attempt + 1, max_retries, model_candidate, e)
                             time.sleep(2)

Based on static analysis hints (TRY201).


175-183: Use logging.exception for better error diagnostics.

When logging errors from exception handlers, use logging.exception instead of logging.error to include the full traceback, which aids debugging.

🔎 Suggested fix
                 try:
                     for m in genai.list_models():
                         LOG.info("Available model: %s (methods: %s)", m.name, m.supported_generation_methods)
                 except Exception as list_e:
-                    LOG.error("Failed to list models: %s", list_e)
+                    LOG.exception("Failed to list models: %s", list_e)

Based on static analysis hints (TRY400).


202-203: Use logging.exception for exception handlers.

Replace logging.error with logging.exception in exception handlers to include tracebacks for better debugging.

🔎 Suggested fix
                     except Exception as e:
-                        LOG.error("Failed to save debug image: %s", e)
+                        LOG.exception("Failed to save debug image: %s", e)
 
             return result_text
 
     except Exception as e:
-        LOG.error("Gemini API call failed: %s", e)
+        LOG.exception("Gemini API call failed: %s", e)
         # We re-raise to let the test fail with ERROR status
         raise

Based on static analysis hints (TRY400).

Also applies to: 207-208


65-211: Consider refactoring this large function into smaller functions.

The verify_screen_with_gemini function is 146 lines long and handles multiple concerns: validation, image processing, API calls with retries, and result processing. Consider extracting helper functions for:

  • Image loading and preprocessing
  • Model iteration with retry logic
  • Response processing and failure image saving

This would improve readability, testability, and maintainability.

Example structure:

def _load_and_prepare_image(image_path, resize_max_dim):
    """Load image and convert to JPEG in memory."""
    ...

def _try_gemini_models(img_jpeg, prompt, candidate_models):
    """Try multiple Gemini models with retry logic."""
    ...

def _save_failure_image(img_byte_arr, results_dir):
    """Save failed validation image with timestamp."""
    ...

def verify_screen_with_gemini(...):
    """Main function that orchestrates the workflow."""
    # Validation
    ...
    # Process image
    img_jpeg, img_byte_arr = _load_and_prepare_image(...)
    # Try models
    response = _try_gemini_models(...)
    # Process response
    ...
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 363ce48 and ce8ec63.

📒 Files selected for processing (2)
  • requirements.txt
  • virttest/ppm_utils.py
🧰 Additional context used
🪛 GitHub Actions: pre-commit
virttest/ppm_utils.py

[error] 1-1: Trailing whitespace check failed. Files were modified by this hook. Command: 'pre-commit run --show-diff-on-failure --color=always --all-files'


[error] 1-1: isort formatting check failed. Files were modified by this hook. Command: 'pre-commit run --show-diff-on-failure --color=always --all-files'


[error] 1-1: Black formatting check failed. File reformatted by this hook. Command: 'pre-commit run --show-diff-on-failure --color=always --all-files'

🪛 Ruff (0.14.10)
virttest/ppm_utils.py

88-91: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


108-108: Avoid specifying long messages outside the exception class

(TRY003)


161-161: Use raise without specifying exception name

Remove exception name

(TRY201)


163-163: Use raise without specifying exception name

Remove exception name

(TRY201)


170-170: Do not catch blind exception: Exception

(BLE001)


180-180: Do not catch blind exception: Exception

(BLE001)


181-181: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


183-183: Abstract raise to an inner function

(TRY301)


202-202: Do not catch blind exception: Exception

(BLE001)


203-203: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


208-208: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (2)
virttest/ppm_utils.py (2)

52-58: LGTM: Proper optional import handling.

The try-except pattern correctly handles the optional google-generativeai dependency, setting genai = None and logging an appropriate warning.


39-50: The monkey-patch is necessary to support google-generativeai on Python < 3.10. The google-generativeai library (or its dependencies) requires packages_distributions to exist in importlib.metadata, which was only added in Python 3.10. The implementation is defensive—it checks for the attribute before patching and gracefully handles import errors—and is time-limited to versions below 3.10. No changes needed.

Comment on lines +5 to +7
google-generativeai
Pillow
importlib-metadata>=4.6; python_version < '3.10'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check latest versions of the dependencies
echo "=== Checking google-generativeai latest version ==="
pip index versions google-generativeai 2>/dev/null | head -20 || curl -s https://pypi.org/pypi/google-generativeai/json | python3 -c "import sys, json; data = json.load(sys.stdin); print('Latest version:', data['info']['version']); print('All recent versions:'); versions = sorted(data['releases'].keys(), reverse=True)[:10]; print('\n'.join(versions))"

echo ""
echo "=== Checking Pillow latest version ==="
curl -s https://pypi.org/pypi/Pillow/json | python3 -c "import sys, json; data = json.load(sys.stdin); print('Latest version:', data['info']['version']); print('All recent versions:'); versions = sorted(data['releases'].keys(), reverse=True)[:10]; print('\n'.join(versions))"

Repository: avocado-framework/avocado-vt

Length of output: 200


🌐 Web query:

latest stable version google-generativeai PyPI 2025

💡 Result:

The latest stable PyPI release of google-generativeai is version 0.8.6 — uploaded Dec 16, 2025. [1][2]

Note: the package is marked legacy/deprecated with support ended Nov 30, 2025. [1]

Sources:
[1] PyPI project page for google-generativeai.
[2] Safety DB package entry for google-generativeai.


🌐 Web query:

latest stable version Pillow PyPI 2025

💡 Result:

Latest stable Pillow on PyPI (as of Dec 30, 2025): 12.0.0 — released/uploaded Oct 15, 2025. [1]

Sources
[1] PyPI project page for Pillow (release files and upload date).


🌐 Web query:

google-generativeai deprecated EOL security November 2025

💡 Result:

Short summary — relevant deprecations / EOL dates in Nov 2025:

  • Google Cloud: the Generative AI module in the Vertex AI SDK was deprecated on June 24, 2025 and is scheduled for removal on June 24, 2026 (so not EOL in Nov 2025). [1]
  • Imagen 4 preview models are retiring on November 30, 2025 (migrate to the Imagen 4 GA models). [1][2]
  • Multiple Gemini / Gemini‑related preview models were announced for shutdown in November–December 2025 (example: several Gemini and Veo preview models listed with Nov/Dec shutdown dates). [3]
  • Google Security (Chronicle) also lists some feature/log connector removals on/around November 30, 2025 (e.g., NETSKOPE_ALERT v1 removal). If you mean security tooling EOLs, check Chronicle deprecations for exact items/dates. [4]

If you want, I can fetch the full deprecation pages and list every affected model/feature and exact shutdown dates for your account/region.


🌐 Web query:

Pillow CVE security vulnerabilities 2025

💡 Result:

Summary — Pillow CVE(s) in 2025

  • CVE-2025-48379: heap-based buffer overflow when writing large (>64 KB encoded) DDS/BCn images. Affects Pillow >= 11.2.0 and < 11.3.0; patched in Pillow 11.3.0 (published July 1, 2025). Avoid saving untrusted DDS images and upgrade to 11.3.0+ immediately. [1][2][3]

References

  • NVD entry for CVE-2025-48379. [1]
  • Rapid7 / security advisories summarizing the issue and fix (Pillow 11.3.0). [2]
  • Wiz vulnerability page / additional technical details and mitigation. [3]

🌐 Web query:

google-generativeai CVE security vulnerabilities 2025

💡 Result:

Here are notable CVEs and security bulletins (2025) related to Google’s generative-AI products (Vertex AI / Gemini) and adjacent Google AI incidents:

  • CVE-2025-11915 — Connection desynchronization (HTTP request/response smuggling) in Vertex AI model proxying; fixed/rolled out by Google (GCP-2025-059). Published/linked in Google Cloud security bulletins (published 2025-10-21; CVE entry in NVD). [1][2]

  • GCP-2025-059 (Vertex AI security bulletin) — streaming requests misrouting between recipients for certain third‑party and self‑deployed models; fixes rolled out Sept 26–28, 2025. (Google’s bulletin page lists the incident and CVE mapping). [1]

  • CVE-2025-6965 / related Google incident — media coverage reporting a patched memory‑corruption vuln (patched June 2025) where Google said an AI agent helped thwart exploitation; reported coverage (SecurityWeek). (Note: follow-up technical details limited in public reporting). [3]

  • Related research / disclosure discussions — public researcher posts and press coverage in 2025 raised issues with model input handling (prompt/ASCII smuggling and prompt‑injection risks affecting Gemini and integrated agents); Google’s position varied by report. See coverage/summaries. [4]

If you want, I can:

  • pull the full NVD/CVE entries and vendor advisories for each CVE (with direct references), or
  • compile a chronological list of all Google Cloud/Vertex AI CVEs in 2025 with NVD links.

Sources: Google Cloud Vertex AI security bulletins (GCP-2025-059) [1]; NVD CVE entry for CVE‑2025‑11915 (NIST) [2]; SecurityWeek reporting on CVE‑2025‑6965 and Google AI agent use [3]; reporting on Gemini/input‑handling research [4].


Pin dependency versions and address end-of-life package.

The new dependencies lack version specifications. Additionally, google-generativeai is marked as deprecated/legacy with support ended November 30, 2025, and should not be used for new development. For Pillow, the suggested version range is outdated; version 12.0.0 is now stable, and versions 11.3.0+ include a critical security fix for CVE-2025-48379 (heap-based buffer overflow in DDS/BCn image handling).

Either replace google-generativeai with an alternative maintained package, or if it must be retained, document the EOL status and migration plan. For Pillow, pin to the latest stable release:

-google-generativeai
-Pillow
+Pillow>=12.0.0,<13.0.0

(Remove google-generativeai or provide justification for using an EOL package.)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
google-generativeai
Pillow
importlib-metadata>=4.6; python_version < '3.10'
Pillow>=12.0.0,<13.0.0
importlib-metadata>=4.6; python_version < '3.10'
🤖 Prompt for AI Agents
In requirements.txt lines 5-7: the dependencies are unpinned and
`google-generativeai` is EOL/legacy; replace or remove it (or keep only with an
inline comment documenting the EOL status and a concrete migration plan to a
maintained alternative), and pin versions for the other packages to safe stable
releases (e.g. set Pillow to the latest stable release that includes the
CVE-2025-48379 fix, and pin importlib-metadata to a specific supported range) so
installs are reproducible and secure; update the file with those version
constraints and add a short comment explaining any retained EOL package and its
replacement plan.

Comment on lines +188 to +203
if save_failed_image and results_dir:
# This logic is loose; caller should decide pass/fail, but we help debug here.
# If the response starts with "No" (case insensitive), we treat it as suspicious.
if result_text.lower().startswith("no"):
try:
timestamp = time.strftime("%Y%m%d_%H%M%S")
fail_filename = "gemini_fail_%s.jpg" % timestamp
fail_path = os.path.join(results_dir, fail_filename)
if not os.path.exists(results_dir):
os.makedirs(results_dir)
# Save the compressed/resized version we actually sent
with open(fail_path, "wb") as f:
f.write(img_byte_arr.getvalue())
LOG.info("Saved failed visual check image to: %s", fail_path)
except Exception as e:
LOG.error("Failed to save debug image: %s", e)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconsider the failure detection logic.

The logic that treats responses starting with "no" as failures is too simplistic and prompt-dependent. For example:

  • Prompt: "Is this screen corrupted?" → "No" means success
  • Prompt: "Does this screen show the login page?" → "No" means failure

This heuristic will produce incorrect results depending on how the prompt is phrased.

Consider these alternatives:

  1. Remove the automatic saving logic and let the caller decide based on the response:

    result_text = response.text.strip()
    return result_text
    # Caller can then save image if needed based on their logic
  2. Add an explicit parameter for expected response or failure keywords:

    def verify_screen_with_gemini(
        ...
        failure_keywords=None,  # e.g., ["no", "incorrect", "failed"]
        ...
    ):
  3. Document the assumption clearly in the docstring that this function expects yes/no prompts where "no" indicates failure, and add a parameter to disable this behavior.

The current implementation may surprise users and lead to incorrect behavior.

🧰 Tools
🪛 Ruff (0.14.10)

202-202: Do not catch blind exception: Exception

(BLE001)


203-203: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In virttest/ppm_utils.py around lines 188-203, the current heuristic that treats
any response starting with "no" as a failure is too brittle and
prompt-dependent; update the function to remove this hardcoded behavior and
instead (a) add a configurable parameter (e.g., failure_keywords or an explicit
boolean like auto_save_on_no with default False) so callers control what
constitutes failure, (b) only save the debug image when the caller-supplied
failure condition is met, and (c) update the docstring to document the default
behavior and the new parameter; ensure any file writes remain wrapped in the
existing try/except and keep logging of success/failure paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant