Skip to content

Sharp predict detects the same file twice #44

@nabagaca

Description

@nabagaca

System Information

  • OS: Windows 11
  • Python: 3.13

Problem Description

When I run sharp predict, it appears to detect and process the same image twice. Below is a sample of output demonstrating this;

sharp predict -i input_images -o output_splats
| INFO | Processing 2 valid image files.
| INFO | Using device cuda
| INFO | No checkpoint provided. Downloading default model from https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt
| INFO | Using preset ViT dinov2l16_384.
| INFO | Using preset ViT dinov2l16_384.
| INFO | Processing input_images\PXL_20250803_095635868.MP.jpg
| INFO | Running preprocessing.
| INFO | Running inference.
| INFO | Running postprocessing.
| INFO | Saving 3DGS to output_splats
| INFO | Processing input_images\PXL_20250803_095635868.MP.jpg
| INFO | Running preprocessing.
| INFO | Running inference.
| INFO | Running postprocessing.
| INFO | Saving 3DGS to output_splats

I have removed the timestamps from the logs for brevity, but the first processing line is timestamped at 20:11:03,618 and the second at 2025-12-24 20:11:10,267, so 7 seconds between, which leads me to believe this is sharp reprocessing the same image, and not something like the logs/output being duplicated.

I did throw OpenAI Codex at the issue, and it came up with a patch that, as far as I can tell has fixed the issue for me. It seems to solve it by taking the set of files for a given sub-directory.

diff --git a/src/sharp/cli/predict.py b/src/sharp/cli/predict.py
index 8914bb5..bab30a8 100644
--- a/src/sharp/cli/predict.py
+++ b/src/sharp/cli/predict.py
@@ -84,15 +84,26 @@ def predict_cli(
     """Predict Gaussians from input images."""
     logging_utils.configure(logging.DEBUG if verbose else logging.INFO)
 
-    extensions = io.get_supported_image_extensions()
+    extensions = {ext.lower() for ext in io.get_supported_image_extensions()}
 
     image_paths = []
     if input_path.is_file():
-        if input_path.suffix in extensions:
+        if input_path.suffix.lower() in extensions:
             image_paths = [input_path]
     else:
-        for ext in extensions:
-            image_paths.extend(list(input_path.glob(f"**/*{ext}")))
+        seen = set()
+        for candidate_path in input_path.rglob("*"):
+            if not candidate_path.is_file():
+                continue
+            if candidate_path.suffix.lower() not in extensions:
+                continue
+
+            resolved = candidate_path.resolve()
+            if resolved in seen:
+                continue
+
+            seen.add(resolved)
+            image_paths.append(candidate_path)
 
     if len(image_paths) == 0:
         LOGGER.info("No valid images found. Input was %s.", input_path)

A sample of the output after this patch;

sharp predict -i input_images -o output_splats
| INFO | Processing 1 valid image files.
| INFO | Using device cuda
| INFO | No checkpoint provided. Downloading default model from https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt
| INFO | Using preset ViT dinov2l16_384.
| INFO | Using preset ViT dinov2l16_384.
| INFO | Processing input_images\PXL_20250803_095635868.MP.jpg
| INFO | Running preprocessing.
| INFO | Running inference.
| INFO | Running postprocessing.
| INFO | Saving 3DGS to output_splats

As you can see, it correctly detects and processes one image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions