Patch/restore mdf client #475

blaiszik · 2026-01-23T05:15:57Z

No description provided.

This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The forge DOI search can return multiple results where only one actually has the matching DOI. Previously, get_metadata_by_doi() blindly returned the first result, which often didn't have the requested DOI. Now it iterates through results to find the one with the exact DOI match, fixing test_dataframe_search_by_doi and test_dataframe_download_by_doi tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The combined size of torch, tensorflow, and NVIDIA CUDA dependencies exceeded GitHub Actions runner disk space (~4GB+). These ML frameworks are now available as optional extras via pip install .[torch] or pip install .[tensorflow]. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove unused imports (sys, rprint, Optional, pandas, numpy) - Fix unused exception variable - Remove f-string without placeholders - Split long line in MCP server description - Add noqa comment for intentional re-export Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move heavy ML dependencies to optional extras to reduce default install size: - pip install foundry-ml[torch] - pip install foundry-ml[tensorflow] - pip install foundry-ml[huggingface] - pip install foundry-ml[excel] - pip install foundry-ml[examples] - pip install foundry-ml[dev] Update README with extras install instructions and NumPy 2.0 compatibility note. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

MDFClient improvements: - Add Globus Search index ID constants (MDF_INDEX_ID, MDF_TEST_INDEX_ID) - Add match_source_names() method with automatic version suffix stripping - Add _has_field_filters property for elegant advanced mode detection - Use advanced=True automatically for DOI and source_name searches (required for exact field matching in Globus Search) - Add try/finally to ensure query state is always reset after search Foundry search fix: - Pass free-text query to Globus Search for server-side filtering instead of fetching 10 results and filtering client-side - This fixes searches like f.search("Computational Band Gaps") that were failing when the target dataset wasn't in the first 10 results Test additions: - Add test_load_mp_band_gaps_dataset to verify DOI-based dataset loading Re-rendered example notebooks with updated outputs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Progress bar improvements: - Enable progress bars by default during dataset downloads - Show "Finding files" progress while discovering files on server - Show "Downloading" progress with file count (e.g., 5/10 files) - Add per-file progress bar for files > 1MB with speed and ETA - Uses tqdm.auto for automatic Jupyter/terminal detection README improvements: - Add "Export to HuggingFace Hub" section with CLI and Python examples - Document the huggingface extra installation - Mention auto-generated Dataset Cards feature Test updates: - Update download tests to mock response.headers for content-length Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…zations Features: - Add dataset.preview(n=5) method to show actual data samples as DataFrame - Add "Open in Colab" badges to all example notebooks (12 notebooks) Improved error messages with actionable hints: - get_dataset() now raises DatasetNotFoundError instead of returning None - DownloadError includes contextual recovery hints based on error type - Better messages for missing files, unsupported data types, failed loads Test optimizations: - Add pytest fixtures (scope=module) to share Foundry client across tests - Add downloaded_dataset fixture to download once, share across 4 tests - Use small dataset (10.18126/8p6m-e135) for all tests - Enable HTTPS download tests to run on GitHub Actions - Reduces test time by avoiding repeated client creation and downloads Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

blaiszik and others added 10 commits January 13, 2026 22:13

Restore missing mdf_client.py from design-renaissance branch

e645c82

This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace mdf_forge with internal MDFClient in tests

c25d88a

Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Merge branch 'main' into patch/restore-mdf-client

ffbc370

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patch/restore mdf client #475

Patch/restore mdf client #475

Uh oh!

blaiszik commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Patch/restore mdf client #475

Are you sure you want to change the base?

Patch/restore mdf client #475

Uh oh!

Conversation

blaiszik commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants