Patch/restore mdf client #473

blaiszik · 2026-01-22T19:40:27Z

No description provided.

This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The forge DOI search can return multiple results where only one actually has the matching DOI. Previously, get_metadata_by_doi() blindly returned the first result, which often didn't have the requested DOI. Now it iterates through results to find the one with the exact DOI match, fixing test_dataframe_search_by_doi and test_dataframe_download_by_doi tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The combined size of torch, tensorflow, and NVIDIA CUDA dependencies exceeded GitHub Actions runner disk space (~4GB+). These ML frameworks are now available as optional extras via pip install .[torch] or pip install .[tensorflow]. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove unused imports (sys, rprint, Optional, pandas, numpy) - Fix unused exception variable - Remove f-string without placeholders - Split long line in MCP server description - Add noqa comment for intentional re-export Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move heavy ML dependencies to optional extras to reduce default install size: - pip install foundry-ml[torch] - pip install foundry-ml[tensorflow] - pip install foundry-ml[huggingface] - pip install foundry-ml[excel] - pip install foundry-ml[examples] - pip install foundry-ml[dev] Update README with extras install instructions and NumPy 2.0 compatibility note. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

MDFClient improvements: - Add Globus Search index ID constants (MDF_INDEX_ID, MDF_TEST_INDEX_ID) - Add match_source_names() method with automatic version suffix stripping - Add _has_field_filters property for elegant advanced mode detection - Use advanced=True automatically for DOI and source_name searches (required for exact field matching in Globus Search) - Add try/finally to ensure query state is always reset after search Foundry search fix: - Pass free-text query to Globus Search for server-side filtering instead of fetching 10 results and filtering client-side - This fixes searches like f.search("Computational Band Gaps") that were failing when the target dataset wasn't in the first 10 results Test additions: - Add test_load_mp_band_gaps_dataset to verify DOI-based dataset loading Re-rendered example notebooks with updated outputs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

what-the-diff · 2026-01-22T19:42:34Z

PR Summary

Enhancements to working_with_data.ipynb
- The code has been updated to install and manage a specific version of pyarrow, a software library, more efficiently. This change also takes care of potential conflicts with other software.
- For easier understanding, the code's format was improved and supplemented with extra comments and printed outputs.
Adjustments to oqmd.ipynb
- Certain parts of the code are now run in a specific order to ensure the whole code functions properly.
- Previous output messages, which might confuse reviewers, have been removed to make the code easier to read.
Optimizations in foundry.py
- A new approach in the get_metadata_by_query process allows for quicker and more efficient data filtering on the server-side.
- The search feature received an upgrade, now allowing filtering by field-specific attributes like source_name.
Updates in mdf_client.py
- Introduced constants for identifying specific Globus Search Indexes. This addition improves the organization and readability of the code.
- Also included is a new method to filter data sets by source_name, enhancing navigation.
- Error handling and management of dataset search were improved, reducing the chance of errors and facilitating smoother searches.
Improvements to test_foundry.py
- A test was added to verify loading of a specific dataset by its DOI (or Digital Object Identifier), offering better quality control.
- General cleanup of the test code and removal of unnecessary comments provide greater clarity for anyone reviewing or using these tests.

blaiszik and others added 8 commits January 13, 2026 22:13

Restore missing mdf_client.py from design-renaissance branch

e645c82

This file was part of PR #469 but was not included in the merge, causing ModuleNotFoundError when importing foundry. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace mdf_forge with internal MDFClient in tests

c25d88a

Update test imports to use foundry.mdf_client.MDFClient instead of mdf_forge.Forge, which is no longer a required dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Merge branch 'main' into patch/restore-mdf-client

ffbc370

blaiszik merged commit 5327b2b into main Jan 22, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patch/restore mdf client #473

Patch/restore mdf client #473

Uh oh!

blaiszik commented Jan 22, 2026

Uh oh!

Uh oh!

what-the-diff bot commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Patch/restore mdf client #473

Patch/restore mdf client #473

Uh oh!

Conversation

blaiszik commented Jan 22, 2026

Uh oh!

Uh oh!

what-the-diff bot commented Jan 22, 2026

PR Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants