-
Notifications
You must be signed in to change notification settings - Fork 478
fix: Image type recognition failure in FileTypeUtils #812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: Image type recognition failure in FileTypeUtils #812
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a bug in FileTypeUtils where image type recognition was failing due to incorrect matching logic. The original code used exact string matching on a 56-character hex string against short magic numbers (6-8 characters), which always failed. The fix changes to a prefix-based matching approach using startsWith().
Changes:
- Modified getImageType() to use startsWith() for prefix matching instead of exact HashMap.get() lookup
- Added IMAGE_TYPE_MARK_MIN_LENGTH constant and adjusted length validation logic
- Added comprehensive unit tests for the FileTypeUtils class
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| fesod-sheet/src/main/java/org/apache/fesod/sheet/util/FileTypeUtils.java | Fixed image type detection logic by implementing prefix matching and improving input validation |
| fesod-sheet/src/test/java/org/apache/fesod/sheet/util/FileTypeUtilsTest.java | Added comprehensive test coverage for all code paths including edge cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (Map.Entry<String, ImageData.ImageType> entry : FILE_TYPE_MAP.entrySet()) { | ||
| String magicNumber = entry.getKey(); | ||
| if (hexString.startsWith(magicNumber)) { | ||
| return entry.getValue(); | ||
| } | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iteration order of HashMap entries is not guaranteed. While not an issue with the current two magic numbers, this could cause incorrect matches if future image types are added with overlapping prefixes. Consider using LinkedHashMap to maintain insertion order or implementing explicit ordering logic (e.g., longest prefix first) to ensure deterministic and correct matching behavior.
| for (Map.Entry<String, ImageData.ImageType> entry : FILE_TYPE_MAP.entrySet()) { | ||
| String magicNumber = entry.getKey(); | ||
| if (hexString.startsWith(magicNumber)) { | ||
| return entry.getValue(); | ||
| } | ||
| } |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iteration-based approach with startsWith() has O(n) time complexity for each lookup, compared to O(1) for the previous HashMap.get() approach. For better performance with the new logic, consider using a more efficient data structure like a Trie or maintaining a sorted list by magic number length for early exit optimization.
| private static final char[] DIGITS = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' | ||
| }; | ||
| private static final int IMAGE_TYPE_MARK_LENGTH = 28; | ||
| private static final int IMAGE_TYPE_MARK_MIN_LENGTH = 4; |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IMAGE_TYPE_MARK_MIN_LENGTH is set to 4 bytes, but JPEG's magic number "ffd8ff" only requires 3 bytes. The test test_getImageType_tooShort expects null for a 3-byte JPEG magic number, which would incorrectly reject valid JPEG headers. Either IMAGE_TYPE_MARK_MIN_LENGTH should be 3 to accommodate JPEG, or the JPEG magic number definition should be extended to require 4 bytes.
| private static final int IMAGE_TYPE_MARK_MIN_LENGTH = 4; | |
| // Must be at least the length (in bytes) of the shortest magic number (currently JPEG: "ffd8ff" = 3 bytes). | |
| private static final int IMAGE_TYPE_MARK_MIN_LENGTH = 3; |
|
|
||
| @Test | ||
| void test_getImageType_tooShort() { | ||
| byte[] input = new byte[] {(byte) 0xFF, (byte) 0xD8, (byte) 0xFF}; |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test creates a 3-byte array with valid JPEG magic bytes (0xFF, 0xD8, 0xFF) and expects null, but JPEG's magic number is "ffd8ff" which is exactly 3 bytes. This test appears to validate incorrect behavior where valid JPEG images would be rejected. The test should either expect PICTURE_TYPE_JPEG or use a different magic number that genuinely requires 4 bytes.
| byte[] input = new byte[] {(byte) 0xFF, (byte) 0xD8, (byte) 0xFF}; | |
| byte[] input = new byte[] {(byte) 0x00, (byte) 0x01}; |
Purpose of the pull request
Related: #811
Incorrect matching logic in FileTypeUtils.
What's changed?
Changes the logic to use
startsWith()to correctly match the file magic numbers.Checklist