Skip to content

Conversation

@mmathieum
Copy link
Member

@mmathieum mmathieum commented Nov 20, 2025

@mmathieum
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a string indexing mechanism to optimize database storage by replacing repeated strings with integer IDs. A new T_STRINGS table and a GTFSStringsUtils helper class are added to manage this. The changes are well-implemented, using a feature flag for a controlled rollout. My review includes a couple of suggestions to improve code clarity and maintainability by using modern language features and removing redundancy.

@mmathieum mmathieum marked this pull request as ready for review November 24, 2025 18:47
@mmathieum mmathieum requested a review from Copilot November 24, 2025 18:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements an "Index Strings" feature that optimizes storage by replacing repeated string values with integer identifiers. The changes introduce a new STRINGS table and supporting infrastructure to handle string indexing across GTFS data tables.

Key Changes

  • Added GTFSStringsUtils utility class to manage string indexing operations including database loading and string replacement
  • Integrated string replacement logic into database initialization and schedule timestamp processing
  • Refactored return types from HashSet to Set interface for better flexibility

Reviewed changes

Copilot reviewed 7 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
GTFSStringsUtils.kt New utility class providing string indexing operations including database queries, file parsing, and string replacement functionality
GTFSStatusProvider.java Updated to use Set interface and integrated string replacement in schedule loading when feature flag is enabled
GTFSScheduleTimestampsProvider.java Changed from HashSet to Set interface for consistency
GTFSRDSProvider.java Renamed constant from TAG to LOG_TAG for consistency and made cursor final
GTFSProviderDbHelper.java Major changes to support strings table including new initialization logic, string file reading, and integration of string replacement during table initialization
GTFSPOIProvider.java Renamed constant from TAG to LOG_TAG for consistency
Schedule.java Added discouraged accessor methods for direct headsign value manipulation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mmathieum
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a string indexing mechanism to optimize data storage, which is a solid improvement. The changes involve creating a new strings table in the database, pre-processing data files to replace strings with IDs during DB creation, and resolving these IDs back to strings at runtime. The implementation is well-structured, with a new GTFSStringsUtils class for handling string operations and modifications to the GTFSProviderDbHelper for database setup. I've found one area for improvement regarding error handling in the new utility class to make it more robust against malformed data files. Overall, this is a good step towards optimizing the application's data layer.

@mmathieum
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an optimization to reduce the database size by indexing repeating strings, which is a great improvement. The implementation is mostly solid, with new Kotlin utility classes for handling string lookups and database initialization.

My review focuses on a few key areas:

  • A significant performance issue in GTFSStatusProvider where database queries are made inside a loop.
  • Several opportunities to improve code readability and maintainability in the new Kotlin files and the database helper.
  • A minor bug in a log message.

I've provided detailed comments and suggestions for each of these points.

@mmathieum mmathieum merged commit 310f151 into master Nov 24, 2025
4 checks passed
@mmathieum mmathieum deleted the mm/strings branch November 24, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants