Skip to content

Conversation

@mmathieum
Copy link
Member

@mmathieum mmathieum commented Nov 20, 2025


Strings count:

  • exo 🚆 : 64
  • STM 🚌 : 3,089

@mmathieum
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a string indexing mechanism to optimize storage, which should reduce file sizes and memory consumption. The implementation is centered around a new MStrings manager for string-to-ID mapping, and the data classes are updated to use these IDs. The changes are well-implemented, but I found a critical copy-paste error in MGenerator.java that would cause database insertion to fail. With that fix, this is a solid improvement.

@mmathieum mmathieum marked this pull request as ready for review November 24, 2025 18:47
@mmathieum mmathieum requested a review from Copilot November 24, 2025 18:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements string indexing to optimize storage by replacing frequently used strings with integer references. The changes introduce a new MStrings object that maintains a bidirectional mapping between strings and integer IDs, significantly reducing data size (from 3,089 to 64 strings for STM data).

Key changes:

  • New MStrings and MString classes to manage string-to-integer mappings
  • Updated data classes to convert strings to IDs via .toStringIds() extension function
  • Migration from Constants.COLUMN_SEPARATOR_ to SQLUtils.COLUMN_SEPARATOR for consistency

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
MStrings.kt New singleton object managing bidirectional string-to-ID mappings with thread-safe operations
MString.kt New data class representing a string with its integer ID and serialization logic
MStop.kt Updated to index stop names and use consistent column separator
MRoute.kt Updated to index route short/long names and use consistent column separator
MSchedule.kt Updated to index headsign values, replace Constants.EMPTY with .orEmpty(), and use consistent separator
MDirection.kt Updated to index headsign values, introduce HEADSIGN_DEFAULT_VALUE constant
MServiceIds.kt Added count() method for consistency with MStrings
MServiceId.kt Updated to use unquotesUnescape() and consistent separator
MServiceDate.kt Updated to use consistent column separator
MFrequency.kt Updated to use consistent column separator
MDirectionStop.kt Updated to use consistent column separator
MReader.kt Added loadStrings() method to load persisted string mappings
MGenerator.java Added dumpStrings() method and logging for string count
GSpec.java Replaced separator usage with string literals for toString()
SQLUtils.kt Exposed COLUMN_SEPARATOR constant
DumpDbUtils.kt Added string table creation/dropping in database setup
DefaultAgencyTools.java Added loading of last strings on startup
Constants.java Removed now-unused separator constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mmathieum
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a string indexing mechanism to optimize storage, which is a great improvement. The implementation in MStrings and its usage across data classes are well done. I've found a couple of areas for improvement:

  • One file uses a hardcoded separator instead of the new constant.
  • The new dumpStrings method could be improved for security and performance by using PreparedStatement and for better resource management.

@mmathieum mmathieum requested a review from Copilot November 24, 2025 19:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mmathieum mmathieum merged commit d202476 into master Nov 24, 2025
4 checks passed
@mmathieum mmathieum deleted the mm/strings branch November 24, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants