Skip to content

sctg-development/latex2mathml

Repository files navigation

latex2mathml

codecov

latex2mathml provides a functionality to convert LaTeX math equations to MathML. This crate is implemented in pure Rust, so it works in any environments if Rust works (including WebAssembly).

This crate is an extension of https://github.com/osanshouo/latex2mathml wich is 6 years old and not maintained anymore.

Supported LaTeX commands

Core Elements

  • Numbers, e.g. 0, 3.14, ...
  • ASCII and Greek (and more) letters, e.g. x, \alpha, \pi, \aleph, ...
  • Symbols, e.g., \infty, \dagger, \angle, \Box, \partial, ...
  • Binary relations, e.g. =, >, <, \ll, :=, ...
  • Binary operations, e.g. +, -, *, /, \times, \otimes, ...

Functions & Operators

  • Basic LaTeX commands, e.g. \sqrt, \frac, \sin, \binom, ...
  • Parentheses and delimiters, e.g., \left\{ .. \middle| .. \right], ...
  • Integrals, e.g., \int_0^\infty, \iint, \oint, ...
  • Big operators, e.g., \sum, \prod, \bigcup_{i = 0}^\infty, ...
  • Limits and overset/underset, e.g., \lim, \overset{}{}, \overbrace{}{}, ...

Styling

  • Font styles, e.g. \mathrm, \mathbf, \bm, \mathit, \mathsf, \mathscr, \mathbb, \mathfrak, \texttt.
  • White spaces, e.g., \!, \,, \:, \;, \ , \quad, \qquad.
  • Color support with \color{colorname} and \textcolor{colorname}{text}.

Matrices & Multi-line Equations

  • Matrix environments, e.g. \begin{matrix}, \begin{pmatrix}, \begin{bmatrix}, \begin{vmatrix}.
  • Aligned equations: \begin{align} ... \end{align} with newlines \\ and alignment &.
  • Case-wise definitions: \begin{cases} ... \end{cases}.

Advanced Features

  • Multiple subscripts and superscripts, e.g., x_{a}^{b}, T^{ij}_{kl}.
  • Feynman slash notation: \slashed{\partial}.
  • Strikethrough and phantom text: \cancel{}, \phantom{}.

See examples/equations.rs for examples. Note that all supported commands are defined in src/token.rs.

Implementation Notes

Newline & Alignment

Newline \\ and alignment & work in matrix and align environments to create multi-line equations. These features are fully integrated with the matrix rendering system.

Align Environment

The \begin{align} .. \end{align} environment (supported since version 0.2.1) creates aligned multi-line equations. Use \\ to separate lines and & to align equations at a specific point:

x + y &= 1 \\
2x - y &= 2

Tensor Notation

Complex subscript/superscript combinations are supported for tensor notation:

T^{ij}_{kl}              % Tensor with superscripts then subscripts
\Gamma^{\lambda}_{\mu\nu} % Christoffel symbol

These render with appropriate MathML msubsup and similar structures for correct mathematical typesetting.

Recent Improvements (v0.4.0)

Code Quality & Test Coverage

  • Comprehensive test suite: 600+ test cases covering edge cases, complex equations, and MathML compliance
  • Test coverage: 82.54% of source code (1456/1764 lines)
  • MathML specification compliance: Full support for MathML rendering with proper element nesting
  • Real-world formulas tested: Einstein's mass-energy equation, Schrödinger equation, Maxwell-Boltzmann distribution, and more

New Test Suites Added

  • complex_equations_rendering.rs: 55 tests for deeply nested equations and complex rendering
  • comprehensive_coverage.rs: 99 tests for edge cases and LaTeX syntax variations
  • mathml_spec_compliance.rs: 50 tests validating MathML specification compliance
  • advanced_mathml_compliance.rs: 46 tests for MathML element structure and attributes
  • error_and_attributes_coverage.rs: 74 tests for error handling and style attributes
  • newline_alignment_multiscripts.rs: 43 tests for multi-line equations and tensor notation

Parser Robustness

  • Improved error messages with clear token expectations
  • Better handling of edge cases in subscript/superscript combinations
  • Validation of environment matching and nesting

Known Limitations

Dollar sign \$ is allowed for the latex_to_mathml function, but the replace function does not allow it. This is because the replace function assumes all dollar signs appear as boundaries of LaTeX equations.

If a feature you need is lacking, feel free to open an issue on GitHub.

Usage

For a single LaTeX equation:

use latex2mathml::{latex_to_mathml, DisplayStyle};

let latex = r#"\erf ( x ) = \frac{ 2 }{ \sqrt{ \pi } } \int_0^x e^{- t^2} \, dt"#;
let mathml = latex_to_mathml(latex, DisplayStyle::Block).unwrap();
println!("{}", mathml);

Multi-line Equations

For aligned equations with newlines and alignment:

use latex2mathml::{latex_to_mathml, DisplayStyle};

let latex = r#"
\begin{align}
x + y &= 1 \\
2x - y &= 2
\end{align}
"#;
let mathml = latex_to_mathml(latex, DisplayStyle::Block).unwrap();

Or using simple newlines in display mode:

let latex = r#"a + b \\ c + d"#;
let mathml = latex_to_mathml(latex, DisplayStyle::Block).unwrap();

Tensor Notation

For complex sub/superscript combinations:

use latex2mathml::{latex_to_mathml, DisplayStyle};

let latex = r#"\Gamma^{\lambda}_{\mu\nu}"#;
let mathml = latex_to_mathml(latex, DisplayStyle::Inline).unwrap();

Document Conversion

For a document that includes LaTeX equations:

let text = r#"
Let us consider a rigid sphere (i.e., one having a spherical 
figure when tested in the stationary system) of radius $R$ 
which is at rest relative to the system ($K$), and whose centre 
coincides with the origin of $K$ then the equation of the 
surface of this sphere, which is moving with a velocity $v$ 
relative to $K$, is
$$\xi^2 + \eta^2 + \zeta^2 = R^2$$
"#;
let mathml = latex2mathml::replace(text).unwrap();
println!("{}", mathml);

HTML Conversion

To convert HTML files in a directory recursively, use latex2mathml::convert_html. This function is for converting HTMLs generated by cargo doc.

See also examples/equations.rs and examples/document.rs for more comprehensive examples.

About

Pure latex to mathML librairie

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages