Skip to content

[IMPROVEMENT] Detect singular matrices on a per-element basis #1217

@mgovers

Description

@mgovers

This issue is a continuation of the investigation in #1205.

The issue

In the current internal matrix solver of the Power Grid Model, there is a method to detect probable singularities in the matrix equation using a condition number-like method. See, for example, the work done in #853, #862 and the open issue in #864.

The way that the singularities are detected is by doing a (partially pre-solved) LU decomposition, potentially with pivot perturbation to catch small pivot elements caused by not doing full pivoting for topological reasons. After the decomposition is done, the each pivot element after the LU composition is compared to the maximum pivot element of the decomposed matrix, which maps onto the condition number (see e.g. the workout in #1125 (comment) and #1162).

However, in power systems, the matrix equations can contain elements that span many orders of magnitude (see e.g. #1125). That is likely to result in the condition number to be extremely large, causing the check to fail, even though the calculation is perfectly stable because none of the individual operations contain unstable subtractions or additions.

Example

As an example, consider the following matrix with elements spanning multiple orders of magnitude, such that $A \gg \hat{a}, B \gg \hat{b}, C$, with capitalization indicating very large, e.g. $~10^{15}$, regular fond "mid-range", e.g. $~1$ and letters in braces indicating very small, e.g. $~10^{-15}$.

$$ M = \begin{pmatrix} 10^{15} && 1 && 0 \\ 1 && 1 && 10^{-15} \\ 0 && 10^{-15} && 10^{-15} \end{pmatrix} $$

Its LU-decomposition is

$$ L = \begin{pmatrix} 1 && 0 && 0 \\ 10^{-15} \times 1 && 1 && 0 \\ 0 && (1 - 10^{-15} \times 1 \times 1)^{-1} \times 10^{-15} && 1 \end{pmatrix} = \begin{pmatrix} 1 && 0 && 0 \\ 10^{-15} && 1 && 0 \\ 0 && (1 - 10^{-15})^{-1} \times 10^{-15} && 1 \end{pmatrix} \approx \begin{pmatrix} 1 && 0 && 0 \\ 10^{-15} && 1 && 0 \\ 0 && 1^{-1} \times 10^{-15} && 1 \end{pmatrix} = \begin{pmatrix} 1 && 0 && 0 \\ 10^{-15} && 1 && 0 \\ 0 && 10^{-15} && 1 \end{pmatrix} $$

$$ U = \begin{pmatrix} 10^{15} && 1 && 0 \\ 0 && 1 - 10^{-15} \times 1 \times 1 && 10^{-15} \\ 0 && 0 && 10^{-15} - (1 - 10^{-15} \times 1 \times 1)^{-1} \times 10^{-15} \times 10^{-15} \end{pmatrix} = \begin{pmatrix} 10^{15} && 1 && 0 \\ 0 && 1 - 10^{-15} && 10^{-15} \\ 0 && 0 && 10^{-15} - (1 - 10^{-15})^{-1} \times 10^{-30} \end{pmatrix} \approx \begin{pmatrix} 10^{15} && 1 && 0 \\ 0 && 1 && 10^{-15} \\ 0 && 0 && 10^{-15} - 1^{-1} \times 10^{-30} \end{pmatrix} = \begin{pmatrix} 10^{15} && 1 && 0 \\ 0 && 1 && 10^{-15} \\ 0 && 0 && 10^{-15} - 10^{-30} \end{pmatrix} \approx \begin{pmatrix} 10^{15} && 1 && 0 \\ 0 && 1 && 10^{-15} \\ 0 && 0 && 10^{-15} \end{pmatrix} $$

If we calculate the condition number using the $L_{\infty}$-norm, it is $\kappa = \frac{\max_i{m_{ii}}}{\min_j{m_{jj}}} = \frac{10^{15}}{10^{-15}} = 10^{30}$ which suggests that the matrix equation is extremely ill-conditioned (remember that 64-bit floating point operations typically have a precision of the order of magnitude of $10^{-16}$).

Closer inspection, however, tells a different story: the potentially unstable operations are the subtractions in the Gaussian elimination procedure. In the example, the pivot elements obtain a the quadratic suppression by the off-diagonal factors during the process. As a result, all of the operations in the above example are perfectly stable and no ill-conditioning is observed, even though the matrix spans 30 orders of magnitude.

Clearly, the current way of doing things is sub-optimal in certain cases, and that is exactly what we find in reality - see e.g. #1125.

Improvement proposal

Instead of doing a posterior analysis, detect instabilities by taking into account the actual operations that may result in instabilities. E.g., one could keep track of the accumulated error tolerance. One would then compare the magnitude of a (component of a?) (pivot?) element to the accumulated error times an acceptable tolerance level (e.g.: $10^{-5}$, or maybe $10^{-8}$ or even just simply $1$, TBD.). If the test fails, the calculation is considered unstable and significantly affected - or even dominated - by accumulation of numerical errors.

#1205 was an experiment to see whether such an on-the-fly detection approach worked and it seems to have been successful, but one has to be careful because there are some edge cases listed in the validation cases that currently seems to produce correct output for all platforms - even if the calculations are considered unstable by #1205. We have to be extremely careful if we want to reject those edge cases and it needs to be thoroughly investigated exactly what are the implications.

This issue follows in the footsteps of #1205 to:

  1. see whether this approach can be worked out towards a sustainable solution.
  2. actually implement the sustainable solution
  3. [optional] Maybe we can once and for all replace the vague SparseMatrixError with more concrete errors for the different error types, e.g., SingularMatrixError or UnstableCalculationError

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementImprovement on internal implementation

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions