Skip to content

Conversation

fumoboy007
Copy link

Fixes #225.

Background

The optimization algorithm has three main calculations:

  1. Select the working set {i, j} that minimizes the decrease in the objective function.
  2. Change alpha[i] and alpha[j] to minimize the decrease in the objective function while respecting constraints.
  3. Update the gradient of the objective function according to the changes to alpha[i] and alpha[j].

All three calculations make use of the matrix Q, which is represented by the QMatrix class. The QMatrix class has two main methods:

  • get_Q, which returns an array of values for a single column of the matrix; and
  • get_QD, which returns an array of diagonal values.

Problem

Q values are of type Qfloat while QD values are of type double. Qfloat is currently defined as float, so there can be inconsistency in the diagonal values returned by get_Q and get_QD. For example, in #225, one of the diagonal values is 181.05748749793070829 as double and 180.99411909539512067 as float.

The first two calculations of the optimization algorithm access the diagonal values via get_QD. However, the third calculation accesses the diagonal values via get_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.

Solution

We change get_Q to return a new class called QColumn instead of a plain array of values. The QColumn class overloads the subscript operator, so accessing individual elements is the same as before. Internally though, the QColumn class will return the QD value when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

Alternatives Considered

Alternatively, we could change Qfloat to be defined as double. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half.

Future Changes

The Java code will be updated similarly in a separate commit.

…llate.

Fixes cjlin1#225.

# Background

The optimization algorithm has three main calculations:
1. Select the working set `{i, j}` that minimizes the decrease in the objective function.
2. Change `alpha[i]` and `alpha[j]` to minimize the decrease in the objective function while respecting constraints.
3. Update the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`.

All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` class. The `QMatrix` class has two main methods:
- `get_Q`, which returns an array of values for a single column of the matrix; and
- `get_QD`, which returns an array of diagonal values.

# Problem

`Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently defined as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`.

The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225.

# Solution

We change `get_Q` to return a new class called `QColumn` instead of a plain array of values. The `QColumn` class overloads the subscript operator, so accessing individual elements is the same as before. Internally though, the `QColumn` class will return the `QD` value when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

# Alternatives Considered

Alternatively, we could change `Qfloat` to be defined as `double`. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half.

# Future Changes

The Java code will be updated similarly in a separate commit.
fumoboy007 added a commit to fumoboy007/scikit-learn that referenced this pull request Dec 21, 2024
…to oscillate.

See more details in the upstream pull request: cjlin1/libsvm#228.
fumoboy007 added a commit to fumoboy007/scikit-learn that referenced this pull request Dec 21, 2024
…to oscillate.

See more details in the upstream pull request: cjlin1/libsvm#228.

Fixes scikit-learn#30353.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Training gets stuck on a specific dataset

1 participant