Skip to content

0.5 is special for delta in keras.losses.Huber() or not #21804

@ILCSFNO

Description

@ILCSFNO

Bug Issue

The doc of keras.losses.Huber() shows its description as below:

delta: A float, the point where the Huber loss function changes from a
quadratic to linear.

For the repros below, we can see that delta=0.5 is special when calculating the memory usage, using tf 2.19.0 and keras latest:

Repro 1 (delta == 0.5)

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.5)
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
    memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)

Output 1

Huber loss: 1.3573407
1792

Repro 2 (delta == 0.1 / 0.3 / 1.0 / 10000000.0)

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.1) # choices: 0.1 / 0.3 / 1.0 / 10000000.0
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
    memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)

Output 2

For each one choice, the outputs are below, in which the memory usage is the same:

Huber loss: 0.29216948
2048
Huber loss: 0.8585064
2048
Huber loss: 2.5695057
2048
Huber loss: 5.2090187
2048

The related codes is here:

super().__init__(
huber,
name=name,
reduction=reduction,
dtype=dtype,
delta=delta,
)

y_pred = ops.convert_to_tensor(y_pred)
y_true = ops.convert_to_tensor(y_true, dtype=y_pred.dtype)
y_true, y_pred = squeeze_or_expand_to_same_rank(y_true, y_pred)
delta = ops.convert_to_tensor(delta, dtype=y_pred.dtype)
error = ops.subtract(y_pred, y_true)
abs_error = ops.abs(error)
half = ops.convert_to_tensor(0.5, dtype=abs_error.dtype)
return ops.mean(
ops.where(
abs_error <= delta,
half * ops.square(error),
delta * abs_error - half * ops.square(delta),
),
axis=-1,
)

But I didn't find something that related to the 0.5, and I don't know if this is expected that the result for delta == 0.5 is special.

Thanks for noting!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions