Skip to content

BC1 and BC4 rounding behavior doesn't match DirectXTex #17

@RunDevelopment

Description

@RunDevelopment

Hi! I'm currently implementing a new DDS decoder for the image crate and wanted to use bcdec_rs for decoding BC1-7 images. Unfortunately, bcdec_rs does not decode BC1-5 images correctly. Due to rounding errors, colors can be off by up to one for both BC1 and BC4.

Example

While the error is only a single bit, it is noticeable. In BC1, the error affects G differently than R and B. This can lead to situations where R is one more than it should be and R and B are one less than they should be, resulting in a noticeable green tint. Here's an example BC1 image that should have the color RGB=47 47 47 everywhere (this is also what image programs like Paint.net and Gimp will show). Using bcdec_rs to decode this image will yield the color RGB=46 48 46 instead.

BC1_UNORM_SRGB-47.zip

I also want to point out that these off-by-one errors aren't rare. I magnified the error of the following image, so we can see it:

Image:
image

Error:
image

BC1

Before I explain the cause of the issue, I quickly want to point out that this bug is not with the source translation of bcdec. The original bcdec C library has the same bug. bcdec_rs simply faithfully reproduces the bug.

The bug itself is actually quite simple: when calculating color2 and color3, bcdec is interpolating the already rounded 8-bit values of color0 and color1 instead of the original 5/6-bit values.

Using the 8-bit values for color0/1 is incorrect, because the conversion from 5/6 bits to 8 bits adds a bit of rounding error (e.g. 30 (5 bit) converted to 8 bit is 246.774 exactly but rounded 247) that is then passed along to color2/3 which are rounded again causing even more rounding error.

The fix is to do the interpolation with the original values and then convert to 8 bit. In code, this can be done like this:

let c0 = u16::from_le_bytes(compressed_block[0..2].try_into().unwrap());
let c1 = u16::from_le_bytes(compressed_block[2..4].try_into().unwrap());

// separate 565 colors
let r0_5 = (c0 >> 11) & 0x1F;
let g0_6 = (c0 >> 5) & 0x3F;
let b0_5 = c0 & 0x1F;

let r1_5 = (c1 >> 11) & 0x1F;
let g1_6 = (c1 >> 5) & 0x3F;
let b1_5 = c1 & 0x1F;

// Expand 565 ref colors to 888
let r0 = (r0_5 * 527 + 23) >> 6;
let g0 = (g0_6 * 259 + 33) >> 6;
let b0 = (b0_5 * 527 + 23) >> 6;
ref_colors[0] = [r0 as u8, g0 as u8, b0 as u8, 255u8];

let r1 = (r1_5 * 527 + 23) >> 6;
let g1 = (g1_6 * 259 + 33) >> 6;
let b1 = (b1_5 * 527 + 23) >> 6;
ref_colors[1] = [r1 as u8, g1 as u8, b1 as u8, 255u8];

if c0 > c1 || only_opaque_mode {
    // Standard BC1 mode (also BC3 color block uses ONLY this mode)
    // color_2 = 2/3*color_0 + 1/3*color_1
    // color_3 = 1/3*color_0 + 2/3*color_1
    let r = 2 * r0_5 + r1_5;
    let g = 2 * g0_6 + g1_6;
    let b = 2 * b0_5 + b1_5;
    let r = (r * 351 + 61) >> 7;
    let g = (g as u32 * 2763 + 1039) >> 11;
    let b = (b * 351 + 61) >> 7;
    ref_colors[2] = [r as u8, g as u8, b as u8, 255u8];

    let r = r0_5 + 2 * r1_5;
    let g = g0_6 + 2 * g1_6;
    let b = b0_5 + 2 * b1_5;
    let r = (r * 351 + 61) >> 7;
    let g = (g as u32 * 2763 + 1039) >> 11;
    let b = (b * 351 + 61) >> 7;
    ref_colors[3] = [r as u8, g as u8, b as u8, 255u8];
} else {
    // Quite rare BC1A mode
    // color_2 = 1/2*color_0 + 1/2*color_1;
    // color_3 = 0;
    let r = r0_5 + r1_5;
    let g = g0_6 + g1_6;
    let b = b0_5 + b1_5;
    let r = (r * 1053 + 125) >> 8;
    let g = (g as u32 * 4145 + 1019) >> 11;
    let b = (b * 1053 + 125) >> 8;
    ref_colors[2] = [r as u8, g as u8, b as u8, 255u8];

    ref_colors[3] = [0u8; 4];
}

In case you're curious about the crazy conversions like (r * 351 + 61) >> 7: they use the same trick as bcdec for its 5/6-bit number to 8-bit number conversion. E.g. (r * 351 + 61) >> 7 is equivalent to (r as f64 / (3 * 63) * 255).round() for values 0 <= r <= 3*63. I got all of these constants using a brute force script that verified that it correctly maps the entire range of inputs.

BC4

Here the bug is simpler: the rounding is wrong. E.g.

alpha[2] = (6 * alpha[0] + alpha[1] + 1) / 7;

That + 1 should have been + 3.

Similarly:

alpha[2] = (4 * alpha[0] + alpha[1] + 1) / 5;

That + 1 should have been + 2.

Why is +1 wrong for the /7 values? E.g. if the interpolated alpha value turns out to be 5, then 5/7 should be rounded to 1. But (5+1) / 7 == 0 because integer division is floor division.

In general, if you have 2 unsigned integers x and n and want to find the rounded division of x/n, then it can be calculated as (x + (n>>1)) / n. This is why the small number we have to add to the interpolated value is 3 for /7 and 2 for /5.


Would you like me to make a PR?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions