Visual Perception
Color Perception ¶
Informative · Practical
We can establish an understanding on color perception through studying its physical and perceptual meaning. This way, we can gather more information on its relation to technologies and devices including displays, cameras, sensors, communication devices, computers and computer graphics.
Color, a perceptual phenomenon, can be explained in a physical and visual perception capacity. In the physical sense, color is a quantity representing the response to wavelength of light. The human visual system can perceive colors within a certain range of the electromagnetic spectrum, from around 400 nanometers to 700 nanometers. For greater details on the electromagnetic spectrum and concept of wavelength, we recommend revisiting Light, Computation, and Computational Light section of our course. For the human visual system, color is a perceptual phenomenon created by our brain when specific wavelengths of light are emitted, reflected, or transmitted by objects. The perception of color originates from the absorption of light by photoreceptors in the eye. These photoreceptor cells convert the light into electrical signals to be interpreted by the brain1. Here, you can see a close-up photograph of these photoreceptor cells found in the eye.
The photoreceptors, where color perception originates, are called rods and cones2. Here, we provide a sketch showing where these rods and cones are located inside the eye. By closely observing this sketch, you can also understand the basic average geometry of a human eye and its parts helping to redirect light from an actual scene towards retinal cells.
Rods, which are relatively more common in the periphery, help people see in low-light (scotopic) conditions.
The current understanding is that the rods can only interpret in a greyscale manner.
Cones, which are more dense in the fovea, are pivotal in color perception in brighter (photopic) environments.
We highlight the distribution of these photoreceptor cells, rods and cones with changing eccentricities in the eye.
Here, the word eccentricities refer to angles with respect to our gaze direction.
For instance, if a person is not directly gazing at a location or an object in a given scene, that location or the object would have some angle to the gaze of that person.
Thus, there would be at some angles, some eccentricity between the gaze of that person and that location or object in that scene.
In the above sketch, we introduced various parts on the retina, including fovea, parafovea, perifovea and peripheral vision. Note that these regions are defined by the angles, in other words eccentricities. Please also note that there is a region on our retina where there are no rods and cones are available. This region could be found in every human eye and known as the blind spot on the retina. Visual acuity and contrast sensitivity decreases progressively across these identified regions, with the most detail in the fovea, diminishing toward the periphery.
The cones are categorized into three types based on their sensitivity to specific wavelengths of light, corresponding to long (L), medium (M), and short (S) wavelength cones. These three types of cones3 allow us to better understand the trichromatic theory4, suggesting that human color perception stems from combining stimulations of the LMS cones. Scientists have tried to graphically represent how sensitive each type of cone is to different wavelengths of light, which is known as the spectral sensitivity function5. In practical applications such as display technologies and computational imaging, the LMS cone response can be replicated with the following formula:
Where:
- \(RGB_i\): The i-th color channel (Red, Green, or Blue) of the image.
- \(Spectrum_i\): The spectral distribution of the corresponding primary
- \(Sensitivity_i\): The sensitivity of the L, M, and S cones for each wavelength.
This formula gives us more insight on how we percieve colors from different digital and physical inputs.
Looking for more reading to expand your understanding on human visual system?
We recommend these papers, which we find it insightful:
- B. P. Schmidt, M. Neitz, and J. Neitz, "Neurobiological hypothesis of color appearance and hue perception," J. Opt. Soc. Am. A 31(4), A195–207 (2014)
- Biomimetic Eye Modeling & Deep Neuromuscular Oculomotor Control
The story of color perception only deepens with the concept of color opponency6. This theory reveals that our perception of color is not just a matter of additive combinations of primary colors but also involves a dynamic interplay of opposing colors: red versus green, blue versus yellow. This phenomenon is rooted in the neural pathways of the eye and brain, where certain cells are excited or inhibited by specific wavelengths, enhancing our ability to distinguish between subtle shades and contrasts. Below is a mathematical formulation for the color opponency model proposed by Schmidt et al.3
In this equation, \(I_L\), \(I_M\), and \(I_S\) represent the intensities received by the long, medium, and short cone cells, respectively. Opponent signals are represented by the differences between combinations of cone responses.
We could exercise on our understanding of trichromat sensation with LMS cones and the concept of color opponency by vising the functions available in our toolkit, odak.
The utility function we will review is odak.learn.perception.display_color_hvs.primarier_to_lms() from odak.learn.perception.
Let us use this test to demonstrate how we can obtain LMS sensation from the color primaries of an image.
import odak # (1)
import torch
import sys
from odak.learn.perception.color_conversion import display_color_hvs
header = "test/test_learn_perception_display_color_hvs.py"
def test(device=torch.device("cpu"), output_directory="test_output"):
odak.tools.check_directory(output_directory)
torch.manual_seed(0)
image_rgb = (
odak.learn.tools.load_image(
"test/data/fruit_lady.png", normalizeby=255.0, torch_style=True
)
.unsqueeze(0)
.to(device)
) # (2)
the_number_of_primaries = 3
multi_spectrum = torch.zeros(the_number_of_primaries, 301) # (3)
multi_spectrum[0, 200:250] = 1.0
multi_spectrum[1, 130:145] = 1.0
multi_spectrum[2, 0:50] = 1.0
display_color = display_color_hvs(
read_spectrum="tensor", primaries_spectrum=multi_spectrum, device=device
) # (4)
image_lms_second_stage = display_color.primaries_to_lms(image_rgb) # (5)
image_lms_third_stage = display_color.second_to_third_stage(
image_lms_second_stage
) # (6)
odak.learn.tools.save_image(
"{}/image_rgb.png".format(output_directory),
image_rgb,
cmin=0.0,
cmax=image_rgb.max(),
)
odak.learn.tools.save_image(
"{}/image_lms_second_stage.png".format(output_directory),
image_lms_second_stage,
cmin=0.0,
cmax=image_lms_second_stage.max(),
)
odak.learn.tools.save_image(
"{}/image_lms_third_stage.png".format(output_directory),
image_lms_third_stage,
cmin=0.0,
cmax=image_lms_third_stage.max(),
)
image_rgb_noisy = image_rgb * 0.6 + torch.rand_like(image_rgb) * 0.4 # (7)
loss_lms = display_color(image_rgb, image_rgb_noisy) # (8)
odak.log.logger.info(
"{} -> The third stage LMS sensation difference between two input images is {:.10f}.".format(
header, loss_lms
)
)
assert True == True
if __name__ == "__main__":
sys.exit(test())
- Adding
odakto our imports. - Loading an existing RGB image.
- Defining the spectrum of our primaries of our imaginary display. These values are defined for each primary from 400 nm to 701 nm (301 elements).
- Obtain LMS cone sensations for our primaries of our imaginary display.
- Calculating the LMS sensation of our input RGB image at the second stage of color perception using our imaginary display.
- Calculating the LMS sensation of our input RGB image at the third stage of color perception using our imaginary display.
- We are intentionally adding some noise to the input RGB image here.
- We calculate the perceptual loss/difference between the two input image (original RGB vs noisy RGB).
This a visualization of a randomly generated image and its' LMS cone sensation.
Our code above saves three different images. The very first saved image is the ground truth RGB image as depicted below.
We process this ground truth image by accounting human visual system's cones and display backlight spectrum. This way, we can calculate how our ground truth image is sensed by LMS cones. The LMS sensation, in other words, ground truth image in LMS color space is provided below. Note that each color here represent a different cone, for instance, green color channel of below image represents medium cone and blue channel represents short cones. Keep in mind that LMS sensation is also known as trichromat sensation in the literature.
Earlier, we discussed about the color oppenency theory. We follow this theory, and with our code, we utilize trichromat values to derive an image representation below.
Lab work: Observing the effect of display spectrum
We introduce our unit test, test_learn_perception_display_color_hvs.py, to provide an example on how to convert an RGB image to trichromat values as sensed by the retinal cone cells.
Note that during this exercise, we define a variable named multi_spectrum to represent the wavelengths of our each color primary.
These wavelength values are stored in a vector for each primary and provided the intensity of a corresponding wavelength from 400 nm to 701 nm.
The trichromat values that we have derived from our original ground truth RGB image is highly correlated with these spectrum values.
To observe this correlation, we encourage you to find spectrums of actual display types (e.g., OLEDs, LEDs, LCDs) and map the multi_spectrum to their spectrum to observe the difference in color perception in various display technologies.
In addition, we believe that this will also give you a practical sandbox to examine the correlation between wavelengths and trichromat values.
Reminder
We host a Slack group with more than 300 members. This Slack group focuses on the topics of rendering, perception, displays and cameras. The group is open to public and you can become a member by following this link. Readers can get in-touch with the wider community using this public group.
Quantitative Contrast Metrics for Image Quality Assessment¶
Informative · Practical
Contrast is a fundamental property of visual perception that describes the difference in luminance or color that makes an object distinguishable from other objects and its background. In computational imaging and visual perception research, quantifying contrast is essential for evaluating image quality, designing display systems, and developing accessibility features.
The odak.learn.perception.contrast module provides three complementary contrast metrics that capture different aspects of image contrast:
Weber Contrast¶
Weber contrast is one of the oldest and most intuitive contrast measures, defined as:
where \(I_{max}\) and \(I_{min}\) are the mean intensities of the foreground (target) and background regions, respectively. Weber contrast is particularly useful for: - Images with uniform backgrounds and localized features - Situations where the background intensity is well-defined - Applications where relative changes in intensity matter
The Weber contrast can take values from 0 (no contrast) to infinity, with higher values indicating greater contrast.
from odak.learn.perception import weber_contrast
import torch
# Load or create an image (2D, 3D, or 4D tensor)
image = torch.ones(1, 1, 64, 64) * 0.3 # Dark background
image[:, :, 0:20, 0:20] = 0.9 # Bright target region
# Define regions of interest (row_start, row_end, col_start, col_end)
roi_high = [0, 20, 0, 20] # Bright region
roi_low = [32, 52, 32, 52] # Background region
# Compute Weber contrast
contrast = weber_contrast(image, roi_high, roi_low)
print(f"Weber contrast: {contrast.item():.4f}")
# Expected: (0.9 - 0.3) / 0.3 = 2.0
Michelson Contrast¶
Michelson contrast is defined as:
This metric produces values in the bounded range [0, 1], making it particularly suitable for: - Periodic patterns and sinusoidal gratings - Comparing contrast across different images or conditions - Applications where normalized contrast values are preferred - Situations where both bright and dark regions are equally important
A Michelson contrast of 0 indicates no contrast (uniform intensity), while 1 represents maximum contrast (one region is completely black).
from odak.learn.perception import michelson_contrast
import torch
# Create a pattern with bright and dark regions
image = torch.zeros(1, 3, 128, 128)
image[:, :, 0:64, :] = 0.8 # Bright half
image[:, :, 64:128, :] = 0.2 # Dark half
# Define regions
roi_high = [0, 64, 0, 128] # Bright region
roi_low = [64, 128, 0, 128] # Dark region
# Compute Michelson contrast
contrast = michelson_contrast(image, roi_high, roi_low)
print(f"Michelson contrast: {contrast.item():.4f}")
# Expected: (0.8 - 0.2) / (0.8 + 0.2) = 0.6
Content-Aware Contrast Ratio (CWMC)¶
The Content-Aware Contrast Ratio Measure (CWMC) is a more sophisticated metric that evaluates contrast locally across the entire image using a sliding window approach. Based on the methodology by Ortiz-Jaramillo et al. (2018), CWMC:
- Extracts overlapping patches across the image
- Applies ISODATA clustering to find optimal thresholds for each patch
- Partitions pixels into foreground and background within each patch
- Computes Weber contrast for each local region
- Aggregates results using percentile pooling and harmonic mean
CWMC is particularly useful for: - Natural images with varying local contrast - Evaluating overall image quality without predefined regions - Applications requiring a single global contrast score that accounts for content distribution
from odak.learn.perception import content_aware_contrast_ratio
import torch
# Create a test image with varying contrast regions
image = torch.rand(1, 1, 256, 256)
# Compute CWMC with default parameters
result = content_aware_contrast_ratio(
image,
window_size=15, # Patch size
step=3, # Sliding window step
pooling_percentile=75.0 # Keep top 25% of local contrasts
)
print(f"Global CWMC score: {result['pooled_harmonic_mean']:.4f}")
print(f"Maximum local contrast: {result['max_contrast_ratio']:.4f}")
print(f"Number of patches evaluated: {result['n_patches']}")
# Access contrast maps
contrast_map = result['contrast_map'] # Full image contrast map
threshold_map = result['threshold_map'] # Optimal thresholds per patch
Choosing the Right Contrast Metric¶
| Metric | Range | Best For | Complexity |
|---|---|---|---|
| Weber Contrast | [0, ∞) | Local regions, uniform backgrounds | Low |
| Michelson Contrast | [0, 1] | Periodic patterns, normalized comparison | Low |
| CWMC | [0, 1] | Natural images, global assessment | Medium |
Practical Considerations: - Weber contrast is sensitive to low background values (can produce very high values) - Michelson contrast is bounded and symmetric but requires both bright and dark regions - CWMC provides a comprehensive assessment but is computationally more intensive
All three functions support batch processing, GPU acceleration, and comprehensive error handling for invalid inputs.
How do these metrics relate to human perception?
Each contrast metric captures different aspects of human visual perception:
- Weber's Law: Human contrast sensitivity follows Weber's law for many conditions, making Weber contrast perceptually meaningful.
- Spatial frequency: Michelson contrast is particularly relevant for understanding sensitivity to sinusoidal gratings, a classic stimulus in vision science.
- Natural scenes: CWMC approximates how humans assess overall image quality by considering local variations in contrast.
For accessibility applications, combining these metrics can provide a more complete picture of visual information availability.
Loss Functions for Color Vision Deficiency Simulation¶
Informative · Practical
Understanding Color Vision Deficiency¶
Color Vision Deficiency (CVD), commonly referred to as color blindness, affects approximately 1 in 12 men (8%) and 1 in 200 women worldwide. It arises when one or more types of cone cells in the retina are missing or have altered spectral sensitivity. The most common forms include:
- Protanopia/Protanomaly (red-blind/red-weak): Reduced sensitivity to red light
- Deuteranopia/Deuteranomaly (green-blind/green-weak): Reduced sensitivity to green light
- Tritanopia/Tritanomaly (blue-blind/blue-weak): Reduced sensitivity to blue light (much rarer)
Individuals with CVD perceive colors differently, often experiencing reduced color discrimination or complete loss of certain color distinctions. This has significant implications for visual information access in digital content, where color is frequently used to convey meaning.
When designing personalized image generation systems for individuals with CVD, we need quantitative metrics to evaluate how well our generated images preserve visual information under CVD conditions. The odak.learn.perception module provides specialized loss functions based on the methodology from the paper "Personalized Image Generation for Color Vision Deficiency Population" (ICCV 2023).
These loss functions measure two critical aspects:
-
Local Contrast Loss (\(L_{LC}\)): Measures the decay of local contrast between an original image and its CVD simulation. This ensures that edges and boundaries remain distinguishable under CVD conditions.
-
Color Information Loss (\(L_{CI}\)): Measures the L1 distance between primary colors after applying Gaussian blur to both images. This focuses on preserving the main color information while avoiding excessive detail.
The combined CVD loss is formulated as:
where \(\alpha = 15.0\) and \(\beta = 1.0\) are the default weighting parameters determined by the authors.
"""
Unit tests for cvd_loss function.
Tests the combined CVD loss computation including local contrast and color information losses.
"""
import torch
import sys
import pytest
def test_cvd_loss_basic():
"""Test basic functionality with simple images."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
# Create two images
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
# All losses should be positive
assert isinstance(total_loss, torch.Tensor)
assert isinstance(lc_loss, torch.Tensor)
assert isinstance(ci_loss, torch.Tensor)
assert total_loss.item() > 0.0
assert lc_loss.item() > 0.0
assert ci_loss.item() > 0.0
def test_cvd_loss_default_weights():
"""Test with default weights (alpha=15.0, beta=1.0)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
# Total loss should be: alpha * lc_loss + beta * ci_loss
expected_total = 15.0 * lc_loss.item() + 1.0 * ci_loss.item()
assert abs(total_loss.item() - expected_total) < 1e-5
def test_cvd_loss_custom_weights():
"""Test with custom weights."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(
image, simulated_image,
alpha=10.0,
beta=2.0
)
# Total loss should be: alpha * lc_loss + beta * ci_loss
expected_total = 10.0 * lc_loss.item() + 2.0 * ci_loss.item()
assert abs(total_loss.item() - expected_total) < 1e-5
def test_cvd_loss_alpha_zero():
"""Test with alpha=0 (only color information loss)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image, alpha=0.0, beta=1.0)
# Total loss should equal ci_loss when alpha=0
assert abs(total_loss.item() - ci_loss.item()) < 1e-5
def test_cvd_loss_beta_zero():
"""Test with beta=0 (only local contrast loss)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image, alpha=1.0, beta=0.0)
# Total loss should equal lc_loss when beta=0
assert abs(total_loss.item() - lc_loss.item()) < 1e-5
def test_cvd_loss_identical_images():
"""Test with identical images (loss should be zero)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image.clone()
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
assert total_loss.item() == 0.0
assert lc_loss.item() == 0.0
assert ci_loss.item() == 0.0
def test_cvd_loss_gradient_flow():
"""Test that gradients flow through the combined loss."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64, requires_grad=True)
simulated_image = image.detach() * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
total_loss.backward()
assert image.grad is not None
assert image.grad.shape == image.shape
def test_cvd_loss_different_parameters():
"""Test with different parameter configurations."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
# Default parameters
total_loss_default, _, _ = cvd_loss(image, simulated_image)
# Custom patch sizes and kernel sizes
total_loss_custom, _, _ = cvd_loss(
image, simulated_image,
lc_patch_size=8,
lc_stride=4,
ci_kernel_size=7,
ci_sigma=2.0
)
# Both should be valid positive losses
assert total_loss_default.item() > 0.0
assert total_loss_custom.item() > 0.0
def test_cvd_loss_minimum_image_size():
"""Test with minimum allowed image size (4x4)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(1, 3, 4, 4)
simulated_image = image * 0.9
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
assert total_loss.item() >= 0.0
assert lc_loss.item() >= 0.0
assert ci_loss.item() >= 0.0
def test_cvd_loss_large_batch():
"""Test with larger batch size."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(8, 3, 64, 64)
simulated_image = image * 0.75
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
assert total_loss.item() > 0.0
assert lc_loss.item() > 0.0
assert ci_loss.item() > 0.0
def test_cvd_loss_type_error_image():
"""Test that TypeError is raised for non-tensor image."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
with pytest.raises(TypeError):
cvd_loss("not_a_tensor", torch.rand(2, 3, 64, 64))
def test_cvd_loss_type_error_simulated():
"""Test that TypeError is raised for non-tensor simulated_image."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
with pytest.raises(TypeError):
cvd_loss(torch.rand(2, 3, 64, 64), "not_a_tensor")
def test_cvd_loss_shape_mismatch():
"""Test that ValueError is raised for shape mismatch."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = torch.rand(2, 3, 32, 32)
with pytest.raises(ValueError):
cvd_loss(image, simulated_image)
def test_cvd_loss_image_too_small():
"""Test that ValueError is raised for images smaller than 4x4."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(1, 3, 3, 3)
simulated_image = image * 0.9
with pytest.raises(ValueError):
cvd_loss(image, simulated_image)
def test_cvd_loss_wrong_dimensions():
"""Test that ValueError is raised for wrong tensor dimensions."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(3, 64, 64) # 3D instead of 4D
simulated_image = image * 0.9
with pytest.raises(ValueError):
cvd_loss(image, simulated_image)
def test_cvd_loss_cuda():
"""Test compatibility with CUDA tensors if available."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
if not torch.cuda.is_available():
pytest.skip("CUDA not available")
image = torch.rand(2, 3, 64, 64, device='cuda')
simulated_image = image.detach() * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
assert total_loss.device.type == 'cuda'
assert lc_loss.device.type == 'cuda'
assert ci_loss.device.type == 'cuda'
assert total_loss.item() > 0.0
def test_cvd_loss_grayscale_images():
"""Test with grayscale images (1 channel)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 1, 64, 64)
simulated_image = image * 0.8
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
assert total_loss.item() > 0.0
assert lc_loss.item() > 0.0
assert ci_loss.item() > 0.0
def test_cvd_loss_extreme_weights():
"""Test with extreme weight combinations."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
# Very high alpha
total_loss_high_alpha, lc_loss_alpha, _ = cvd_loss(image, simulated_image, alpha=100.0, beta=1.0)
# Very high beta
total_loss_high_beta, _, ci_loss_beta = cvd_loss(image, simulated_image, alpha=1.0, beta=100.0)
# Both should be valid and positive
assert total_loss_high_alpha.item() > 0.0
assert total_loss_high_beta.item() > 0.0
# High alpha should primarily scale lc_loss, high beta should scale ci_loss
# The comparison depends on which base loss is larger, so just verify both work
expected_alpha = 100.0 * lc_loss_alpha.item() + 1.0 * ci_loss_beta.item() # approximate
assert abs(total_loss_high_alpha.item() - 100.0 * lc_loss_alpha.item()) < 1.0 + ci_loss_beta.item()
def test_cvd_loss_different_sigma_values():
"""Test with different sigma values for color information loss."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
# Small sigma
loss_small_sigma, _, _ = cvd_loss(
image, simulated_image,
ci_sigma=0.5
)
# Large sigma
loss_large_sigma, _, _ = cvd_loss(
image, simulated_image,
ci_sigma=3.0
)
# Both should be valid positive losses
assert loss_small_sigma.item() > 0.0
assert loss_large_sigma.item() > 0.0
def test_cvd_loss_return_values_consistency():
"""Test that return values are always in correct order."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = image * 0.8
# Run multiple times to ensure consistency
for _ in range(5):
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
# Verify relationship
expected_total = 15.0 * lc_loss.item() + 1.0 * ci_loss.item()
assert abs(total_loss.item() - expected_total) < 1e-5
def test_cvd_loss_inverted_colors():
"""Test with inverted colors (extreme case)."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
simulated_image = 1.0 - image # Invert
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
# Total loss should be positive due to color information differences
# Note: inverted colors preserve contrast, so lc_loss may be 0
assert total_loss.item() > 0.0
assert ci_loss.item() > 0.0 # Color information loss should be positive
assert lc_loss.item() >= 0.0 # Contrast may be preserved
def test_cvd_loss_very_similar_images():
"""Test with very similar images."""
from odak.learn.perception.cvd_loss_functions import cvd_loss
image = torch.rand(2, 3, 64, 64)
# Very slight difference
simulated_image = image * 0.999
total_loss, lc_loss, ci_loss = cvd_loss(image, simulated_image)
# Loss should be very small but positive
assert total_loss.item() > 0.0
assert total_loss.item() < 0.1 # Should be very small
assert lc_loss.item() < 0.1
assert ci_loss.item() < 0.1
def run_all_tests():
"""Run all tests and return True if all pass."""
try:
test_cvd_loss_basic()
test_cvd_loss_default_weights()
test_cvd_loss_custom_weights()
test_cvd_loss_alpha_zero()
test_cvd_loss_beta_zero()
test_cvd_loss_identical_images()
test_cvd_loss_gradient_flow()
test_cvd_loss_different_parameters()
test_cvd_loss_minimum_image_size()
test_cvd_loss_large_batch()
test_cvd_loss_type_error_image()
test_cvd_loss_type_error_simulated()
test_cvd_loss_shape_mismatch()
test_cvd_loss_image_too_small()
test_cvd_loss_wrong_dimensions()
test_cvd_loss_grayscale_images()
test_cvd_loss_extreme_weights()
test_cvd_loss_different_sigma_values()
test_cvd_loss_return_values_consistency()
test_cvd_loss_inverted_colors()
test_cvd_loss_very_similar_images()
# Test CUDA if available
if torch.cuda.is_available():
test_cvd_loss_cuda()
print("All tests passed!")
return True
except AssertionError as e:
print(f"Test failed: {e}")
import traceback
traceback.print_exc()
return False
except Exception as e:
print(f"Error: {e}")
import traceback
traceback.print_exc()
return False
if __name__ == "__main__":
success = run_all_tests()
sys.exit(0 if success else 1)
- Required imports for CVD loss computation.
- Generate synthetic RGB images [B, C, H, W] for testing.
- Create a CVD-simulated version of the image (scaled by 0.8).
- Compute the combined CVD loss with default weights (α=15.0, β=1.0).
- Access individual component losses: local contrast and color information.
- Verify gradient flow through the loss computation for optimization.
In practice, you would use these losses during training to optimize image generation networks. The local contrast loss encourages preservation of structural information, while the color information loss ensures that primary colors remain distinguishable under CVD simulation.
Local Contrast Loss¶
The local contrast loss evaluates how well local contrast patterns are preserved between the original image and its CVD simulation:
where \(C_{ori}\) and \(C_{sim}\) are the contrasts of corresponding patches in the original and simulated images, respectively. A value close to 0 indicates good contrast preservation.
from odak.learn.perception import local_contrast_loss
import torch
# Original and CVD-simulated images
original = torch.rand(1, 3, 128, 128)
simulated = torch.rand_like(original) * 0.9 # Simulated CVD version
# Compute local contrast loss
loss = local_contrast_loss(original, simulated)
print(f"Local contrast loss: {loss.item():.4f}")
Color Information Loss¶
The color information loss focuses on primary colors by applying Gaussian blur before comparison:
where \(\Phi(\cdot)\) denotes a Gaussian blur operator that extracts the primary color information while suppressing excessive detail.
from odak.learn.perception import color_information_loss
import torch
# Original and CVD-simulated images
original = torch.rand(2, 3, 256, 256)
simulated = torch.rand_like(original) * 0.85
# Compute color information loss with custom blur parameters
loss = color_information_loss(original, simulated, kernel_size=7, sigma=1.5)
print(f"Color information loss: {loss.item():.4f}")
Combined CVD Loss¶
For practical applications, the combined loss allows balancing between contrast preservation and color information retention:
from odak.learn.perception import cvd_loss
import torch
# Training data
original_images = torch.rand(4, 3, 128, 128)
cvd_simulated = original_images * 0.8 # Simplified CVD simulation
# Compute combined loss
total_loss, local_contrast, color_info = cvd_loss(
original_images,
cvd_simulated,
alpha=15.0, # Weight for local contrast
beta=1.0 # Weight for color information
)
print(f"Total CVD loss: {total_loss.item():.4f}")
print(f" Local contrast component: {local_contrast.item():.4f}")
print(f" Color information component: {color_info.item():.4f}")
How are the weighting parameters (α, β) determined?
The default values α=15.0 and β=1.0 were empirically determined by the authors to balance the importance of contrast preservation against color information retention. In practice, you may need to adjust these weights based on:
- The severity of CVD (protanopia, deuteranopia, tritanopia)
- The specific application (medical imaging, display design, accessibility testing)
- Subjective user studies with CVD individuals
Challenge: Implement CVD-aware image generation
Using the loss functions provided, implement a complete image generation pipeline optimized for CVD accessibility:
- Create a simple generator network that transforms input images
- Use the combined CVD loss as your objective function
- Train your generator to produce images that maintain information under CVD simulation
- Evaluate your results using standard metrics and potentially user studies
To add these to odak, you can rely on the pull request feature on GitHub. You can also create a new engineering note for CVD-aware generation in docs/notes/cvd_image_generation.md.
Lab Exercise: Optimizing Display Colors for CVD
As a practical exercise, try the following:
- Generate a set of test images with various color palettes
- Apply CVD simulation to each image
- Compute the CVD loss for different color combinations
- Identify which color combinations minimize the loss (are most CVD-friendly)
- Experiment with adjusting the loss weights (α, β) to prioritize different aspects
This exercise will give you hands-on experience with perceptual optimization and help you understand the trade-offs in designing CVD-accessible content.
Closing Remarks¶
As we explore color perception through both biological and computational lenses, we've journeyed from the photoreceptor cells in our eyes to the algorithms that simulate how people with Color Vision Deficiency experience the world.
The loss functions we've introduced—local contrast loss and color information loss—represent a bridge between theoretical understanding and practical application. They enable us to quantify how well digital content preserves information for individuals with CVD, making our technology more inclusive and accessible.
This exploration into the nature of color and perception sets the stage for deeper examination of how we can create technology that serves a diverse population. Whether you're designing displays, developing accessibility tools, or researching computational imaging, understanding these principles will help you create more inclusive solutions.
Consider revisiting this chapter
Remember that you can always revisit this chapter as you progress with the course and as you need it. This chapter is vital for establishing a means to complete your assignments and could help formulate a suitable base to collaborate and work with my research group in the future or other experts in the field.
-
Jeremy Freeman and Eero P Simoncelli. Metamers of the ventral stream. Nature Neuroscience, 14:1195–1201, 2011. doi:10.1038/nn.2889. ↩
-
Trevor D Lamb. Why rods and cones? Eye, 30:179–185, 2015. doi:10.1038/eye.2015.236. ↩
-
Brian P Schmidt, Maureen Neitz, and Jay Neitz. Neurobiological hypothesis of color appearance and hue perception. Journal of the Optical Society of America A, 31(4):A195–A207, 2014. doi:10.1364/JOSAA.31.00A195. ↩↩
-
H. V. Walters. Some experiments on the trichromatic theory of vision. Proceedings of the Royal Society of London. Series B - Biological Sciences, 131:27–50, 1942. doi:10.1098/rspb.1942.0016. ↩
-
Andrew Stockman and Lindsay T Sharpe. The spectral sensitivities of the middle- and long-wavelength-sensitive cones derived from measurements in observers of known genotype. Vision Research, 40:1711–1737, 2000. doi:10.1016/S0042-6989(00)00021-3. ↩
-
Steven K Shevell and Paul R Martin. Color opponency: tutorial. Journal of the Optical Society of America A, 34(8):1099–1110, 2017. doi:10.1364/JOSAA.34.001099. ↩






