Skip to content

odak.learn.tools

odak.learn.tools

Provides necessary definitions for general tools used across the library.

PerspectiveCamera

A lightweight perspective camera model.

Stores camera intrinsics and extrinsics and provides coordinate-transform utilities.

Parameters:

  • R
             Rotation matrix, shape ``(3, 3)`` or ``(1, 3, 3)``.
    
  • T
             Translation vector, shape ``(3,)`` or ``(1, 3)``.
    
  • focal_length
             Focal lengths ``(fx, fy)``, shape ``(2,)`` or ``(1, 2)``.
    
  • principal_point
             Principal point ``(px, py)``, shape ``(2,)`` or ``(1, 2)``.
    
  • device
             Device for all tensors (default: ``"cpu"``).
    
Source code in odak/learn/tools/camera.py
class PerspectiveCamera:
    """
    A lightweight perspective camera model.

    Stores camera intrinsics and extrinsics and provides
    coordinate-transform utilities.

    Parameters
    ----------
    R              : torch.Tensor
                     Rotation matrix, shape ``(3, 3)`` or ``(1, 3, 3)``.
    T              : torch.Tensor
                     Translation vector, shape ``(3,)`` or ``(1, 3)``.
    focal_length   : torch.Tensor
                     Focal lengths ``(fx, fy)``, shape ``(2,)`` or ``(1, 2)``.
    principal_point: torch.Tensor
                     Principal point ``(px, py)``, shape ``(2,)`` or ``(1, 2)``.
    device         : torch.device or str, optional
                     Device for all tensors (default: ``"cpu"``).
    """

    def __init__(self, R, T, focal_length, principal_point, device="cpu"):
        self.device = torch.device(device)
        self.R = (
            R.to(self.device)
            if isinstance(R, torch.Tensor)
            else torch.tensor(R, dtype=torch.float32, device=self.device)
        )
        self.T = (
            T.to(self.device)
            if isinstance(T, torch.Tensor)
            else torch.tensor(T, dtype=torch.float32, device=self.device)
        )
        self.focal_length = (
            focal_length.to(self.device)
            if isinstance(focal_length, torch.Tensor)
            else torch.tensor(focal_length, dtype=torch.float32, device=self.device)
        )
        self.principal_point = (
            principal_point.to(self.device)
            if isinstance(principal_point, torch.Tensor)
            else torch.tensor(principal_point, dtype=torch.float32, device=self.device)
        )

    def transform_world_to_camera_space(self, points):
        """
        Transform world-space points into camera space.

        Follows the convention: ``X_cam = X_world @ R + T``.

        Parameters
        ----------
        points : torch.Tensor
                 World-space points, shape ``(N, 3)``.

        Returns
        -------
        cam_points : torch.Tensor
                     Camera-space points, shape ``(N, 3)``.
        """
        R = self.R[0] if self.R.dim() == 3 else self.R
        T = self.T[0] if self.T.dim() == 2 else self.T
        return points @ R + T

    def get_camera_center(self):
        """
        Compute the camera centre in world coordinates.

        Returns
        -------
        center : torch.Tensor
                 Camera centre, shape ``(1, 3)``.
        """
        R = self.R[0] if self.R.dim() == 3 else self.R
        T = self.T[0] if self.T.dim() == 2 else self.T
        center = -T @ R.transpose(0, 1)
        return center.unsqueeze(0)

get_camera_center()

Compute the camera centre in world coordinates.

Returns:

  • center ( Tensor ) –

    Camera centre, shape (1, 3).

Source code in odak/learn/tools/camera.py
def get_camera_center(self):
    """
    Compute the camera centre in world coordinates.

    Returns
    -------
    center : torch.Tensor
             Camera centre, shape ``(1, 3)``.
    """
    R = self.R[0] if self.R.dim() == 3 else self.R
    T = self.T[0] if self.T.dim() == 2 else self.T
    center = -T @ R.transpose(0, 1)
    return center.unsqueeze(0)

transform_world_to_camera_space(points)

Transform world-space points into camera space.

Follows the convention: X_cam = X_world @ R + T.

Parameters:

  • points (Tensor) –
     World-space points, shape ``(N, 3)``.
    

Returns:

  • cam_points ( Tensor ) –

    Camera-space points, shape (N, 3).

Source code in odak/learn/tools/camera.py
def transform_world_to_camera_space(self, points):
    """
    Transform world-space points into camera space.

    Follows the convention: ``X_cam = X_world @ R + T``.

    Parameters
    ----------
    points : torch.Tensor
             World-space points, shape ``(N, 3)``.

    Returns
    -------
    cam_points : torch.Tensor
                 Camera-space points, shape ``(N, 3)``.
    """
    R = self.R[0] if self.R.dim() == 3 else self.R
    T = self.T[0] if self.T.dim() == 2 else self.T
    return points @ R + T

blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding='same')

Blur a field using a Gaussian kernel.

This function applies Gaussian blur to the input field using convolution with a Gaussian kernel in the frequency domain.

Parameters:

  • field
            MxN field to be blurred.
    
  • kernel_length (list, default: [21, 21] ) –
            Length of the Gaussian kernel along X and Y axes.
    
  • nsigma
            Sigma of the Gaussian kernel along X and Y axes.
    
  • padding
            Padding value, see torch.nn.functional.conv2d() for more.
    

Returns:

  • blurred_field ( tensor ) –

    Blurred field.

Source code in odak/learn/tools/matrix.py
def blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding="same"):
    """
    Blur a field using a Gaussian kernel.

    This function applies Gaussian blur to the input field using convolution with 
    a Gaussian kernel in the frequency domain.

    Parameters
    ----------
    field         : torch.tensor
                    MxN field to be blurred.
    kernel_length : list
                    Length of the Gaussian kernel along X and Y axes.
    nsigma        : list
                    Sigma of the Gaussian kernel along X and Y axes.
    padding       : int or string
                    Padding value, see torch.nn.functional.conv2d() for more.

    Returns
    ----------
    blurred_field : torch.tensor
                    Blurred field.
    """
    kernel = generate_2d_gaussian(kernel_length, nsigma).to(field.device)
    kernel = kernel.unsqueeze(0).unsqueeze(0)
    if len(field.shape) == 2:
        field = field.view(1, 1, field.shape[-2], field.shape[-1])
    blurred_field = torch.nn.functional.conv2d(field, kernel, padding="same")
    if field.shape[1] == 1:
        blurred_field = blurred_field.view(
            blurred_field.shape[-2], blurred_field.shape[-1]
        )
    return blurred_field

center_of_triangle(triangle)

Definition to calculate center of a triangle.

Parameters:

  • triangle
            An array that contains three points defining a triangle (Mx3). It can also parallel process many triangles (NxMx3).
    
Source code in odak/raytracing/primitives.py
def center_of_triangle(triangle):
    """
    Definition to calculate center of a triangle.

    Parameters
    ----------
    triangle      : ndarray
                    An array that contains three points defining a triangle (Mx3). It can also parallel process many triangles (NxMx3).
    """
    if len(triangle.shape) == 2:
        triangle = triangle.reshape((1, 3, 3))
    center = np.mean(triangle, axis=1)
    return center

circular_binary_mask(px, py, r)

Generate a 2D circular binary mask.

Parameters:

  • px (int) –

    Pixel count in x dimension.

  • py (int) –

    Pixel count in y dimension.

  • r (Union[int, float]) –

    Radius of the circle.

Returns:

  • Tensor

    Binary mask of shape [1, 1, px, py].

Source code in odak/learn/tools/mask.py
def circular_binary_mask(px: int, py: int, r: Union[int, float]) -> torch.Tensor:
    """
    Generate a 2D circular binary mask.

    Parameters
    ----------
    px : int
        Pixel count in x dimension.
    py : int
        Pixel count in y dimension.
    r : Union[int, float]
        Radius of the circle.

    Returns
    -------
    torch.Tensor
        Binary mask of shape [1, 1, px, py].
    """
    x = torch.linspace(-px / 2.0, px / 2.0, px)
    y = torch.linspace(-py / 2.0, py / 2.0, py)
    X, Y = torch.meshgrid(x, y, indexing="ij")
    Z = (X**2 + Y**2) ** 0.5
    mask = torch.zeros_like(Z)
    mask[Z < r] = 1
    return mask.unsqueeze(0).unsqueeze(0)

convolve2d(field, kernel)

Convolve a field with a kernel using frequency domain multiplication.

This function performs 2D convolution by transforming both the field and kernel to frequency domain, multiplying them, and transforming back to spatial domain.

Parameters:

  • field
          Input field with MxN shape.
    
  • kernel
          Input kernel with MxN shape.
    

Returns:

  • convolved_field ( tensor ) –

    Convolved field.

Source code in odak/learn/tools/matrix.py
def convolve2d(field, kernel):
    """
    Convolve a field with a kernel using frequency domain multiplication.

    This function performs 2D convolution by transforming both the field and kernel 
    to frequency domain, multiplying them, and transforming back to spatial domain.

    Parameters
    ----------
    field       : torch.tensor
                  Input field with MxN shape.
    kernel      : torch.tensor
                  Input kernel with MxN shape.

    Returns
    ----------
    convolved_field   : torch.tensor
                        Convolved field.
    """
    fr = torch.fft.fft2(field)
    fr2 = torch.fft.fft2(torch.flip(torch.flip(kernel, [1, 0]), [0, 1]))
    m, n = fr.shape
    convolved_field = torch.real(torch.fft.ifft2(fr * fr2))
    convolved_field = torch.roll(convolved_field, shifts=(int(n / 2 + 1), 0), dims=(1, 0))
    convolved_field = torch.roll(convolved_field, shifts=(int(m / 2 + 1), 0), dims=(0, 1))
    return convolved_field

correlation_2d(first_tensor, second_tensor)

Calculate the correlation between two tensors using FFT.

This function computes the 2D correlation between two tensors using frequency domain multiplication. It's equivalent to computing cross-correlation using FFT techniques.

Parameters:

  • first_tensor
            First tensor.
    
  • second_tensor (tensor) –
            Second tensor.
    

Returns:

  • correlation ( tensor ) –

    Correlation between the two tensors.

Source code in odak/learn/tools/matrix.py
def correlation_2d(first_tensor, second_tensor):
    """
    Calculate the correlation between two tensors using FFT.

    This function computes the 2D correlation between two tensors using 
    frequency domain multiplication. It's equivalent to computing 
    cross-correlation using FFT techniques.

    Parameters
    ----------
    first_tensor  : torch.tensor
                    First tensor.
    second_tensor : torch.tensor
                    Second tensor.

    Returns
    ----------
    correlation   : torch.tensor
                    Correlation between the two tensors.
    """
    fft_first_tensor = torch.fft.fft2(first_tensor)
    fft_second_tensor = torch.fft.fft2(second_tensor)
    conjugate_second_tensor = torch.conj(fft_second_tensor)
    result = torch.fft.ifftshift(
        torch.fft.ifft2(fft_first_tensor * conjugate_second_tensor)
    )
    return result

crop_center(field, size=None)

Crop the center of a field to specified size or half of current size.

This function crops the center of a field to either half of its current size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.

Parameters:

  • field
          Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array.
    
  • size
          Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N).
          If None, crops to half of the current size.
    

Returns:

  • cropped ( tensor ) –

    Cropped version of the input field.

Source code in odak/learn/tools/matrix.py
def crop_center(field, size=None):
    """
    Crop the center of a field to specified size or half of current size.

    This function crops the center of a field to either half of its current size (default) 
    or to a specified size. The input can be 2D, 3D or 4D tensors.

    Parameters
    ----------
    field       : torch.tensor
                  Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array.
    size        : list
                  Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N).
                  If None, crops to half of the current size.

    Returns
    ----------
    cropped     : torch.tensor
                  Cropped version of the input field.
    """
    orig_resolution = field.shape
    if len(field.shape) < 3:
        field = field.unsqueeze(0)
    if len(field.shape) < 4:
        field = field.unsqueeze(0)
    permute_flag = False
    if field.shape[-1] < 5:
        permute_flag = True
        field = field.permute(0, 3, 1, 2)
    if size is None:
        qx = int(field.shape[-2] // 4)
        qy = int(field.shape[-1] // 4)
        cropped_padded = field[
            :, :, qx : qx + field.shape[-2] // 2, qy : qy + field.shape[-1] // 2
        ]
    else:
        cx = int(field.shape[-2] // 2)
        cy = int(field.shape[-1] // 2)
        hx = int(size[-2] // 2)
        hy = int(size[-1] // 2)
        cropped_padded = field[:, :, cx - hx : cx + hx, cy - hy : cy + hy]
    cropped = cropped_padded
    if permute_flag:
        cropped = cropped.permute(0, 2, 3, 1)
    if len(orig_resolution) == 2:
        cropped = cropped_padded.squeeze(0).squeeze(0)
    if len(orig_resolution) == 3:
        cropped = cropped_padded.squeeze(0)
    return cropped

cross_product(vector1, vector2)

Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product

Parameters:

  • vector1
           A vector/ray.
    
  • vector2
           A vector/ray.
    

Returns:

  • ray ( tensor ) –

    Array that contains starting points and cosines of a created ray.

Source code in odak/learn/tools/vector.py
def cross_product(vector1, vector2):
    """
    Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product

    Parameters
    ----------
    vector1      : torch.tensor
                   A vector/ray.
    vector2      : torch.tensor
                   A vector/ray.

    Returns
    ----------
    ray          : torch.tensor
                   Array that contains starting points and cosines of a created ray.
    """
    angle = torch.cross(vector1[1].T, vector2[1].T)
    angle = torch.tensor(angle)
    ray = torch.tensor([vector1[0], angle], dtype=torch.float32)
    return ray

distance_between_two_points(point1, point2)

Definition to calculate distance between two given points.

Parameters:

  • point1
          First point in X,Y,Z.
    
  • point2
          Second point in X,Y,Z.
    

Returns:

  • distance ( Tensor ) –

    Distance in between given two points.

Source code in odak/learn/tools/vector.py
def distance_between_two_points(point1, point2):
    """
    Definition to calculate distance between two given points.

    Parameters
    ----------
    point1      : torch.Tensor
                  First point in X,Y,Z.
    point2      : torch.Tensor
                  Second point in X,Y,Z.

    Returns
    ----------
    distance    : torch.Tensor
                  Distance in between given two points.
    """
    point1 = torch.tensor(point1) if not isinstance(point1, torch.Tensor) else point1
    point2 = torch.tensor(point2) if not isinstance(point2, torch.Tensor) else point2

    if len(point1.shape) == 1 and len(point2.shape) == 1:
        distance = torch.sqrt(torch.sum((point1 - point2) ** 2))
    elif len(point1.shape) == 2 or len(point2.shape) == 2:
        distance = torch.sqrt(torch.sum((point1 - point2) ** 2, dim=-1))

    return distance

evaluate_3d_gaussians(points, centers=torch.zeros(1, 3), scales=torch.ones(1, 3), angles=torch.zeros(1, 3), opacity=torch.ones(1, 1))

Evaluate 3D Gaussian functions at given points, with optional rotation.

Parameters:

  • points
          The 3D points at which to evaluate the Gaussians.
    
  • centers
          The centers of the Gaussians.
    
  • scales
          The standard deviations (spread) of the Gaussians along each axis.
    
  • angles
          The rotation angles (in radians) for each Gaussian, applied to the points.
    
  • opacity
          Opacity of the Gaussians.
    

Returns:

  • intensities ( (Tensor, shape[n, 1]) ) –

    The evaluated Gaussian intensities at each point.

Source code in odak/learn/tools/function.py
def evaluate_3d_gaussians(
    points,
    centers=torch.zeros(1, 3),
    scales=torch.ones(1, 3),
    angles=torch.zeros(1, 3),
    opacity=torch.ones(1, 1),
) -> torch.Tensor:
    """
    Evaluate 3D Gaussian functions at given points, with optional rotation.

    Parameters
    ----------
    points      : torch.Tensor, shape [n, 3]
                  The 3D points at which to evaluate the Gaussians.
    centers     : torch.Tensor, shape [n, 3]
                  The centers of the Gaussians.
    scales      : torch.Tensor, shape [n, 3]
                  The standard deviations (spread) of the Gaussians along each axis.
    angles      : torch.Tensor, shape [n, 3]
                  The rotation angles (in radians) for each Gaussian, applied to the points.
    opacity     : torch.Tensor, shape [n, 1]
                  Opacity of the Gaussians.

    Returns
    -------
    intensities : torch.Tensor, shape [n, 1]
                  The evaluated Gaussian intensities at each point.
    """
    points_rotated, _, _, _ = rotate_points(point=points, angles=angles, origin=centers)
    points_rotated = points_rotated - centers.unsqueeze(0)
    scales = scales.unsqueeze(0)
    exponent = torch.sum(-0.5 * (points_rotated / scales) ** 2, dim=-1)
    divider = (scales[:, :, 0] * scales[:, :, 1] * scales[:, :, 2]) * (
        2.0 * torch.pi
    ) ** (3.0 / 2.0)
    exponential = torch.exp(exponent)
    intensities = exponential / divider
    intensities = opacity.T * intensities
    return intensities

freeze(model)

A utility function to freeze the parameters of a provided model.

This function sets requires_grad to False for all parameters in the model, effectively freezing them during training.

Parameters:

  • model (Module) –

    Model whose parameters are to be frozen. This should be a PyTorch model instance.

Returns:

  • None

    The function modifies the model in-place.

Source code in odak/learn/tools/models.py
def freeze(model):
    """
    A utility function to freeze the parameters of a provided model.

    This function sets `requires_grad` to `False` for all parameters in the model,
    effectively freezing them during training.

    Parameters
    ----------
    model : torch.nn.Module
        Model whose parameters are to be frozen. This should be a PyTorch model instance.

    Returns
    -------
    None
        The function modifies the model in-place.
    """
    for parameter in model.parameters():
        parameter.requires_grad = False

generate_2d_dirac_delta(kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False)

Generate 2D Dirac delta function using Gaussian approximation.

This function creates a 2D Dirac delta function by using a Gaussian distribution with very small standard deviations (a values) to approximate the behavior. Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function

Parameters:

  • kernel_length (list, default: [21, 21] ) –
            Length of the Dirac delta function along X and Y axes.
    
  • a
            The scale factor in Gaussian distribution to approximate the Dirac delta function.
            As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function.
    
  • mu
            Mu of the Gaussian kernel along X and Y axes.
    
  • theta
            The rotation angle of the 2D Dirac delta function.
    
  • normalize
            If set True, normalize the output to maximum value of 1.
    

Returns:

  • kernel_2d ( tensor ) –

    Generated 2D Dirac delta function.

Source code in odak/learn/tools/matrix.py
def generate_2d_dirac_delta(
    kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False
):
    """
    Generate 2D Dirac delta function using Gaussian approximation.

    This function creates a 2D Dirac delta function by using a Gaussian distribution 
    with very small standard deviations (a values) to approximate the behavior.
    Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function

    Parameters
    ----------
    kernel_length : list
                    Length of the Dirac delta function along X and Y axes.
    a             : list
                    The scale factor in Gaussian distribution to approximate the Dirac delta function.
                    As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function.
    mu            : list
                    Mu of the Gaussian kernel along X and Y axes.
    theta         : float
                    The rotation angle of the 2D Dirac delta function.
    normalize     : bool
                    If set True, normalize the output to maximum value of 1.

    Returns
    ----------
    kernel_2d     : torch.tensor
                    Generated 2D Dirac delta function.
    """
    x = torch.linspace(
        -kernel_length[0] / 2.0, kernel_length[0] / 2.0, kernel_length[0]
    )
    y = torch.linspace(
        -kernel_length[1] / 2.0, kernel_length[1] / 2.0, kernel_length[1]
    )
    X, Y = torch.meshgrid(x, y, indexing="ij")
    X = X - mu[0]
    Y = Y - mu[1]
    theta = torch.as_tensor(theta)
    X_rot = X * torch.cos(theta) - Y * torch.sin(theta)
    Y_rot = X * torch.sin(theta) + Y * torch.cos(theta)
    kernel_2d = (1 / (abs(a[0] * a[1]) * torch.pi)) * torch.exp(
        -((X_rot / a[0]) ** 2 + (Y_rot / a[1]) ** 2)
    )
    if normalize:
        kernel_2d = kernel_2d / kernel_2d.max()
    return kernel_2d

generate_2d_gaussian(kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False)

Generate 2D Gaussian kernel.

This function creates a 2D Gaussian kernel with specified dimensions and parameters. Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy

Parameters:

  • kernel_length (list, default: [21, 21] ) –
            Length of the Gaussian kernel along X and Y axes.
    
  • nsigma
            Sigma of the Gaussian kernel along X and Y axes.
    
  • mu
            Mu of the Gaussian kernel along X and Y axes.
    
  • normalize
            If set True, normalize the output to maximum value of 1.
    

Returns:

  • kernel_2d ( tensor ) –

    Generated Gaussian kernel.

Source code in odak/learn/tools/matrix.py
def generate_2d_gaussian(
    kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False
):
    """
    Generate 2D Gaussian kernel.

    This function creates a 2D Gaussian kernel with specified dimensions and parameters.
    Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy

    Parameters
    ----------
    kernel_length : list
                    Length of the Gaussian kernel along X and Y axes.
    nsigma        : list
                    Sigma of the Gaussian kernel along X and Y axes.
    mu            : list
                    Mu of the Gaussian kernel along X and Y axes.
    normalize     : bool
                    If set True, normalize the output to maximum value of 1.

    Returns
    ----------
    kernel_2d     : torch.tensor
                    Generated Gaussian kernel.
    """
    x = torch.linspace(
        -kernel_length[0] / 2.0, kernel_length[0] / 2.0, kernel_length[0]
    )
    y = torch.linspace(
        -kernel_length[1] / 2.0, kernel_length[1] / 2.0, kernel_length[1]
    )
    X, Y = torch.meshgrid(x, y, indexing="ij")
    if nsigma[0] == 0:
        nsigma[0] = 1e-5
    if nsigma[1] == 0:
        nsigma[1] = 1e-5
    kernel_2d = (
        1.0
        / (2.0 * torch.pi * nsigma[0] * nsigma[1])
        * torch.exp(
            -(
                (X - mu[0]) ** 2.0 / (2.0 * nsigma[0] ** 2.0)
                + (Y - mu[1]) ** 2.0 / (2.0 * nsigma[1] ** 2.0)
            )
        )
    )
    if normalize:
        kernel_2d = kernel_2d / kernel_2d.max()
    return kernel_2d

get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order='XYZ')

Generate rotation matrix for given tilt angles and tilt order.

Parameters:

  • tilt_angles (list, default: [0.0, 0.0, 0.0] ) –

    Tilt angles in degrees along XYZ axes.

  • tilt_order (str, default: 'XYZ' ) –

    Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).

Returns:

  • Tensor

    Rotation matrix.

Source code in odak/learn/tools/transformation.py
def get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order="XYZ"):
    """
    Generate rotation matrix for given tilt angles and tilt order.

    Parameters
    ----------
    tilt_angles : list
        Tilt angles in degrees along XYZ axes.
    tilt_order : str
        Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).

    Returns
    -------
    torch.Tensor
        Rotation matrix.
    """
    rotx = rotmatx(tilt_angles[0])
    roty = rotmaty(tilt_angles[1])
    rotz = rotmatz(tilt_angles[2])
    if tilt_order == "XYZ":
        rotmat = torch.mm(rotz, torch.mm(roty, rotx))
    elif tilt_order == "XZY":
        rotmat = torch.mm(roty, torch.mm(rotz, rotx))
    elif tilt_order == "ZXY":
        rotmat = torch.mm(roty, torch.mm(rotx, rotz))
    elif tilt_order == "YXZ":
        rotmat = torch.mm(rotz, torch.mm(rotx, roty))
    elif tilt_order == "ZYX":
        rotmat = torch.mm(rotx, torch.mm(roty, rotz))
    return rotmat

grid_sample(no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0])

Generate samples over a surface.

Parameters:

  • no (list, default: [10, 10] ) –

    Number of samples along each dimension.

  • size (list, default: [100.0, 100.0] ) –

    Physical size of the surface along each dimension.

  • center (list, default: [0.0, 0.0, 0.0] ) –

    Center location of the surface.

  • angles (list, default: [0.0, 0.0, 0.0] ) –

    Tilt angles of the surface around X, Y, and Z axes.

Returns:

  • samples ( tensor ) –

    Generated samples.

  • rotx ( tensor ) –

    Rotation matrix around X axis.

  • roty ( tensor ) –

    Rotation matrix around Y axis.

  • rotz ( tensor ) –

    Rotation matrix around Z axis.

Source code in odak/learn/tools/sample.py
def grid_sample(
    no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0]
):
    """
    Generate samples over a surface.

    Parameters
    ----------
    no : list
        Number of samples along each dimension.
    size : list
        Physical size of the surface along each dimension.
    center : list
        Center location of the surface.
    angles : list
        Tilt angles of the surface around X, Y, and Z axes.

    Returns
    -------
    samples : torch.tensor
        Generated samples.
    rotx : torch.tensor
        Rotation matrix around X axis.
    roty : torch.tensor
        Rotation matrix around Y axis.
    rotz : torch.tensor
        Rotation matrix around Z axis.
    """
    center = torch.tensor(center, dtype=torch.float32)
    angles = torch.tensor(angles, dtype=torch.float32)
    size = torch.tensor(size, dtype=torch.float32)
    samples = torch.zeros((no[0], no[1], 3), dtype=torch.float32)
    x = torch.linspace(-size[0] / 2.0, size[0] / 2.0, no[0])
    y = torch.linspace(-size[1] / 2.0, size[1] / 2.0, no[1])
    X, Y = torch.meshgrid(x, y, indexing="ij")
    samples[:, :, 0] = X
    samples[:, :, 1] = Y
    samples = samples.reshape((-1, 3))
    samples, rotx, roty, rotz = rotate_points(samples, angles=angles, offset=center)
    return samples, rotx, roty, rotz

histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0])

Calculates histogram loss between input frame and ground truth.

This function computes the MSE loss between histograms of the input frame and ground truth images, divided into specified number of bins.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • ground_truth (Tensor) –

    Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • bins (int, default: 32 ) –

    Number of bins for histogram calculation (default: 32).

  • limits (list, default: [0.0, 1.0] ) –

    Histogram limits as [min, max] (default: [0.0, 1.0]).

Returns:

  • Tensor

    Histogram loss value.

Source code in odak/learn/tools/loss.py
def histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0]):
    """
    Calculates histogram loss between input frame and ground truth.

    This function computes the MSE loss between histograms of the input frame
    and ground truth images, divided into specified number of bins.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    ground_truth : torch.Tensor
        Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    bins : int, optional
        Number of bins for histogram calculation (default: 32).
    limits : list, optional
        Histogram limits as [min, max] (default: [0.0, 1.0]).

    Returns
    -------
    torch.Tensor
        Histogram loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0).unsqueeze(0)
    elif len(frame.shape) == 3:
        frame = frame.unsqueeze(0)

    if len(ground_truth.shape) == 2:
        ground_truth = ground_truth.unsqueeze(0).unsqueeze(0)
    elif len(ground_truth.shape) == 3:
        ground_truth = ground_truth.unsqueeze(0)

    histogram_frame = torch.zeros(frame.shape[1], bins).to(frame.device)
    histogram_ground_truth = torch.zeros(ground_truth.shape[1], bins).to(frame.device)

    l2 = torch.nn.MSELoss()

    for i in range(frame.shape[1]):
        histogram_frame[i] = torch.histc(
            frame[:, i].flatten(), bins=bins, min=limits[0], max=limits[1]
        )
        histogram_ground_truth[i] = torch.histc(
            ground_truth[:, i].flatten(), bins=bins, min=limits[0], max=limits[1]
        )

    loss = l2(histogram_frame, histogram_ground_truth)

    return loss

load_image(fn, normalizeby=0.0, torch_style=False)

Definition to load an image from a given location as a torch tensor.

Parameters:

  • fn
           Filename.
    
  • normalizeby
           Value to to normalize images with. Default value of zero will lead to no normalization.
    
  • torch_style
           If set True, it will load an image mxnx3 as 3xmxn.
    

Returns:

  • image ( tensor ) –

    Image loaded as a torch tensor.

Source code in odak/learn/tools/file.py
def load_image(fn, normalizeby=0.0, torch_style=False):
    """
    Definition to load an image from a given location as a torch tensor.

    Parameters
    ----------
    fn           : str
                   Filename.
    normalizeby  : float or optional
                   Value to to normalize images with. Default value of zero will lead to no normalization.
    torch_style  : bool or optional
                   If set True, it will load an image mxnx3 as 3xmxn.

    Returns
    -------
    image        : torch.tensor
                   Image loaded as a torch tensor.
    """
    image = tools_load_image(fn, normalizeby=normalizeby, torch_style=torch_style)
    image = torch.from_numpy(image).float()
    return image

load_voxelized_PLY(ply_filename, voxel_size=[0.05, 0.05, 0.05], device=torch.device('cpu'))

Load a point cloud from a PLY file and convert it into a voxel grid representation.

Parameters:

  • ply_filename (str or Path) –

    The path to the input PLY file containing triangle data.

  • voxel_size ((list or tuple, shape(3)), default: [0.05, 0.05, 0.05] ) –

    The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].

  • device (device, default: device('cpu') ) –

    The device on which to perform computations. Default is CPU.

Returns:

  • points ( (Tensor, shape(N, 3)) ) –

    A tensor containing the coordinates of the voxel centers.

  • ground_truth ( (Tensor, shape(Gx * Gy * Gz)) ) –

    A binary tensor where each element indicates whether a corresponding voxel contains at least one point.

Notes
  • The function reads triangle data from the PLY file and computes the center points of these triangles.
  • These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
  • Only voxels containing at least one point are marked as 1 in ground_truth.
  • All operations are performed on the specified device for efficiency.
Source code in odak/learn/tools/transformation.py
def load_voxelized_PLY(
    ply_filename,
    voxel_size=[0.05, 0.05, 0.05],
    device=torch.device("cpu"),
):
    """
    Load a point cloud from a PLY file and convert it into a voxel grid representation.

    Parameters
    ----------
    ply_filename : str or Path
        The path to the input PLY file containing triangle data.
    voxel_size : list or tuple, shape (3,), optional
        The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].
    device : torch.device, optional
        The device on which to perform computations. Default is CPU.

    Returns
    -------
    points : torch.Tensor, shape (N, 3)
        A tensor containing the coordinates of the voxel centers.
    ground_truth : torch.Tensor, shape (Gx * Gy * Gz,)
        A binary tensor where each element indicates whether a corresponding voxel contains at least one point.

    Notes
    -----
    - The function reads triangle data from the PLY file and computes the center points of these triangles.
    - These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
    - Only voxels containing at least one point are marked as 1 in `ground_truth`.
    - All operations are performed on the specified device for efficiency.
    """
    triangles = read_PLY(ply_filename)
    points = center_of_triangle(triangles)
    points = torch.as_tensor(points, device=device)
    points = points - points.mean()
    points = points / torch.amax(points)
    ground_truth = torch.ones(points.shape[0], device=device)
    voxel_locations, voxel_grid = point_cloud_to_voxel(
        points=points,
        voxel_size=voxel_size,
    )
    points = voxel_locations.reshape(-1, 3)
    ground_truth = voxel_grid.reshape(-1)
    return points, ground_truth

michelson_contrast(image, roi_high, roi_low)

Calculates Michelson contrast ratio for given regions of an image.

This function computes the Michelson contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / (mean_high + mean_low).

Parameters:

  • image (Tensor) –

    Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

  • roi_high (Tensor) –

    Corner locations of the high intensity region [m_start, m_end, n_start, n_end].

  • roi_low (Tensor) –

    Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

Returns:

  • Tensor

    Michelson contrast for the given regions. Shape is [1] or [3] depending on input.

Source code in odak/learn/tools/loss.py
def michelson_contrast(image, roi_high, roi_low):
    """
    Calculates Michelson contrast ratio for given regions of an image.

    This function computes the Michelson contrast ratio for high and low intensity regions
    using the formula: (mean_high - mean_low) / (mean_high + mean_low).

    Parameters
    ----------
    image : torch.Tensor
        Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
    roi_high : torch.Tensor
        Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
    roi_low : torch.Tensor
        Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

    Returns
    -------
    torch.Tensor
        Michelson contrast for the given regions. Shape is [1] or [3] depending on input.
    """
    if len(image.shape) == 2:
        image = image.unsqueeze(0)
    if len(image.shape) == 3:
        image = image.unsqueeze(0)
    region_low = image[:, :, roi_low[0] : roi_low[1], roi_low[2] : roi_low[3]]
    region_high = image[:, :, roi_high[0] : roi_high[1], roi_high[2] : roi_high[3]]
    high = torch.mean(region_high, dim=(2, 3))
    low = torch.mean(region_low, dim=(2, 3))
    result = (high - low) / (high + low)
    return result.squeeze(0)

multi_scale_total_variation_loss(frame, levels=3)

Calculates multi-scale total variation loss for an input frame.

This function computes the total variation loss at multiple scales by creating an image pyramid where each level has half the resolution of the previous level.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

  • levels (int, default: 3 ) –

    Number of scales in the image pyramid (default: 3).

Returns:

  • Tensor

    Total variation loss value.

Source code in odak/learn/tools/loss.py
def multi_scale_total_variation_loss(frame, levels=3):
    """
    Calculates multi-scale total variation loss for an input frame.

    This function computes the total variation loss at multiple scales by creating
    an image pyramid where each level has half the resolution of the previous level.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
    levels : int, optional
        Number of scales in the image pyramid (default: 3).

    Returns
    -------
    torch.Tensor
        Total variation loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    scale = torch.nn.Upsample(scale_factor=0.5, mode="nearest")
    level = frame
    loss = 0
    for i in range(levels):
        if i != 0:
            level = scale(level)
        loss += total_variation_loss(level)
    return loss

point_cloud_to_voxel(points, voxel_size=[0.1, 0.1, 0.1])

Convert a point cloud to a voxel grid representation.

Parameters:

  • points ((Tensor, shape(N, 3))) –

    The input point cloud, where each row is a 3D point.

  • voxel_size ((list or Tensor, shape(3)), default: [0.1, 0.1, 0.1] ) –

    The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].

Returns:

  • locations ( (Tensor, shape(Gx, Gy, Gz, 3)) ) –

    The coordinates of each voxel center in the grid.

  • grid ( (Tensor, shape(Gx, Gy, Gz)) ) –

    A binary voxel grid where 1 indicates the presence of at least one point.

Notes
  • The voxel grid is constructed by discretizing the space between the minimum and maximum coordinates of the point cloud.
  • Only voxels containing at least one point are marked as 1.
  • The output grid is of type float32 and resides on the same device as the input points.
Source code in odak/learn/tools/transformation.py
def point_cloud_to_voxel(
    points,
    voxel_size=[0.1, 0.1, 0.1],
):
    """
    Convert a point cloud to a voxel grid representation.

    Parameters
    ----------
    points : torch.Tensor, shape (N, 3)
        The input point cloud, where each row is a 3D point.
    voxel_size : list or torch.Tensor, shape (3,), optional
        The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].

    Returns
    -------
    locations : torch.Tensor, shape (Gx, Gy, Gz, 3)
        The coordinates of each voxel center in the grid.
    grid : torch.Tensor, shape (Gx, Gy, Gz)
        A binary voxel grid where 1 indicates the presence of at least one point.

    Notes
    -----
    - The voxel grid is constructed by discretizing the space between the minimum and maximum
      coordinates of the point cloud.
    - Only voxels containing at least one point are marked as 1.
    - The output grid is of type float32 and resides on the same device as the input points.
    """
    voxel_size = torch.as_tensor(voxel_size, device=points.device)

    min_coords = points.min(dim=0).values
    max_coords = points.max(dim=0).values
    grid_size = ((max_coords - min_coords) / voxel_size).ceil().int()
    points = points - min_coords

    x = torch.linspace(min_coords[0], max_coords[0], grid_size[0], device=points.device)
    y = torch.linspace(min_coords[1], max_coords[1], grid_size[1], device=points.device)
    z = torch.linspace(min_coords[2], max_coords[2], grid_size[2], device=points.device)
    X, Y, Z = torch.meshgrid(x, y, z, indexing="ij")
    locations = torch.stack([X, Y, Z], dim=-1)

    voxel_indices = (points / voxel_size).floor().int()
    mask = (voxel_indices >= 0).all(dim=1) & (voxel_indices < grid_size).all(dim=1)
    voxel_indices = voxel_indices[mask]
    grid = torch.zeros(grid_size.tolist(), dtype=torch.float32, device=points.device)
    grid[voxel_indices[:, 0], voxel_indices[:, 1], voxel_indices[:, 2]] = 1.0

    return locations, grid

quantize(image_field, bits=8, limits=[0.0, 1.0])

Quantize an image field to a specified number of bits.

This function maps the input image field from its original range to a quantized representation with the specified number of bits.

Parameters:

  • image_field (tensor) –
          Input image field between any range.
    
  • bits
          Number of bits for quantization (1-8).
    
  • limits
          The minimum and maximum of the image_field variable.
    

Returns:

  • quantized_field ( tensor ) –

    Quantized image field.

Source code in odak/learn/tools/matrix.py
def quantize(image_field, bits=8, limits=[0.0, 1.0]):
    """
    Quantize an image field to a specified number of bits.

    This function maps the input image field from its original range to a quantized 
    representation with the specified number of bits.

    Parameters
    ----------
    image_field : torch.tensor
                  Input image field between any range.
    bits        : int
                  Number of bits for quantization (1-8).
    limits      : list
                  The minimum and maximum of the image_field variable.

    Returns
    ----------
    quantized_field   : torch.tensor
                        Quantized image field.
    """
    normalized_field = (image_field - limits[0]) / (limits[1] - limits[0])
    divider = 2**bits
    quantized_field = normalized_field * divider
    quantized_field = quantized_field.int()
    return quantized_field

quaternion_to_rotation_matrix(quaternions)

Convert rotations given as unit quaternions to rotation matrices.

Parameters:

  • quaternions (Tensor) –
          Quaternions with real part first, shape ``(*, 4)``
          in ``(w, x, y, z)`` convention.
    

Returns:

  • rotation_matrices ( Tensor ) –

    Rotation matrices, shape (*, 3, 3).

Source code in odak/learn/tools/transformation.py
def quaternion_to_rotation_matrix(quaternions):
    """
    Convert rotations given as unit quaternions to rotation matrices.

    Parameters
    ----------
    quaternions : torch.Tensor
                  Quaternions with real part first, shape ``(*, 4)``
                  in ``(w, x, y, z)`` convention.

    Returns
    -------
    rotation_matrices : torch.Tensor
                        Rotation matrices, shape ``(*, 3, 3)``.
    """
    quaternions = F.normalize(quaternions, dim=-1)
    w, x, y, z = quaternions.unbind(-1)

    two_s = 2.0 / (quaternions * quaternions).sum(-1)

    rotation_matrices = torch.stack(
        [
            1 - two_s * (y * y + z * z),
            two_s * (x * y - w * z),
            two_s * (x * z + w * y),
            two_s * (x * y + w * z),
            1 - two_s * (x * x + z * z),
            two_s * (y * z - w * x),
            two_s * (x * z - w * y),
            two_s * (y * z + w * x),
            1 - two_s * (x * x + y * y),
        ],
        dim=-1,
    )

    return rotation_matrices.reshape(quaternions.shape[:-1] + (3, 3))

radial_basis_function(value, epsilon=0.5)

Applies radial basis function with Gaussian description to input values.

This function applies the Gaussian radial basis function: y = e^(-ε² * x²)

Parameters:

  • value (Tensor) –

    Value(s) to pass to the radial basis function.

  • epsilon (float, default: 0.5 ) –

    Epsilon parameter used in the Gaussian radial basis function (default: 0.5).

Returns:

  • Tensor

    Output values after applying the radial basis function.

Source code in odak/learn/tools/loss.py
def radial_basis_function(value, epsilon=0.5):
    """
    Applies radial basis function with Gaussian description to input values.

    This function applies the Gaussian radial basis function: y = e^(-ε² * x²)

    Parameters
    ----------
    value : torch.Tensor
        Value(s) to pass to the radial basis function.
    epsilon : float, optional
        Epsilon parameter used in the Gaussian radial basis function (default: 0.5).

    Returns
    -------
    torch.Tensor
        Output values after applying the radial basis function.
    """
    output = torch.exp((-((epsilon * value) ** 2)))
    return output

read_PLY(fn, offset=[0, 0, 0], angles=[0.0, 0.0, 0.0], mode='XYZ')

Definition to read a PLY file and extract meshes from a given PLY file. Note that rotation is always with respect to 0,0,0.

Parameters:

  • fn
           Filename of a PLY file.
    
  • offset
           Offset in X,Y,Z.
    
  • angles
           Rotation angles in degrees.
    
  • mode
           Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.
    

Returns:

  • triangles ( ndarray ) –

    Triangles from a given PLY file. Note that the triangles coming out of this function isn't always structured in the right order and with the size of (MxN)x3. You can use numpy's reshape to restructure it to mxnx3 if you know what you are doing.

Raises:

  • ValueError : If path validation fails or extension is not allowed.
  • TypeError : If fn is not a string.
Source code in odak/tools/asset.py
def read_PLY(fn, offset=[0, 0, 0], angles=[0.0, 0.0, 0.0], mode="XYZ"):
    """
    Definition to read a PLY file and extract meshes from a given PLY file. Note that rotation is always with respect to 0,0,0.

    Parameters
    ----------
    fn           : string
                   Filename of a PLY file.
    offset       : ndarray
                   Offset in X,Y,Z.
    angles       : list
                   Rotation angles in degrees.
    mode         : str
                   Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.

    Returns
    ----------
    triangles    : ndarray
                   Triangles from a given PLY file. Note that the triangles coming out of this function isn't always structured in the right order and with the size of (MxN)x3. You can use numpy's reshape to restructure it to mxnx3 if you know what you are doing.

    Raises
    ------
    ValueError   : If path validation fails or extension is not allowed.
    TypeError    : If fn is not a string.
    """
    if np.__name__ != "numpy":
        import numpy as np_ply
    else:
        np_ply = np
    safe_path = validate_path(fn, allowed_extensions=[".ply"])
    with open(safe_path, "rb") as f:
        plydata = PlyData.read(f)
    triangle_ids = np_ply.vstack(plydata["face"].data["vertex_indices"])
    triangles = []
    for vertex_ids in triangle_ids:
        triangle = [
            rotate_point(
                plydata["vertex"][int(vertex_ids[0])].tolist(),
                angles=angles,
                offset=offset,
            )[0],
            rotate_point(
                plydata["vertex"][int(vertex_ids[1])].tolist(),
                angles=angles,
                offset=offset,
            )[0],
            rotate_point(
                plydata["vertex"][int(vertex_ids[2])].tolist(),
                angles=angles,
                offset=offset,
            )[0],
        ]
        triangle = np_ply.asarray(triangle)
        triangles.append(triangle)
    triangles = np_ply.array(triangles)
    triangles = np.asarray(triangles, dtype=np.float32)
    return triangles

resize(image, multiplier=0.5, mode='nearest')

Definition to resize an image.

Parameters:

  • image
          Image with MxNx3 resolution.
    
  • multiplier
          Multiplier used in resizing operation (e.g., 0.5 is half size in one axis).
    
  • mode
          Mode to be used in scaling, nearest, bilinear, etc.
    

Returns:

  • new_image ( tensor ) –

    Resized image.

Source code in odak/learn/tools/file.py
def resize(image, multiplier=0.5, mode="nearest"):
    """
    Definition to resize an image.

    Parameters
    ----------
    image       : torch.tensor
                  Image with MxNx3 resolution.
    multiplier  : float
                  Multiplier used in resizing operation (e.g., 0.5 is half size in one axis).
    mode        : str
                  Mode to be used in scaling, nearest, bilinear, etc.

    Returns
    -------
    new_image   : torch.tensor
                  Resized image.
    """
    # Handle the case where image needs to be in the right format for torch.nn.Upsample
    if len(image.shape) == 3:
        # Add batch dimension: (H, W, C) -> (1, H, W, C)
        image = image.unsqueeze(0)
    elif len(image.shape) == 4:
        # Image is already in batch format
        pass
    else:
        raise ValueError("Image must have 3 or 4 dimensions")

    # Use torch.nn.functional.interpolate for resizing
    if mode not in ["nearest", "bilinear", "bicubic", "area"]:
        raise ValueError("Mode must be one of: nearest, bilinear, bicubic, area")

    # Resize the image
    new_image = torch.nn.functional.interpolate(
        image,
        scale_factor=multiplier,
        mode=mode,
        align_corners=None if mode in ["nearest", "area"] else False,
    )

    # Remove batch dimension if it was added
    if new_image.shape[0] == 1:
        new_image = new_image.squeeze(0)

    return new_image

rotate_points(point, angles=torch.zeros(1, 3), mode='XYZ', origin=torch.zeros(1, 3), offset=torch.zeros(1, 3))

Rotate a given point and return the result along with rotation matrices.

Note that rotation is always with respect to 0,0,0.

Parameters:

  • point (Tensor) –

    A point with size of [3] or [1, 3] or [m, 3].

  • angles (Tensor, default: zeros(1, 3) ) –

    Rotation angles in degrees.

  • mode (str, default: 'XYZ' ) –

    Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.

  • origin (Tensor, default: zeros(1, 3) ) –

    Reference point for a rotation. Expected size is [3] or [1, 3].

  • offset (Tensor, default: zeros(1, 3) ) –

    Shift with the given offset. Expected size is [3] or [1, 3] or [m, 3].

Returns:

  • tuple

    Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.

Source code in odak/learn/tools/transformation.py
def rotate_points(
    point,
    angles=torch.zeros(1, 3),
    mode="XYZ",
    origin=torch.zeros(1, 3),
    offset=torch.zeros(1, 3),
):
    """
    Rotate a given point and return the result along with rotation matrices.

    Note that rotation is always with respect to 0,0,0.

    Parameters
    ----------
    point : torch.Tensor
        A point with size of [3] or [1, 3] or [m, 3].
    angles : torch.Tensor
        Rotation angles in degrees.
    mode : str
        Rotation mode determines ordering of the rotations at each axis.
        There are XYZ,YXZ,ZXY and ZYX modes.
    origin : torch.Tensor
        Reference point for a rotation.
        Expected size is [3] or [1, 3].
    offset : torch.Tensor
        Shift with the given offset.
        Expected size is [3] or [1, 3] or [m, 3].

    Returns
    -------
    tuple
        Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.
    """
    origin = origin.to(point.device)
    offset = offset.to(point.device)
    angles = angles.to(point.device)

    if len(point.shape) == 1:
        point = point.unsqueeze(0)
    if len(angles.shape) == 1:
        angles = angles.unsqueeze(0)
    if len(origin.shape) == 1:
        origin = origin.unsqueeze(0)
    if len(offset.shape) == 1:
        offset = offset.unsqueeze(0)

    rotx = rotmatx(angles[:, 0]).unsqueeze(0)
    roty = rotmaty(angles[:, 1]).unsqueeze(0)
    rotz = rotmatz(angles[:, 2]).unsqueeze(0)

    new_points = (point.unsqueeze(1) - origin.unsqueeze(0)).unsqueeze(-1)

    if mode == "XYZ":
        result = rotz @ (roty @ (rotx @ new_points))
    elif mode == "XZY":
        result = roty @ (rotz @ (rotx @ new_points))
    elif mode == "YXZ":
        result = rotz @ (rotx @ (roty @ new_points))
    elif mode == "ZXY":
        result = roty @ (rotx @ (rotz @ new_points))
    elif mode == "ZYX":
        result = rotx @ (roty @ (rotz @ new_points))

    result = result.squeeze(-1)
    result = result + origin.unsqueeze(0)
    result = result + offset.unsqueeze(0)
    if result.shape[1] == 1:
        result = result.squeeze(1)
    return result, rotx, roty, rotz

rotmatx(angle)

Generate a rotation matrix along the X axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the X axis.

Source code in odak/learn/tools/transformation.py
def rotmatx(angle):
    """
    Generate a rotation matrix along the X axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the X axis.
    """
    angle = torch.deg2rad(angle)
    rotx = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    rotx[:, 0, 0] = 1.0
    rotx[:, 1, 1] = torch.cos(angle)
    rotx[:, 1, 2] = -torch.sin(angle)
    rotx[:, 2, 1] = torch.sin(angle)
    rotx[:, 2, 2] = torch.cos(angle)
    if rotx.shape[0] == 1:
        rotx = rotx.squeeze(0)
    return rotx

rotmaty(angle)

Generate a rotation matrix along the Y axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the Y axis.

Source code in odak/learn/tools/transformation.py
def rotmaty(angle):
    """
    Generate a rotation matrix along the Y axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the Y axis.
    """
    angle = torch.deg2rad(angle)
    roty = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    roty[:, 0, 0] = torch.cos(angle)
    roty[:, 0, 2] = torch.sin(angle)
    roty[:, 1, 1] = 1.0
    roty[:, 2, 0] = -torch.sin(angle)
    roty[:, 2, 2] = torch.cos(angle)
    if roty.shape[0] == 1:
        roty = roty.squeeze(0)
    return roty

rotmatz(angle)

Generate a rotation matrix along the Z axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the Z axis.

Source code in odak/learn/tools/transformation.py
def rotmatz(angle):
    """
    Generate a rotation matrix along the Z axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the Z axis.
    """
    angle = torch.deg2rad(angle)
    rotz = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    rotz[:, 0, 0] = torch.cos(angle)
    rotz[:, 0, 1] = -torch.sin(angle)
    rotz[:, 1, 0] = torch.sin(angle)
    rotz[:, 1, 1] = torch.cos(angle)
    rotz[:, 2, 2] = 1.0
    if rotz.shape[0] == 1:
        rotz = rotz.squeeze(0)
    return rotz

same_side(p1, p2, a, b)

Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.

Parameters:

  • p1
          Point(s) to check.
    
  • p2
          This is the point check against.
    
  • a
          First point that forms the line.
    
  • b
          Second point that forms the line.
    
Source code in odak/learn/tools/vector.py
def same_side(p1, p2, a, b):
    """
    Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.

    Parameters
    ----------
    p1          : list
                  Point(s) to check.
    p2          : list
                  This is the point check against.
    a           : list
                  First point that forms the line.
    b           : list
                  Second point that forms the line.
    """
    ba = torch.subtract(b, a)
    p1a = torch.subtract(p1, a)
    p2a = torch.subtract(p2, a)
    cp1 = torch.cross(ba, p1a)
    cp2 = torch.cross(ba, p2a)
    test = torch.dot(cp1, cp2)
    if len(p1.shape) > 1:
        return test >= 0
    if test >= 0:
        return True
    return False

save_image(fn, img, cmin=0, cmax=255, color_depth=8)

Definition to save a torch tensor as an image.

Parameters:

  • fn
           Filename.
    
  • img
           A torch tensor with NxMx3 or NxMx1 shapes.
    
  • cmin
           Minimum value that will be interpreted as 0 level in the final image.
    
  • cmax
           Maximum value that will be interpreted as 255 level in the final image.
    
  • color_depth
           Color depth of an image. Default is eight.
    

Returns:

  • bool ( bool ) –

    True if successful.

Source code in odak/learn/tools/file.py
def save_image(fn, img, cmin=0, cmax=255, color_depth=8):
    """
    Definition to save a torch tensor as an image.

    Parameters
    ----------
    fn           : str
                   Filename.
    img          : torch.tensor
                   A torch tensor with NxMx3 or NxMx1 shapes.
    cmin         : int
                   Minimum value that will be interpreted as 0 level in the final image.
    cmax         : int
                   Maximum value that will be interpreted as 255 level in the final image.
    color_depth  : int
                   Color depth of an image. Default is eight.

    Returns
    -------
    bool         : bool
                   True if successful.
    """
    if len(img.shape) == 4:
        img = img.squeeze(0)
    if len(img.shape) > 2 and torch.argmin(torch.tensor(img.shape)) == 0:
        # Transpose from (C, H, W) to (H, W, C)
        new_img = torch.zeros(img.shape[1], img.shape[2], img.shape[0]).to(img.device)
        for i in range(img.shape[0]):
            new_img[:, :, i] = img[i].detach().clone()
        img = new_img.detach().clone()
    img = img.cpu().detach().numpy()
    return tools_save_image(fn, img, cmin=cmin, cmax=cmax, color_depth=color_depth)

save_torch_tensor(fn, tensor)

Definition to save a torch tensor or dictionary.

Parameters:

  • fn
        Filename.
    
  • tensor
        Torch tensor or dictionary to be saved.
    

Raises:

  • ValueError : If path validation fails or extension is not allowed.
Source code in odak/learn/tools/file.py
def save_torch_tensor(fn, tensor):
    """
    Definition to save a torch tensor or dictionary.

    Parameters
    ----------
    fn           : str
                Filename.
    tensor       : torch.tensor or dict
                Torch tensor or dictionary to be saved.

    Raises
    ------
    ValueError : If path validation fails or extension is not allowed.
    """
    safe_path = validate_path(fn, allowed_extensions=[".pt", ".pth", ".pkl"])
    torch.save(tensor, safe_path)

spatial_gradient(frame)

Calculates the spatial gradient of a given frame.

This function computes the gradient of the input frame in both x and y directions by differencing adjacent pixels.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

Returns:

  • tuple

    Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.

Source code in odak/learn/tools/loss.py
def spatial_gradient(frame):
    """
    Calculates the spatial gradient of a given frame.

    This function computes the gradient of the input frame in both x and y directions
    by differencing adjacent pixels.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

    Returns
    -------
    tuple
        Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    diff_x = frame[:, :, :, 1:] - frame[:, :, :, :-1]
    diff_y = frame[:, :, 1:, :] - frame[:, :, :-1, :]
    return diff_x, diff_y

tilt_towards(location, lookat)

Tilt surface normal of a plane towards a point.

Parameters:

  • location (list) –

    Center of the plane to be tilted.

  • lookat (list) –

    Tilt towards this point.

Returns:

  • list

    Rotation angles in degrees.

Source code in odak/learn/tools/transformation.py
def tilt_towards(location, lookat):
    """
    Tilt surface normal of a plane towards a point.

    Parameters
    ----------
    location : list
        Center of the plane to be tilted.
    lookat : list
        Tilt towards this point.

    Returns
    -------
    list
        Rotation angles in degrees.
    """
    dx = location[0] - lookat[0]
    dy = location[1] - lookat[1]
    dz = location[2] - lookat[2]
    dist = torch.sqrt(torch.tensor(dx**2 + dy**2 + dz**2))
    phi = torch.atan2(torch.tensor(dy), torch.tensor(dx))
    theta = torch.arccos(dz / dist)
    angles = [0, float(torch.rad2deg(theta)), float(torch.rad2deg(phi))]
    return angles

tools_load_image(fn, normalizeby=0.0, torch_style=False)

Definition to load an image from a given location as a Numpy array.

Parameters:

  • fn
            Filename.
    
  • normalizeby
            Value to to normalize images with. Default value of zero will lead to no normalization.
    
  • torch_style
            If set True, it will load an image mxnx3 as 3xmxn.
    

Returns:

  • image ( ndarray ) –

    Image loaded as a Numpy array.

Source code in odak/tools/file.py
def load_image(fn, normalizeby=0.0, torch_style=False):
    """
    Definition to load an image from a given location as a Numpy array.


    Parameters
    ----------
    fn           : str
                    Filename.
    normalizeby  : float
                    Value to to normalize images with. Default value of zero will lead to no normalization.
    torch_style  : bool
                    If set True, it will load an image mxnx3 as 3xmxn.


    Returns
    ----------
    image        :  ndarray
                    Image loaded as a Numpy array.

    """
    logger.info("Loading image: {}".format(fn))
    safe_path = validate_path(
        fn,
        allowed_extensions=[
            ".png",
            ".jpg",
            ".jpeg",
            ".bmp",
            ".tiff",
            ".tif",
            ".gif",
            ".webp",
            ".pbm",
            ".pgm",
            ".ppm",
            ".sr",
            ".ras",
        ],
    )
    image = cv2.imread(safe_path, cv2.IMREAD_UNCHANGED)
    if isinstance(image, type(None)):
        raise ValueError(
            f"Failed to load image from '{safe_path}'. "
            f"Check file format, permissions, and that the file exists."
        )
    if len(image.shape) > 2:
        new_image = np.copy(image)
        new_image[:, :, 0] = image[:, :, 2]
        new_image[:, :, 2] = image[:, :, 0]
        image = new_image
    if normalizeby != 0.0:
        image = image * 1.0 / normalizeby
    if torch_style == True and len(image.shape) > 2:
        image = np.moveaxis(image, -1, 0)
    logger.info("Loaded image: {}".format(safe_path))
    return image.astype(float)

tools_save_image(fn, img, cmin=0, cmax=255, color_depth=8)

Definition to save a Numpy array as an image.

Parameters:

  • fn
            Filename.
    
  • img
            A numpy array with NxMx3 or NxMx1 shapes.
    
  • cmin
            Minimum value that will be interpreted as 0 level in the final image.
    
  • cmax
            Maximum value that will be interpreted as 255 level in the final image.
    
  • color_depth
            Pixel color depth in bits, default is eight bits.
    

Returns:

  • bool ( bool ) –

    True if successful.

Source code in odak/tools/file.py
def save_image(fn, img, cmin=0, cmax=255, color_depth=8):
    """
    Definition to save a Numpy array as an image.


    Parameters
    ----------
    fn           : str
                    Filename.
    img          : ndarray
                    A numpy array with NxMx3 or NxMx1 shapes.
    cmin         : int
                    Minimum value that will be interpreted as 0 level in the final image.
    cmax         : int
                    Maximum value that will be interpreted as 255 level in the final image.
    color_depth  : int
                    Pixel color depth in bits, default is eight bits.


    Returns
    ----------
    bool         :  bool
                    True if successful.

    """
    logger.info("Saving image: {}".format(fn))
    input_img = np.copy(img).astype(np.float32)
    cmin = float(cmin)
    cmax = float(cmax)
    input_img[input_img < cmin] = cmin
    input_img[input_img > cmax] = cmax
    input_img /= cmax
    input_img = input_img * 1.0 * (2**color_depth - 1)
    if color_depth == 8:
        input_img = input_img.astype(np.uint8)
    elif color_depth == 16:
        input_img = input_img.astype(np.uint16)
    if len(input_img.shape) > 2:
        if input_img.shape[2] > 1:
            cache_img = np.copy(input_img)
            cache_img[:, :, 0] = input_img[:, :, 2]
            cache_img[:, :, 2] = input_img[:, :, 0]
            input_img = cache_img
    safe_path = validate_path(
        fn, allowed_extensions=[".png", ".jpg", ".jpeg", ".bmp", ".tiff", ".tif"]
    )
    cv2.imwrite(safe_path, input_img)
    logger.info("Saved image: {}".format(safe_path))
    return True

torch_load(fn, weights_only=True, map_location=None)

Definition to load a torch files (*.pt).

Parameters:

  • fn
           Filename.
    
  • weights_only (bool, default: True ) –
           See torch.load() for details.
    
  • map_location (str, default: None ) –
           The device location to place data (e.g., `cuda`, `cpu`, etc.).
           The default is None.
    

Returns:

  • data ( any ) –

    See torch.load() for more.

Raises:

  • ValueError : If path validation fails or unsafe characters detected.
Source code in odak/learn/tools/file.py
def torch_load(fn, weights_only=True, map_location=None):
    """
    Definition to load a torch files (*.pt).

    Parameters
    ----------
    fn           : str
                   Filename.
    weights_only : bool
                   See torch.load() for details.
    map_location : str
                   The device location to place data (e.g., `cuda`, `cpu`, etc.).
                   The default is None.

    Returns
    -------
    data         : any
                   See torch.load() for more.

    Raises
    ------
    ValueError   : If path validation fails or unsafe characters detected.
    """
    safe_path = validate_path(fn, allowed_extensions=[".pt", ".pth", ".pkl"])
    data = torch.load(
        safe_path,
        weights_only=weights_only,
        map_location=map_location,
    )
    return data

total_variation_loss(frame)

Calculates total variation loss for an input frame.

This function computes the total variation loss by calculating spatial gradients in both x and y directions and averaging their squared values.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

Returns:

  • Tensor

    Total variation loss value.

Source code in odak/learn/tools/loss.py
def total_variation_loss(frame):
    """
    Calculates total variation loss for an input frame.

    This function computes the total variation loss by calculating spatial gradients
    in both x and y directions and averaging their squared values.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

    Returns
    -------
    torch.Tensor
        Total variation loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    diff_x, diff_y = spatial_gradient(frame)
    pixel_count = frame.shape[0] * frame.shape[1] * frame.shape[2] * frame.shape[3]
    loss = ((diff_x**2).sum() + (diff_y**2).sum()) / pixel_count
    return loss

unfreeze(model)

A utility function to unfreeze the parameters of a provided model.

This function sets requires_grad to True for all parameters in the model, effectively allowing them to be updated during training.

Parameters:

  • model (Module) –

    Model whose parameters are to be unfrozen. This should be a PyTorch model instance.

Returns:

  • None

    The function modifies the model in-place.

Source code in odak/learn/tools/models.py
def unfreeze(model):
    """
    A utility function to unfreeze the parameters of a provided model.

    This function sets `requires_grad` to `True` for all parameters in the model,
    effectively allowing them to be updated during training.

    Parameters
    ----------
    model : torch.nn.Module
        Model whose parameters are to be unfrozen. This should be a PyTorch model instance.

    Returns
    -------
    None
        The function modifies the model in-place.
    """
    for parameter in model.parameters():
        parameter.requires_grad = True

validate_path(path, allowed_extensions=None)

Validates a file path for security safety.

Parameters:

  • path
              Path to validate.
    
  • allowed_extensions (list, default: None ) –
                  List of allowed extensions (e.g., ['.png', '.jpg']).
                  If None, all extensions are allowed.
    

Returns:

  • safe_path ( str ) –

    The validated and secured path (with tilde expanded).

Raises:

  • ValueError : If path traversal attempt detected or extension not allowed.
  • TypeError : If path is not a string.
Source code in odak/tools/file.py
def validate_path(path, allowed_extensions=None):
    """
    Validates a file path for security safety.

    Parameters
    ----------
    path            : str
                      Path to validate.
    allowed_extensions : list, optional
                          List of allowed extensions (e.g., ['.png', '.jpg']).
                          If None, all extensions are allowed.

    Returns
    -------
    safe_path       : str
                      The validated and secured path (with tilde expanded).

    Raises
    ------
    ValueError      : If path traversal attempt detected or extension not allowed.
    TypeError       : If path is not a string.
    """
    if not isinstance(path, str):
        raise TypeError(f"Path must be a string, got {type(path).__name__}")

    # Check for null bytes before expanding user (Windows path injection)
    if "\x00" in path:
        raise ValueError("Null bytes not allowed in path")

    # Check for path traversal patterns BEFORE expanding
    if ".." in path.split(os.sep) or ".." in path.replace(os.sep, "/").split("/"):
        if re.search(r"(^|[/\\])\.\.([/\\]|$)", path):
            raise ValueError("Path traversal detected: '..' not allowed in path")

    # Check for URL protocols before expanding
    path_lower = path.lower()
    if re.search(r"https?://|ftp://", path_lower):
        raise ValueError("URL protocols not allowed in file paths")

    path = os.path.expanduser(path)
    resolved_path = os.path.abspath(path)

    # Check for UNC or device paths on Windows
    if re.match(r"\\\\\\\|\\\\\\?\.\\", path) or path.startswith("//."):
        raise ValueError("UNC/device paths not allowed")

    if len(resolved_path) > 260:  # Windows MAX_PATH limit
        raise ValueError("Path exceeds maximum allowed length (260 characters)")

    if allowed_extensions is not None:
        _, file_ext = os.path.splitext(path)
        ext_lower = file_ext.lower()
        allowed_normalized = [
            ext.lower() if ext.startswith(".") else f".{ext}"
            for ext in allowed_extensions
        ]
        if ext_lower not in allowed_normalized:
            raise ValueError(
                f"File extension '{file_ext}' is not allowed. "
                f"Allowed: {allowed_extensions}"
            )

    logger.debug(f"Path validated: {path}")
    return resolved_path

weber_contrast(image, roi_high, roi_low)

Calculates Weber contrast ratio for given regions of an image.

This function computes the Weber contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / mean_low.

Parameters:

  • image (Tensor) –

    Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • roi_high (Tensor) –

    Corner locations of the high intensity region [m_start, m_end, n_start, n_end].

  • roi_low (Tensor) –

    Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

Returns:

  • Tensor

    Weber contrast for the given regions. Shape is [1] or [3] depending on input.

Source code in odak/learn/tools/loss.py
def weber_contrast(image, roi_high, roi_low):
    """
    Calculates Weber contrast ratio for given regions of an image.

    This function computes the Weber contrast ratio for high and low intensity regions
    using the formula: (mean_high - mean_low) / mean_low.

    Parameters
    ----------
    image : torch.Tensor
        Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    roi_high : torch.Tensor
        Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
    roi_low : torch.Tensor
        Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

    Returns
    -------
    torch.Tensor
        Weber contrast for the given regions. Shape is [1] or [3] depending on input.
    """
    if len(image.shape) == 2:
        image = image.unsqueeze(0)
    if len(image.shape) == 3:
        image = image.unsqueeze(0)
    region_low = image[:, :, roi_low[0] : roi_low[1], roi_low[2] : roi_low[3]]
    region_high = image[:, :, roi_high[0] : roi_high[1], roi_high[2] : roi_high[3]]
    high = torch.mean(region_high, dim=(2, 3))
    low = torch.mean(region_low, dim=(2, 3))
    result = (high - low) / low
    return result.squeeze(0)

wrapped_mean_squared_error(image, ground_truth, reduction='mean')

Calculates wrapped mean squared error between predicted and target angles.

This function computes the mean squared error for angular data, accounting for the wrap-around property of angles (e.g., 359° and 1° are close).

Parameters:

  • image (Tensor) –

    Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • ground_truth (Tensor) –

    Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • reduction (str, default: 'mean' ) –

    Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.

Returns:

  • Tensor

    The calculated wrapped mean squared error.

Raises:

  • ValueError

    If an invalid reduction type is specified.

Source code in odak/learn/tools/loss.py
def wrapped_mean_squared_error(image, ground_truth, reduction="mean"):
    """
    Calculates wrapped mean squared error between predicted and target angles.

    This function computes the mean squared error for angular data, accounting for
    the wrap-around property of angles (e.g., 359° and 1° are close).

    Parameters
    ----------
    image : torch.Tensor
        Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    ground_truth : torch.Tensor
        Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    reduction : str, optional
        Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.

    Returns
    -------
    torch.Tensor
        The calculated wrapped mean squared error.

    Raises
    ------
    ValueError
        If an invalid reduction type is specified.
    """
    sin_diff = torch.sin(image) - torch.sin(ground_truth)
    cos_diff = torch.cos(image) - torch.cos(ground_truth)
    loss = sin_diff**2 + cos_diff**2

    if reduction == "mean":
        return loss.mean()
    elif reduction == "sum":
        return loss.sum()
    else:
        raise ValueError("Invalid reduction type. Choose 'mean' or 'sum'.")

zernike_polynomial(n, m, rho, theta)

Compute the 2D Zernike polynomial Z_n^m(rho, theta).

Parameters:

  • n
         Radial degree of the polynomial (n >= 0).
    
  • m
         Azimuthal frequency of the polynomial. Must satisfy |m| <= n and (n - |m|) % 2 == 0.
    
  • rho
         Radial distance from the origin (0 to 1). Shape (H, W).
    
  • theta
         Azimuthal angle in radians. Shape (H, W).
    

Returns:

  • zernike ( Tensor ) –

    The computed 2D Zernike polynomial. Values are zero where rho > 1.

Source code in odak/learn/tools/function.py
def zernike_polynomial(
    n,
    m,
    rho,
    theta,
):
    """
    Compute the 2D Zernike polynomial Z_n^m(rho, theta).

    Parameters
    ----------
    n          : int
                 Radial degree of the polynomial (n >= 0).
    m          : int
                 Azimuthal frequency of the polynomial. Must satisfy |m| <= n and (n - |m|) % 2 == 0.
    rho        : torch.Tensor
                 Radial distance from the origin (0 to 1). Shape (H, W).
    theta      : torch.Tensor
                 Azimuthal angle in radians. Shape (H, W).


    Returns
    -------
    zernike    : torch.Tensor
                 The computed 2D Zernike polynomial.
                 Values are zero where rho > 1.
    """
    m_abs = abs(m)
    if m_abs > n or (n - m_abs) % 2 != 0:
        return torch.zeros(rho.shape, dtype=torch.complex64, device=rho.device)

    radial = torch.zeros_like(rho)

    for k in range((n - m_abs) // 2 + 1):
        num = (-1) ** k * torch.exp(torch.lgamma(torch.tensor(n - k + 1.0)))
        den = (
            torch.exp(torch.lgamma(torch.tensor(k + 1.0)))
            * torch.exp(torch.lgamma(torch.tensor((n + m_abs) // 2 - k + 1.0)))
            * torch.exp(torch.lgamma(torch.tensor((n - m_abs) // 2 - k + 1.0)))
        )
        radial += (num / den) * torch.pow(rho, n - 2 * k)

    if m >= 0:
        zernike = radial * torch.cos(m * theta)
    else:
        zernike = radial * torch.sin(m_abs * theta)
    zernike[rho > 1] = 0

    return zernike

zero_pad(field, size=None, method='center')

Zero pad a field to double its size or specified size.

This function pads a field with zeros to either double its size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.

Parameters:

  • field
                Input field MxN or KxJxMxN or KxMxNxJ array.
    
  • size
                Size to be zeropadded (e.g., [m, n], last two dimensions only). 
                If None, doubles the last two dimensions.
    
  • method
                Zeropad either by placing the content to center or to the left.
    

Returns:

  • field_zero_padded ( tensor ) –

    Zeropadded version of the input field.

Source code in odak/learn/tools/matrix.py
def zero_pad(field, size=None, method="center"):
    """
    Zero pad a field to double its size or specified size.

    This function pads a field with zeros to either double its size (default) 
    or to a specified size. The input can be 2D, 3D or 4D tensors.

    Parameters
    ----------
    field             : torch.tensor
                        Input field MxN or KxJxMxN or KxMxNxJ array.
    size              : list
                        Size to be zeropadded (e.g., [m, n], last two dimensions only). 
                        If None, doubles the last two dimensions.
    method            : str
                        Zeropad either by placing the content to center or to the left.

    Returns
    ----------
    field_zero_padded : torch.tensor
                        Zeropadded version of the input field.
    """
    orig_resolution = field.shape
    if len(field.shape) < 3:
        field = field.unsqueeze(0)
    if len(field.shape) < 4:
        field = field.unsqueeze(0)
    permute_flag = False
    if field.shape[-1] < 5:
        permute_flag = True
        field = field.permute(0, 3, 1, 2)
    if size is None:
        resolution = [
            field.shape[0],
            field.shape[1],
            2 * field.shape[-2],
            2 * field.shape[-1],
        ]
    else:
        resolution = [field.shape[0], field.shape[1], size[0], size[1]]
    field_zero_padded = torch.zeros(resolution, device=field.device, dtype=field.dtype)
    if method == "center":
        start = [
            resolution[-2] // 2 - field.shape[-2] // 2,
            resolution[-1] // 2 - field.shape[-1] // 2,
        ]
        field_zero_padded[
            :,
            :,
            start[0] : start[0] + field.shape[-2],
            start[1] : start[1] + field.shape[-1],
        ] = field
    elif method == "left":
        field_zero_padded[:, :, 0 : field.shape[-2], 0 : field.shape[-1]] = field
    if permute_flag:
        field_zero_padded = field_zero_padded.permute(0, 2, 3, 1)
    if len(orig_resolution) == 2:
        field_zero_padded = field_zero_padded.squeeze(0).squeeze(0)
    if len(orig_resolution) == 3:
        field_zero_padded = field_zero_padded.squeeze(0)
    return field_zero_padded

load_image(fn, normalizeby=0.0, torch_style=False)

Definition to load an image from a given location as a torch tensor.

Parameters:

  • fn
           Filename.
    
  • normalizeby
           Value to to normalize images with. Default value of zero will lead to no normalization.
    
  • torch_style
           If set True, it will load an image mxnx3 as 3xmxn.
    

Returns:

  • image ( tensor ) –

    Image loaded as a torch tensor.

Source code in odak/learn/tools/file.py
def load_image(fn, normalizeby=0.0, torch_style=False):
    """
    Definition to load an image from a given location as a torch tensor.

    Parameters
    ----------
    fn           : str
                   Filename.
    normalizeby  : float or optional
                   Value to to normalize images with. Default value of zero will lead to no normalization.
    torch_style  : bool or optional
                   If set True, it will load an image mxnx3 as 3xmxn.

    Returns
    -------
    image        : torch.tensor
                   Image loaded as a torch tensor.
    """
    image = tools_load_image(fn, normalizeby=normalizeby, torch_style=torch_style)
    image = torch.from_numpy(image).float()
    return image

resize(image, multiplier=0.5, mode='nearest')

Definition to resize an image.

Parameters:

  • image
          Image with MxNx3 resolution.
    
  • multiplier
          Multiplier used in resizing operation (e.g., 0.5 is half size in one axis).
    
  • mode
          Mode to be used in scaling, nearest, bilinear, etc.
    

Returns:

  • new_image ( tensor ) –

    Resized image.

Source code in odak/learn/tools/file.py
def resize(image, multiplier=0.5, mode="nearest"):
    """
    Definition to resize an image.

    Parameters
    ----------
    image       : torch.tensor
                  Image with MxNx3 resolution.
    multiplier  : float
                  Multiplier used in resizing operation (e.g., 0.5 is half size in one axis).
    mode        : str
                  Mode to be used in scaling, nearest, bilinear, etc.

    Returns
    -------
    new_image   : torch.tensor
                  Resized image.
    """
    # Handle the case where image needs to be in the right format for torch.nn.Upsample
    if len(image.shape) == 3:
        # Add batch dimension: (H, W, C) -> (1, H, W, C)
        image = image.unsqueeze(0)
    elif len(image.shape) == 4:
        # Image is already in batch format
        pass
    else:
        raise ValueError("Image must have 3 or 4 dimensions")

    # Use torch.nn.functional.interpolate for resizing
    if mode not in ["nearest", "bilinear", "bicubic", "area"]:
        raise ValueError("Mode must be one of: nearest, bilinear, bicubic, area")

    # Resize the image
    new_image = torch.nn.functional.interpolate(
        image,
        scale_factor=multiplier,
        mode=mode,
        align_corners=None if mode in ["nearest", "area"] else False,
    )

    # Remove batch dimension if it was added
    if new_image.shape[0] == 1:
        new_image = new_image.squeeze(0)

    return new_image

save_image(fn, img, cmin=0, cmax=255, color_depth=8)

Definition to save a torch tensor as an image.

Parameters:

  • fn
           Filename.
    
  • img
           A torch tensor with NxMx3 or NxMx1 shapes.
    
  • cmin
           Minimum value that will be interpreted as 0 level in the final image.
    
  • cmax
           Maximum value that will be interpreted as 255 level in the final image.
    
  • color_depth
           Color depth of an image. Default is eight.
    

Returns:

  • bool ( bool ) –

    True if successful.

Source code in odak/learn/tools/file.py
def save_image(fn, img, cmin=0, cmax=255, color_depth=8):
    """
    Definition to save a torch tensor as an image.

    Parameters
    ----------
    fn           : str
                   Filename.
    img          : torch.tensor
                   A torch tensor with NxMx3 or NxMx1 shapes.
    cmin         : int
                   Minimum value that will be interpreted as 0 level in the final image.
    cmax         : int
                   Maximum value that will be interpreted as 255 level in the final image.
    color_depth  : int
                   Color depth of an image. Default is eight.

    Returns
    -------
    bool         : bool
                   True if successful.
    """
    if len(img.shape) == 4:
        img = img.squeeze(0)
    if len(img.shape) > 2 and torch.argmin(torch.tensor(img.shape)) == 0:
        # Transpose from (C, H, W) to (H, W, C)
        new_img = torch.zeros(img.shape[1], img.shape[2], img.shape[0]).to(img.device)
        for i in range(img.shape[0]):
            new_img[:, :, i] = img[i].detach().clone()
        img = new_img.detach().clone()
    img = img.cpu().detach().numpy()
    return tools_save_image(fn, img, cmin=cmin, cmax=cmax, color_depth=color_depth)

save_torch_tensor(fn, tensor)

Definition to save a torch tensor or dictionary.

Parameters:

  • fn
        Filename.
    
  • tensor
        Torch tensor or dictionary to be saved.
    

Raises:

  • ValueError : If path validation fails or extension is not allowed.
Source code in odak/learn/tools/file.py
def save_torch_tensor(fn, tensor):
    """
    Definition to save a torch tensor or dictionary.

    Parameters
    ----------
    fn           : str
                Filename.
    tensor       : torch.tensor or dict
                Torch tensor or dictionary to be saved.

    Raises
    ------
    ValueError : If path validation fails or extension is not allowed.
    """
    safe_path = validate_path(fn, allowed_extensions=[".pt", ".pth", ".pkl"])
    torch.save(tensor, safe_path)

torch_load(fn, weights_only=True, map_location=None)

Definition to load a torch files (*.pt).

Parameters:

  • fn
           Filename.
    
  • weights_only (bool, default: True ) –
           See torch.load() for details.
    
  • map_location (str, default: None ) –
           The device location to place data (e.g., `cuda`, `cpu`, etc.).
           The default is None.
    

Returns:

  • data ( any ) –

    See torch.load() for more.

Raises:

  • ValueError : If path validation fails or unsafe characters detected.
Source code in odak/learn/tools/file.py
def torch_load(fn, weights_only=True, map_location=None):
    """
    Definition to load a torch files (*.pt).

    Parameters
    ----------
    fn           : str
                   Filename.
    weights_only : bool
                   See torch.load() for details.
    map_location : str
                   The device location to place data (e.g., `cuda`, `cpu`, etc.).
                   The default is None.

    Returns
    -------
    data         : any
                   See torch.load() for more.

    Raises
    ------
    ValueError   : If path validation fails or unsafe characters detected.
    """
    safe_path = validate_path(fn, allowed_extensions=[".pt", ".pth", ".pkl"])
    data = torch.load(
        safe_path,
        weights_only=weights_only,
        map_location=map_location,
    )
    return data

histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0])

Calculates histogram loss between input frame and ground truth.

This function computes the MSE loss between histograms of the input frame and ground truth images, divided into specified number of bins.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • ground_truth (Tensor) –

    Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • bins (int, default: 32 ) –

    Number of bins for histogram calculation (default: 32).

  • limits (list, default: [0.0, 1.0] ) –

    Histogram limits as [min, max] (default: [0.0, 1.0]).

Returns:

  • Tensor

    Histogram loss value.

Source code in odak/learn/tools/loss.py
def histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0]):
    """
    Calculates histogram loss between input frame and ground truth.

    This function computes the MSE loss between histograms of the input frame
    and ground truth images, divided into specified number of bins.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    ground_truth : torch.Tensor
        Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    bins : int, optional
        Number of bins for histogram calculation (default: 32).
    limits : list, optional
        Histogram limits as [min, max] (default: [0.0, 1.0]).

    Returns
    -------
    torch.Tensor
        Histogram loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0).unsqueeze(0)
    elif len(frame.shape) == 3:
        frame = frame.unsqueeze(0)

    if len(ground_truth.shape) == 2:
        ground_truth = ground_truth.unsqueeze(0).unsqueeze(0)
    elif len(ground_truth.shape) == 3:
        ground_truth = ground_truth.unsqueeze(0)

    histogram_frame = torch.zeros(frame.shape[1], bins).to(frame.device)
    histogram_ground_truth = torch.zeros(ground_truth.shape[1], bins).to(frame.device)

    l2 = torch.nn.MSELoss()

    for i in range(frame.shape[1]):
        histogram_frame[i] = torch.histc(
            frame[:, i].flatten(), bins=bins, min=limits[0], max=limits[1]
        )
        histogram_ground_truth[i] = torch.histc(
            ground_truth[:, i].flatten(), bins=bins, min=limits[0], max=limits[1]
        )

    loss = l2(histogram_frame, histogram_ground_truth)

    return loss

michelson_contrast(image, roi_high, roi_low)

Calculates Michelson contrast ratio for given regions of an image.

This function computes the Michelson contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / (mean_high + mean_low).

Parameters:

  • image (Tensor) –

    Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

  • roi_high (Tensor) –

    Corner locations of the high intensity region [m_start, m_end, n_start, n_end].

  • roi_low (Tensor) –

    Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

Returns:

  • Tensor

    Michelson contrast for the given regions. Shape is [1] or [3] depending on input.

Source code in odak/learn/tools/loss.py
def michelson_contrast(image, roi_high, roi_low):
    """
    Calculates Michelson contrast ratio for given regions of an image.

    This function computes the Michelson contrast ratio for high and low intensity regions
    using the formula: (mean_high - mean_low) / (mean_high + mean_low).

    Parameters
    ----------
    image : torch.Tensor
        Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
    roi_high : torch.Tensor
        Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
    roi_low : torch.Tensor
        Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

    Returns
    -------
    torch.Tensor
        Michelson contrast for the given regions. Shape is [1] or [3] depending on input.
    """
    if len(image.shape) == 2:
        image = image.unsqueeze(0)
    if len(image.shape) == 3:
        image = image.unsqueeze(0)
    region_low = image[:, :, roi_low[0] : roi_low[1], roi_low[2] : roi_low[3]]
    region_high = image[:, :, roi_high[0] : roi_high[1], roi_high[2] : roi_high[3]]
    high = torch.mean(region_high, dim=(2, 3))
    low = torch.mean(region_low, dim=(2, 3))
    result = (high - low) / (high + low)
    return result.squeeze(0)

multi_scale_total_variation_loss(frame, levels=3)

Calculates multi-scale total variation loss for an input frame.

This function computes the total variation loss at multiple scales by creating an image pyramid where each level has half the resolution of the previous level.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

  • levels (int, default: 3 ) –

    Number of scales in the image pyramid (default: 3).

Returns:

  • Tensor

    Total variation loss value.

Source code in odak/learn/tools/loss.py
def multi_scale_total_variation_loss(frame, levels=3):
    """
    Calculates multi-scale total variation loss for an input frame.

    This function computes the total variation loss at multiple scales by creating
    an image pyramid where each level has half the resolution of the previous level.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
    levels : int, optional
        Number of scales in the image pyramid (default: 3).

    Returns
    -------
    torch.Tensor
        Total variation loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    scale = torch.nn.Upsample(scale_factor=0.5, mode="nearest")
    level = frame
    loss = 0
    for i in range(levels):
        if i != 0:
            level = scale(level)
        loss += total_variation_loss(level)
    return loss

radial_basis_function(value, epsilon=0.5)

Applies radial basis function with Gaussian description to input values.

This function applies the Gaussian radial basis function: y = e^(-ε² * x²)

Parameters:

  • value (Tensor) –

    Value(s) to pass to the radial basis function.

  • epsilon (float, default: 0.5 ) –

    Epsilon parameter used in the Gaussian radial basis function (default: 0.5).

Returns:

  • Tensor

    Output values after applying the radial basis function.

Source code in odak/learn/tools/loss.py
def radial_basis_function(value, epsilon=0.5):
    """
    Applies radial basis function with Gaussian description to input values.

    This function applies the Gaussian radial basis function: y = e^(-ε² * x²)

    Parameters
    ----------
    value : torch.Tensor
        Value(s) to pass to the radial basis function.
    epsilon : float, optional
        Epsilon parameter used in the Gaussian radial basis function (default: 0.5).

    Returns
    -------
    torch.Tensor
        Output values after applying the radial basis function.
    """
    output = torch.exp((-((epsilon * value) ** 2)))
    return output

spatial_gradient(frame)

Calculates the spatial gradient of a given frame.

This function computes the gradient of the input frame in both x and y directions by differencing adjacent pixels.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

Returns:

  • tuple

    Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.

Source code in odak/learn/tools/loss.py
def spatial_gradient(frame):
    """
    Calculates the spatial gradient of a given frame.

    This function computes the gradient of the input frame in both x and y directions
    by differencing adjacent pixels.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

    Returns
    -------
    tuple
        Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    diff_x = frame[:, :, :, 1:] - frame[:, :, :, :-1]
    diff_y = frame[:, :, 1:, :] - frame[:, :, :-1, :]
    return diff_x, diff_y

total_variation_loss(frame)

Calculates total variation loss for an input frame.

This function computes the total variation loss by calculating spatial gradients in both x and y directions and averaging their squared values.

Parameters:

  • frame (Tensor) –

    Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

Returns:

  • Tensor

    Total variation loss value.

Source code in odak/learn/tools/loss.py
def total_variation_loss(frame):
    """
    Calculates total variation loss for an input frame.

    This function computes the total variation loss by calculating spatial gradients
    in both x and y directions and averaging their squared values.

    Parameters
    ----------
    frame : torch.Tensor
        Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].

    Returns
    -------
    torch.Tensor
        Total variation loss value.
    """
    if len(frame.shape) == 2:
        frame = frame.unsqueeze(0)
    if len(frame.shape) == 3:
        frame = frame.unsqueeze(0)
    diff_x, diff_y = spatial_gradient(frame)
    pixel_count = frame.shape[0] * frame.shape[1] * frame.shape[2] * frame.shape[3]
    loss = ((diff_x**2).sum() + (diff_y**2).sum()) / pixel_count
    return loss

weber_contrast(image, roi_high, roi_low)

Calculates Weber contrast ratio for given regions of an image.

This function computes the Weber contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / mean_low.

Parameters:

  • image (Tensor) –

    Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • roi_high (Tensor) –

    Corner locations of the high intensity region [m_start, m_end, n_start, n_end].

  • roi_low (Tensor) –

    Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

Returns:

  • Tensor

    Weber contrast for the given regions. Shape is [1] or [3] depending on input.

Source code in odak/learn/tools/loss.py
def weber_contrast(image, roi_high, roi_low):
    """
    Calculates Weber contrast ratio for given regions of an image.

    This function computes the Weber contrast ratio for high and low intensity regions
    using the formula: (mean_high - mean_low) / mean_low.

    Parameters
    ----------
    image : torch.Tensor
        Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    roi_high : torch.Tensor
        Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
    roi_low : torch.Tensor
        Corner locations of the low intensity region [m_start, m_end, n_start, n_end].

    Returns
    -------
    torch.Tensor
        Weber contrast for the given regions. Shape is [1] or [3] depending on input.
    """
    if len(image.shape) == 2:
        image = image.unsqueeze(0)
    if len(image.shape) == 3:
        image = image.unsqueeze(0)
    region_low = image[:, :, roi_low[0] : roi_low[1], roi_low[2] : roi_low[3]]
    region_high = image[:, :, roi_high[0] : roi_high[1], roi_high[2] : roi_high[3]]
    high = torch.mean(region_high, dim=(2, 3))
    low = torch.mean(region_low, dim=(2, 3))
    result = (high - low) / low
    return result.squeeze(0)

wrapped_mean_squared_error(image, ground_truth, reduction='mean')

Calculates wrapped mean squared error between predicted and target angles.

This function computes the mean squared error for angular data, accounting for the wrap-around property of angles (e.g., 359° and 1° are close).

Parameters:

  • image (Tensor) –

    Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • ground_truth (Tensor) –

    Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].

  • reduction (str, default: 'mean' ) –

    Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.

Returns:

  • Tensor

    The calculated wrapped mean squared error.

Raises:

  • ValueError

    If an invalid reduction type is specified.

Source code in odak/learn/tools/loss.py
def wrapped_mean_squared_error(image, ground_truth, reduction="mean"):
    """
    Calculates wrapped mean squared error between predicted and target angles.

    This function computes the mean squared error for angular data, accounting for
    the wrap-around property of angles (e.g., 359° and 1° are close).

    Parameters
    ----------
    image : torch.Tensor
        Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    ground_truth : torch.Tensor
        Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
    reduction : str, optional
        Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.

    Returns
    -------
    torch.Tensor
        The calculated wrapped mean squared error.

    Raises
    ------
    ValueError
        If an invalid reduction type is specified.
    """
    sin_diff = torch.sin(image) - torch.sin(ground_truth)
    cos_diff = torch.cos(image) - torch.cos(ground_truth)
    loss = sin_diff**2 + cos_diff**2

    if reduction == "mean":
        return loss.mean()
    elif reduction == "sum":
        return loss.sum()
    else:
        raise ValueError("Invalid reduction type. Choose 'mean' or 'sum'.")

blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding='same')

Blur a field using a Gaussian kernel.

This function applies Gaussian blur to the input field using convolution with a Gaussian kernel in the frequency domain.

Parameters:

  • field
            MxN field to be blurred.
    
  • kernel_length (list, default: [21, 21] ) –
            Length of the Gaussian kernel along X and Y axes.
    
  • nsigma
            Sigma of the Gaussian kernel along X and Y axes.
    
  • padding
            Padding value, see torch.nn.functional.conv2d() for more.
    

Returns:

  • blurred_field ( tensor ) –

    Blurred field.

Source code in odak/learn/tools/matrix.py
def blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding="same"):
    """
    Blur a field using a Gaussian kernel.

    This function applies Gaussian blur to the input field using convolution with 
    a Gaussian kernel in the frequency domain.

    Parameters
    ----------
    field         : torch.tensor
                    MxN field to be blurred.
    kernel_length : list
                    Length of the Gaussian kernel along X and Y axes.
    nsigma        : list
                    Sigma of the Gaussian kernel along X and Y axes.
    padding       : int or string
                    Padding value, see torch.nn.functional.conv2d() for more.

    Returns
    ----------
    blurred_field : torch.tensor
                    Blurred field.
    """
    kernel = generate_2d_gaussian(kernel_length, nsigma).to(field.device)
    kernel = kernel.unsqueeze(0).unsqueeze(0)
    if len(field.shape) == 2:
        field = field.view(1, 1, field.shape[-2], field.shape[-1])
    blurred_field = torch.nn.functional.conv2d(field, kernel, padding="same")
    if field.shape[1] == 1:
        blurred_field = blurred_field.view(
            blurred_field.shape[-2], blurred_field.shape[-1]
        )
    return blurred_field

convolve2d(field, kernel)

Convolve a field with a kernel using frequency domain multiplication.

This function performs 2D convolution by transforming both the field and kernel to frequency domain, multiplying them, and transforming back to spatial domain.

Parameters:

  • field
          Input field with MxN shape.
    
  • kernel
          Input kernel with MxN shape.
    

Returns:

  • convolved_field ( tensor ) –

    Convolved field.

Source code in odak/learn/tools/matrix.py
def convolve2d(field, kernel):
    """
    Convolve a field with a kernel using frequency domain multiplication.

    This function performs 2D convolution by transforming both the field and kernel 
    to frequency domain, multiplying them, and transforming back to spatial domain.

    Parameters
    ----------
    field       : torch.tensor
                  Input field with MxN shape.
    kernel      : torch.tensor
                  Input kernel with MxN shape.

    Returns
    ----------
    convolved_field   : torch.tensor
                        Convolved field.
    """
    fr = torch.fft.fft2(field)
    fr2 = torch.fft.fft2(torch.flip(torch.flip(kernel, [1, 0]), [0, 1]))
    m, n = fr.shape
    convolved_field = torch.real(torch.fft.ifft2(fr * fr2))
    convolved_field = torch.roll(convolved_field, shifts=(int(n / 2 + 1), 0), dims=(1, 0))
    convolved_field = torch.roll(convolved_field, shifts=(int(m / 2 + 1), 0), dims=(0, 1))
    return convolved_field

correlation_2d(first_tensor, second_tensor)

Calculate the correlation between two tensors using FFT.

This function computes the 2D correlation between two tensors using frequency domain multiplication. It's equivalent to computing cross-correlation using FFT techniques.

Parameters:

  • first_tensor
            First tensor.
    
  • second_tensor (tensor) –
            Second tensor.
    

Returns:

  • correlation ( tensor ) –

    Correlation between the two tensors.

Source code in odak/learn/tools/matrix.py
def correlation_2d(first_tensor, second_tensor):
    """
    Calculate the correlation between two tensors using FFT.

    This function computes the 2D correlation between two tensors using 
    frequency domain multiplication. It's equivalent to computing 
    cross-correlation using FFT techniques.

    Parameters
    ----------
    first_tensor  : torch.tensor
                    First tensor.
    second_tensor : torch.tensor
                    Second tensor.

    Returns
    ----------
    correlation   : torch.tensor
                    Correlation between the two tensors.
    """
    fft_first_tensor = torch.fft.fft2(first_tensor)
    fft_second_tensor = torch.fft.fft2(second_tensor)
    conjugate_second_tensor = torch.conj(fft_second_tensor)
    result = torch.fft.ifftshift(
        torch.fft.ifft2(fft_first_tensor * conjugate_second_tensor)
    )
    return result

crop_center(field, size=None)

Crop the center of a field to specified size or half of current size.

This function crops the center of a field to either half of its current size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.

Parameters:

  • field
          Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array.
    
  • size
          Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N).
          If None, crops to half of the current size.
    

Returns:

  • cropped ( tensor ) –

    Cropped version of the input field.

Source code in odak/learn/tools/matrix.py
def crop_center(field, size=None):
    """
    Crop the center of a field to specified size or half of current size.

    This function crops the center of a field to either half of its current size (default) 
    or to a specified size. The input can be 2D, 3D or 4D tensors.

    Parameters
    ----------
    field       : torch.tensor
                  Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array.
    size        : list
                  Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N).
                  If None, crops to half of the current size.

    Returns
    ----------
    cropped     : torch.tensor
                  Cropped version of the input field.
    """
    orig_resolution = field.shape
    if len(field.shape) < 3:
        field = field.unsqueeze(0)
    if len(field.shape) < 4:
        field = field.unsqueeze(0)
    permute_flag = False
    if field.shape[-1] < 5:
        permute_flag = True
        field = field.permute(0, 3, 1, 2)
    if size is None:
        qx = int(field.shape[-2] // 4)
        qy = int(field.shape[-1] // 4)
        cropped_padded = field[
            :, :, qx : qx + field.shape[-2] // 2, qy : qy + field.shape[-1] // 2
        ]
    else:
        cx = int(field.shape[-2] // 2)
        cy = int(field.shape[-1] // 2)
        hx = int(size[-2] // 2)
        hy = int(size[-1] // 2)
        cropped_padded = field[:, :, cx - hx : cx + hx, cy - hy : cy + hy]
    cropped = cropped_padded
    if permute_flag:
        cropped = cropped.permute(0, 2, 3, 1)
    if len(orig_resolution) == 2:
        cropped = cropped_padded.squeeze(0).squeeze(0)
    if len(orig_resolution) == 3:
        cropped = cropped_padded.squeeze(0)
    return cropped

generate_2d_dirac_delta(kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False)

Generate 2D Dirac delta function using Gaussian approximation.

This function creates a 2D Dirac delta function by using a Gaussian distribution with very small standard deviations (a values) to approximate the behavior. Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function

Parameters:

  • kernel_length (list, default: [21, 21] ) –
            Length of the Dirac delta function along X and Y axes.
    
  • a
            The scale factor in Gaussian distribution to approximate the Dirac delta function.
            As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function.
    
  • mu
            Mu of the Gaussian kernel along X and Y axes.
    
  • theta
            The rotation angle of the 2D Dirac delta function.
    
  • normalize
            If set True, normalize the output to maximum value of 1.
    

Returns:

  • kernel_2d ( tensor ) –

    Generated 2D Dirac delta function.

Source code in odak/learn/tools/matrix.py
def generate_2d_dirac_delta(
    kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False
):
    """
    Generate 2D Dirac delta function using Gaussian approximation.

    This function creates a 2D Dirac delta function by using a Gaussian distribution 
    with very small standard deviations (a values) to approximate the behavior.
    Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function

    Parameters
    ----------
    kernel_length : list
                    Length of the Dirac delta function along X and Y axes.
    a             : list
                    The scale factor in Gaussian distribution to approximate the Dirac delta function.
                    As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function.
    mu            : list
                    Mu of the Gaussian kernel along X and Y axes.
    theta         : float
                    The rotation angle of the 2D Dirac delta function.
    normalize     : bool
                    If set True, normalize the output to maximum value of 1.

    Returns
    ----------
    kernel_2d     : torch.tensor
                    Generated 2D Dirac delta function.
    """
    x = torch.linspace(
        -kernel_length[0] / 2.0, kernel_length[0] / 2.0, kernel_length[0]
    )
    y = torch.linspace(
        -kernel_length[1] / 2.0, kernel_length[1] / 2.0, kernel_length[1]
    )
    X, Y = torch.meshgrid(x, y, indexing="ij")
    X = X - mu[0]
    Y = Y - mu[1]
    theta = torch.as_tensor(theta)
    X_rot = X * torch.cos(theta) - Y * torch.sin(theta)
    Y_rot = X * torch.sin(theta) + Y * torch.cos(theta)
    kernel_2d = (1 / (abs(a[0] * a[1]) * torch.pi)) * torch.exp(
        -((X_rot / a[0]) ** 2 + (Y_rot / a[1]) ** 2)
    )
    if normalize:
        kernel_2d = kernel_2d / kernel_2d.max()
    return kernel_2d

generate_2d_gaussian(kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False)

Generate 2D Gaussian kernel.

This function creates a 2D Gaussian kernel with specified dimensions and parameters. Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy

Parameters:

  • kernel_length (list, default: [21, 21] ) –
            Length of the Gaussian kernel along X and Y axes.
    
  • nsigma
            Sigma of the Gaussian kernel along X and Y axes.
    
  • mu
            Mu of the Gaussian kernel along X and Y axes.
    
  • normalize
            If set True, normalize the output to maximum value of 1.
    

Returns:

  • kernel_2d ( tensor ) –

    Generated Gaussian kernel.

Source code in odak/learn/tools/matrix.py
def generate_2d_gaussian(
    kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False
):
    """
    Generate 2D Gaussian kernel.

    This function creates a 2D Gaussian kernel with specified dimensions and parameters.
    Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy

    Parameters
    ----------
    kernel_length : list
                    Length of the Gaussian kernel along X and Y axes.
    nsigma        : list
                    Sigma of the Gaussian kernel along X and Y axes.
    mu            : list
                    Mu of the Gaussian kernel along X and Y axes.
    normalize     : bool
                    If set True, normalize the output to maximum value of 1.

    Returns
    ----------
    kernel_2d     : torch.tensor
                    Generated Gaussian kernel.
    """
    x = torch.linspace(
        -kernel_length[0] / 2.0, kernel_length[0] / 2.0, kernel_length[0]
    )
    y = torch.linspace(
        -kernel_length[1] / 2.0, kernel_length[1] / 2.0, kernel_length[1]
    )
    X, Y = torch.meshgrid(x, y, indexing="ij")
    if nsigma[0] == 0:
        nsigma[0] = 1e-5
    if nsigma[1] == 0:
        nsigma[1] = 1e-5
    kernel_2d = (
        1.0
        / (2.0 * torch.pi * nsigma[0] * nsigma[1])
        * torch.exp(
            -(
                (X - mu[0]) ** 2.0 / (2.0 * nsigma[0] ** 2.0)
                + (Y - mu[1]) ** 2.0 / (2.0 * nsigma[1] ** 2.0)
            )
        )
    )
    if normalize:
        kernel_2d = kernel_2d / kernel_2d.max()
    return kernel_2d

quantize(image_field, bits=8, limits=[0.0, 1.0])

Quantize an image field to a specified number of bits.

This function maps the input image field from its original range to a quantized representation with the specified number of bits.

Parameters:

  • image_field (tensor) –
          Input image field between any range.
    
  • bits
          Number of bits for quantization (1-8).
    
  • limits
          The minimum and maximum of the image_field variable.
    

Returns:

  • quantized_field ( tensor ) –

    Quantized image field.

Source code in odak/learn/tools/matrix.py
def quantize(image_field, bits=8, limits=[0.0, 1.0]):
    """
    Quantize an image field to a specified number of bits.

    This function maps the input image field from its original range to a quantized 
    representation with the specified number of bits.

    Parameters
    ----------
    image_field : torch.tensor
                  Input image field between any range.
    bits        : int
                  Number of bits for quantization (1-8).
    limits      : list
                  The minimum and maximum of the image_field variable.

    Returns
    ----------
    quantized_field   : torch.tensor
                        Quantized image field.
    """
    normalized_field = (image_field - limits[0]) / (limits[1] - limits[0])
    divider = 2**bits
    quantized_field = normalized_field * divider
    quantized_field = quantized_field.int()
    return quantized_field

zero_pad(field, size=None, method='center')

Zero pad a field to double its size or specified size.

This function pads a field with zeros to either double its size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.

Parameters:

  • field
                Input field MxN or KxJxMxN or KxMxNxJ array.
    
  • size
                Size to be zeropadded (e.g., [m, n], last two dimensions only). 
                If None, doubles the last two dimensions.
    
  • method
                Zeropad either by placing the content to center or to the left.
    

Returns:

  • field_zero_padded ( tensor ) –

    Zeropadded version of the input field.

Source code in odak/learn/tools/matrix.py
def zero_pad(field, size=None, method="center"):
    """
    Zero pad a field to double its size or specified size.

    This function pads a field with zeros to either double its size (default) 
    or to a specified size. The input can be 2D, 3D or 4D tensors.

    Parameters
    ----------
    field             : torch.tensor
                        Input field MxN or KxJxMxN or KxMxNxJ array.
    size              : list
                        Size to be zeropadded (e.g., [m, n], last two dimensions only). 
                        If None, doubles the last two dimensions.
    method            : str
                        Zeropad either by placing the content to center or to the left.

    Returns
    ----------
    field_zero_padded : torch.tensor
                        Zeropadded version of the input field.
    """
    orig_resolution = field.shape
    if len(field.shape) < 3:
        field = field.unsqueeze(0)
    if len(field.shape) < 4:
        field = field.unsqueeze(0)
    permute_flag = False
    if field.shape[-1] < 5:
        permute_flag = True
        field = field.permute(0, 3, 1, 2)
    if size is None:
        resolution = [
            field.shape[0],
            field.shape[1],
            2 * field.shape[-2],
            2 * field.shape[-1],
        ]
    else:
        resolution = [field.shape[0], field.shape[1], size[0], size[1]]
    field_zero_padded = torch.zeros(resolution, device=field.device, dtype=field.dtype)
    if method == "center":
        start = [
            resolution[-2] // 2 - field.shape[-2] // 2,
            resolution[-1] // 2 - field.shape[-1] // 2,
        ]
        field_zero_padded[
            :,
            :,
            start[0] : start[0] + field.shape[-2],
            start[1] : start[1] + field.shape[-1],
        ] = field
    elif method == "left":
        field_zero_padded[:, :, 0 : field.shape[-2], 0 : field.shape[-1]] = field
    if permute_flag:
        field_zero_padded = field_zero_padded.permute(0, 2, 3, 1)
    if len(orig_resolution) == 2:
        field_zero_padded = field_zero_padded.squeeze(0).squeeze(0)
    if len(orig_resolution) == 3:
        field_zero_padded = field_zero_padded.squeeze(0)
    return field_zero_padded

grid_sample(no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0])

Generate samples over a surface.

Parameters:

  • no (list, default: [10, 10] ) –

    Number of samples along each dimension.

  • size (list, default: [100.0, 100.0] ) –

    Physical size of the surface along each dimension.

  • center (list, default: [0.0, 0.0, 0.0] ) –

    Center location of the surface.

  • angles (list, default: [0.0, 0.0, 0.0] ) –

    Tilt angles of the surface around X, Y, and Z axes.

Returns:

  • samples ( tensor ) –

    Generated samples.

  • rotx ( tensor ) –

    Rotation matrix around X axis.

  • roty ( tensor ) –

    Rotation matrix around Y axis.

  • rotz ( tensor ) –

    Rotation matrix around Z axis.

Source code in odak/learn/tools/sample.py
def grid_sample(
    no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0]
):
    """
    Generate samples over a surface.

    Parameters
    ----------
    no : list
        Number of samples along each dimension.
    size : list
        Physical size of the surface along each dimension.
    center : list
        Center location of the surface.
    angles : list
        Tilt angles of the surface around X, Y, and Z axes.

    Returns
    -------
    samples : torch.tensor
        Generated samples.
    rotx : torch.tensor
        Rotation matrix around X axis.
    roty : torch.tensor
        Rotation matrix around Y axis.
    rotz : torch.tensor
        Rotation matrix around Z axis.
    """
    center = torch.tensor(center, dtype=torch.float32)
    angles = torch.tensor(angles, dtype=torch.float32)
    size = torch.tensor(size, dtype=torch.float32)
    samples = torch.zeros((no[0], no[1], 3), dtype=torch.float32)
    x = torch.linspace(-size[0] / 2.0, size[0] / 2.0, no[0])
    y = torch.linspace(-size[1] / 2.0, size[1] / 2.0, no[1])
    X, Y = torch.meshgrid(x, y, indexing="ij")
    samples[:, :, 0] = X
    samples[:, :, 1] = Y
    samples = samples.reshape((-1, 3))
    samples, rotx, roty, rotz = rotate_points(samples, angles=angles, offset=center)
    return samples, rotx, roty, rotz

get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order='XYZ')

Generate rotation matrix for given tilt angles and tilt order.

Parameters:

  • tilt_angles (list, default: [0.0, 0.0, 0.0] ) –

    Tilt angles in degrees along XYZ axes.

  • tilt_order (str, default: 'XYZ' ) –

    Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).

Returns:

  • Tensor

    Rotation matrix.

Source code in odak/learn/tools/transformation.py
def get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order="XYZ"):
    """
    Generate rotation matrix for given tilt angles and tilt order.

    Parameters
    ----------
    tilt_angles : list
        Tilt angles in degrees along XYZ axes.
    tilt_order : str
        Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).

    Returns
    -------
    torch.Tensor
        Rotation matrix.
    """
    rotx = rotmatx(tilt_angles[0])
    roty = rotmaty(tilt_angles[1])
    rotz = rotmatz(tilt_angles[2])
    if tilt_order == "XYZ":
        rotmat = torch.mm(rotz, torch.mm(roty, rotx))
    elif tilt_order == "XZY":
        rotmat = torch.mm(roty, torch.mm(rotz, rotx))
    elif tilt_order == "ZXY":
        rotmat = torch.mm(roty, torch.mm(rotx, rotz))
    elif tilt_order == "YXZ":
        rotmat = torch.mm(rotz, torch.mm(rotx, roty))
    elif tilt_order == "ZYX":
        rotmat = torch.mm(rotx, torch.mm(roty, rotz))
    return rotmat

load_voxelized_PLY(ply_filename, voxel_size=[0.05, 0.05, 0.05], device=torch.device('cpu'))

Load a point cloud from a PLY file and convert it into a voxel grid representation.

Parameters:

  • ply_filename (str or Path) –

    The path to the input PLY file containing triangle data.

  • voxel_size ((list or tuple, shape(3)), default: [0.05, 0.05, 0.05] ) –

    The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].

  • device (device, default: device('cpu') ) –

    The device on which to perform computations. Default is CPU.

Returns:

  • points ( (Tensor, shape(N, 3)) ) –

    A tensor containing the coordinates of the voxel centers.

  • ground_truth ( (Tensor, shape(Gx * Gy * Gz)) ) –

    A binary tensor where each element indicates whether a corresponding voxel contains at least one point.

Notes
  • The function reads triangle data from the PLY file and computes the center points of these triangles.
  • These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
  • Only voxels containing at least one point are marked as 1 in ground_truth.
  • All operations are performed on the specified device for efficiency.
Source code in odak/learn/tools/transformation.py
def load_voxelized_PLY(
    ply_filename,
    voxel_size=[0.05, 0.05, 0.05],
    device=torch.device("cpu"),
):
    """
    Load a point cloud from a PLY file and convert it into a voxel grid representation.

    Parameters
    ----------
    ply_filename : str or Path
        The path to the input PLY file containing triangle data.
    voxel_size : list or tuple, shape (3,), optional
        The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].
    device : torch.device, optional
        The device on which to perform computations. Default is CPU.

    Returns
    -------
    points : torch.Tensor, shape (N, 3)
        A tensor containing the coordinates of the voxel centers.
    ground_truth : torch.Tensor, shape (Gx * Gy * Gz,)
        A binary tensor where each element indicates whether a corresponding voxel contains at least one point.

    Notes
    -----
    - The function reads triangle data from the PLY file and computes the center points of these triangles.
    - These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
    - Only voxels containing at least one point are marked as 1 in `ground_truth`.
    - All operations are performed on the specified device for efficiency.
    """
    triangles = read_PLY(ply_filename)
    points = center_of_triangle(triangles)
    points = torch.as_tensor(points, device=device)
    points = points - points.mean()
    points = points / torch.amax(points)
    ground_truth = torch.ones(points.shape[0], device=device)
    voxel_locations, voxel_grid = point_cloud_to_voxel(
        points=points,
        voxel_size=voxel_size,
    )
    points = voxel_locations.reshape(-1, 3)
    ground_truth = voxel_grid.reshape(-1)
    return points, ground_truth

point_cloud_to_voxel(points, voxel_size=[0.1, 0.1, 0.1])

Convert a point cloud to a voxel grid representation.

Parameters:

  • points ((Tensor, shape(N, 3))) –

    The input point cloud, where each row is a 3D point.

  • voxel_size ((list or Tensor, shape(3)), default: [0.1, 0.1, 0.1] ) –

    The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].

Returns:

  • locations ( (Tensor, shape(Gx, Gy, Gz, 3)) ) –

    The coordinates of each voxel center in the grid.

  • grid ( (Tensor, shape(Gx, Gy, Gz)) ) –

    A binary voxel grid where 1 indicates the presence of at least one point.

Notes
  • The voxel grid is constructed by discretizing the space between the minimum and maximum coordinates of the point cloud.
  • Only voxels containing at least one point are marked as 1.
  • The output grid is of type float32 and resides on the same device as the input points.
Source code in odak/learn/tools/transformation.py
def point_cloud_to_voxel(
    points,
    voxel_size=[0.1, 0.1, 0.1],
):
    """
    Convert a point cloud to a voxel grid representation.

    Parameters
    ----------
    points : torch.Tensor, shape (N, 3)
        The input point cloud, where each row is a 3D point.
    voxel_size : list or torch.Tensor, shape (3,), optional
        The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].

    Returns
    -------
    locations : torch.Tensor, shape (Gx, Gy, Gz, 3)
        The coordinates of each voxel center in the grid.
    grid : torch.Tensor, shape (Gx, Gy, Gz)
        A binary voxel grid where 1 indicates the presence of at least one point.

    Notes
    -----
    - The voxel grid is constructed by discretizing the space between the minimum and maximum
      coordinates of the point cloud.
    - Only voxels containing at least one point are marked as 1.
    - The output grid is of type float32 and resides on the same device as the input points.
    """
    voxel_size = torch.as_tensor(voxel_size, device=points.device)

    min_coords = points.min(dim=0).values
    max_coords = points.max(dim=0).values
    grid_size = ((max_coords - min_coords) / voxel_size).ceil().int()
    points = points - min_coords

    x = torch.linspace(min_coords[0], max_coords[0], grid_size[0], device=points.device)
    y = torch.linspace(min_coords[1], max_coords[1], grid_size[1], device=points.device)
    z = torch.linspace(min_coords[2], max_coords[2], grid_size[2], device=points.device)
    X, Y, Z = torch.meshgrid(x, y, z, indexing="ij")
    locations = torch.stack([X, Y, Z], dim=-1)

    voxel_indices = (points / voxel_size).floor().int()
    mask = (voxel_indices >= 0).all(dim=1) & (voxel_indices < grid_size).all(dim=1)
    voxel_indices = voxel_indices[mask]
    grid = torch.zeros(grid_size.tolist(), dtype=torch.float32, device=points.device)
    grid[voxel_indices[:, 0], voxel_indices[:, 1], voxel_indices[:, 2]] = 1.0

    return locations, grid

quaternion_to_rotation_matrix(quaternions)

Convert rotations given as unit quaternions to rotation matrices.

Parameters:

  • quaternions (Tensor) –
          Quaternions with real part first, shape ``(*, 4)``
          in ``(w, x, y, z)`` convention.
    

Returns:

  • rotation_matrices ( Tensor ) –

    Rotation matrices, shape (*, 3, 3).

Source code in odak/learn/tools/transformation.py
def quaternion_to_rotation_matrix(quaternions):
    """
    Convert rotations given as unit quaternions to rotation matrices.

    Parameters
    ----------
    quaternions : torch.Tensor
                  Quaternions with real part first, shape ``(*, 4)``
                  in ``(w, x, y, z)`` convention.

    Returns
    -------
    rotation_matrices : torch.Tensor
                        Rotation matrices, shape ``(*, 3, 3)``.
    """
    quaternions = F.normalize(quaternions, dim=-1)
    w, x, y, z = quaternions.unbind(-1)

    two_s = 2.0 / (quaternions * quaternions).sum(-1)

    rotation_matrices = torch.stack(
        [
            1 - two_s * (y * y + z * z),
            two_s * (x * y - w * z),
            two_s * (x * z + w * y),
            two_s * (x * y + w * z),
            1 - two_s * (x * x + z * z),
            two_s * (y * z - w * x),
            two_s * (x * z - w * y),
            two_s * (y * z + w * x),
            1 - two_s * (x * x + y * y),
        ],
        dim=-1,
    )

    return rotation_matrices.reshape(quaternions.shape[:-1] + (3, 3))

rotate_points(point, angles=torch.zeros(1, 3), mode='XYZ', origin=torch.zeros(1, 3), offset=torch.zeros(1, 3))

Rotate a given point and return the result along with rotation matrices.

Note that rotation is always with respect to 0,0,0.

Parameters:

  • point (Tensor) –

    A point with size of [3] or [1, 3] or [m, 3].

  • angles (Tensor, default: zeros(1, 3) ) –

    Rotation angles in degrees.

  • mode (str, default: 'XYZ' ) –

    Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.

  • origin (Tensor, default: zeros(1, 3) ) –

    Reference point for a rotation. Expected size is [3] or [1, 3].

  • offset (Tensor, default: zeros(1, 3) ) –

    Shift with the given offset. Expected size is [3] or [1, 3] or [m, 3].

Returns:

  • tuple

    Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.

Source code in odak/learn/tools/transformation.py
def rotate_points(
    point,
    angles=torch.zeros(1, 3),
    mode="XYZ",
    origin=torch.zeros(1, 3),
    offset=torch.zeros(1, 3),
):
    """
    Rotate a given point and return the result along with rotation matrices.

    Note that rotation is always with respect to 0,0,0.

    Parameters
    ----------
    point : torch.Tensor
        A point with size of [3] or [1, 3] or [m, 3].
    angles : torch.Tensor
        Rotation angles in degrees.
    mode : str
        Rotation mode determines ordering of the rotations at each axis.
        There are XYZ,YXZ,ZXY and ZYX modes.
    origin : torch.Tensor
        Reference point for a rotation.
        Expected size is [3] or [1, 3].
    offset : torch.Tensor
        Shift with the given offset.
        Expected size is [3] or [1, 3] or [m, 3].

    Returns
    -------
    tuple
        Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.
    """
    origin = origin.to(point.device)
    offset = offset.to(point.device)
    angles = angles.to(point.device)

    if len(point.shape) == 1:
        point = point.unsqueeze(0)
    if len(angles.shape) == 1:
        angles = angles.unsqueeze(0)
    if len(origin.shape) == 1:
        origin = origin.unsqueeze(0)
    if len(offset.shape) == 1:
        offset = offset.unsqueeze(0)

    rotx = rotmatx(angles[:, 0]).unsqueeze(0)
    roty = rotmaty(angles[:, 1]).unsqueeze(0)
    rotz = rotmatz(angles[:, 2]).unsqueeze(0)

    new_points = (point.unsqueeze(1) - origin.unsqueeze(0)).unsqueeze(-1)

    if mode == "XYZ":
        result = rotz @ (roty @ (rotx @ new_points))
    elif mode == "XZY":
        result = roty @ (rotz @ (rotx @ new_points))
    elif mode == "YXZ":
        result = rotz @ (rotx @ (roty @ new_points))
    elif mode == "ZXY":
        result = roty @ (rotx @ (rotz @ new_points))
    elif mode == "ZYX":
        result = rotx @ (roty @ (rotz @ new_points))

    result = result.squeeze(-1)
    result = result + origin.unsqueeze(0)
    result = result + offset.unsqueeze(0)
    if result.shape[1] == 1:
        result = result.squeeze(1)
    return result, rotx, roty, rotz

rotmatx(angle)

Generate a rotation matrix along the X axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the X axis.

Source code in odak/learn/tools/transformation.py
def rotmatx(angle):
    """
    Generate a rotation matrix along the X axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the X axis.
    """
    angle = torch.deg2rad(angle)
    rotx = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    rotx[:, 0, 0] = 1.0
    rotx[:, 1, 1] = torch.cos(angle)
    rotx[:, 1, 2] = -torch.sin(angle)
    rotx[:, 2, 1] = torch.sin(angle)
    rotx[:, 2, 2] = torch.cos(angle)
    if rotx.shape[0] == 1:
        rotx = rotx.squeeze(0)
    return rotx

rotmaty(angle)

Generate a rotation matrix along the Y axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the Y axis.

Source code in odak/learn/tools/transformation.py
def rotmaty(angle):
    """
    Generate a rotation matrix along the Y axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the Y axis.
    """
    angle = torch.deg2rad(angle)
    roty = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    roty[:, 0, 0] = torch.cos(angle)
    roty[:, 0, 2] = torch.sin(angle)
    roty[:, 1, 1] = 1.0
    roty[:, 2, 0] = -torch.sin(angle)
    roty[:, 2, 2] = torch.cos(angle)
    if roty.shape[0] == 1:
        roty = roty.squeeze(0)
    return roty

rotmatz(angle)

Generate a rotation matrix along the Z axis.

Parameters:

  • angle (Tensor) –

    Rotation angles in degrees.

Returns:

  • Tensor

    Rotation matrix along the Z axis.

Source code in odak/learn/tools/transformation.py
def rotmatz(angle):
    """
    Generate a rotation matrix along the Z axis.

    Parameters
    ----------
    angle : torch.Tensor
        Rotation angles in degrees.

    Returns
    -------
    torch.Tensor
        Rotation matrix along the Z axis.
    """
    angle = torch.deg2rad(angle)
    rotz = torch.zeros(angle.shape[0], 3, 3, device=angle.device)
    rotz[:, 0, 0] = torch.cos(angle)
    rotz[:, 0, 1] = -torch.sin(angle)
    rotz[:, 1, 0] = torch.sin(angle)
    rotz[:, 1, 1] = torch.cos(angle)
    rotz[:, 2, 2] = 1.0
    if rotz.shape[0] == 1:
        rotz = rotz.squeeze(0)
    return rotz

tilt_towards(location, lookat)

Tilt surface normal of a plane towards a point.

Parameters:

  • location (list) –

    Center of the plane to be tilted.

  • lookat (list) –

    Tilt towards this point.

Returns:

  • list

    Rotation angles in degrees.

Source code in odak/learn/tools/transformation.py
def tilt_towards(location, lookat):
    """
    Tilt surface normal of a plane towards a point.

    Parameters
    ----------
    location : list
        Center of the plane to be tilted.
    lookat : list
        Tilt towards this point.

    Returns
    -------
    list
        Rotation angles in degrees.
    """
    dx = location[0] - lookat[0]
    dy = location[1] - lookat[1]
    dz = location[2] - lookat[2]
    dist = torch.sqrt(torch.tensor(dx**2 + dy**2 + dz**2))
    phi = torch.atan2(torch.tensor(dy), torch.tensor(dx))
    theta = torch.arccos(dz / dist)
    angles = [0, float(torch.rad2deg(theta)), float(torch.rad2deg(phi))]
    return angles

cross_product(vector1, vector2)

Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product

Parameters:

  • vector1
           A vector/ray.
    
  • vector2
           A vector/ray.
    

Returns:

  • ray ( tensor ) –

    Array that contains starting points and cosines of a created ray.

Source code in odak/learn/tools/vector.py
def cross_product(vector1, vector2):
    """
    Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product

    Parameters
    ----------
    vector1      : torch.tensor
                   A vector/ray.
    vector2      : torch.tensor
                   A vector/ray.

    Returns
    ----------
    ray          : torch.tensor
                   Array that contains starting points and cosines of a created ray.
    """
    angle = torch.cross(vector1[1].T, vector2[1].T)
    angle = torch.tensor(angle)
    ray = torch.tensor([vector1[0], angle], dtype=torch.float32)
    return ray

distance_between_two_points(point1, point2)

Definition to calculate distance between two given points.

Parameters:

  • point1
          First point in X,Y,Z.
    
  • point2
          Second point in X,Y,Z.
    

Returns:

  • distance ( Tensor ) –

    Distance in between given two points.

Source code in odak/learn/tools/vector.py
def distance_between_two_points(point1, point2):
    """
    Definition to calculate distance between two given points.

    Parameters
    ----------
    point1      : torch.Tensor
                  First point in X,Y,Z.
    point2      : torch.Tensor
                  Second point in X,Y,Z.

    Returns
    ----------
    distance    : torch.Tensor
                  Distance in between given two points.
    """
    point1 = torch.tensor(point1) if not isinstance(point1, torch.Tensor) else point1
    point2 = torch.tensor(point2) if not isinstance(point2, torch.Tensor) else point2

    if len(point1.shape) == 1 and len(point2.shape) == 1:
        distance = torch.sqrt(torch.sum((point1 - point2) ** 2))
    elif len(point1.shape) == 2 or len(point2.shape) == 2:
        distance = torch.sqrt(torch.sum((point1 - point2) ** 2, dim=-1))

    return distance

same_side(p1, p2, a, b)

Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.

Parameters:

  • p1
          Point(s) to check.
    
  • p2
          This is the point check against.
    
  • a
          First point that forms the line.
    
  • b
          Second point that forms the line.
    
Source code in odak/learn/tools/vector.py
def same_side(p1, p2, a, b):
    """
    Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.

    Parameters
    ----------
    p1          : list
                  Point(s) to check.
    p2          : list
                  This is the point check against.
    a           : list
                  First point that forms the line.
    b           : list
                  Second point that forms the line.
    """
    ba = torch.subtract(b, a)
    p1a = torch.subtract(p1, a)
    p2a = torch.subtract(p2, a)
    cp1 = torch.cross(ba, p1a)
    cp2 = torch.cross(ba, p2a)
    test = torch.dot(cp1, cp2)
    if len(p1.shape) > 1:
        return test >= 0
    if test >= 0:
        return True
    return False