odak.learn.tools
odak.learn.tools
Provides necessary definitions for general tools used across the library.
PerspectiveCamera
¶
A lightweight perspective camera model.
Stores camera intrinsics and extrinsics and provides coordinate-transform utilities.
Parameters:
-
R–Rotation matrix, shape ``(3, 3)`` or ``(1, 3, 3)``. -
T–Translation vector, shape ``(3,)`` or ``(1, 3)``. -
focal_length–Focal lengths ``(fx, fy)``, shape ``(2,)`` or ``(1, 2)``. -
principal_point–Principal point ``(px, py)``, shape ``(2,)`` or ``(1, 2)``. -
device–Device for all tensors (default: ``"cpu"``).
Source code in odak/learn/tools/camera.py
get_camera_center()
¶
Compute the camera centre in world coordinates.
Returns:
-
center(Tensor) –Camera centre, shape
(1, 3).
Source code in odak/learn/tools/camera.py
transform_world_to_camera_space(points)
¶
Transform world-space points into camera space.
Follows the convention: X_cam = X_world @ R + T.
Parameters:
-
points(Tensor) –World-space points, shape ``(N, 3)``.
Returns:
-
cam_points(Tensor) –Camera-space points, shape
(N, 3).
Source code in odak/learn/tools/camera.py
blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding='same')
¶
Blur a field using a Gaussian kernel.
This function applies Gaussian blur to the input field using convolution with a Gaussian kernel in the frequency domain.
Parameters:
-
field–MxN field to be blurred. -
kernel_length(list, default:[21, 21]) –Length of the Gaussian kernel along X and Y axes. -
nsigma–Sigma of the Gaussian kernel along X and Y axes. -
padding–Padding value, see torch.nn.functional.conv2d() for more.
Returns:
-
blurred_field(tensor) –Blurred field.
Source code in odak/learn/tools/matrix.py
center_of_triangle(triangle)
¶
Definition to calculate center of a triangle.
Parameters:
-
triangle–An array that contains three points defining a triangle (Mx3). It can also parallel process many triangles (NxMx3).
Source code in odak/raytracing/primitives.py
circular_binary_mask(px, py, r)
¶
Generate a 2D circular binary mask.
Parameters:
-
px(int) –Pixel count in x dimension.
-
py(int) –Pixel count in y dimension.
-
r(Union[int, float]) –Radius of the circle.
Returns:
-
Tensor–Binary mask of shape [1, 1, px, py].
Source code in odak/learn/tools/mask.py
convolve2d(field, kernel)
¶
Convolve a field with a kernel using frequency domain multiplication.
This function performs 2D convolution by transforming both the field and kernel to frequency domain, multiplying them, and transforming back to spatial domain.
Parameters:
-
field–Input field with MxN shape. -
kernel–Input kernel with MxN shape.
Returns:
-
convolved_field(tensor) –Convolved field.
Source code in odak/learn/tools/matrix.py
correlation_2d(first_tensor, second_tensor)
¶
Calculate the correlation between two tensors using FFT.
This function computes the 2D correlation between two tensors using frequency domain multiplication. It's equivalent to computing cross-correlation using FFT techniques.
Parameters:
-
first_tensor–First tensor. -
second_tensor(tensor) –Second tensor.
Returns:
-
correlation(tensor) –Correlation between the two tensors.
Source code in odak/learn/tools/matrix.py
crop_center(field, size=None)
¶
Crop the center of a field to specified size or half of current size.
This function crops the center of a field to either half of its current size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.
Parameters:
-
field–Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array. -
size–Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N). If None, crops to half of the current size.
Returns:
-
cropped(tensor) –Cropped version of the input field.
Source code in odak/learn/tools/matrix.py
cross_product(vector1, vector2)
¶
Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product
Parameters:
-
vector1–A vector/ray. -
vector2–A vector/ray.
Returns:
-
ray(tensor) –Array that contains starting points and cosines of a created ray.
Source code in odak/learn/tools/vector.py
distance_between_two_points(point1, point2)
¶
Definition to calculate distance between two given points.
Parameters:
-
point1–First point in X,Y,Z. -
point2–Second point in X,Y,Z.
Returns:
-
distance(Tensor) –Distance in between given two points.
Source code in odak/learn/tools/vector.py
evaluate_3d_gaussians(points, centers=torch.zeros(1, 3), scales=torch.ones(1, 3), angles=torch.zeros(1, 3), opacity=torch.ones(1, 1))
¶
Evaluate 3D Gaussian functions at given points, with optional rotation.
Parameters:
-
points–The 3D points at which to evaluate the Gaussians. -
centers–The centers of the Gaussians. -
scales–The standard deviations (spread) of the Gaussians along each axis. -
angles–The rotation angles (in radians) for each Gaussian, applied to the points. -
opacity–Opacity of the Gaussians.
Returns:
-
intensities((Tensor, shape[n, 1])) –The evaluated Gaussian intensities at each point.
Source code in odak/learn/tools/function.py
freeze(model)
¶
A utility function to freeze the parameters of a provided model.
This function sets requires_grad to False for all parameters in the model,
effectively freezing them during training.
Parameters:
-
model(Module) –Model whose parameters are to be frozen. This should be a PyTorch model instance.
Returns:
-
None–The function modifies the model in-place.
Source code in odak/learn/tools/models.py
generate_2d_dirac_delta(kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False)
¶
Generate 2D Dirac delta function using Gaussian approximation.
This function creates a 2D Dirac delta function by using a Gaussian distribution with very small standard deviations (a values) to approximate the behavior. Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function
Parameters:
-
kernel_length(list, default:[21, 21]) –Length of the Dirac delta function along X and Y axes. -
a–The scale factor in Gaussian distribution to approximate the Dirac delta function. As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function. -
mu–Mu of the Gaussian kernel along X and Y axes. -
theta–The rotation angle of the 2D Dirac delta function. -
normalize–If set True, normalize the output to maximum value of 1.
Returns:
-
kernel_2d(tensor) –Generated 2D Dirac delta function.
Source code in odak/learn/tools/matrix.py
generate_2d_gaussian(kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False)
¶
Generate 2D Gaussian kernel.
This function creates a 2D Gaussian kernel with specified dimensions and parameters. Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy
Parameters:
-
kernel_length(list, default:[21, 21]) –Length of the Gaussian kernel along X and Y axes. -
nsigma–Sigma of the Gaussian kernel along X and Y axes. -
mu–Mu of the Gaussian kernel along X and Y axes. -
normalize–If set True, normalize the output to maximum value of 1.
Returns:
-
kernel_2d(tensor) –Generated Gaussian kernel.
Source code in odak/learn/tools/matrix.py
get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order='XYZ')
¶
Generate rotation matrix for given tilt angles and tilt order.
Parameters:
-
tilt_angles(list, default:[0.0, 0.0, 0.0]) –Tilt angles in degrees along XYZ axes.
-
tilt_order(str, default:'XYZ') –Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).
Returns:
-
Tensor–Rotation matrix.
Source code in odak/learn/tools/transformation.py
grid_sample(no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0])
¶
Generate samples over a surface.
Parameters:
-
no(list, default:[10, 10]) –Number of samples along each dimension.
-
size(list, default:[100.0, 100.0]) –Physical size of the surface along each dimension.
-
center(list, default:[0.0, 0.0, 0.0]) –Center location of the surface.
-
angles(list, default:[0.0, 0.0, 0.0]) –Tilt angles of the surface around X, Y, and Z axes.
Returns:
-
samples(tensor) –Generated samples.
-
rotx(tensor) –Rotation matrix around X axis.
-
roty(tensor) –Rotation matrix around Y axis.
-
rotz(tensor) –Rotation matrix around Z axis.
Source code in odak/learn/tools/sample.py
histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0])
¶
Calculates histogram loss between input frame and ground truth.
This function computes the MSE loss between histograms of the input frame and ground truth images, divided into specified number of bins.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
ground_truth(Tensor) –Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
bins(int, default:32) –Number of bins for histogram calculation (default: 32).
-
limits(list, default:[0.0, 1.0]) –Histogram limits as [min, max] (default: [0.0, 1.0]).
Returns:
-
Tensor–Histogram loss value.
Source code in odak/learn/tools/loss.py
load_image(fn, normalizeby=0.0, torch_style=False)
¶
Definition to load an image from a given location as a torch tensor.
Parameters:
-
fn–Filename. -
normalizeby–Value to to normalize images with. Default value of zero will lead to no normalization. -
torch_style–If set True, it will load an image mxnx3 as 3xmxn.
Returns:
-
image(tensor) –Image loaded as a torch tensor.
Source code in odak/learn/tools/file.py
load_voxelized_PLY(ply_filename, voxel_size=[0.05, 0.05, 0.05], device=torch.device('cpu'))
¶
Load a point cloud from a PLY file and convert it into a voxel grid representation.
Parameters:
-
ply_filename(str or Path) –The path to the input PLY file containing triangle data.
-
voxel_size((list or tuple, shape(3)), default:[0.05, 0.05, 0.05]) –The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].
-
device(device, default:device('cpu')) –The device on which to perform computations. Default is CPU.
Returns:
-
points((Tensor, shape(N, 3))) –A tensor containing the coordinates of the voxel centers.
-
ground_truth((Tensor, shape(Gx * Gy * Gz))) –A binary tensor where each element indicates whether a corresponding voxel contains at least one point.
Notes
- The function reads triangle data from the PLY file and computes the center points of these triangles.
- These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
- Only voxels containing at least one point are marked as 1 in
ground_truth. - All operations are performed on the specified device for efficiency.
Source code in odak/learn/tools/transformation.py
michelson_contrast(image, roi_high, roi_low)
¶
Calculates Michelson contrast ratio for given regions of an image.
This function computes the Michelson contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / (mean_high + mean_low).
Parameters:
-
image(Tensor) –Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
-
roi_high(Tensor) –Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
-
roi_low(Tensor) –Corner locations of the low intensity region [m_start, m_end, n_start, n_end].
Returns:
-
Tensor–Michelson contrast for the given regions. Shape is [1] or [3] depending on input.
Source code in odak/learn/tools/loss.py
multi_scale_total_variation_loss(frame, levels=3)
¶
Calculates multi-scale total variation loss for an input frame.
This function computes the total variation loss at multiple scales by creating an image pyramid where each level has half the resolution of the previous level.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
-
levels(int, default:3) –Number of scales in the image pyramid (default: 3).
Returns:
-
Tensor–Total variation loss value.
Source code in odak/learn/tools/loss.py
point_cloud_to_voxel(points, voxel_size=[0.1, 0.1, 0.1])
¶
Convert a point cloud to a voxel grid representation.
Parameters:
-
points((Tensor, shape(N, 3))) –The input point cloud, where each row is a 3D point.
-
voxel_size((list or Tensor, shape(3)), default:[0.1, 0.1, 0.1]) –The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].
Returns:
-
locations((Tensor, shape(Gx, Gy, Gz, 3))) –The coordinates of each voxel center in the grid.
-
grid((Tensor, shape(Gx, Gy, Gz))) –A binary voxel grid where 1 indicates the presence of at least one point.
Notes
- The voxel grid is constructed by discretizing the space between the minimum and maximum coordinates of the point cloud.
- Only voxels containing at least one point are marked as 1.
- The output grid is of type float32 and resides on the same device as the input points.
Source code in odak/learn/tools/transformation.py
quantize(image_field, bits=8, limits=[0.0, 1.0])
¶
Quantize an image field to a specified number of bits.
This function maps the input image field from its original range to a quantized representation with the specified number of bits.
Parameters:
-
image_field(tensor) –Input image field between any range. -
bits–Number of bits for quantization (1-8). -
limits–The minimum and maximum of the image_field variable.
Returns:
-
quantized_field(tensor) –Quantized image field.
Source code in odak/learn/tools/matrix.py
quaternion_to_rotation_matrix(quaternions)
¶
Convert rotations given as unit quaternions to rotation matrices.
Parameters:
-
quaternions(Tensor) –Quaternions with real part first, shape ``(*, 4)`` in ``(w, x, y, z)`` convention.
Returns:
-
rotation_matrices(Tensor) –Rotation matrices, shape
(*, 3, 3).
Source code in odak/learn/tools/transformation.py
radial_basis_function(value, epsilon=0.5)
¶
Applies radial basis function with Gaussian description to input values.
This function applies the Gaussian radial basis function: y = e^(-ε² * x²)
Parameters:
-
value(Tensor) –Value(s) to pass to the radial basis function.
-
epsilon(float, default:0.5) –Epsilon parameter used in the Gaussian radial basis function (default: 0.5).
Returns:
-
Tensor–Output values after applying the radial basis function.
Source code in odak/learn/tools/loss.py
read_PLY(fn, offset=[0, 0, 0], angles=[0.0, 0.0, 0.0], mode='XYZ')
¶
Definition to read a PLY file and extract meshes from a given PLY file. Note that rotation is always with respect to 0,0,0.
Parameters:
-
fn–Filename of a PLY file. -
offset–Offset in X,Y,Z. -
angles–Rotation angles in degrees. -
mode–Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.
Returns:
-
triangles(ndarray) –Triangles from a given PLY file. Note that the triangles coming out of this function isn't always structured in the right order and with the size of (MxN)x3. You can use numpy's reshape to restructure it to mxnx3 if you know what you are doing.
Raises:
-
ValueError : If path validation fails or extension is not allowed.– -
TypeError : If fn is not a string.–
Source code in odak/tools/asset.py
resize(image, multiplier=0.5, mode='nearest')
¶
Definition to resize an image.
Parameters:
-
image–Image with MxNx3 resolution. -
multiplier–Multiplier used in resizing operation (e.g., 0.5 is half size in one axis). -
mode–Mode to be used in scaling, nearest, bilinear, etc.
Returns:
-
new_image(tensor) –Resized image.
Source code in odak/learn/tools/file.py
rotate_points(point, angles=torch.zeros(1, 3), mode='XYZ', origin=torch.zeros(1, 3), offset=torch.zeros(1, 3))
¶
Rotate a given point and return the result along with rotation matrices.
Note that rotation is always with respect to 0,0,0.
Parameters:
-
point(Tensor) –A point with size of [3] or [1, 3] or [m, 3].
-
angles(Tensor, default:zeros(1, 3)) –Rotation angles in degrees.
-
mode(str, default:'XYZ') –Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.
-
origin(Tensor, default:zeros(1, 3)) –Reference point for a rotation. Expected size is [3] or [1, 3].
-
offset(Tensor, default:zeros(1, 3)) –Shift with the given offset. Expected size is [3] or [1, 3] or [m, 3].
Returns:
-
tuple–Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.
Source code in odak/learn/tools/transformation.py
rotmatx(angle)
¶
Generate a rotation matrix along the X axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the X axis.
Source code in odak/learn/tools/transformation.py
rotmaty(angle)
¶
Generate a rotation matrix along the Y axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the Y axis.
Source code in odak/learn/tools/transformation.py
rotmatz(angle)
¶
Generate a rotation matrix along the Z axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the Z axis.
Source code in odak/learn/tools/transformation.py
same_side(p1, p2, a, b)
¶
Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.
Parameters:
-
p1–Point(s) to check. -
p2–This is the point check against. -
a–First point that forms the line. -
b–Second point that forms the line.
Source code in odak/learn/tools/vector.py
save_image(fn, img, cmin=0, cmax=255, color_depth=8)
¶
Definition to save a torch tensor as an image.
Parameters:
-
fn–Filename. -
img–A torch tensor with NxMx3 or NxMx1 shapes. -
cmin–Minimum value that will be interpreted as 0 level in the final image. -
cmax–Maximum value that will be interpreted as 255 level in the final image. -
color_depth–Color depth of an image. Default is eight.
Returns:
-
bool(bool) –True if successful.
Source code in odak/learn/tools/file.py
save_torch_tensor(fn, tensor)
¶
Definition to save a torch tensor or dictionary.
Parameters:
-
fn–Filename. -
tensor–Torch tensor or dictionary to be saved.
Raises:
-
ValueError : If path validation fails or extension is not allowed.–
Source code in odak/learn/tools/file.py
spatial_gradient(frame)
¶
Calculates the spatial gradient of a given frame.
This function computes the gradient of the input frame in both x and y directions by differencing adjacent pixels.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
Returns:
-
tuple–Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.
Source code in odak/learn/tools/loss.py
tilt_towards(location, lookat)
¶
Tilt surface normal of a plane towards a point.
Parameters:
-
location(list) –Center of the plane to be tilted.
-
lookat(list) –Tilt towards this point.
Returns:
-
list–Rotation angles in degrees.
Source code in odak/learn/tools/transformation.py
tools_load_image(fn, normalizeby=0.0, torch_style=False)
¶
Definition to load an image from a given location as a Numpy array.
Parameters:
-
fn–Filename. -
normalizeby–Value to to normalize images with. Default value of zero will lead to no normalization. -
torch_style–If set True, it will load an image mxnx3 as 3xmxn.
Returns:
-
image(ndarray) –Image loaded as a Numpy array.
Source code in odak/tools/file.py
tools_save_image(fn, img, cmin=0, cmax=255, color_depth=8)
¶
Definition to save a Numpy array as an image.
Parameters:
-
fn–Filename. -
img–A numpy array with NxMx3 or NxMx1 shapes. -
cmin–Minimum value that will be interpreted as 0 level in the final image. -
cmax–Maximum value that will be interpreted as 255 level in the final image. -
color_depth–Pixel color depth in bits, default is eight bits.
Returns:
-
bool(bool) –True if successful.
Source code in odak/tools/file.py
torch_load(fn, weights_only=True, map_location=None)
¶
Definition to load a torch files (*.pt).
Parameters:
-
fn–Filename. -
weights_only(bool, default:True) –See torch.load() for details. -
map_location(str, default:None) –The device location to place data (e.g., `cuda`, `cpu`, etc.). The default is None.
Returns:
-
data(any) –See torch.load() for more.
Raises:
-
ValueError : If path validation fails or unsafe characters detected.–
Source code in odak/learn/tools/file.py
total_variation_loss(frame)
¶
Calculates total variation loss for an input frame.
This function computes the total variation loss by calculating spatial gradients in both x and y directions and averaging their squared values.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
Returns:
-
Tensor–Total variation loss value.
Source code in odak/learn/tools/loss.py
unfreeze(model)
¶
A utility function to unfreeze the parameters of a provided model.
This function sets requires_grad to True for all parameters in the model,
effectively allowing them to be updated during training.
Parameters:
-
model(Module) –Model whose parameters are to be unfrozen. This should be a PyTorch model instance.
Returns:
-
None–The function modifies the model in-place.
Source code in odak/learn/tools/models.py
validate_path(path, allowed_extensions=None)
¶
Validates a file path for security safety.
Parameters:
-
path–Path to validate. -
allowed_extensions(list, default:None) –List of allowed extensions (e.g., ['.png', '.jpg']). If None, all extensions are allowed.
Returns:
-
safe_path(str) –The validated and secured path (with tilde expanded).
Raises:
-
ValueError : If path traversal attempt detected or extension not allowed.– -
TypeError : If path is not a string.–
Source code in odak/tools/file.py
weber_contrast(image, roi_high, roi_low)
¶
Calculates Weber contrast ratio for given regions of an image.
This function computes the Weber contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / mean_low.
Parameters:
-
image(Tensor) –Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
roi_high(Tensor) –Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
-
roi_low(Tensor) –Corner locations of the low intensity region [m_start, m_end, n_start, n_end].
Returns:
-
Tensor–Weber contrast for the given regions. Shape is [1] or [3] depending on input.
Source code in odak/learn/tools/loss.py
wrapped_mean_squared_error(image, ground_truth, reduction='mean')
¶
Calculates wrapped mean squared error between predicted and target angles.
This function computes the mean squared error for angular data, accounting for the wrap-around property of angles (e.g., 359° and 1° are close).
Parameters:
-
image(Tensor) –Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
ground_truth(Tensor) –Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
reduction(str, default:'mean') –Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.
Returns:
-
Tensor–The calculated wrapped mean squared error.
Raises:
-
ValueError–If an invalid reduction type is specified.
Source code in odak/learn/tools/loss.py
zernike_polynomial(n, m, rho, theta)
¶
Compute the 2D Zernike polynomial Z_n^m(rho, theta).
Parameters:
-
n–Radial degree of the polynomial (n >= 0). -
m–Azimuthal frequency of the polynomial. Must satisfy |m| <= n and (n - |m|) % 2 == 0. -
rho–Radial distance from the origin (0 to 1). Shape (H, W). -
theta–Azimuthal angle in radians. Shape (H, W).
Returns:
-
zernike(Tensor) –The computed 2D Zernike polynomial. Values are zero where rho > 1.
Source code in odak/learn/tools/function.py
zero_pad(field, size=None, method='center')
¶
Zero pad a field to double its size or specified size.
This function pads a field with zeros to either double its size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.
Parameters:
-
field–Input field MxN or KxJxMxN or KxMxNxJ array. -
size–Size to be zeropadded (e.g., [m, n], last two dimensions only). If None, doubles the last two dimensions. -
method–Zeropad either by placing the content to center or to the left.
Returns:
-
field_zero_padded(tensor) –Zeropadded version of the input field.
Source code in odak/learn/tools/matrix.py
load_image(fn, normalizeby=0.0, torch_style=False)
¶
Definition to load an image from a given location as a torch tensor.
Parameters:
-
fn–Filename. -
normalizeby–Value to to normalize images with. Default value of zero will lead to no normalization. -
torch_style–If set True, it will load an image mxnx3 as 3xmxn.
Returns:
-
image(tensor) –Image loaded as a torch tensor.
Source code in odak/learn/tools/file.py
resize(image, multiplier=0.5, mode='nearest')
¶
Definition to resize an image.
Parameters:
-
image–Image with MxNx3 resolution. -
multiplier–Multiplier used in resizing operation (e.g., 0.5 is half size in one axis). -
mode–Mode to be used in scaling, nearest, bilinear, etc.
Returns:
-
new_image(tensor) –Resized image.
Source code in odak/learn/tools/file.py
save_image(fn, img, cmin=0, cmax=255, color_depth=8)
¶
Definition to save a torch tensor as an image.
Parameters:
-
fn–Filename. -
img–A torch tensor with NxMx3 or NxMx1 shapes. -
cmin–Minimum value that will be interpreted as 0 level in the final image. -
cmax–Maximum value that will be interpreted as 255 level in the final image. -
color_depth–Color depth of an image. Default is eight.
Returns:
-
bool(bool) –True if successful.
Source code in odak/learn/tools/file.py
save_torch_tensor(fn, tensor)
¶
Definition to save a torch tensor or dictionary.
Parameters:
-
fn–Filename. -
tensor–Torch tensor or dictionary to be saved.
Raises:
-
ValueError : If path validation fails or extension is not allowed.–
Source code in odak/learn/tools/file.py
torch_load(fn, weights_only=True, map_location=None)
¶
Definition to load a torch files (*.pt).
Parameters:
-
fn–Filename. -
weights_only(bool, default:True) –See torch.load() for details. -
map_location(str, default:None) –The device location to place data (e.g., `cuda`, `cpu`, etc.). The default is None.
Returns:
-
data(any) –See torch.load() for more.
Raises:
-
ValueError : If path validation fails or unsafe characters detected.–
Source code in odak/learn/tools/file.py
histogram_loss(frame, ground_truth, bins=32, limits=[0.0, 1.0])
¶
Calculates histogram loss between input frame and ground truth.
This function computes the MSE loss between histograms of the input frame and ground truth images, divided into specified number of bins.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
ground_truth(Tensor) –Ground truth with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
bins(int, default:32) –Number of bins for histogram calculation (default: 32).
-
limits(list, default:[0.0, 1.0]) –Histogram limits as [min, max] (default: [0.0, 1.0]).
Returns:
-
Tensor–Histogram loss value.
Source code in odak/learn/tools/loss.py
michelson_contrast(image, roi_high, roi_low)
¶
Calculates Michelson contrast ratio for given regions of an image.
This function computes the Michelson contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / (mean_high + mean_low).
Parameters:
-
image(Tensor) –Input image with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
-
roi_high(Tensor) –Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
-
roi_low(Tensor) –Corner locations of the low intensity region [m_start, m_end, n_start, n_end].
Returns:
-
Tensor–Michelson contrast for the given regions. Shape is [1] or [3] depending on input.
Source code in odak/learn/tools/loss.py
multi_scale_total_variation_loss(frame, levels=3)
¶
Calculates multi-scale total variation loss for an input frame.
This function computes the total variation loss at multiple scales by creating an image pyramid where each level has half the resolution of the previous level.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
-
levels(int, default:3) –Number of scales in the image pyramid (default: 3).
Returns:
-
Tensor–Total variation loss value.
Source code in odak/learn/tools/loss.py
radial_basis_function(value, epsilon=0.5)
¶
Applies radial basis function with Gaussian description to input values.
This function applies the Gaussian radial basis function: y = e^(-ε² * x²)
Parameters:
-
value(Tensor) –Value(s) to pass to the radial basis function.
-
epsilon(float, default:0.5) –Epsilon parameter used in the Gaussian radial basis function (default: 0.5).
Returns:
-
Tensor–Output values after applying the radial basis function.
Source code in odak/learn/tools/loss.py
spatial_gradient(frame)
¶
Calculates the spatial gradient of a given frame.
This function computes the gradient of the input frame in both x and y directions by differencing adjacent pixels.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
Returns:
-
tuple–Tuple of (diff_x, diff_y) representing spatial gradients along x and y axes.
Source code in odak/learn/tools/loss.py
total_variation_loss(frame)
¶
Calculates total variation loss for an input frame.
This function computes the total variation loss by calculating spatial gradients in both x and y directions and averaging their squared values.
Parameters:
-
frame(Tensor) –Input frame with shape [1 x 3 x m x n], [3 x m x n], or [m x n].
Returns:
-
Tensor–Total variation loss value.
Source code in odak/learn/tools/loss.py
weber_contrast(image, roi_high, roi_low)
¶
Calculates Weber contrast ratio for given regions of an image.
This function computes the Weber contrast ratio for high and low intensity regions using the formula: (mean_high - mean_low) / mean_low.
Parameters:
-
image(Tensor) –Input image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
roi_high(Tensor) –Corner locations of the high intensity region [m_start, m_end, n_start, n_end].
-
roi_low(Tensor) –Corner locations of the low intensity region [m_start, m_end, n_start, n_end].
Returns:
-
Tensor–Weber contrast for the given regions. Shape is [1] or [3] depending on input.
Source code in odak/learn/tools/loss.py
wrapped_mean_squared_error(image, ground_truth, reduction='mean')
¶
Calculates wrapped mean squared error between predicted and target angles.
This function computes the mean squared error for angular data, accounting for the wrap-around property of angles (e.g., 359° and 1° are close).
Parameters:
-
image(Tensor) –Predicted image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
ground_truth(Tensor) –Ground truth image with shape [1 x 3 x m x n], [3 x m x n], [1 x m x n], or [m x n].
-
reduction(str, default:'mean') –Specifies the reduction to apply to the output: 'mean' (default) or 'sum'.
Returns:
-
Tensor–The calculated wrapped mean squared error.
Raises:
-
ValueError–If an invalid reduction type is specified.
Source code in odak/learn/tools/loss.py
blur_gaussian(field, kernel_length=[21, 21], nsigma=[3, 3], padding='same')
¶
Blur a field using a Gaussian kernel.
This function applies Gaussian blur to the input field using convolution with a Gaussian kernel in the frequency domain.
Parameters:
-
field–MxN field to be blurred. -
kernel_length(list, default:[21, 21]) –Length of the Gaussian kernel along X and Y axes. -
nsigma–Sigma of the Gaussian kernel along X and Y axes. -
padding–Padding value, see torch.nn.functional.conv2d() for more.
Returns:
-
blurred_field(tensor) –Blurred field.
Source code in odak/learn/tools/matrix.py
convolve2d(field, kernel)
¶
Convolve a field with a kernel using frequency domain multiplication.
This function performs 2D convolution by transforming both the field and kernel to frequency domain, multiplying them, and transforming back to spatial domain.
Parameters:
-
field–Input field with MxN shape. -
kernel–Input kernel with MxN shape.
Returns:
-
convolved_field(tensor) –Convolved field.
Source code in odak/learn/tools/matrix.py
correlation_2d(first_tensor, second_tensor)
¶
Calculate the correlation between two tensors using FFT.
This function computes the 2D correlation between two tensors using frequency domain multiplication. It's equivalent to computing cross-correlation using FFT techniques.
Parameters:
-
first_tensor–First tensor. -
second_tensor(tensor) –Second tensor.
Returns:
-
correlation(tensor) –Correlation between the two tensors.
Source code in odak/learn/tools/matrix.py
crop_center(field, size=None)
¶
Crop the center of a field to specified size or half of current size.
This function crops the center of a field to either half of its current size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.
Parameters:
-
field–Input field 2M x 2N or K x L x 2M x 2N or K x 2M x 2N x L array. -
size–Dimensions to crop with respect to center of the image (e.g., M x N or 1 x 1 x M x N). If None, crops to half of the current size.
Returns:
-
cropped(tensor) –Cropped version of the input field.
Source code in odak/learn/tools/matrix.py
generate_2d_dirac_delta(kernel_length=[21, 21], a=[3, 3], mu=[0, 0], theta=0, normalize=False)
¶
Generate 2D Dirac delta function using Gaussian approximation.
This function creates a 2D Dirac delta function by using a Gaussian distribution with very small standard deviations (a values) to approximate the behavior. Inspired from https://en.wikipedia.org/wiki/Dirac_delta_function
Parameters:
-
kernel_length(list, default:[21, 21]) –Length of the Dirac delta function along X and Y axes. -
a–The scale factor in Gaussian distribution to approximate the Dirac delta function. As a approaches zero, the Gaussian distribution becomes infinitely narrow and tall at the center (x=0), approaching the Dirac delta function. -
mu–Mu of the Gaussian kernel along X and Y axes. -
theta–The rotation angle of the 2D Dirac delta function. -
normalize–If set True, normalize the output to maximum value of 1.
Returns:
-
kernel_2d(tensor) –Generated 2D Dirac delta function.
Source code in odak/learn/tools/matrix.py
generate_2d_gaussian(kernel_length=[21, 21], nsigma=[3, 3], mu=[0, 0], normalize=False)
¶
Generate 2D Gaussian kernel.
This function creates a 2D Gaussian kernel with specified dimensions and parameters. Inspired from https://stackoverflow.com/questions/29731726/how-to-calculate-a-gaussian-kernel-matrix-efficiently-in-numpy
Parameters:
-
kernel_length(list, default:[21, 21]) –Length of the Gaussian kernel along X and Y axes. -
nsigma–Sigma of the Gaussian kernel along X and Y axes. -
mu–Mu of the Gaussian kernel along X and Y axes. -
normalize–If set True, normalize the output to maximum value of 1.
Returns:
-
kernel_2d(tensor) –Generated Gaussian kernel.
Source code in odak/learn/tools/matrix.py
quantize(image_field, bits=8, limits=[0.0, 1.0])
¶
Quantize an image field to a specified number of bits.
This function maps the input image field from its original range to a quantized representation with the specified number of bits.
Parameters:
-
image_field(tensor) –Input image field between any range. -
bits–Number of bits for quantization (1-8). -
limits–The minimum and maximum of the image_field variable.
Returns:
-
quantized_field(tensor) –Quantized image field.
Source code in odak/learn/tools/matrix.py
zero_pad(field, size=None, method='center')
¶
Zero pad a field to double its size or specified size.
This function pads a field with zeros to either double its size (default) or to a specified size. The input can be 2D, 3D or 4D tensors.
Parameters:
-
field–Input field MxN or KxJxMxN or KxMxNxJ array. -
size–Size to be zeropadded (e.g., [m, n], last two dimensions only). If None, doubles the last two dimensions. -
method–Zeropad either by placing the content to center or to the left.
Returns:
-
field_zero_padded(tensor) –Zeropadded version of the input field.
Source code in odak/learn/tools/matrix.py
grid_sample(no=[10, 10], size=[100.0, 100.0], center=[0.0, 0.0, 0.0], angles=[0.0, 0.0, 0.0])
¶
Generate samples over a surface.
Parameters:
-
no(list, default:[10, 10]) –Number of samples along each dimension.
-
size(list, default:[100.0, 100.0]) –Physical size of the surface along each dimension.
-
center(list, default:[0.0, 0.0, 0.0]) –Center location of the surface.
-
angles(list, default:[0.0, 0.0, 0.0]) –Tilt angles of the surface around X, Y, and Z axes.
Returns:
-
samples(tensor) –Generated samples.
-
rotx(tensor) –Rotation matrix around X axis.
-
roty(tensor) –Rotation matrix around Y axis.
-
rotz(tensor) –Rotation matrix around Z axis.
Source code in odak/learn/tools/sample.py
get_rotation_matrix(tilt_angles=[0.0, 0.0, 0.0], tilt_order='XYZ')
¶
Generate rotation matrix for given tilt angles and tilt order.
Parameters:
-
tilt_angles(list, default:[0.0, 0.0, 0.0]) –Tilt angles in degrees along XYZ axes.
-
tilt_order(str, default:'XYZ') –Rotation order (e.g., XYZ, XZY, ZXY, YXZ, ZYX).
Returns:
-
Tensor–Rotation matrix.
Source code in odak/learn/tools/transformation.py
load_voxelized_PLY(ply_filename, voxel_size=[0.05, 0.05, 0.05], device=torch.device('cpu'))
¶
Load a point cloud from a PLY file and convert it into a voxel grid representation.
Parameters:
-
ply_filename(str or Path) –The path to the input PLY file containing triangle data.
-
voxel_size((list or tuple, shape(3)), default:[0.05, 0.05, 0.05]) –The size of each voxel in the x, y, and z directions. Default is [0.05, 0.05, 0.05].
-
device(device, default:device('cpu')) –The device on which to perform computations. Default is CPU.
Returns:
-
points((Tensor, shape(N, 3))) –A tensor containing the coordinates of the voxel centers.
-
ground_truth((Tensor, shape(Gx * Gy * Gz))) –A binary tensor where each element indicates whether a corresponding voxel contains at least one point.
Notes
- The function reads triangle data from the PLY file and computes the center points of these triangles.
- These points are then processed to create a normalized point cloud, which is converted into a voxel grid.
- Only voxels containing at least one point are marked as 1 in
ground_truth. - All operations are performed on the specified device for efficiency.
Source code in odak/learn/tools/transformation.py
point_cloud_to_voxel(points, voxel_size=[0.1, 0.1, 0.1])
¶
Convert a point cloud to a voxel grid representation.
Parameters:
-
points((Tensor, shape(N, 3))) –The input point cloud, where each row is a 3D point.
-
voxel_size((list or Tensor, shape(3)), default:[0.1, 0.1, 0.1]) –The size of each voxel in the x, y, and z directions. Default is [0.1, 0.1, 0.1].
Returns:
-
locations((Tensor, shape(Gx, Gy, Gz, 3))) –The coordinates of each voxel center in the grid.
-
grid((Tensor, shape(Gx, Gy, Gz))) –A binary voxel grid where 1 indicates the presence of at least one point.
Notes
- The voxel grid is constructed by discretizing the space between the minimum and maximum coordinates of the point cloud.
- Only voxels containing at least one point are marked as 1.
- The output grid is of type float32 and resides on the same device as the input points.
Source code in odak/learn/tools/transformation.py
quaternion_to_rotation_matrix(quaternions)
¶
Convert rotations given as unit quaternions to rotation matrices.
Parameters:
-
quaternions(Tensor) –Quaternions with real part first, shape ``(*, 4)`` in ``(w, x, y, z)`` convention.
Returns:
-
rotation_matrices(Tensor) –Rotation matrices, shape
(*, 3, 3).
Source code in odak/learn/tools/transformation.py
rotate_points(point, angles=torch.zeros(1, 3), mode='XYZ', origin=torch.zeros(1, 3), offset=torch.zeros(1, 3))
¶
Rotate a given point and return the result along with rotation matrices.
Note that rotation is always with respect to 0,0,0.
Parameters:
-
point(Tensor) –A point with size of [3] or [1, 3] or [m, 3].
-
angles(Tensor, default:zeros(1, 3)) –Rotation angles in degrees.
-
mode(str, default:'XYZ') –Rotation mode determines ordering of the rotations at each axis. There are XYZ,YXZ,ZXY and ZYX modes.
-
origin(Tensor, default:zeros(1, 3)) –Reference point for a rotation. Expected size is [3] or [1, 3].
-
offset(Tensor, default:zeros(1, 3)) –Shift with the given offset. Expected size is [3] or [1, 3] or [m, 3].
Returns:
-
tuple–Result of the rotation [1 x 3] or [m x 3], and rotation matrices along each axis.
Source code in odak/learn/tools/transformation.py
rotmatx(angle)
¶
Generate a rotation matrix along the X axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the X axis.
Source code in odak/learn/tools/transformation.py
rotmaty(angle)
¶
Generate a rotation matrix along the Y axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the Y axis.
Source code in odak/learn/tools/transformation.py
rotmatz(angle)
¶
Generate a rotation matrix along the Z axis.
Parameters:
-
angle(Tensor) –Rotation angles in degrees.
Returns:
-
Tensor–Rotation matrix along the Z axis.
Source code in odak/learn/tools/transformation.py
tilt_towards(location, lookat)
¶
Tilt surface normal of a plane towards a point.
Parameters:
-
location(list) –Center of the plane to be tilted.
-
lookat(list) –Tilt towards this point.
Returns:
-
list–Rotation angles in degrees.
Source code in odak/learn/tools/transformation.py
cross_product(vector1, vector2)
¶
Definition to cross product two vectors and return the resultant vector. Used method described under: http://en.wikipedia.org/wiki/Cross_product
Parameters:
-
vector1–A vector/ray. -
vector2–A vector/ray.
Returns:
-
ray(tensor) –Array that contains starting points and cosines of a created ray.
Source code in odak/learn/tools/vector.py
distance_between_two_points(point1, point2)
¶
Definition to calculate distance between two given points.
Parameters:
-
point1–First point in X,Y,Z. -
point2–Second point in X,Y,Z.
Returns:
-
distance(Tensor) –Distance in between given two points.
Source code in odak/learn/tools/vector.py
same_side(p1, p2, a, b)
¶
Definition to figure which side a point is on with respect to a line and a point. See http://www.blackpawn.com/texts/pointinpoly/ for more. If p1 and p2 are on the sameside, this definition returns True.
Parameters:
-
p1–Point(s) to check. -
p2–This is the point check against. -
a–First point that forms the line. -
b–Second point that forms the line.