IVALab Python Libraries
Collection of code for computer vision and robotics with specific API.
Public Member Functions | Public Attributes | List of all members
frameTransformer Class Reference

Public Member Functions

def __init__ (self, M_intrinsic=None, M_WtoC=None, M_WtoR=None, M_BEV=None)
 
def parseDepth (self, depth_map)
 
def parsePBEV (self, coords, deps=None)
 
def parsePCam (self)
 
def parsePImg (self, coords, deps)
 
def parsePRob (self, robot_coords)
 

Public Attributes

 M_BEV
 
 M_int
 
 M_WtoC
 
 M_WtoR
 

Detailed Description

In a world composed of a depth camera, robot, and a defined world frame, 
the frameTransformer handels the transformation between the image frame, camera frame, tabletop frame, and the robot frame.
The image and camera frame follow the OpenCV definition

The transformation matrix required to complish the task:
1. M_intrinsic: The camera intrinsic matrix, which is the camera-to-image transformation (double direction). P_I = M_int * P_C
2. M_WtoC: The world-to-camera transformation. One example is using the Aruco tag. P_C = M_WtoC * P_W
3. M_WtoR: The world-to-robot transformation. P_R = M_WtoR * P_W
4. M_BEV: The image-to-BEV transformation.  P_BEV = M_BEV * P_I

Args:
    M_intrinsic ((3,3)): The 3-by-3 intrinsic matrix
    M_WtoC ((4, 4)): The 4-by-4 homogeneous camera-to-world intrinsic matrix
    M_WtoR ((4, 4)): The 4-by-4 homogeneous world-to-robot intrinsic matrix
    BEV_mat ((3, 3)): The Bird-eye-view transformation matrix

Constructor & Destructor Documentation

◆ __init__()

def __init__ (   self,
  M_intrinsic = None,
  M_WtoC = None,
  M_WtoR = None,
  M_BEV = None 
)

Member Function Documentation

◆ parseDepth()

def parseDepth (   self,
  depth_map 
)
Parse all the pixel coordinates from a depth map

Args:
    depth_map ((H, W)): The depth map
Returns:
    pC_map  ((H, W, 3)): The camera frame coordinate
    pW_map  ((H, W, 3)): The world frame coordinate
    pR_map  ((H, W, 3)): The robot frame coordinate

◆ parsePBEV()

def parsePBEV (   self,
  coords,
  deps = None 
)
Parse the Bird-eye-view(BEV) coordinates

Args:
    coords ((N, 2)):        The BEV coordinates 
    deps ((N,), optional):  The deps at the corresponding locations
Returns
    p_I ((N, 2)):           The image frame coordinates
    p_C [(N, 3)]:           The camera frame coordinates (x, y, z)
    p_W [(N, 3)]:           The world frame coordinates (x, y, z)
    p_R [(N, 3)]:           The robot frame coordinates (x, y, z)

◆ parsePCam()

def parsePCam (   self)

◆ parsePImg()

def parsePImg (   self,
  coords,
  deps 
)
Parse the image frame coordinate and get the coordinate in camera, world, and robot frame if applicable.
The function requires both the image frame coordinate (u, v) and the depth value w.r.t. the camera.
Without the depth, one image frame point will corresponds to a line in the 3D space

Args:
    coords (np.ndarray, (N, 2)): The (u, v) image frame coordinates. N is the query number
    deps (np.ndarray, (N, )): The depth value cooresponds to the coords

Returns:
    p_C [(N, 3)]: The camera frame coordinates (x, y, z)
    p_W [(N, 3)]: The world frame coordinates (x, y, z)
    p_R [(N, 3)]: The robot frame coordinates (x, y, z)

◆ parsePRob()

def parsePRob (   self,
  robot_coords 
)
Parse the robot frame coordinate and get the coordinate in world, image, and camera frame if applicable.
Args:
    robot_coords (np.ndarray, (N, 3)): The robot frame coordinates. N is the query number

Returns:
    p_W [(N, 3)]: The world frame coordinates (x, y, z)
    p_C [(N, 3)]: The camera frame coordinates (x, y, z)
    p_Img [(N, 2)]: The image frame coordinates (u, v)

Member Data Documentation

◆ M_BEV

M_BEV

◆ M_int

M_int

◆ M_WtoC

M_WtoC

◆ M_WtoR

M_WtoR

The documentation for this class was generated from the following file: