Introduction
Goal
Input an RGB image and output 3D objects using triangle mesh
Key Ideas
- Iterative Refinement
- Graph Convolution
- Vertex Aligned-Features
- Chamfer Loss Function
Key Ideas Explanation
Iterative Refinement
Motivation
A problem in generating triangle mesh 3D object is that we don’t know how to initialize the output of the network
Implementation
This paper initialize the output as an ellipsoid, then the network will make the ellipsoid shrink and shrink until the shape adhere to the surface of the object
Vertex-aligned features which will be introduced later is an alternative which can achieve the same result as iterative refinement
![]()
Graph Convolution
Motivation
In 2D CNN, we use filter to capture features in the feature map, graph convolution is a way for us to do the same thing on triangle mesh
In Pixel2Mesh, we replace ordinary layers with graph convolution layers
Implementation
Vertex has features The new feature (like we will create new channel in Conv layer) for vertex depends on feature of neighboring vertices
Vertex-Aligned Features
Implementation
- First, we start with the same ellipsoid as we did in iterative refinement
- Next, we use specific methods to match each vertex of our 3D object to the initial 2D input image pixels
- We use CNN to process the original image, and use the element in the feature map corresponding to the input pixel to update the features (location, colors, …) of the vertices (This step is done using bilinear interpolation)
![]()
Chamfer Loss Function
Motivation
There are many ways to represent the same shape, e.g., a square can be represent with different set and different numbers of vertices
Hence, we need a way to omit the influence of the above situation
Implementation: Transfer to Point Cloud
1. For predicted object We sample points from the surface of the predicted mesh. Their feature vectors are calculated using bilinear interpolation (This is done online)
2. For ground-truth object We do the same thing like that for predicted object (This is done offline)
3. Compare the point clouds We use Chamfer distance to calculate loss between predicted and ground-truth objects