We have chosen Facebook Hydra library as out core tool for managing the configuration of our experiments. It provides a nice and scalable interface to defining models and datasets. We encourage our users to take a look at their documentation and get a basic understanding of its core functionalities. As per their website

“Hydra is a framework for elegantly configuring complex applications”

Configuration architecture

All configurations leave in the conf folder and it is organised as follow:

├── config.yaml     # main config file for training
├── data            # contains all configurations related to datasets
├── debugging       # configs that can be used for debugging purposes
├── eval.yaml       # Main config for running a full evaluation on a given dataset
├── hydra           # hydra specific configs
├── lr_scheduler    # learning rate schedulers
├── models          # Architectures of the models
├── sota.yaml       # SOTA scores
├── training        # Training specific parameters
└── visualization   # Parameters for saving visualisation artefact

Understanding config.yaml

config.yaml is the config file that governs the behaviour of your trainings. It gathers multiple configurations into one, and it is organised as follow:

  - task: ??? # Task performed (segmentation, classification etc...)
    optional: True
  - model_type: ??? # Type of model to use, e.g. pointnet2, rsconv etc...
    optional: True
  - dataset: ???
    optional: True

  - visualization: default
  - lr_scheduler: multi_step
  - training: default
  - eval

  - debugging: default.yaml
  - models: ${defaults.0.task}/${defaults.1.model_type}
  - data: ${defaults.0.task}/${defaults.2.dataset}
  - sota # Contains current SOTA results on different datasets (extracted from papers !).
  - hydra/job_logging: custom

model_name: ??? # Name of the specific model to load

selection_stage: ""
pretty_print: False

Hydra is expecting the followings arguments from the command line:

  • task

  • model_type

  • dataset

  • model_name

The provided task and dataset will be used to load the configuration for the dataset at conf/data/{task}/{dataset}.yaml while the model_type argument will be used to load the model config at conf/models/{task}/{model_type}.yaml. Finally model_name is used to pull the appropriate model from the model configuration file.

Training arguments

# @package training
# Those arguments defines the training hyper-parameters
epochs: 100
num_workers: 6
batch_size: 16
shuffle: True
cuda: 0 # -1 -> no cuda otherwise takes the specified index
precompute_multi_scale: False # Compute multiscate features on cpu for faster training / inference
  base_lr: 0.001
  # accumulated_gradient: -1 # Accumulate gradient accumulated_gradient * batch_size
  grad_clip: -1
    class: Adam
      lr: ${training.optim.base_lr} # The path is cut from training
  lr_scheduler: ${lr_scheduler}
    bn_policy: "step_decay"
      bn_momentum: 0.1
      bn_decay: 0.9
      decay_step: 10
      bn_clip: 1e-2
weight_name: "latest" # Used during resume, select with model to load from [miou, macc, acc..., latest]
enable_cudnn: True
checkpoint_dir: ""

# Those arguments within experiment defines which model, dataset and task to be created for benchmarking
# parameters for Weights and Biases
  entity: ""
  project: default
  log: True
  public: True # It will be display the model within wandb log, else not.
    model_name: ${model_name}

  # parameters for TensorBoard Visualization
  log: True
  • precompute_multi_scale: Computes spatial queries such as grid sampling and neighbour search on cpu for faster. Currently this is only supported for KPConv.

Eval arguments

  - visualization: eval

num_workers: 0
batch_size: 1
cuda: 0
weight_name: "latest" # Used during resume, select with model to load from [miou, macc, acc..., latest]
enable_cudnn: True
checkpoint_dir: "/local/torch-points3d/outputs/2021-06-01/11-53-23" # "{your_path}/outputs/2020-01-28/11-04-13" for example
model_name: pointnet2_charlesssg
precompute_multi_scale: True # Compute multiscate features on cpu for faster training / inference
enable_dropout: False
voting_runs: 1

tracker_options: # Extra options for the tracker
  full_res: False
  make_submission: True

    dir: ${checkpoint_dir}/eval/${now:%Y-%m-%d_%H-%M-%S}

Data formats for point cloud

While developing this project, we discovered there are several ways to implement a convolution.

  • “DENSE”



  • “SPARSE”


This format is very similar to what you would be used to with images, during the assembling of a batch the B tensors of shape (num_points, feat_dim) will be concatenated on a new dimension [(num_points, feat_dim), …, (num_points, feat_dim)] -> (B, num_points, feat_dim).

This format forces each sample to have exactly the same number of points.


  • The format is dense and therefore aggregation operation are fast


  • Handling variability in the number of neighbours happens through padding which is not very efficient

  • Each sample needs to have the same number of points, as a consequence points are duplicated or removed from a sample during the data loading phase using a FixedPoints transform

Sparse formats

The second family of convolution format is based on a sparse data format meaning that each sample can have a variable number of points and the collate function handles the complexity behind the scene. For those intersted in learning more about it Batch.from_data_list


Given N tensors with their own num_points_{i}, the collate function does:

[(num_points_1, feat_dim), ..., (num_points_n, feat_dim)]
    -> (num_points_1 + ... + num_points_n, feat_dim)

It also creates an associated batch tensor of size (num_points_1 + ... + num_points_n) with indices of the corresponding batch.



  • A with shape (2, 2)

  • B with shape (3, 2)

C = Batch.from_data_list([A, B])

C is a tensor of shape (5, 2) and its associated batch will contain [0, 0, 1, 1, 1]

PARTIAL_DENSE ConvType format

This format is used by KPConv original implementation.

Same as dense format, it forces each point to have the same number of neighbors. It is why we called it partially dense.


This ConvType is Pytorch Geometric base format. Using Message Passing API class, it deploys the graph created by neighbour finder using internally the torch.index_select operator.

Therefore, the [PointNet++] internal convolution looks like that.

import torch
from torch_geometric.nn.conv import MessagePassing
from torch_geometric.utils import remove_self_loops, add_self_loops

from ..inits import reset

class PointConv(MessagePassing):
    r"""The PointNet set layer from the `"PointNet: Deep Learning on Point Sets
    for 3D Classification and Segmentation"
    <>`_ and `"PointNet++: Deep Hierarchical
    Feature Learning on Point Sets in a Metric Space"
    <>`_ papers

    def __init__(self, local_nn=None, global_nn=None, **kwargs):
        super(PointConv, self).__init__(aggr='max', **kwargs)

        self.local_nn = local_nn
        self.global_nn = global_nn


    def reset_parameters(self):

    def forward(self, x, pos, edge_index):
            x (Tensor): The node feature matrix. Allowed to be :obj:`None`.
            pos (Tensor or tuple): The node position matrix. Either given as
                tensor for use in general message passing or as tuple for use
                in message passing in bipartite graphs.
            edge_index (LongTensor): The edge indices.
        if torch.is_tensor(pos):  # Add self-loops for symmetric adjacencies.
            edge_index, _ = remove_self_loops(edge_index)
            edge_index, _ = add_self_loops(edge_index, num_nodes=pos.size(0))

        return self.propagate(edge_index, x=x, pos=pos)

    def message(self, x_j, pos_i, pos_j):
        msg = pos_j - pos_i
        if x_j is not None:
            msg =[x_j, msg], dim=1)
        if self.local_nn is not None:
            msg = self.local_nn(msg)
        return msg

    def update(self, aggr_out):
        if self.global_nn is not None:
            aggr_out = self.global_nn(aggr_out)
        return aggr_out

    def __repr__(self):
        return '{}(local_nn={}, global_nn={})'.format(
            self.__class__.__name__, self.local_nn, self.global_nn)

SPARSE ConvType Format

The sparse conv type is used by project like SparseConv or Minkowski Engine, therefore, the points have to be converted into indices living within a grid.

Backbone Architectures

Several unet could be built using different convolution or blocks. However, the final model will still be a UNet.

In the base_architectures folder, we intend to provide base architecture builder which could be used across tasks and datasets.

We provide two UNet implementations:

  • UnetBasedModel

  • UnwrappedUnetBasedModel

The main difference between them if UnetBasedModel implements the forward function and UnwrappedUnetBasedModel doesn’t.


def forward(self, data):
    if self.innermost:
        data_out = self.inner(data)
        data = (data_out, data)
        return self.up(data)
        data_out = self.down(data)
        data_out2 = self.submodule(data_out)
        data = (data_out2, data)
        return self.up(data)

The UNet will be built recursively from the middle using the UnetSkipConnectionBlock class.

UnetSkipConnectionBlock .. code-block:

Defines the Unet submodule with skip connection.
X -------------------identity----------------------
-- downsampling -- |submodule| -- upsampling --|


The UnwrappedUnetBasedModel will create the model based on the configuration and add the created layers within the followings ModuleList

self.down_modules = nn.ModuleList()
self.inner_modules = nn.ModuleList()
self.up_modules = nn.ModuleList()



Preprocessed S3DIS

We support a couple of flavours or S3DIS. The dataset used for S3DIS1x1 is coming from

It is a preprocessed version of the original data where each sample is a 1mx1m extraction of the original data. It was initially used in PointNet.


The dataset used for S3DIS is the original dataset without any pre-processing applied. Here is the area_1 if you want to visualize it. We provide some data transform for combining each area back together and split the dataset into digestible chunks. Please refer to code base and associated configuration file for more details:

# @package data
task: segmentation
class: s3dis.S3DISFusedDataset
dataroot: data
fold: 5
first_subsampling: 0.04
use_category: False
    - transform: PointCloudFusion   # One point cloud per area
    - transform: SaveOriginalPosId    # Required so that one can recover the original point in the fused point cloud
    - transform: GridSampling3D       # Samples on a grid
          size: ${data.first_subsampling}
  - transform: RandomNoise
      sigma: 0.001
  - transform: RandomRotate
      degrees: 180
      axis: 2
  - transform: RandomScaleAnisotropic
      scales: [0.8, 1.2]
  - transform: RandomSymmetry
      axis: [True, False, False]
  - transform: DropFeature
      drop_proba: 0.2
      feature_name: rgb
  - transform: XYZFeature
      add_x: False
      add_y: False
      add_z: True
  - transform: AddFeatsByKeys
      list_add_to_x: [True, True]
      feat_names: [rgb, pos_z]
      delete_feats: [True, True]
  - transform: Center
  - transform: XYZFeature
      add_x: False
      add_y: False
      add_z: True
  - transform: AddFeatsByKeys
      list_add_to_x: [True, True]
      feat_names: [rgb, pos_z]
      delete_feats: [True, True]
  - transform: Center
val_transform: ${data.test_transform}


Shapenet is a simple dataset that allows quick prototyping for segmentation models. When used in single class mode, for part segmentation on airplanes for example, it is a good way to figure out if your implementation is correct.




The dataset used for ModelNet comes in two format:


3D Match

IRALab Benchmark composed of data from:

Model checkpoint

Model Saving

Our custom Checkpoint class keeps track of the models for every metric, the stats for "train", "test", "val", optimizer and its learning params.

self._objects = {}
self._objects["models"] = {}
self._objects["stats"] = {"train": [], "test": [], "val": []}
self._objects["optimizer"] = None
self._objects["lr_params"] = None

Model Loading

In training.yaml and eval.yaml, you can find the followings parameters:

  • weight_name

  • checkpoint_dir

  • resume

As the model is saved for every metric + the latest epoch. It is possible by loading any of them using weight_name.

Example: weight_name: "miou"

If the checkpoint contains weight with the key “miou”, it will set the model state to them. If not, it will try the latest if it exists. If None are found, the model will be randonmly initialized.

Adding a new metric

Within the file torch_points3d/metrics/, It contains a mapping dictionnary between a sub metric_name and an optimization function.

Currently, we support the following metrics.

    "iou": max,
    "acc": max,
    "loss": min,
    "mer": min,
}  # Those map subsentences to their optimization functions


The associated visualization

The framework currently support both wandb and tensorboard

# parameters for Weights and Biases
    project: benchmarking
    log: False

# parameters for TensorBoard Visualization
    log: True

Custom logging

We use a custom hydra logging message which you can find within conf/hydra/job_logging/custom.yaml

# @package _group_
        format: "%(message)s"
    handlers: [debug_console_handler, file_handler]
version: 1
        level: DEBUG
        formatter: simple
        class: logging.StreamHandler
        stream: ext://sys.stdout
        level: DEBUG
        formatter: simple
        class: logging.FileHandler
        filename: train.log
disable_existing_loggers: False