Research | PIRLab

Overview

Three main directions form one physical intelligence loop.

We do not treat perception, prediction, and control as isolated modules. Our research builds robot systems in which scene understanding improves forecasting, forecasting improves planning, and action closes the loop back onto better representations.

The shared theme is physical realism: models should respect geometry, semantics, dynamics, and contact, and should matter on real robots rather than only on offline benchmarks.

01

3D/4D Robot Vision

Robots first need rich perception: geometry, semantics, localization, mapping, and motion from cameras, LiDAR, and point clouds.

02

World-Model-Based Prediction

Perception becomes prediction: scene flow, map evolution, future observations, and action-conditioned physical change.

03

Physics-Based Robot Action

Prediction becomes action: manipulation, VLA policies, robust estimation, and physically grounded robot control.

Embodied Intelligence

Featured VLA and world-model projects.

Continual VLA

Stellar VLA

Continual skill knowledge for robot manipulation, connecting task memory with vision-language-action policies.

Local page Project page

Flow world model

RoboFlow4D

A lightweight 4D flow world model for real-time, flow-guided robotic manipulation.

Local page Project page

VSLA

HEAR

A Vision-Sound-Language-Action framework for robots that use acoustic cues during manipulation.

Local page Project page

Selected Work

Representative projects with visual demonstrations.

Image and LiDAR point cloud registration for vehicle localization

2D-3D localization

End-to-end 2D-3D Registration between Image and LiDAR Point Cloud

A vehicle localization pipeline that directly aligns image observations with LiDAR point clouds for robust real-world positioning.

Paper Publication list

Semantic neural implicit SLAM reconstruction

Semantic SLAM

SNI-SLAM and SNI-SLAM++

Semantic neural implicit mapping for dense scene reconstruction, tracking, and physically meaningful robot perception.

Paper Code

3D scene flow

3DSFLabelling and DifFlow3D

Scene-flow learning for dynamic 3D worlds, combining pseudo auto-labelling and uncertainty-aware diffusion refinement.

3DSFLabelling DifFlow3D

Real2Sim2Real robotic manipulation learning with Gaussian splatting

Robot manipulation

RL-GSBridge

A 3D Gaussian Splatting based Real2Sim2Real method for robotic manipulation learning and sim-to-real transfer.

Paper Code

LiDAR odometry

PWCLO-Net and EfficientLO-Net

Learning-based LiDAR odometry that estimates robust 3D motion from large-scale point clouds.

PWCLO-Net EfficientLO-Net

Robust estimation

RLSAC

Reinforcement learning enhanced sample consensus for end-to-end robust estimation in computer vision and robotics.

Paper Code

01

3D/4D Robot Vision

We develop robot perception models that read geometry, semantics, correspondence, and motion from RGB, RGB-D, LiDAR, and multi-modal point clouds. This includes localization, odometry, registration, semantic segmentation, dense mapping, and dynamic scene understanding.

Typical questions include how to align 2D and 3D observations, how to represent large-scale scenes efficiently, and how to reason about time so that vision becomes 4D rather than a collection of still frames.

02

World-Model-Based Prediction

We investigate predictive models that forecast scene flow, semantic evolution, latent map states, and action-conditioned changes in the environment. This includes diffusion models, neural implicit representations, Gaussian scene models, and future-conditioned perception.

The goal is not just prediction for its own sake. We want world models that make robots plan better, simulate better, recover from partial observability, and transfer learning more effectively between real and virtual environments.

03

Physics-Based Robot Action

We study how structured perception and predictive models can improve robotic manipulation, robust estimation, and embodied decision making. Our interest is in methods that preserve physical meaning instead of treating action as a purely black-box policy problem.

Current themes include real-to-sim-to-real learning, contact-rich manipulation, reinforcement learning with priors, and action pipelines built on geometry-aware scene representations.

Physical intelligence from perception to action.

Three main directions form one physical intelligence loop.

3D/4D Robot Vision

World-Model-Based Prediction

Physics-Based Robot Action

Featured VLA and world-model projects.

Stellar VLA

RoboFlow4D

HEAR

Representative projects with visual demonstrations.

End-to-end 2D-3D Registration between Image and LiDAR Point Cloud

SNI-SLAM and SNI-SLAM++

3DSFLabelling and DifFlow3D

RL-GSBridge

PWCLO-Net and EfficientLO-Net

RLSAC

3D/4D Robot Vision

World-Model-Based Prediction

Physics-Based Robot Action