World Model

RoboFlow4D.

A lightweight flow world model that predicts future multi-frame 3D flows and guides real-time robotic manipulation.

Planning in 3D Space

Predicted flow becomes a spatial plan.

RoboFlow4D treats world modelling as a closed loop between observation, prediction, and execution. Given a visual sequence and language instruction, the model predicts future multi-frame 3D flow that describes how task-relevant geometry should move.

The original project page includes interactive 3D demos for household manipulation tasks such as Moka Pot, Drawer, Book to Caddy, and Push Cube. The local page keeps the task structure and links back to the live project resources.

Pipeline

A lightweight flow world model for real-time manipulation.

RoboFlow4D full pipeline

FlowDiT

RGB, point, and text tokens to multi-frame 3D flow

The model encodes observations and task instructions, predicts future 3D flow, and feeds that flow into a policy for action generation.

Closed loop

Slow planner, fast executor

RoboFlow4D acts as a predictive planner, while the action policy executes conditioned on both robot state and explicit flow.

+6.2 / +11.0

Average success-rate gains reported over base policies on LIBERO and ManiSkill3.

120x

Reported planning speedup compared with modular flow-planning pipelines.

< 1s

Goal-oriented planning latency aimed at real-time robot deployment.

Simulation Videos

Flow-conditioned policy rollouts in benchmark tasks.

LIBERO Object

Cream cheese to basket

LIBERO Object

Milk to basket

LIBERO Spatial

Bowl on cabinet to plate

LIBERO Spatial

Bowl on ramekin to plate

LIBERO Goal

Open drawer and place bowl

LIBERO Goal

Cream cheese to bowl

LIBERO Long

Both moka pots on stove

LIBERO Long

Two mugs to two plates

Real-World Videos

Robot manipulation with predicted flow.

Real robot

Cup insertion

Pick up the brown cup and insert it into the black cup.

Real robot

Pick-and-place assembly

Place an object into the target workspace with flow-guided control.

Real robot

Drawer manipulation

Open the top drawer, place the red cube inside, and close it.

Real robot

Stacking

Pick up the red cube and place it on the blue cube.

Quantitative Results

Benchmark gains with RoboFlow4D guidance.

MethodSpatialObjectGoalLongAverage
Octo78.985.784.651.175.1
SpatialVLA88.289.978.655.578.1
4D-VLA88.995.290.979.188.6
DP81.691.578.464.078.9
DP + RoboFlow4D89.893.285.272.085.1
DiT84.296.385.468.883.7
DiT + RoboFlow4D90.297.088.475.287.7

Real-world DP gain

+12.5 average success

RoboFlow4D improves DP real-robot average success while reducing average completion time in the reported tasks.

Real-world DiT gain

+11.3 average success

The same flow guidance improves DiT across pick-and-place, stack, assemble, and drawer scenarios.

Deployment

Lightweight world modelling

The system is designed to make predictive 3D motion practical inside a robot control loop.