Vision-Language-Action

Stellar VLA.

Continually evolving skill knowledge for robot manipulation, with task-centric and task-skill variants for stable lifelong VLA learning.

Project page Paper

01

Motivation

Why continual VLA needs reusable task-skill knowledge rather than growing task modules.

02

Method

T-Stellar and TS-Stellar model task knowledge and task-skill structure for adaptive action routing.

03

Experiments

Dual-arm real-robot tasks and LIBERO continual imitation learning evaluations.

Motivation

Lifelong robot policies need stable skill memory.

Vision-language-action models can inherit broad manipulation knowledge from pretraining, but efficient continual learning remains difficult. When new tasks arrive sequentially, the robot must adapt without overwriting old behaviours or adding a separate module for every task.

Stellar VLA reframes this as knowledge modelling. Instead of treating each task as isolated, the policy learns a compact knowledge space that captures relationships between tasks and skills. That knowledge then guides expert routing for action prediction.

Stellar VLA motivation overview

Overview

Continual robotic manipulation learning

The project studies how VLA agents acquire a sequence of tasks while preserving useful prior knowledge.

Stellar VLA real-world success curves

Real robot evidence

Knowledge retention across tasks

Real-world curves compare continual learning behaviour across seven dual-arm manipulation tasks.

Method

Task knowledge, skill structure, and knowledge-guided routing.

Overall architecture of Stellar VLA

Architecture

Stellar VLA pipeline

Language and visual observations are encoded into task-centric representations, then aligned with a learned knowledge space for action prediction.

Stellar VLA learned knowledge space visualization

Knowledge space

Task and subskill organization

The learned representation exposes task-level clusters and shared subskill relationships in long-horizon manipulation sequences.

T-Stellar

Task-centric modelling

The flat variant learns task-relevant knowledge that helps specialize action prediction without expanding the policy for each new task.

TS-Stellar

Task-skill hierarchy

The hierarchical variant models how tasks share reusable subskills, which is important for long-horizon manipulation.

Expert routing

Knowledge-guided action head

Semantic embeddings and knowledge relationships guide which motion experts should be emphasized for a given task.

Real-World Experiments

Seven dual-arm manipulation tasks after continual learning.

Task 01

Pick up Stick

Grasping a stick with the dual-arm robot setup.

Task 02

Handover Stick

Coordinated handover behaviour between robot arms.

Task 03

Pick up Bag

Generalization to deformable or flexible objects.

Task 04

Handover Toy

Handover with object and scene variation.

Task 05

Pick Toy Place Plate

Pick-and-place with coordinated placement constraints.

Task 06

Pick Both Object

Dual-arm coordination with multiple objects.

Task 07

Pull Stick from Bag

Contact-rich extraction after learning previous tasks.

Results

Real-world and simulation evaluations.

Stellar VLA real-world success-rate comparison table

Real world

Success-rate comparison

The real-robot evaluation compares continual manipulation performance across baseline policies and Stellar variants.

Stellar VLA simulation success-rate results

Simulation

LIBERO continual learning

Simulation studies cover goal, long-horizon, and multi-task LIBERO settings with limited replay.