MRPilot: A Mixed-Reality System for Responsive Navigation of General Procedural Tasks

1Shenzhen University, 2HKUST
IEEE International Symposium on Mixed and Augmented Reality 2025

*Corresponding Author
MRPilot Overview

Overview of MRPilot: In automatic anchoring mode, labels displaying object names will be overlaid on the recognized objects (b). Once the user confirms the anchor, the object anchor is overlaid on the corresponding object (c). The users can also use hand gestures to manually anchor objects that cannot be automatically recognized left in the hand panel (c). MRPilot can automatically detect and track physical objects during the whole task progress (d). It provides users with responsive guidance by monitoring their actions, detecting progress (e), and automatically advancing instructions (f).

Abstract

People often need guidance to complete tasks with specific requirements or sophisticated steps, such as preparing a meal or assembling furniture. Traditional guidance often relies on unstructured paper instructions that require people to switch between reading instructions and performing actions, resulting in an unsmooth user experience. Recent Mixed Reality (MR) systems alleviate this problem by giving spatialized navigation but demand an authoring step and, therefore, cannot be easily adapted to general tasks. We propose MRPilot, an MR system empowered by Large Language Models (LLMs) and Computer Vision techniques, offering responsive navigation for general tasks without pre-authoring. MRPilot consists of three modules: a Navigation Builder Module using LLMs to generate structured instructions, an Object Anchor Module exploiting Computer Vision techniques to anchor physical objects with virtual proxies, and an Action Recommendation Module giving responsive navigation according to users’ interactions with physical objects. MRPilot bridges the gap between virtual instructions and physical interactions for general tasks, providing contextual and responsive navigation. We conducted a user study to compare MRPilot with a baseline MR system that also exploited LLMs. The results confirmed the effectiveness of MRPilot.

User Interface

Preparation user interface

Preparation user interface. (a) The user provides task instructions via voice or text to initiate task preparation mode. Note that this panel is also used in Baseline system. (b) The user captures the scene for object recognition using the "Capture Your Scene" button. (c) This panel displays the generated instruction draft, where the user can either accept or discard it. (d) The user enters Object Anchoring mode by clicking "Start Tracking", allowing MRPilot to anchor virtual visual cues above physical objects in the environment. After all objects are anchored, the user can click the "Start MRPilot" button to begin task navigation.

Task consumption mode user interface

Task consumption mode user interface. (a) Guidance overview panel. (b) The task navigation panel shows the current step and recommended steps. (c) Upon selecting a recommended step, other recommended steps will be discarded. (d) Feedback from the LLM displays in Baseline, where users can also regenerate the result using the interface in Figure 4 (a).

Experiments

NASA-TLX and SUS distribution results

The NASA-TLX (left) and SUS (right) distribution for MRPilot (without hatching) and Baseline (with hatching). The numbers within each bar segment represent the number of participants who selected the corresponding response option. Statistical significance is indicated above each bar segment (+ : .050 < p < .100, * : p < .050, ** : p < .010, *** : p < .001). A more comprehensive statistical analysis and boxplot figures are provided in the supplemental material.

Automatic vs Manual Switch Distribution

The distribution of automatic and manual switch proportions across individual participants. Percentages within each bar segment represent the relative frequency of each switch type reported by participants. Tasks completed by each participant using MRPilot are shown in parentheses. Detailed task assignments are provided in the supplemental material.

BibTeX

BibTex Code Here