Augmented reality (AR) refers to a novel human-computer interface,
where the computer-generated (virtual) output is superimposed on a
real scene. A properly designed augmentation can thus greatly enhance
a person's perception of the surrounding world. Note that this differs
from virtual reality (VR) which aims at creating an entirely
artificial world to replace the user's perception of the surrounding
world. To be effective, an AR display needs to be sensitive to the
current state of the surrounding real world as the user interacts with
it. Thus a rich sensing modality like computer vision is perhaps
essential for AR. Another key issue is how to develop systematic
augmentation schemes. We address these issues for the "assembly
domain" where a multimedia augmentation guides a human in assembling
an industrial object. We use concepts from robotic assembly planning
for partitioning the possible states of the world and developing a
systematic augmentation scheme for guiding assembly. To address the
sensing problem, as a first step, we developed a system of markers for
visually tracking multiple objects in real-time. We have built a
prototype AR system for the assembly domain. Other domains where AR
is useful include computer assisted surgery, medical diagnosis,
manufacturing, training, and education. Fundamental problems need to
be addressed on how to develop the computer vision techniques in the
interactive context of AR, and how to formalize the flow of the
augmentation information based on a partial interpretation of the
scene.