AWE USA 2025
Modern mixed and augmented reality (MR/AR) experiences can perform below expectations, or even pose security risks, when deployed in unknown real-world environments: for instance, virtual content may appear misaligned with respect to the real world, or may block user's view of important real-world objects. Fortunately, modern vision language models, such as GPT-4o and Claude 3, have sufficiently powerful real world understanding capabilities to enable AR/MR experience quality monitoring and evaluation, and can be coupled with fast-acting edge computing based mechanisms to allow rapid responses to cases where improperly positioned AR/MR content may significantly affect user performance or endanger the user. This talk describes our recently developed VLM-based approach to AR/MR experience quality evaluation and showcases an implementation of this approach that ensures that AR/MR virtual content does not block user's view of important real-world objects. The talk is based on research that is funded by an NSF AI Institute and by a DARPA Young Faculty Award.