Skip to content
PRAVAR
AI Accessibility· 2025 — Present

AI Accessibility Extension

Grant-funded ASU ETX accessibility research project. I am building a Chrome MV3 extension and AWS-backed service that lets blind and low-vision students control STEM simulations with voice.

RoleStudent developer, with technical/product/faculty guidance
TimelineOct 2025 — Present
StatusActive research project
7 simulation profiles in scope
Grant-funded research project
Voice + narrated feedback prototype
AI Accessibility Extension
The Hook

A lot of STEM simulations teach through moving graphs, sliders, and visual state changes. If that state never reaches a screen reader, the lab becomes much harder to do independently.

Business Use Case

The goal is to avoid rebuilding accessibility support from scratch for every simulation. If the adapter/profile approach works, ETX can support another sim by writing a detector and profile for that sim instead of redesigning the whole voice-and-narration workflow.

Problem

ETX runs interactive STEM simulations across SimCAPI-based HTML/JS sims and Unity WebGL sims. For many of them, a screen reader does not expose the changing graph, animation, or simulation state in a way that lets a blind or low-vision student follow the lab independently. I needed a browser-based approach that could observe sim state, explain what changed, and eventually turn a voice command into a safe sim action.

Approach

I am building the extension and backend with close technical guidance from Kevin Segovia, who helps me think through architecture, feasibility, and the weird parts of simulation integration. Mia Brunkhorst helps keep the work grounded in Argos/Torus, QA, and real course-delivery needs. The mini-grant research runs under Chris Mead's faculty guidance, so I am also trying to build this in a way that can be evaluated instead of just demoed.

Guidance & Stakeholders
Architecture
  1. 01Chrome MV3 extension — TypeScript content scripts watch for SimCAPI events, DOM changes, and sim-specific state that can be turned into narration.
  2. 02AWS Lambda backend — TypeScript endpoint that receives sim state and a user command, calls the LLM, and returns either narration or a structured action plan.
  3. 03Simulation profiles — each supported sim gets a profile that describes what state matters, what actions are allowed, and what context the assistant needs.
  4. 04Terraform setup — used so the AWS pieces can be recreated without clicking through the console each time.
  5. 05Accessibility output — Web Speech API and ARIA live regions for narration, with the current work focused on not interrupting the user's screen reader flow.
Challenges & Decisions

Two simulation engines, one schema

The SimCAPI sims expose useful state through JavaScript events, while Unity/WebGL sims are much less transparent. I am handling that by writing sim-specific adapters that emit a normalized event shape, but the Unity side is still the harder part because not everything is exposed cleanly.

Iframe sandboxes

The first idea was to read sim state directly from the parent page, but that breaks down when the sim lives inside an iframe. I moved toward content scripts inside the sim frame and `postMessage` back to the extension. The tradeoff is that every sim needs more careful permission and failure handling.

Latency vs. depth

Vision-based calls are useful when the structured state is missing something, but they are slower and more expensive. The current direction is to prefer structured sim state first and only fall back to heavier interpretation when the adapter cannot explain what changed.

Screen-reader cooperation

I cannot just dump narration into the page whenever state changes. If the extension talks over the screen reader, it becomes annoying fast. I am using ARIA live regions and queueing rules, but this still needs user testing before I would call it solved.

Results
  • Current prototype is being built around 7 simulation profiles across SimCAPI and Unity/WebGL-style sims.
  • The core loop is working in prototype form: observe sim state, generate narration, and map selected voice commands into structured actions.
  • This is the current technical prototype for the ETX accessibility mini-grant work.
What I'd Change
  • ·This still needs telemetry before it can be evaluated seriously: latency, failed commands, confusing intents, and cases where narration is too noisy.
  • ·I need automated tests across the simulation profiles because manual testing every sim after each extension change does not scale.
  • ·The biggest open question is how well the adapter pattern holds up when a simulation hides important state inside canvas/WebGL.
Stack

Backend

AWS LambdaTypeScriptAPI Gateway

AI

GPT-4oStructured action planningSelective vision fallback

Extension

Chrome MV3TypeScriptWeb Speech APIARIA live regions

Infra

TerraformAWS