Looking for feedback/beta users – applying RL ideas to real-time task execution via voice
We’re working on a system called Gennie that sits at an interesting intersection of reinforcement learning, human-in-the-loop systems, and noisy real-world environments. The core problem we’re exploring is this: In real-world settings, users issue short, ambiguous, and sometimes incorrect commands (often via voice) under time pressure. The system must decide when to act, when to request confirmation, and when to do nothing, balancing speed and accuracy. The reward signal isn’t immediate and is often delayed or implicit (task […]