Bandit and Delayed Feedback in Online Structured Prediction
arXiv:2502.18709v3 Announce Type: replace-cross Abstract: Online structured prediction is a task of sequentially predicting outputs with complex structures based on inputs and past observations, encompassing online classification. Recent studies showed that in the full-information setting, we can achieve finite bounds on the textit{surrogate regret}, textit{i.e.,}~the extra target loss relative to the best possible surrogate loss. In practice, however, full-information feedback is often unrealistic as it requires immediate access to the whole structure of complex outputs. Motivated by this, […]