Meta-Research on Backdoors: Dataset and Threat Model Shifts in Multimodal Backdoor Attacks
Backdoor attacks enable adversaries to embed malicious behavior into machine learning models by poisoning training data with triggers. Researchers focused largely on backdoors in unimodal models. However, the rise of multimodal systems, e.g., vision–language models (VLMs) and multimodal large language models (MLLMs), has significantly increased the attack surface. Multimodal backdoors can exploit cross-modal triggers, representation-level manipulation, instruction-conditioned behaviors, and test-time activation pathways that are not available in unimodal models. Nevertheless, quantifying progress in this field remains challenging due […]