A Chinese researcher has compiled an extensive survey of datasets used in image-to-video (I2V) adversarial attack and defense research. The work categorizes datasets into seven types: face images, art/style, general images, videos, adversarial robustness, multimodal/edit, and security evaluation. It also provides a cross-matrix linking datasets to specific research papers. This resource is particularly valuable for AI security researchers and ML engineers working on adversarial robustness in video and multimodal models. The survey reflects the growing need for standardized benchmarks in AI security, a field that is rapidly evolving as generative models become more prevalent. For overseas developers, this signals an opportunity to explore I2V security challenges and contribute to open benchmarks. The post is not a tutorial but a curated reference, making it suitable for a topic page that can be updated as new datasets emerge.
This post provides a detailed survey of datasets used in image-to-video (I2V) adversarial attack and defense research, including face, art, general image, video, robustness, multimodal, and security evaluation datasets. It organizes them into a cross-matrix with related papers, offering a valuable resource for researchers. The signal highlights the maturation of AI security as a field requiring structured benchmarks.