2023-07-20
-
Artificial Intelligence,
Information Processing | Computing,
Research
Yiyuan Zhang, Kaixiong Gong, Kaipeng Zhang, Hongsheng Li, Yu Qiao, Wanli Ouyang and Xiangyu Yue propose a framework, named Meta-Transformer, that leverages a frozen encoder to perform multimodal perception without any paired multimodal training data.