Abstract: The multimodal feature fusion of the NestedFormer network architecture merely considers the feature fusion among each mode, and It can not solve the problem of insufficient mutual feature ...