Abstract: Multi-modal and cross-modal retrieval has garnered increasing attention from researchers recently, owing to its potential to transcend the limitations imposed by traditional retrieval ...
Abstract: As a pioneering vision-language model, CLIP (Contrastive Language-Image Pre-training) has achieved significant success across various domains and a wide range of downstream vision-language ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results