Modality Gap
The modality gap in CLIP-style models: separated cones for image and text, why it appears, and whether to separate or represent.
The modality gap in CLIP-style models: separated cones for image and text, why it appears, and whether to separate or represent.