Balanced CLIP (1M)
Details
Download Files
Model description
Balanced CLIP (1M)
Training CLIP-G took >15KwH of energy, CLIP-L took far less <1KwH
The full negative reinforcement (Cosine Dissimilarity) is available on my huggingface, this was paired with a positive reinforcement (Contrastive Loss) using the full frozen vision model in latent space.

