This may be the best.
The model is trained on photorealistic images of fit and attractive females. It should work on all types of captions but the training dataset did not have any nudity.