Captions of the images in this version of the training dataset are a mix of booru tags and natural language.
X , Note , Pixiv