simplified version(the dataset has not changed)
dataset of 12 images of game screenshots
use trigger words