This is version 1.0 with 100 more training steps, same dataset. The results are much better overall and you should expect more consistency over generations, way better eye contact and nose action. A more stable version in general that doesn't waste generations. Still, the variety of motion is quite limited and you get pretty much what you see in the gallery.
I'm planning two additional experiments to wrap up the cock sniffing journey:
A potential version with this same dataset that describes the motions with different equivalent wording for each image description. This should give less coupling over the sentences I repeat over and over in my example videos.
A version with video data as part of the training set. My initial experiments are promising, so again, follow to get notified when new version drops.
Note: I'm looking for a way to convert the OneTrainer Hunyuan lora to ComfyUI. If someone has guidelines on how to do that please drop a message. In the meantime, for those having trouble using it in Comfy do this to fix it (basically rename some lora keys):
In the lora.py file, search for the line
if isinstance(model, comfy.model_base.HunyuanVideo):
and inside that conditional block paste:
diffusers_keys = comfy.utils.flux_to_diffusers(model.model_config.unet_config, output_prefix="diffusion_model.")
for j in diffusers_keys:
if j.endswith(".weight"):
to = diffusers_keys[j]
key_map["transformer.{}".format(j[:-len(".weight")])] = to #simpletrainer and probably regular diffusers flux lora format
key_map["lycoris_{}".format(j[:-len(".weight")].replace(".", "_"))] = to #simpletrainer lycoris
key_map["lora_transformer_{}".format(j[:-len(".weight")].replace(".", "_"))] = to #onetrainer