promptlyVectorDB

This is a vector database containing some 900k prompts from successful txt2img generations taken from the civitai database. The data has been processed and has removed the majority of data around children or potential toxic prompts.

The database is a chroma database with vectors of cleaned prompt, positive prompt, and negative prompt. The cleaned prompt is a processed version of the positive prompt meant to remove extraneous punctuation and prompting artifacts.

The metadata for each vector includes the base model, the nsfw level of the prompt (nsfw vs None), and the imageId associated with the prompt

How to use:

Download the resource and unzip locally

run the following code in a notebook:

!pip install langchain_community ##assuming you're working from nb. If working from terminal remove !
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
##Load persisted vectorstore and use all-MiniLM-L12-V2 embeddings (Same as those that were used to MAKE vectorDB)
embedding_function = SentenceTransformerEmbeddings(model_name=“all-MiniLM-L12-v2”)
vectorstore = Chroma(embedding_function=embedding_function,
persist_directory=LOCALVECTORDB
##input some base prompt
basePrompt = ‘A cute kitten’
##Retrieve top k=5 similar prompts
context = vectorstore 

_search(“cleanedPrompt [TOPICKEY] “+ basePrompt,
filter = {“nsfw”: “safe”}, k = 5)

Note: This does contain sfw and nsfw text. To get nsfw text, toggle nsfw filter to None

users can then use the prompts to build context in a large language model or for other uses.

Model Type	Other
Base Model	Other
Published	2024-05-01

promptlyVectorDB

Details

Download Files (1)

Model description

Images made by this model