promptlyVectorDB
Details
Download Files (1)
Model description
This is a vector database containing some 900k prompts from successful txt2img generations taken from the civitai database. The data has been processed and has removed the majority of data around children or potential toxic prompts.
The database is a chroma database with vectors of cleaned prompt, positive prompt, and negative prompt. The cleaned prompt is a processed version of the positive prompt meant to remove extraneous punctuation and prompting artifacts.
The metadata for each vector includes the base model, the nsfw level of the prompt (nsfw vs None), and the imageId associated with the prompt
How to use:
Download the resource and unzip locally
run the following code in a notebook:
!pip install langchain_community ##assuming you're working from nb. If working from terminal remove ! from langchain_community.vectorstores import Chroma from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings##Load persisted vectorstore and use all-MiniLM-L12-V2 embeddings (Same as those that were used to MAKE vectorDB) embedding_function = SentenceTransformerEmbeddings(model_name=“all-MiniLM-L12-v2”)
vectorstore = Chroma(embedding_function=embedding_function, persist_directory=LOCALVECTORDB
##input some base prompt basePrompt = ‘A cute kitten’
##Retrieve top k=5 similar prompts context = vectorstore
_search(“cleanedPrompt [TOPICKEY] “+ basePrompt, filter = {“nsfw”: “safe”}, k = 5)Note: This does contain sfw and nsfw text. To get nsfw text, toggle nsfw filter to
Noneusers can then use the prompts to build context in a large language model or for other uses.




