自動提示LLM-Text與LLM-Vision的應用指南

閱讀時間約 26 分鐘

Quick Links

  • Auto prompt by LLM and LLM-Vision (Trigger more details out inside model)
    • SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-auto-prompt-llm
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-auto-prompt-llm
  • Auto msg to ur mobile (LINE | Telegram | Discord)
    • SD-WEB-UI :https://github.com/xlinx/sd-webui-decadetw-auto-messaging-realtime
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-auto-messaging-realtime
  • I'm SD-VJ. (share SD-generating-process in realtime by gpu)
    • SD-WEB-UI: https://github.com/xlinx/sd-webui-decadetw-spout-syphon-im-vj
    • ComfyUI: https://github.com/xlinx/ComfyUI-decadetw-spout-syphon-im-vj
  • CivitAI Info|discuss:
    • https://civitai.com/articles/6988/extornode-using-llm-trigger-more-detail-that-u-never-thought
    • https://civitai.com/articles/6989/extornode-sd-image-auto-msg-to-u-mobile-realtime
    • https://civitai.com/articles/7090/share-sd-img-to-3rd-software-gpu-share-memory-realtime-spout-or-syphon

SD-WEB-UI | ComfyUI | decadetw-Auto-Prompt-LLM-Vision


 

 

    

Update Log

  • [add|20240730] | 🟢 LLM Recursive Prompt
  • [add|20240730] | 🟢 Keep ur prompt ahead each request
  • [add|20240731] | 🟢 LLM Vision
  • [add|20240803] | 🟢 translateFunction
    • When LLM answered, use LLM translate result to your favorite language.ex: Chinese. It's just for your reference, which won't affect SD.
  • [add|20240808] | 🟠 Before and After script | exe-command
  • [add|20240808] | 🟠 release LLM VRAM everytimes

Motivation💡

  • Call LLM : auto prompt for batch generate images
  • Call LLM-Vision: auto prompt for batch generate images
  • Image will get more details that u never though before.
  • prompt detail is important

Usage

LLM-Text

  • batch image generate with LLM
    • a story
  • Using Recursive prompt say a story with image generate
  • Using LLM
    • when generate forever modeexample as follows figure Red-box.just tell LLM who, when or whatLLM will take care details.
    • when a story-board mode (You can generate serial image follow a story by LLM context.)its like comic booka superstar on stageshe is singingpeople give her flowera fashion men is walking.

LLM-Vision 👀

  • batch image generate with LLM-Vision
    • let LLM-Vision see a magazine
    • see series of image
    • see last-one-img for next-image
    • make a serious of image like comic

Before and After script

  • support load script or exe-command Before-LLM and After-LLM
  • javascript fetch POST method (install Yourself )
    • security issue, but u can consider as follows
    • https://github.com/pmcculler/sd-dynamic-javascript
    • https://github.com/ThereforeGames/unprompted
    • https://github.com/adieyal/sd-dynamic-prompts
    • https://en.wikipedia.org/wiki/Server-side_request_forgery
    • and Command Line Arg --allow-code

[🟢] stable-diffusion-webui-AUTOMATIC1111[🟢] stable-diffusion-webui-forge[🟢] ComfyUI1. SD-Prompt ✦1girl2.1 LLM-Text ✦2.2 LLM-Vision ✦a super star on stage.Who is she in image?2.3 LLM-Text-sys-prompt ✦2.4 LLM-Vision-sys-prompt ✦You are an AI prompt word engineer. Use the provided keywords to create a beautiful composition. Only the prompt words are needed, not your feelings. Customize the style, scene, decoration, etc., and be as detailed as possible without endings.You are an AI prompt word engineer. Use the provided image to create a beautiful composition. Only the prompt words are needed, not your feelings. Customize the style, scene, decoration, etc., and be as detailed as possible without endings.3. LLM will answer other detail ✦The superstar, with their hair flowing in the wind, stands on the stage. The lights dance around them, creating a magical moment that fills everyone present with awe. Their eyes shine bright, as if they are ready to take on the world.The superstar stands tall in their sparkling costume, surrounded by fans who chant and cheer their name. The lights shine down on them, making their hair shine like silver. The crowd is electric, every muscle tense, waiting for the superstar to perform4. Main Interface | sd-web-ui | ComfyUIComfyUI Manager | search keyword: auto

Usage

InputOutputLLM-Text: a superstar on stage.LLM-Vision: What's pose in this image?.(okay, its cool.)LLM-Text: a superstar on stage.LLM-Vision: with a zebra image(okie, cooool show dress. At least we don't have half zebra half human.)LLM-Text: a superstar on stage.(okay, its cool.)LLM: a superstar on stage.(Wow... the describe of light is great.)LLM: a superstar on stage.(hnn... funny, it does make sense.)CHALLENGELLM-vision:A Snow White girl walk in forest.(detect ur LLM-Vision Model IQ; if u didnt get white dress and lot of snow.... plz let me know model name)SD model: Flux.1 DLLM model: llava-llama-3.1-8bLLM model: Eris_PrimeV4-Vision-32k-7B-IQ3_XXS FLUX modelhnn...NSFW show. I'm not mean that, but not a wrong answer.(Trigger more details; that u never thought about it.)SD model: Flux.1 DLLM model: llava-llama-3.1-8bLLM model: Eris_PrimeV4-Vision-32k-7B-IQ3_XXSadvanced use | before-after-actionin fact, u can run any u want script | (storyboard) | random read line from txt send into LLMSpecial LLM LoopConnect 1st LLM-Text output to 2nd LLM-Text Input Special LLM Loop - keep each feature assign to different obj not mix it on one.LLM-Text output ask looply : here     [new tool 20240915] Civitai Prompt Grabberquick prompt from civitai. u can pick some prompt from another area model(ex indoor design or building model) with ur 1girl, ex: 1girl(up figure)) + in-door design model-prompt. then u will get full detail in background(bottom figure) : https://civitai.com/models/85691this is good present in FLUX model. trigger more detail in background. Make the photo getting more realistic feelingoption1. just quick append prompt from other model from civitai oroption2. of course u can send it into LLM too. [update] LLM-ask-LLM🌀[support] Cloud Service: Gemini Procloud service: https://generativelanguage.googleapis.com/v1model: gemini-1.5-flash (vision)support text and visionit will get more🌀 and more🌀 and more🌀 like....(bottom to top)

Usage Tips

  • tips1:
    • leave only 1 or fewer keyword(deep inside CLIP encode) for SD-Prompt, others just fitting into LLM
    • SD-Prompt: 1girl, [xxx,]<--(the keyword u use usually, u got usually image)
    • LLM-Prompt: xxx, yyy, zzz, <--(move it to here; trigger more detail that u never though.)
  • tips2:
    • leave only 1 or fewer keyword(deep inside CLIP encode) for SD-Prompt, others just fit into LLM
    • SD-Prompt: 1girl,
    • LLM-Prompt: a superstar on stage. <--(say a story)
  • tips3:
    • action script - Beforerandom/series pick prompt txt file random line fit into LLM-Text [read_random_line.bat]random/series pick image path file fit into LLM-Vision
    • action script - Afteru can call what u want commandex: release LLM VRAM each call: "curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}'" @Pdonorex: bra bra. Interactive anything.
  • tipsX: Enjoy it, inspire ur idea, and tell everybody how u use this.

Installtion

  • You need install LM Studio or ollama first.
    • LM Studio: Start the LLM service on port 1234. (suggest use this one)
    • ollama: Start service on port 11434 .
  • Pick one language model from under list
    • text base(small ~2G)
    • text&vision base(a little big ~8G)
  • Start web-ui or ComfyUI install extensions or node
    • stable-diffusion-webui | stable-diffusion-webui-forge:go Extensions->Available [official] or Install from URLhttps://github.com/xlinx/sd-webui-decadetw-auto-prompt-llm
    • ComfyUI: using Manager install nodeManager -> Customer Node Manager -> Search keyword: autohttps://github.com/ltdrdata/ComfyUI-Managerhttps://registry.comfy.org/https://ltdrdata.github.io/
  • Open ur favorite UI
    • Lets inactive with LLM. go~
    • trigger more detail by LLM

Suggestion software info list


Suggestion LLM Model

  • LLM-text (normal, chat, assistant)
    • 4B VRAM<2GCHE-72/Qwen1.5-4B-Chat-Q2_K-GGUF/qwen1.5-4b-chat-q2_k.ggufhttps://huggingface.co/CHE-72/Qwen1.5-4B-Chat-Q2_K-GGUF
    • 7B VRAM<8Gccpl17/Llama-3-Taiwan-8B-Instruct-GGUF/Llama-3-Taiwan-8B-Instruct.Q2_K.ggufLewdiculous/L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix/L3-8B-Stheno-v3.2-IQ3_XXS-imat.gguf
    • Google-Gemmahttps://huggingface.co/bartowski/gemma-2-9b-it-GGUFbartowski/gemma-2-9b-it-GGUF/gemma-2-9b-it-IQ2_M.ggufsmall and good for SD-Prompt
  • LLM-vision 👀 (work with SDXL, VRAM >=8G is better )
    • https://huggingface.co/xtuner/llava-phi-3-mini-ggufllava-phi-3-mini-mmproj-f16.gguf (600MB,vision adapter)⭐⭐⭐llava-phi-3-mini-f16.gguf (7G, main model)
    • https://huggingface.co/FiditeNemini/Llama-3.1-Unhinged-Vision-8B-GGUFllava-llama-3.1-8b-mmproj-f16.gguf⭐⭐⭐Llama-3.1-Unhinged-Vision-8B-Q8.0.gguf
    • https://huggingface.co/Lewdiculous/Eris_PrimeV4-Vision-32k-7B-GGUF-IQ-Imatrix#quantization-informationquantization_options = ["Q4_K_M", "Q4_K_S", "IQ4_XS", "Q5_K_M", "Q5_K_S","Q6_K", "Q8_0", "IQ3_M", "IQ3_S", "IQ3_XXS"]⭐⭐⭐⭐⭐for low VRAM super small: IQ3_XXS (2.83G)in fact, it's enough uses.

Using Online LLM Service Setup example

OpenAI ChatGPT

  • In Auto-LLM Setup tab
    • LLM-URL=https://api.openai.com/v1
  • get ur api key from openAI : https://platform.openai.com/api-keys
    • LLM-API-KEY = xxxxxxxxxxxxxxxxxxxxxxx
    • LLM-Model-Name = gpt-3.5-turbo

Google Gemini

X Grok

claude.ai

Hugging face space

Javascript!

security issue, but u can consider as follows.

Buy me a Coca cola ☕

https://buymeacoffee.com/xxoooxx

Colophon

Made for fun. I hope if brings you great joy, and perfect hair forever. Contact me with questions and comments, but not threats, please. And feel free to contribute! Pull requests and ideas in Discussions or Issues will be taken quite seriously! --- https://decade.tw

avatar-img
0會員
1內容數
留言0
查看全部
avatar-img
發表第一個留言支持創作者!
你可能也想看
Google News 追蹤
Thumbnail
徵的就是你 🫵 超ㄅㄧㄤˋ 獎品搭配超瞎趴的四大主題,等你踹共啦!還有機會獲得經典的「偉士牌樂高」喔!馬上來參加本次的活動吧!
Thumbnail
隨著理財資訊的普及,越來越多台灣人不再將資產侷限於台股,而是將視野拓展到國際市場。特別是美國市場,其豐富的理財選擇,讓不少人開始思考將資金配置於海外市場的可能性。 然而,要參與美國市場並不只是盲目跟隨標的這麼簡單,而是需要策略和方式,尤其對新手而言,除了選股以外還會遇到語言、開戶流程、Ap
我們人類和ChatGPT的對話技巧也是需要學習的,有鑑於此,我想要一天分享一點「和ChatGPT對話的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。 如果您對自動模擬中的細節不滿意,您可以使用一系列引導 Prompt 將對話引導至您喜歡的方式,以下範例示
Thumbnail
台灣也開放使用了! 你知道除了 ChatGPT、Gemini、claude 3.5等等AI工具之外,還有一個超好用的AI工具叫做NotebookLM嗎?
https://www.youtube.com/watch?v=wjZofJX0v4M 這是我看過最好的AI科普影片了;現在流行的GPT使用的大語言模型 (large language model, LLM), 是把每一個單字都當作一個高維度向量 影片中GPT3共儲存50257個英文單字, 每
Thumbnail
本文介紹了大型語言模型(LLM)中Prompt的原理及實踐,並提供了撰寫Prompt的基本框架邏輯PREP,以及加強Prompt撰寫的幾個方向:加強說明背景、角色描述和呈現風格,加強背景說明,角色描述,呈現風格以及目標受眾(TA)。同時推薦了幾個Prompt相關的參考網站。最後解答了一些快問快答。
我們知道AI的作法可以分為Supervised Learning、Unsupervised Learning、Reinforcement Learning,整題區分如下圖: 圖片出處:https://www.superannotate.com/blog/supervised-learning-an
Thumbnail
本篇文章分享了對創意和靈感來源的深入思考,以及如何將其轉化為實際的成果或解決方案的過程。透過學習、資料收集、練習、創新等方法,提出了將創意落實的思路和技巧。同時介紹了AI在外顯知識的自動化應用,以及對其潛在發展方向的討論。最後探討了傳統機器學習技術在模擬中的應用案例和對AI世界的影響。
這個頻道將提供以下服務: 深入介紹各種Machine Learning技術 深入介紹各種Deep Learning技術 深入介紹各種Reinforcement Learning技術 深入介紹Probabilistic Graphical Model技術 不定時提供讀書筆記 讓我們一起在未
Thumbnail
未來,針對圖片生成的 prompt engineering 可能會越來越不重要。
Thumbnail
這篇內容與你分享我看到哪些不錯的設計、AI 相關內容,像是我最近有看到 OpenAI 官方分享的 Prompt 教學,由官方分享絕對實用,另外也看到一篇創作者分享自己的一手印刷廠推薦心得,這真的非常難得,除了很多人會私藏外,要花心力整理也很不容易。
Thumbnail
徵的就是你 🫵 超ㄅㄧㄤˋ 獎品搭配超瞎趴的四大主題,等你踹共啦!還有機會獲得經典的「偉士牌樂高」喔!馬上來參加本次的活動吧!
Thumbnail
隨著理財資訊的普及,越來越多台灣人不再將資產侷限於台股,而是將視野拓展到國際市場。特別是美國市場,其豐富的理財選擇,讓不少人開始思考將資金配置於海外市場的可能性。 然而,要參與美國市場並不只是盲目跟隨標的這麼簡單,而是需要策略和方式,尤其對新手而言,除了選股以外還會遇到語言、開戶流程、Ap
我們人類和ChatGPT的對話技巧也是需要學習的,有鑑於此,我想要一天分享一點「和ChatGPT對話的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。 如果您對自動模擬中的細節不滿意,您可以使用一系列引導 Prompt 將對話引導至您喜歡的方式,以下範例示
Thumbnail
台灣也開放使用了! 你知道除了 ChatGPT、Gemini、claude 3.5等等AI工具之外,還有一個超好用的AI工具叫做NotebookLM嗎?
https://www.youtube.com/watch?v=wjZofJX0v4M 這是我看過最好的AI科普影片了;現在流行的GPT使用的大語言模型 (large language model, LLM), 是把每一個單字都當作一個高維度向量 影片中GPT3共儲存50257個英文單字, 每
Thumbnail
本文介紹了大型語言模型(LLM)中Prompt的原理及實踐,並提供了撰寫Prompt的基本框架邏輯PREP,以及加強Prompt撰寫的幾個方向:加強說明背景、角色描述和呈現風格,加強背景說明,角色描述,呈現風格以及目標受眾(TA)。同時推薦了幾個Prompt相關的參考網站。最後解答了一些快問快答。
我們知道AI的作法可以分為Supervised Learning、Unsupervised Learning、Reinforcement Learning,整題區分如下圖: 圖片出處:https://www.superannotate.com/blog/supervised-learning-an
Thumbnail
本篇文章分享了對創意和靈感來源的深入思考,以及如何將其轉化為實際的成果或解決方案的過程。透過學習、資料收集、練習、創新等方法,提出了將創意落實的思路和技巧。同時介紹了AI在外顯知識的自動化應用,以及對其潛在發展方向的討論。最後探討了傳統機器學習技術在模擬中的應用案例和對AI世界的影響。
這個頻道將提供以下服務: 深入介紹各種Machine Learning技術 深入介紹各種Deep Learning技術 深入介紹各種Reinforcement Learning技術 深入介紹Probabilistic Graphical Model技術 不定時提供讀書筆記 讓我們一起在未
Thumbnail
未來,針對圖片生成的 prompt engineering 可能會越來越不重要。
Thumbnail
這篇內容與你分享我看到哪些不錯的設計、AI 相關內容,像是我最近有看到 OpenAI 官方分享的 Prompt 教學,由官方分享絕對實用,另外也看到一篇創作者分享自己的一手印刷廠推薦心得,這真的非常難得,除了很多人會私藏外,要花心力整理也很不容易。