我想要一天分享一點「LLM從底層堆疊的技術」,並且每篇文章長度控制在三分鐘以內,讓大家不會壓力太大,但是又能夠每天成長一點。
回顧目前有的素材:
透過 Embedding 模型找出高相似度的文字後,要再匯入當作 Prompt,不過我們先寫一個計算 Token 數目的函數,不管從專案面 (模型有最大支援 Token 數限制) 還是價格面都會有幫助:
GPT_MODEL = "gpt-4-turbo"
def num_tokens(text: str, model: str = GPT_MODEL) -> int:
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
再撰寫一個 Query 函數,此函數要考量上述提及的:模型有最大支援 Token 數限制:
def query_message(query: str, df: pd.DataFrame, model: str, token_budget: int) -> str:
strings, relatednesses = strings_ranked_by_relatedness(query, df)
introduction = 'Use the below articles on the 2022 Winter Olympics to answer the subsequent question. If the answer cannot be found in the articles, write "I could not find an answer."'
question = f"\n\nQuestion: {query}"
message = introduction
for string in strings:
next_article = f'\n\nWikipedia article section:\n"""\n{string}\n"""'
if num_tokens(message + next_article + question, model = model) > token_budget:
break
else:
message += next_article
return message + question