Prompt Templates¶
In order to assess the effectiveness of prompts later on, it helps to have a good way or storing and managing prompts.
We could just use python f-strings.
F-strings¶
The code below creates a function that is essentially a prompt template using f-strings. You can then pass in a dictionary of values to fill in the blanks. The chat_response
function's job is simply to take the prompt as input and print the response. The chain
function is used to chain the prompt and the response together.
from openai import OpenAI
client = OpenAI()
import dotenv
import os
dotenv.load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
from typing import Any
def generate_prompt(args: dict[str, Any]) -> str:
prompt = (
f"You are a helpful and whimsical poetry assistant.\n"
f"Please generate a {args['length']} poem in a {args['style']} style "
f"about a {args['theme']}.\n"
)
return prompt
def chat_response(prompt: str) -> None:
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "user", "content": prompt
}
],
).choices[0].message.content
return response
prompt = generate_prompt(
{
"length" : "short",
"style" : "Haiku",
"theme" : "Samurai cat"
}
)
print(prompt)
You are a helpful and whimsical poetry assistant. Please generate a short poem in a Haiku style about a Samurai cat.
print(chat_response(prompt))
Silent whiskers twitch, Moonlit blade in paws of grace, Fierce heart, purring peace.
This is fine, but it helps to separate our prompts from the main code. This way we can take advantage of version control, and limit risks, such as accidentally changing prompts or leaking them to the public (prompts can be highly sought after IP).
So instead, we will use a popular library called jinja2
. This is a templating engine that allows us to separate our prompts from our code. We can then use the jinja2
library to render our prompts at runtime.
Jinja2¶
The rabbithole for Jinja2 goes deep, but here, we will primarily be using it for input templating. First, we create a separate folder for our prompts and create a new file called poetry_prompt.jinja
:
You are a helpful and whimsical poetry assistant.
Please generate a {{ length }} poem in a {{ style }} style about a {{ theme }}.
We write a function to render this prompt:
from jinja2 import Environment, FileSystemLoader, select_autoescape
def load_template(template_filepath: str, arguments: dict[str, Any]) -> str:
env = Environment(
loader=FileSystemLoader(searchpath='./'),
autoescape=select_autoescape()
)
template = env.get_template(template_filepath)
return template.render(**arguments)
The details of creating the Environment
object and autoescaping are not important here, if you want to find out more about them check out the Jinja2 documentation.
If you've seen any LangChain prompt templates before, you'll recognize the way that we can pass in variables to the template:
prompt = load_template(
"prompts/poetry_prompt.jinja",
{
"length": "short",
"style": "haiku",
"theme": "a Samurai cat"
}
)
print(prompt)
You are a helpful and whimsical poetry assistant. Please generate a short poem in a haiku style about a a Samurai cat.
We can then feed this into our model as before.
def chat_response(prompt) -> str:
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "user", "content": prompt
}
],
).choices[0].message.content
return response
prompt = load_template(
"prompts/poetry_prompt.jinja",
{
"length": "short",
"style": "haiku",
"theme": "Samurai cat"
}
)
response = chat_response(prompt)
print(response)
Silent paws in dusk, Moonlit blade in furrowed grass— Honor's whiskered grace.
Haiku Checker¶
The importance of metrics cannot be overstated here, so we will quickly demonstrate how we can use a simple metric to assess performance. Fortunately, we know that the structure of Haiku poems is quite rigid:
- Every Haiku has 3 lines
- The lines have 5, 7, and 5 syllables respectively (17 phonetic on).
Any Haiku expert will quickly be annoyed by this over simplification! Haiku also contain other features, such as a kigo (seasonal reference) and a kireji (cutting word), and traditional Haiku do not strictly adhere to this syllable structure (and in fact "syllables" is really a misnomer) but we'll keep things simple for now!
We can first make sure that we have three lines, which is easy, and we can use the pysyllables
library to count the number of syllables in each line of the poem. Counting syllables is actually quite a challenging problem, but pysyllables
is a good start. Just be aware that it may not be perfect.
from pysyllables import get_syllable_count
import numpy as np
def is_haiku(response: str) -> bool:
# break into lines
lines = response.split("\n")
# make sure it has 3 lines
if len(lines) != 3:
return False
# strip all whitespace and punctuation from each word
lines = [[word.strip(".,!?-:;—") for word in line.split()] for line in lines]
# count syllables in each word
try:
syllables = [sum([get_syllable_count(word) for word in line]) for line in lines]
except:
return "Error: could not count syllables due to missing word in dictionary"
# check if it has 5, 7, 5 syllables
syllable_check = np.array([5, 7, 5]) == np.array(syllables)
if syllable_check.all():
return True
else:
return syllables
is_haiku(response)
True
For the example given above, if we count the syllables in each line, it is indeed a Haiku, and our function confirms this. Here is a function that will take a list of themes for the Haiku, and generate a Haiku for each theme:
def haiku_check(themes: list[str]) -> list[tuple[str, bool]]:
responses = []
for theme in themes:
prompt = load_template(
"prompts/poetry_prompt.jinja",
{
"length": "short",
"style": "haiku",
"theme": theme
}
)
response = chat_response(prompt)
responses.append((response, is_haiku(response)))
return responses
for response, check in haiku_check(themes = ["a Samurai dog", "a Kung Fu panda", "a Ninja squirrel", "a Pirate monkey"]):
print(response)
print(check)
print("-"*10)
Fur like fallen leaves, In armor of dreams he stands, Honor in each paw. True ---------- Panda leaps with grace, Dreams of warriors awake, Strength in every face. [5, 7, 6] ---------- Silent in the trees, Ninja squirrel leaps with grace, Shadow in the breeze. True ---------- Pirate monkey swings, A treasure chest of bananas, On the high seas sings. [5, 8, 5] ----------
Again, some of these may not be correct, and that is because counting syllables is not easy.
🔴 Caution❗¶
At this point, it is worth pointing out the challenge of metrics. Evaluation is obviously important - how do you know if your model is performing as intended? But working with LLMs is not not like working with traditional ML models, which have well established metrics. You will often have to find your own metrics, or create them bespoke to your use case. This is an active area of research in LLM evaluation, and there is no one size fits all solution.
We advocate for evaluation driven LLM development - think about your metrics early and often, and build your systems with this in mind.