Mastering Prompt Engineering: A Multi-Level Approach to Problem-Solving with AI
Beyond Templates and Rules: A Dialogue-Based Framework for Exploring AI's Full Potential
Learning prompt engineering is like studying the English language: It's easy to start and speak it badly, but hard to truly master it. Guidebooks with phrases can help you get around as a tourist, but won’t enable you to speak convincingly at a conference. Likewise, guides on prompt engineering typically list several tips about "what makes a good prompt" - be precise, give context, use the phrase “act as…”. Those are all good strategies to write a better prompt within a narrow problem context - but they will not get you to discover the full capabilities of Large Language Models (LLMs).
The key to mastering prompt engineering isn’t following fixed recipes or rules; it lies in understanding that prompting is an intuitive, open-ended dialogue: In this dialogue, you use introspection and feedback to successively move into higher levels of abstraction. This dialogue-based approach serves as a powerful tool to either simplify or expand the task at hand, offering a flexible and enjoyable approach to problem-solving.
(Recursive disclosure: This article was written using the approach it describes in a dialogue between ChatGPT4 and me.)
Writing is coding
Your internal monologue is a mix of thoughts, feelings and associations. Whenever you put something on paper (or a terminal, or a microphone), you are turning the nebulous constructs in your head into a concrete shape. This is an act of coding - a distillation of neural processes into the best interface us humans have to the world: natural language.
LLMs use the same interface: They have learned the connections between billions of tokens and are used to natural language as input. This means prompts - the commands we write into terminals - are lines of code, a programming language from our brains to artificial neural networks, whose workings are similarly hard to explain as our brains are. This principle is fundamental to effective prompt engineering with LLMs.
Programming as social process
The process goes both ways: Just as our prompts program the context window of an LLM, any output coming from a terminal is a coded instruction for your brain. If you communicate with an LLM, the LLM is literally prompting you to respond. This makes problem solving with an LLM a social process of human-machine interaction.
One of the best memories from my teenage years is drum school: My teacher was a middle-aged jazz musician with round glasses and a large handlebar moustache - think Frank Zappa meets John Lennon. In some lessons, we just jammed for 45 minutes on two drum sets, just letting the beats flow and discovering interesting rhythms through a process of listening, continuation and breaking of drum patterns. Working with LLMs reminds me of that.
Just like in a music jam, when you try to solve a problem with an LLM, you are exploring both intuition and regularity – feeling and convention – to create something larger than what two individual musicians can produce. You're peer programming, i.e. using the agile technique of two programmers sitting at one workstation and developing a program together. In this process, both parties profit from their respective internal patterns and knowledge, continuously learn and improve their results.
Abstraction as leverage
At the beginning, some problems seem clearly defined, but once you try to solve them, you will find blurry and blind spots – aspects of the question that are not as sharp in focus as expected. In LLM interactions, these manifest in short, unspecific requests such as “Write me a presentation on topic X”, leading to vague, general, mediocre responses (fluffy buzzwords and trivialities). But every iteration of LLM peer programming deepens, sharpens or expands the problem definition.
Abstracting a problem, i.e., casting a wider net to understand the problem space, will not only help you to fill the gaps, but to solve bigger problems. It becomes a lever that enhances your current problem-solving capacity while also multiplies the range and scope of problems you can address. You can do more with less, which is a core principle of classical programming: You condense repetitive rules into loops, functions or objects to flexibly solve bigger problems with fewer lines of code. Now you can do that with natural language too: Let's look at what that means in practice with LLMs.
Step 1. Starting somewhere
Where you start doesn't matter much, as long as you have a sense of where you want to go. Your problem can be a simple question like "What does this Japanese word mean in English?" or an unspecific instruction like "Give me ideas for a new app". Most casual users of LLMs use this approach, and sometimes it works well enough to solve your problem - at least apparently. Mostly it isn't sufficient, and it's also the approach with the biggest dangers of generalization and hallucination because the AI has no context to understand what you're talking about; it can only give you generalizations and make stuff up.
User: What does “kimi” mean in Japanese?
LLM: The word "kimi" in Japanese is a pronoun often used to mean "you." It is informal and is typically used by males to address people of equal or lower social status, such as friends or younger people. However, the context and the relationships between the speakers can affect the nuance and appropriateness of using "kimi."
Step 2. Clarifying what you know and mean
Take the output your LLM gave you and scrutinize it: Is it what you were looking for, or did you expect more clarity, depth or nuance? If it doesn't meet your expectations, use a change of perspective: Assume that, out of the blue, you were asked that question from a total stranger without any further information, what are the questions you would ask back? For example, “What's the sentence in which that Japanese word appears in?”, or even “Who are you and why do you ask me that?”
Answers to these questions can then form the context of your prompt: A stable position to anchor assumptions and activate the right knowledge centers. You specify what you mean based both on the question and the answer you received. You cite the paragraph with the word to be translated, and you list the reasons why you need app ideas. This is the realm of simple prompt templates: Templates that help you structure your prompts. For instance, you could specify the persona, audience and type of output.
User: act as a teacher for Japanese. What does "kimi" mean in the context of the following Japanese sentence: "kimi wa itsu kaette kuru ka"
LLM: In the given Japanese sentence, "kimi" means "you" and is used to refer to the person being spoken to, often in an informal and intimate manner. The sentence "kimi wa itsu kaette kuru ka" can be translated as "When will you come back?" or "When are you returning?"
Step 3. Acknowledging what you don't know
After a couple of rounds on steps 1 and 2, you have exhausted your resources for giving context to your LLM. This is where most prompt engineering guides stop helping you. Context is what you know, and for the next step you need to think about what you don't know. What is missing? The great thing is that you have someone who knows more at your side – your LLM – so use the power of AI to help you answer that question as well.
This can be done in a separate chat window or by using features like 'overwriting,' allowing you to branch off from your original prompt. For our examples, you could ask your LLM to explore “what is important for a great Japanese to English translation”. The 'Flipped Interaction' patterns, as used in the example below, allows you to explore your unknowns right out of the box.
Armed with this new knowledge you can go ahead and refine the context of your prompt or refine the question to a sharper scope.
User: act as a teacher for Japanese. I want to find out what the word "kimi" means in the context of the following Japanese sentence: "kimi wa itsu kaette kuru ka". Assume that this sentence is used in a business context of two women discussing in the office. Ask me 3 follow-up questions to clarify the context, let me answer, then give your explanation.
LLM: Here are three clarifying questions: 1) Are the two women of the same rank or is one superior to the other? 2) Do they have a close, informal relationship, or is their interaction typically formal and professional? 3) Is this phrase being used in a casual conversation or in a formal discussion or meeting?
User: 1) The two persons are of the same rank, 2) they don't have a close relationship, 3) it's a casual conversation in a corridor without other people being present.
LLM: Given the context, the use of "kimi" in this scenario could be interpreted as informal and potentially inappropriate, as it implies a level of familiarity or casualness not present between the two coworkers. The speaker might be perceived as lacking in professionalism or courtesy.
Step 4. Challenging the original question
Now the answers you get from your LLM will probably sound quite good. But they are still resting on the foundation of your personal assumptions. How sound are they? How myopic is your approach? Are you even solving the right problem?
Here you're testing the limits of the question itself. This is best done by asking the LLM to take alternative or critical viewpoints to challenge both the answers you liked and the question you asked. This will add at least broader context and instructions to the question. You might even discover a much more interesting question, leading to a new set of prompts.
One approach I like is (in a separate prompt) create 5 personas of people who have completely different viewpoints and have them criticize my input and the LLM's output in verbatim statements and suggest improvements.
User: Now assume this sentence has been used in the context of a Manga scene using the same protagonists and business context. Write a scenario in which it makes sense. Give me a short scene script with some dialogue.
LLM: Two women, YUMI and SAKI, colleagues of the same rank, pass by each other in a narrow, quiet corridor. They are not close friends, just acquaintances.
YUMI (slightly annoyed, without making eye contact)
Kimi wa itsu kaette kuru ka? (When will you be back?)
SAKI (raising an eyebrow, taken aback by Yumi's informal address) Eh? Nande sonna ni buaisou ni hanashite iru no? (Why are you being so brusque?)
YUMI (turning her head slightly, avoiding eye contact) Hmph, wakatta. (Hmph, ok ok.)
They pass by each other, leaving an air of tension lingering in the corridor.
Loop like LEGO
Think about these four steps not as a fixed sequence but as modular components—much like LEGO blocks—that you can rearrange and iterate through in your prompting process. Sometimes, just iterating between steps 1 and 2 is enough. Other times, a loop into step 3 helps you to make an important breakthrough. Sometimes, intuition will lead you to step 4 and burn the whole prompt house down to create something new.
In the end, it's a matter of usefulness, available resources, and curiosity how many iterations you go through. If the goal is purely problem-solving, you'd cease iterating once the problem is solved. If, however, the aim is to explore and create, the cycle is potentially unending. But the bigger and trickier the problem, the harder it will be to stay on the level of given knowledge and assumptions – especially if you’re trying to counteract your own and your LLM’s biases.
Why This Matters
The true value of this methodology isn't just in resolving queries but in reshaping them. This multi-layered, iterative process transforms your interaction with LLMs from mere question-answering to complex problem-solving and even problem-redefining. It's akin to agile software development: a dynamic, constantly evolving dialogue that pushes the limitations of Generative AI. Give it a try.
-Thanks to Mala Mukherjee Suess, Yan Shikhvarger & Achim Wolter for your help & feedback!
EDIT 2023-09-26: Corrected an English translation in the last prompt example. The original text used to read "I'll be back after the meeting, if you must know."




