Learn To Master Prompt Engineering With This Singular (Triple) Framework
An essay on the art and science of using generative AI tools
My latest essay was about the importance of learning—and mastering—prompt engineering. Although the term was coined a while ago, it's become trendy recently. How could it be otherwise when prompt engineering goes hand in hand with generative AI: If the latter is a set of tools, prompt engineering is the means we have to communicate with those tools.
Given the value users are finding in systems like ChatGPT, it’s trivial to conclude prompt engineering is just as important. The tool is worth nothing without the means to extract the value within. You don’t want a calculator without knowing arithmetic. You don’t want a piano without musical theory lessons. And you don’t want a computer without the skill to navigate the operating system, the browser, and the web.
Interestingly, this isn't what happens in practice with prompt engineering. The generative AI-prompt engineering relationship is rather special. In the absence of the right skills, it's not straightforward to assess one's inability because, even if not the best result, you'll always get something. It's hard to distinguish when the issue lies in the system's shortcomings or in the user's lack of knowledge.
And it gets more complicated because prompt engineering is not just one homogeneous thing. Not one “language.” It’s substantially varied; it encompasses everything from the obscure modifiers you can add to Stable Diffusion prompts for tiny changes to the stylish prose that stimulates ChatGPT’s latent space just in the right amount and direction to get a beautiful poem or a short essay.
As I explained in my earlier post, prompt engineering has similarities with code (e.g. it allows communication with machines), but it also has features specific to natural languages (e.g. there’s ambiguity and room for interpretation). It’s also akin to no-code tools in that it’s more intuitive and user-friendly than programming languages, but it’s superior due to its versatility and flexibility.
Although some people overuse those analogies, prompt engineering is (for now at least) something different. Something new. Something in need of conceptualization. And nothing is more helpful for that than the right framework. Code and natural language are close-enough metaphors for what prompt engineering is but they can’t teach you how to think about prompt engineering. (And neither can a quick advice-and-tips guide.)
Assuming prompt engineering is for many of you just this abstract idea of talking to machines, I wrote this essay (it was longer initially but I’ve decided to split it into parts that I’ll publish later) as an attempt to help you concretize the concept. It's a descriptive analysis of the triple framework I use to think about and understand prompt engineering—and ultimately approach it the right way. I have no choice but to use (imperfect) analogies that I hope will be sufficiently useful in helping you make sense of all this.
A clarification before we begin: This isn't your typical prompt engineering practical guide. It’s something else you’ll rarely find elsewhere. You may value it more or less depending on how you learn new things. I particularly think a good framework and mindset are critical and more so when we’re talking about an inscrutable tech that promises to be revolutionary.
This isn’t a prompt engineering quick “how-to” practical guide—but there’s huge value in those, too
This article isn’t about tips and tricks for prompts. It isn’t about what descriptors work with Midjourney or Stable Diffusion. And it isn’t about what words make ChatGPT break its filters and allow you to make it say bad things. There’s a lot of value in learning practical prompting techniques (and even learning to jailbreak the systems even if only to be aware of their failure modes), but there are plenty of those kinds of resources out there already.
If you’re exclusively interested in the hands-on aspects of prompt engineering and feel no need to understand it deeper, I encourage you to stop reading this blog post and find a good course (like this one) and follow experts who share their knowledge online (like Riley Goodside, among many others).
This is something much less common and, in my theory-is-important view, arguably more valuable. As I see it, value is a direct function of the scope of the lessons learned—not necessarily of the immediateness of their applicability. Adapting a popular Chinese proverb to our topic: A specific hands-on guide is like giving you a fish to eat today. A great framework is like teaching you to fish so you can eat forever.
If you only learn the practical aspects of prompt engineering you may never understand the big picture—the whys and hows. You’ll have just short-sighted heuristics you'll have to update every month. The ideas I’ll share below (and in future posts) will help you be better prepared to communicate with any generative AI tool—not just the one that’s popular this week.
What do you prefer, learning a couple of quick pieces on the piano or spending a few years learning musical theory to master any piece that may come your way? Practice matters either way but the approach is radically different. Depending on what you answer to that question, this essay will be more or less valuable to you.
In case it’s not clear: Practice is extremely important. There’s no way you’ll get better at prompt engineering without interacting with AI tools often. What I argue here is that practice isn’t sufficient by itself. I’m giving you a framework to conduct your experiments and tests in a better light.
You may excel at the habit but without a method, you’d be lost.
A triple framework: Three useful, familiar disciplines to think about prompt engineering
The basic idea is to use disciplines you’re familiar with (even if at a superficial level) and combine them to develop a multidimensional lens to think about prompt engineering. As I wrote above, prompt engineering isn’t like anything else, so my approach consists in merging three areas and finding the sweet spot that makes my mind go “click,” opening my perspective and making those other practical “tips-and-tricks” guides much more valuable as a by-product. The triple framework is the child of psychology, physics, and art.