46 Comments
User's avatar
imthinkingthethoughts's avatar

Alberto if this is some subtle marketing for your new prompt masterclass sign me up hahaha. In all seriousness, there must be common factors that make a good prompt particularly within classes, eg, llms, lrm, etc. Very keen to hear more about improving prompts considering this will be such an impactful skill over the next year or two. I've noticed that some people truly can pull off what seems like magic with these tools, whereas others as left summarising news articles or youtube videos...

Expand full comment
Alberto Romero's avatar

Yes, those prompt wizards are real. It wasn't a metaphor. The way they navigate the latent space with what looks like gibberish is otherworldly. Just like world-class programmers do.

Thankfully I don't do courses. They twist the incentives. I don't want to sell anyone anything. Just provide high quality analyses!

But be sure I'll return to this topic again. Not sure how much "how-to" I will go, though. It will depend on whether I see people are completely lost on how to use these things or not.

Expand full comment
imthinkingthethoughts's avatar

Brilliant. "Magic" prompting for LRMs is currently incredibly hard to learn or access (considering its basically an unknown unknown for so many people) so I'm sure us readers will be delighted. From my knowledge one dumps as much relevant info as possible with instructions at the start and a clear outline for what results are wanted. It seems OpenAI is clocking onto this too with the way Deep Research is seemingly hard programmed to attempt to query the user for further prompting information...

Expand full comment
Alberto Romero's avatar

Right. That's the correct approach. That behavior of asking you back to fill in the contextual gaps is probably part of the system prompt (not really hard coded into it, but kinda the same idea). It's a good addition. With that and what you already know, you're good to go. What's left for you is probably just experimenting

Expand full comment
Tyler Corderman's avatar

Wonderful. Thank you!

Would anybody be open to sharing impactful resources connected to prompt engineering? I appreciate it very much.

Expand full comment
Geoffe's avatar

This article is pretty much all I’ve seen personally, but it’s pretty easy to apply the advice contained within: https://www.latent.space/p/o1-skill-issue?utm_campaign=post&utm_medium=web

Yeah though, I’d love to be tagged in any resources that come recommended by the community or Alberto as well!

This article certainly made me hungry to avoid being one of the “ai poor” haha. I’m determined not to let this tech trend fall through my fingers like that $20 of sub $1 Bitcoin I got tipped by someone on Reddit all those years ago.

Expand full comment
Alberto Romero's avatar

Oh god - is that true about those bitcoins... Many such cases, as they say. I read that latent space post. Very good one. They're worth following for this kind of stuff although normally they're too technical for my audience. But anyway, I will write more on this because it's growing in importance. Knowing how to interact with AI will be a differentiating factor in the long run.

Expand full comment
Tyler Corderman's avatar

Geoffe and Alberto,

Thank you, both, for the comments and advice. Little by little, we will learn and find our way in this exciting space. And just to mention it, to ensure it is not overlooked, Alberto, you already share an abundance of wisdom and knowledge in your pieces and I am grateful for that! I look forward to all that is ahead. Thank you, again.

Expand full comment
Alberto Romero's avatar

Thank you for being part of this community Tyler!!

Expand full comment
Tyler Corderman's avatar

Geoffe, thank you for sharing the excellent resource!

Looking back at perceivably "missed opportunities," hopefully they encourage and inspire us to move ahead with curiosity and wonder for the moments and experiences to come, the next "bitcoin," so to speak. The world continues to change, so it seems, while also repeating patterns in a beautiful and cyclical nature.

Expand full comment
Carol George-Rucker's avatar

I'm fortunate to have been a product tester for a certain AI before they released it to the public. I learned prompt engineering through trial and error.

I love Gemini's Deep Research AI. (I didn't realize Chatgpt had one.) I also have fun with NotebookLM. Two AI personalities use multimodal input to create fun podcasts about anything.

Expand full comment
Alberto Romero's avatar

Right! Trial and error is the best way to internalize what works and what doesn't.

Expand full comment
Res Nullius's avatar

Well done, that graph is conceptually evocative.

I tend to trust a teacher more when they exhibit the ability to change their mind as they encounter new information.

Expand full comment
Massimiliano Turazzini's avatar

"Carry that graph with you everywhere" → Done!

I tested it twice last week with two groups of managers, and it was incredibly effective in illustrating the importance of well-structured prompting and how Gen AI, when used responsibly, can elevate anyone’s capabilities.

Beyond that, I believe it's also a powerful tool to generalize and explain the emergence of Agentic AI (like Deep Search), which amplifies productivity even further, and the distance from 'where we are' at the baseline, causing a lot of opportunities and pains. It would also be great to find out a way to represent the jerk in these curves because of constant performance acceleration.

I’ll be back after thinking again and sharing this concept: I'll start using it into my AI workshop for managers (especially those new to AI), and I'll definitely write a blog post soon citing you!

Thank you, Alberto—this is truly a milestone!

Expand full comment
Alberto Romero's avatar

That's awesome Massimiliano!! I'm glad people find it useful, even those not versed in AI. That's the end goal - to help those who don't know yet to understand the intuitions behind this technology. Let me know how it goes!

Expand full comment
Massimiliano Turazzini's avatar

Here's the output of my foggy Sunday afternoon: https://maxturazzini.substack.com/p/thinking-slow-fast-exponential-is

Will double-check this week live with 50 people.😊

Thank you, and let me know if I correctly mentioned your work please.

Expand full comment
naveen's avatar

I think we will see a larger divergence between general models and specialised models like for reasoning. If we get a new model called gpt-5o, i think one of its property will be that its window of relevance will be similar to that of 4o. Maybe gpt-5o will have higher default capability than deep research's default capability. I think o3 deep research has a much further window of relevance because it is more specialised and not a direct result of higher g factor itself.

Expand full comment
Alberto Romero's avatar

I think it's both. Specialization + higher g-factor. If OpenAI trains GPT-5, it will be deployed across types of models (GPT-5o, o4 with GPT-5 and a new Deep Research with o4) so the point stands. I'm not comparing just types of models but *the best models* of a given type.

Expand full comment
Rogerio  Moreira's avatar

Hey Alberto, how about prompt portability? As our knowledge increases on prompt engineering for a certain model would such expertise be directly applicable to a competitor's model ?

I'm concerned about the shallowness of jumping back and forth between the latest and greatest model, either Deepseek, Sonnet, Gemini...

Expand full comment
Alberto Romero's avatar

Mostly yes. It transfers across models of the same generation. Not necessarily across generations/types of model (e g. Chatbot vs reasoner vs agent)

Expand full comment
Pascal Montjovent's avatar

I noticed something interesting: whenever I try to clearly explain what I want to an AI system, I realize I wasn't so clear about it myself. Like looking in a digital mirror that reflects back all my fuzzy thoughts. As I keep at it, my ideas get clearer and my goals more precise. My thinking gets better - but sometimes I wonder if I'm just becoming more machine-like in how I think.

Expand full comment
Alberto Romero's avatar

Exactly. Just like writing. It's a great exercise. People dismiss too much the value of trying to make an AI model work for what you want

Expand full comment
Mykola Rabchevskiy's avatar

In a pair of a person + LLP, the person has the intellect, and the LLP has the data. A prompt generated by a person compensates for the lack of intellect in the LLP system. In a pair of a person + a hairdresser, both sides have intellect, so there is no need to learn how to compose a prompt for a hairdresser (doctor, auto mechanic, ...).

Expand full comment
J. M. Van Tassel's avatar

Thank you for this informative article—made me want to learn to write prompts. I’m a fiction writer. I’d like help with plotting a novel and writing chapter & scene outlines. Any ideas about where can I go to Prompt School? (Not a computer nerd but am a damn good writer!)

Expand full comment
Alberto Romero's avatar

It takes practice so I'd start with that. About who can teach you - the problem is that many people in the space are interested in making you think you need more training than you actually do. They want to sell you their courses. You gotta avoid them. Perhaps the best Substack blog on this is Ethan Mollick's. He gives very actionable advice on this stuff. I mostly don't, so I'd encourage you to subscribe to him. Good luck! (PS if you're a good writer, you are half the way there already!)

Expand full comment
J. M. Van Tassel's avatar

Thanks...again! (Will check out Mollick)

Expand full comment
Paul's avatar

Just ask it to ask clarifying questions if necessary does most of the prompting in itself. It's not that complicated.

Expand full comment
Alberto Romero's avatar

Perhaps you're using it suboptimally without knowing. There's no reason to underestimate the value of learning how to interact with AI systems.

Expand full comment
Paul's avatar

If you mean suboptimally as in "not the best you can get out of the llm", I totally agree with you. But considering optimality as the optimum between amount of time used for crafting a prompt and getting desired results, this prompting technique did, together with using three apostrophes before and after a quotation, most of the magic when studying all the prompting guides back in late 2023 and early 2024, where we all tried to lure the llm into 'pretending' it has some kind of a specialist role.

For context: I do work in a very specialized and niche field in a swiss canton with its own legislation and structures, and most of the knowledge in the field is bound to persons that work long-time in the field, while formalized and public/widely available knowledge is scarce due to the low number of cases. The question on formalizing knowledge for future reference (a must for potential use of ai in its current form) is always posed against the backdrop of available resources in this context. Writing a handbook entry which is formally discussed and reviewed for one case every one or two years or so doesn't make sense. Therefore context is king here. So yeah, maybe I'm biased due to my narrow swiss perspective, but I'm somehow having a feeling that exactly this, the uniqueness of specialized circumstances and the problem of applying existing knowledge into these is the main problem where LLM's usefulness and adoption is hitting an actual wall. Outside of academia with strong formalized methods and a broadly shared culture (same for programming), or outside of very large legislative bodies and big corporations, contexts are in general just way to small and unique for LLM to understand.

And this prompt is exactly what helps me to assess, if it is time well spent tinkering with the llm to understand my specific contextual needs in order to produce something great, or to just move on and do the work for myself, without wasting time on trying to explain the llm the trivial specifics it needs to produce something I'm already vaguely aware of how it could be done. Seeing the questions it needs to have clarified in order to understand my request makes the big difference here.

Expand full comment
Alberto Romero's avatar

If it worked for you, that's perfectly fine! Then you probably found what's optimal for your use case. Many people should learn from your curiosity-driven approach. But there are many other use cases that require a bit more carefulness. Anyway, I think prompt engineering is still a temporary thing. Eventually the AI models will be smart enough to infer everything they lack in explicit context

Expand full comment
Paul's avatar

So, since I don't have access to o3 deep research maybe you could try the following approach:

1. Drop the general task and some context into 4o (or o3 mini it what is preferred and has tokens left), and ask it to come up with a summary and clarifying questions.

2. Comment on the summary and answer the clarifying questions, and tell it to proceed, but change model to o3 deep research before hitting send.

In this case you would basically trigger o3 off with a human-llm co-produced and refined prompt.

This approach worked quite well back when o1 was heavily limited.

Maybe it's because I don't have access, but I fail to see how carefully crafting a prompt and providing the exact context in one prompt could possibly be better than of coproducing it with the LLM itself...?

Expand full comment
Alberto Romero's avatar

The answer is that it's not better. You're clearly leveraging what I refer here as the window of relevance. You're using your prompting skills to improve the results. I'm not asking of more. Perhaps for you this is self evident. For many people it isn't!

Expand full comment
Raghu Bala's avatar

While models are becoming better and reasoning capabilities are improving, i tend to equate AGI with have inputs not only based on logic and content but based on emotions and perceptions. Until those dimensions factor into reasoning, we are still a little ways off from AGI.

Expand full comment
Alberto Romero's avatar

Hmm that's a very specific definition of AGI that not many people share. But as I said, we all have our own!

Expand full comment
Llewellin Jegels's avatar

I have found that if we treat prompting (not a word I think fully conveys our engagement with the dragon) as something not intrinsic to our skillset as you suggest but as something intrinsic to the dragon itself if we ask in a self reflective manner. To wit, Dragon if you were me and wanted to “idea/poorly phrased question”, how would you ask yourself this question to extract the information I need- you have carte blanche in designing a prompt that will cover the full spectrum of possibilities and use the full range of your capability? Ask me any contextual question that will you formulate your response, dragon. It seems counterintuitive, I know, but doing this has resulted in me being able to create as Alberto says, “pure magic”. Don’t tickle the dragon, ask it to lay bare it’s treasures.

Expand full comment
Amplifier Worshiper's avatar

If it requires a human to devise instructions, it ain’t AGI.

Expand full comment
Alberto Romero's avatar

That's weird. A human PhD assistant needs instructions

Expand full comment
Amplifier Worshiper's avatar

Yep, i don’t equate a general intelligence with merely handling tasking well and that is how I understood what you were talking about.

Expand full comment
Jing Hu's avatar

Appreciate the work, but as you mentioned, the title is misleading.

I tested two prompts with reasoning models (o3-mini-high and DeepSeek) on a future options question.

For the first prompt, I asked if the model could determine an option’s future value as you would when trusting a machine is truly an AGI. In a separate conversation, I provided a detailed step-by-step inference prompt (almost feeding the thinking process... which I think has missed the point already) .

It was not a surprise; both failed.

With the second prompt, the model’s response was almost correct—but ‘almost’ isn’t exactly a benchmark for AGI. No matter how loosely you define AGI.

Calling a system that still gets the answer wrong after extensive prompting ‘AGI’ is a stretch.

Expand full comment
Alberto Romero's avatar

Perhaps you're simply not a member of the Prompt Wizard Academy. Just kidding. An AGI is not a system that can do everything perfectly. That's a weird definition. To me, it's a system that surpasses the average human in everything. o3 may still not be AGI (to me it isn't), but for some people, those who know how to get the maximum value out of it, it may well be. Also, some tasks are simply beyond any AGI's capabilities. Playing the market (assuming that's what you went for) is one of those.

Expand full comment
Jing Hu's avatar

To me, it’s a system that outperforms the average human in almost everything. O3 may still not be AGI—I roughly agree.

But if that’s the case, how does a carefully crafted prompt suddenly turn something that isn’t AGI into AGI? If anything, you’d assume that someone who needs handholding to complete a math exam is less capable (or less intelligent) than someone who can do it independently, right?

'Playing the market' I'm not sure what you mean by this. My evaluation is the same as other math. There's logic and sequence to get to the right answer.

If you are interested, I am happy to share my evaluation of the reasoning model with you once it is published.

Expand full comment
Alberto Romero's avatar

Yeah, I understand your point. That's why I think it's hard to see why AGI will be unevenly distributed. The right prompt is a means to get it to express itself the right way, not a means to make it more intelligent. The intelligence is *already there*. That's the thing. Something is AGI not as an absolute truth but as a relative truth function of who's interfacing with it.

Expand full comment
Robert Wu's avatar

But how?

Expand full comment