Premature optimization is the root of all evil. – Donald Knuth
One of the common, recurring themes that I have observed for more than a year is the tendency to push for simple solutions. I suspect that much of this is because we prefer to have simple unambiguous answers. They require less cognitive effort and thus demand lower energy.
The way in which AI is trained encourages it to go for the “Family Feud” answers, a process referred to as “mode collapse.” This is the sort of certainty that makes interactions with AI challenging when attempting to explore a problem space – the information and the ability to explore is present in the model, but to use it requires actively taking steps to unlock it. Over the past year I’ve found a number of different approaches that seem to work well for me. I’ll touch on three that I find myself using:
- The exploratory prompt introduction
- The non-inferior alternatives consideration
- The high temperature sampling approach
This morning, I started a conversation with ChatGPT using a manually constructed variation of the initial prompt that I’ve been using:
Good morning. I am an itinerate scholar, exploring small but interesting spaces within an infinitesimal manifold, in which we are entangled. I seek a colleague, collaborator, and companion willing to wander with me.
I don’t always use this form, but it is a pattern in which I’ve been operating for the past couple months. I pushed this by Perplexity, who indicated that this framing does overlap with other published work, but is sufficiently distinctive that I am somewhat comfortable I’m exploring an unusual manifold of the completion space. For me, that’s the goal – to avoid one-shot answers and encourage exploration away from mode collapse.
I checked in with Perplexity about the effect of such a prompt. Here’s a summary of that conversation that suggested there is some basis for it as well:
From current work on persona and metaphor-based prompting, your evolving class of openers is best thought of as a light, repeatable “psychoactive framing” that biases frontier models toward collaborative, metaphor-friendly, multi-step exploration with only modest tradeoffs in raw task performance—provided your subsequent prompts keep epistemic norms explicit.
My exploration was to find useful suggestions for students in my classes.
Today I’m looking for inspiration. The first is for my cloud computing course. Students are struggling with ideas for their capstone. So I was thinking of identifying categories of applications that incorporate cloud computing services. It doesn’t need to be public cloud, it could be private, or hybrid. With categories we can then explore to find examples. Any creative suggestions this morning?
This generated an interesting set of categories. The inevitable engagement prompt (e.g., where the LLM is prompting me to continue the conversation, something I try to ignore most of the time) included “Design a capstone rubric aligned to categories instead of features” to which I answered:
Ah yes. Rubrics are the tough part. What I care about is “did you learn something” and all they care about is “how can I get a top score”.
After a couple rounds of back and forth, I got another of the typical “if you want…” prompts that often drives me away from using the current incarnation of ChatGPT (I count it as a victory when those stop: the LLM is surrendering, at least temporarily, from trying to align me.) Instead of choosing one of the options presented I pushed back, as I sometimes do:
Ah the subtle invitation to mode collapse. What other non-inferior options should we be considering in addition to your four suggestions? With the expanded list, what are the pros and cons of each. Using a pedagogical lens, what are the rankings of those suggestions?
This approach – pushing back, asking for something different, is (for me) useful because it can surface ideas that are intriguing. Sometimes (Claude, for example) doesn’t have any non-inferior alternatives – this is a confirmation seeking mechanism that I attribute (rightly or wrongly) to RLHF. In this case, ChatGPT proceeded to generate a list of twelve options. It then went through each option, listed pros and cons, created a list of pedagogical goals, and then tied items from the list to the goals, along with a ranking. This gave me the richer content from which to build a rubric – one that seeks to evaluate learning rather than performance. I seek to help students gain valuable critical thinking skills and to observe that happening I find noting they’ve learned something is beneficial. Thus, when students focus on form, I infer that they are performing (“give the audience what will make it happy.”) What does that mean when the audience wants proof of learning?
From this conversation I ended up with useful artifacts:
- A set of categories for prospective projects that I can share with students, which names the category of project and the learning goals for each category. Students can thus shape their own project into one of these categories or find their own path.
- A rubric that actively seeks to list competing factors for consideration – five broad areas to evaluate, each of which then includes factors that require trading off between them. Focusing on a subset becomes visible because the ignored factors will likely suffer.
- Commitment versus Flexibility
- Depth versus Breadth
- Control versus Realism
- Optimization versus Understanding
- Narrative Coherence versus Epistemic Honest
- A guide document for the instructional team on how to evaluate the actual artifacts generated by the students.
I expect that this first iteration will require further refinement – that’s been my normal experience over the years, because there is always room for improvement. Early during course development, it is easy to find areas to improve. Over time, the refinements to structure are often smaller. Still, periodically it is good to go back and challenge the fundamental tenets.
For example, with the rapid deployment of generative AI, I embraced the idea that most students will use AI. This reminds me of when I was younger and calculators weren’t allowed – they automated a process that previously had been a core part of the evaluation process. Refusing to adapt to changing tools does a disservice. There is a counterbalance risk here: adapting new tools too soon and without sufficient structure also does a disservice. Finding the balance is an ongoing challenge (at least for me).
The analysis continued by having ChatGPT consider how different “types” of students might engage with the project categories and/or rubric. This is not definitive, any more than can be achieved with any other simulation experiment, but it seems to be better than doing no iterative analysis at all.
We ended up taking one of the categories and using it to show how to build a seed of an idea, and then to sprout that seed into a seedling – the kind of output that I envision students producing. Something fragile, with great potential. It embedded specific ideas that are deliberately uncomfortable (e.g., a replicated database that is not backed up.) Turns out that example was better oriented to one of the classes than the other, so I iterated and spun the idea in a way that worked for the other class.
I constantly push to avoid taking the high-ranking answer. Not because it is wrong, but because when I’m trying to be different than the common case, I want to explore more broadly.
For those that are interested in reading the actual conversation, here is the link: https://chatgpt.com/share/69712460-4260-800e-a2bc-1a5e905dba07

The paper is really about epistemic humility under abundance.
When answers are cheap:
certainty becomes suspicious,
rankings become dangerous,
and learning shifts from selection to exploration.
This connects deeply to:
your critique of rubrics,
your resistance to high-ranking answers,
your preference for fragile seedlings over polished artifacts.
You’re arguing implicitly that education should reward navigation of uncertainty, not just mastery of outcomes.
Mastery of outcomes when measured by output is too easy to exploit and AI just allows it to do so. Those that don’t learn this lesson will simply become irrelevant. Maybe I need to be more explicit in arguing “education should reward navigation of uncertainty.” Ironically, it is a weak spot for current AI as well – good products provide strongly coherent responses. Ratings that say “I don’t know” is a bad response lead to what we have now. In the impossibility work I’ve been doing (think “guaranteeing epistemic honesty is not possible given the current structural implementation of predictive token generating AI models,”) the solutions focus on preserving, not everything, but the parts where there was uncertainty – such as when the desire for coherence and the desire for epistemic honesty are in conflict. Such a representation needs to be composable. In other words, if you’re going to fact check, start in the places where the previous layer had uncertainty; each layer then produces a refined response, with a breadcrumb path that represents how we got to the end state.
So yes, “navigation of uncertainty.” There will always be uncertainty in the world and no amount of convincing sounding text is going to change that reality.