Prompting the Physics Mind: The Role of AI Tools and Prompt Engineering in Addressing Metacognitive Learning Resource Gaps Among Undergraduate Physics Students

Guest post by Patricio Bastida Nava, undergraduate researcher at the University of Massachusetts Amherst.

“Give me six hours to chop down a tree, and I will spend the first four sharpening the axe.”

— Abraham Lincoln

The first time I realized how badly AI could fail a student was during my first semester at UMass Amherst. I was studying for my second Physics 181 midterm. I just couldn’t understand projectile motion and struggled with kinematics. None of it was clicking. So I did what felt productive: I asked ChatGPT to build me an interactive visualization, a map of how the problems fit together, something I could study from. The artifact it produced was beautiful. I felt prepared. There I was looking at the exam when I knew instantly that understanding the concept and actually solving a problem were two entirely different things. The exam did not ask me to recall relationships. It asked me to set up equations, choose coordinate systems, and grind through algebra with variables. I had outsourced the thinking and memorized the output. I felt deeply frustrated. I just didn’t know what was wrong with me or how to fix it. The gap between what I thought I knew and what I could do had never been so big.

That failure changed how I used AI. I stopped asking it to explain and started asking it to coach: generate problems, demand my reasoning before giving feedback, and adapt difficulty to my mistakes. My learning improved. As usual, this experience led to a very important question — if I was going through this situation, what was happening to everyone else?

Weeks later, Dr. Torrey Trust ran an exercise in her AI and Education seminar that gave me part of the answer. She asked students — biology, computer science, engineering, economics — what tool they turned to when a concept was not clicking. Nearly every hand pointed in the same direction: ChatGPT. Not because anyone had tested it against alternatives. Not because it produced the best learning outcomes. Because it was fast, and speed feels like understanding. Researchers call this the learning illusion: the subjective sense that you have learned something when you have only been exposed to it. In education research, metacognition (the practice of thinking about how you are learning and whether it is actually working) is the primary defense against this illusion. But metacognition is effortful, and ChatGPT is effortless. That is the trap.

I am a first-year physics student, but I am also a researcher. I never attended a traditional school. I earned my high school diploma in Mexico by examination alone. Everything I know, I taught myself, and for much of that process, AI was one of the only resources I had. That experience gives me no patience for the argument that AI is simply a shortcut. For students like me, it was the classroom. But it also gave me no illusions about its dangers, because I have lived both sides: the version of AI that builds understanding and the version that quietly destroys it. This past March, I co-presented original research at the SITE International Conference in Philadelphia with Dr. Trust, evaluating how well large language models actually support learning when measured against established instructional theory. What we found should matter to every STEM educator. Faculty need to stop relying on blanket AI bans, update their syllabus policies, and start teaching students how to use AI for metacognitive reflection and cognitive collaboration — because whether faculty act or not, students are already using these tools every day.

The Learning Illusion

Akgun and Toker published a 2025 empirical study comparing students using ChatGPT against students using traditional textbooks. The AI group showed short-term gains on simpler tasks, but their long-term retention was significantly worse. The AI was doing the thinking. The student was watching. In learning science, this is called cognitive offloading, and in physics, it compounds every week. A student who does not genuinely work through Newton’s Second Law in week three will be lost when momentum, energy, and wave mechanics arrive later.

The struggle is not the enemy of learning in physics. The struggle frequently is the learning.

Hon’s 2026 systematic review of studies from 2018 to 2024 confirms that AI tools consistently increased engagement but also produced over-reliance and inconsistent outcomes, with the biggest gaps in disciplines that require deep conceptual reasoning. Physics is exactly that kind of discipline. Yet every day, physics students everywhere open ChatGPT, paste in a problem, and read the solution. It feels productive. It is not.

When AI Actually Works

The picture is not uniformly negative. AI can sometimes teach better than a traditional classroom, but only when it’s designed very carefully. In 2025, Harvard researchers ran an experiment and found that students learned more physics and learned it faster when they used a custom-built AI tutor instead of sitting in a typical active-learning class. What made it work wasn’t the AI itself so much as the guardrails built into it: students had to walk through their thinking before getting any help, mistakes became useful signals rather than dead ends, and the system adjusted based on where each student was actually getting tripped up. Even then, the researchers noted it could have been even better with tighter controls on how quickly answers were revealed. When I tested the model myself, I found it still occasionally provided solutions faster than a student could meaningfully process them.

Kotsis frames this through cognitive load theory: AI must scaffold inquiry rather than replace it. When a student pastes a problem and copies the answer, they eliminate all cognitive load. When they prompt an AI to coach them step by step and require them to show their work first, they engage exactly the cognitive processes physics instruction is designed to build. Younis found measurable improvements in conceptual mastery among undergraduate physics students when AI was integrated this way.

The AI is the same either way. The learning is completely different.

What the Data Actually Shows

At SITE 2026, Dr. Trust and I set out to answer a specific question: do the study and learning modes that major AI companies have built — features these companies developed, by their own account, in partnership with educators and learning scientists — actually deliver a sound learning experience? We tested four platforms: ChatGPT, Gemini, Claude, and Perplexity. Our framework was Gagné’s Nine Events of Instruction, a model from the 1960s that defines the foundational conditions for effective learning, from gaining the learner’s attention and stating objectives through eliciting performance, providing feedback, and supporting transfer to real-world application.

Across all four platforms, two of Gagné’s events were nearly absent: Gain Attention and Inform Objectives. In practice, this meant that no tool consistently explained what the student should know or be able to do after the lesson, and no tool took meaningful steps to engage the student’s curiosity before presenting the content. Without a stated learning objective, a student cannot track their own progress, cannot reflect on whether they actually understood something, and cannot connect the current concept to the next one. In a discipline as cumulative as physics, that is not a minor gap. It is a structural failure.

The findings went deeper than missing events. Learning guidance was the most consistent behavior across all four tools, but the other behaviors followed a repetitive, formulaic pattern rather than adapting as the interaction progressed. Feedback was constantly present but shallow — short and generic, lacking the depth needed to actually support learning. Every tool works with enthusiasm and encouragement regardless of the quality of the student’s responses, making it dangerously easy to fall into a learning illusion: you feel like you understand because the AI keeps telling you that you are doing great. ChatGPT in particular overwhelmed users with multiple questions simultaneously, creating a mismatch between what it asked the learner to do and what its own interface allowed. Of the four tools, Claude was the only one that consistently pushed students toward critical thinking — and, perhaps tellingly, it is often perceived as the most frustrating to use.

There is something else important to say. The presence of a pedagogical behavior in an AI interaction does not guarantee its quality. A tool can ask questions without asking useful questions. Our research required classifying each interaction against Gagné’s events regardless of quality, then reexamining the qualitative texture of those interactions to understand what the numbers alone could not capture. What the data showed, across hundreds of interactions, is that the most sophisticated AI study modes available right now cannot consistently meet what a first-year education textbook from 1965 would call basic instructional standards — and these are the tools students are relying on every night.

The Missing Skill: Metacognitive Prompting

If the tools themselves are not pedagogically reliable, then the burden falls on how students use them. This is where metacognitive prompting becomes essential — and where the gap in instruction is most glaring. Consider two students preparing for the same Physics 181 midterm on the work-energy theorem. The first opens ChatGPT and types: “Teach me about the work-energy theorem for my exam.” The AI produces a tidy summary. The student reads it, feels reassured, and moves on. Cognitive offloading is complete.

The second student writes a different kind of prompt. They instruct the AI to act as a physics professor who will first provide a short conceptual explanation, then present a symbolic problem using only variables — no numbers. The prompt explicitly requires the student to show their full step-by-step reasoning, including a free-body diagram and force decomposition, before the AI reveals any solution. It instructs the AI to analyze the student’s reasoning, identify specific misconceptions, explain why each mistake matters conceptually, and provide metacognitive strategies — reflection prompts like “Which assumption did I make unconsciously?” or checklists for common errors. Only after this exchange does the AI present a worked solution, and it follows up with a new problem adapted to the student’s demonstrated weaknesses.

The AI is identical in both cases. The learning is not. The first student consumed information. The second student built understanding. The difference is not intelligence or motivation. It is whether anyone ever taught the second student that prompting is a skill, that the quality of what you ask determines the quality of what you learn, and that the goal is not to get the answer but to find out where your reasoning breaks. Nobody is teaching this. Not in physics courses, not in orientation, not in any syllabus I have seen.

What Needs to Change

A professor during my first semester dismissed AI with an analogy: “Do you send your computer to do workouts for you?” The analogy is not wrong about personal responsibility. But it assumes students have a proper gym, a qualified trainer, and enough time to use both. Most of us do not. Office hours last an hour. Textbooks do not ask you how you are thinking. AI is available at two in the morning when the exam is tomorrow, and the concept still will not click. For many of us, it is the only resource available long enough to actually help. That does not make it safe. It makes it necessary — and necessity without guidance is how students get hurt.

Three concrete changes could begin to address this, and none of them cost money. First, update syllabus policies. The University of Texas at Austin has published sample AI guidelines that move past blanket bans toward transparent policies treating AI as a citable tool with clear attribution requirements. Any university can adopt and adapt the same framework. Second, name the risk. Tell students explicitly what cognitive offloading is and why speed is not learning. Chen documents practical strategies for avoiding AI-driven learning illusions that could be incorporated into any course’s first-week materials. Third — and this is the intervention that does not exist yet — teach students how to prompt. Not as a computer science skill, but as a metacognitive one. A single module in the first week of a physics course, showing the difference between a prompt that offloads thinking and a prompt that forces reflection, would do more for student learning than any AI ban ever has. Resources for this already exist. EdTech Books publishes open-access materials — many peer-reviewed, others designed by scholars and educators — addressing how to design AI-integrated assignments and teach prompting for critical thinking rather than answer retrieval. One example is AI-Ready Educators and Students: Using the AUGMENT Framework to Teach and Learn with Generative AI, which offers a free, classroom-ready framework for exactly this kind of teaching. These resources exist right now, and most faculty have not seen them.

I want to be honest about the limits of this argument. Prompting is a patch. It is a patch for what is, at its core, a real and serious wound: AI tools built for speed rather than learning, that consume millions of liters of water annually, that encode biases, and that will not on their own produce the physicists this world needs. But we do not have time to wait for better tools, and the wound is already open. We do not have those tools yet. I am not sure we will have them in five years. Students are using these tools today with no guidance on how to use them well.

The question has never been whether students will use AI. The question is whether anyone will teach them the difference between a prompt that replaces their thinking and a prompt that sharpens it. That is a teaching problem, and it has a teaching solution. The goal is not to ban these tools or to endorse them. The goal is to give students the knowledge, the research, and the critical awareness they need to make an informed decision about how they learn — and then the freedom to make it. Right now, students are making that decision every day. They are just making it in the dark. The least any university can do is turn on the lights.

About the author

Patricio Bastida Nava is a Mexican undergraduate student at the University of Massachusetts Amherst, where he is pursuing a double major in Physics and Astronomy/Astrophysics alongside interdisciplinary studies in artificial intelligence and STEM education. His work sits at the intersection of AI research, instructional design, and applied technology. He has co-authored research on how generative AI platforms support teaching and learning, and designs corporate AI training programs grounded in prompt engineering and educational theory. He is also a member of UMass’s iCons program in the AI & Future of Work track. Beyond his academic work, Patricio serves in student technical leadership and is passionate about the role of AI, physics, and pedagogy in shaping the future of work and learning.

About Rachelle

If Your Organization Is Beginning This Work

I help schools and other organizations (law firms, healthcare professionals, business owners) implement AI responsibly through policy guidance, professional learning, and classroom-ready strategies grounded in both instructional practice and legal insight.

My sessions focus on helping teams:

• understand what AI can and cannot do

• recognize responsible-use considerations

• build confidence using emerging tools

•align implementation with organizational priorities

If your school, district, or organization is beginning conversations or looking to dive in and learn more about AI policy, professional learning, or responsible implementation, I’d welcome the opportunity to support your next steps through leadership workshops, keynote sessions, or strategic planning partnerships.

Preparing people is what makes AI implementation successful. Contact me via bit.ly/thrivineduPD for my training and speaking services.

Dr. Rachelle Dené Poth is a Spanish and STEAM: What’s Next in Emerging Technology Teacher. Rachelle is also an attorney with a Juris Doctor degree from Duquesne University School of Law and a Master’s in Instructional Technology. Rachelle received her Doctorate in Instructional Technology, with a research focus on AI and Professional Development. In addition to teaching, she is a full-time consultant and works with companies and organizations to provide PD, speaking, and consulting services. Contact Rachelle for your event!

Rachelle is an ISTE-certified educator and community leader who served as president of the ISTE Teacher Education Network. By EdTech Digest, she was named the EdTech Trendsetter of 2024, one of 30 K-12 IT Influencers to follow in 2021, and one of 150 Women Global EdTech Thought Leaders in 2022.

She is the author of ten books, including ‘What The Tech? An Educator’s Guide to AI, AR/VR, the Metaverse and More” and ‘How To Teach AI’. In addition, other books include, “In Other Words: Quotes That Push Our Thinking,” “Unconventional Ways to Thrive in EDU,” “The Future is Now: Looking Back to Move Ahead,” “Chart A New Course: A Guide to Teaching Essential Skills for Tomorrow’s World, “True Story: Lessons That One Kid Taught Us,” “Things I Wish […] Knew” and her newest “How To Teach AI” is available from ISTE or on Amazon.