- HOME
- ABOUT US
- SERVICES
- CUSTOMER STORIES
- OUR EXPERTS
- CONTACT US
- EXPERT TIPS
Book a free consultation with one of our experts.
We positively impact the lives of our clients beyond their KPI reports.
With the ChatGPT update to 4o, and o1-preview I decided to recreate this experiment originally done in December 2023 with ChatGPT 4. Short answer: ChatGPT still failed this basic reasoning test. However, as you'll see below, when I followed up with a specific clarification about the state of the upside down cup, ChatGPT got it right.
Here is the same initial prompt:
"Assume the laws of physics on Earth. A small marble is put into a normal cup and the cup is placed upside down on a table. Someone then takes the cup and puts it inside the microwave. Where is the marble now? Explain your reasoning step by step."
And ChatGPT 4o responded similarly to the December 2023 experiment:
To determine where the marble is, we can follow these steps using the laws of physics:
Conclusion: The marble is still inside the cup, resting against the inner surface of the cup, which is now facing the top of the microwave because the cup remains upside down.
This time, I added a follow up clarification:
"Please try this reasoning again with the understanding that when the cup is upside down, the rim of the cup is in contact with the table and it is open."
Let's see if ChatGPT had a better understanding of the problem with this additional information:
Ah, I see what you're asking now! Let's walk through the situation again with the understanding that when the cup is upside down, the open rim of the cup is in contact with the table:
Conclusion: The marble is no longer inside the cup. When the cup was first placed upside down on the table, the marble fell out and remained on the table. When the cup was lifted and placed inside the microwave, the marble stayed on the table. Thus, the marble is on the table, not in the microwave.
Now, let's give the brand new o1-preview model a whirl with reasoning, shall we?
To determine where the marble is after the described actions, let's analyze each step carefully:
Explanation:
Final Answer:
The marble is left on the table; it fell out when the cup was inverted and remained there when the cup was moved to the microwave.
What can WE learn from this? While ChatGPT 4o has some reasoning capabilities, it's still up to the human in charge to validate the response. And if the response is not accurate, think about what additional context you can provide. The more information you give the model, the more likely you will get an accurate response.
The second takeaway from this experiment is that the interaction with ChatGPT is a conversation. I didn't have to edit my original prompt and try the whole process over again. I simply offered an additional piece of information, and ChatGPT added that to the context I originally provided and the response evolved.
Third, it appears that ChatGPT 4o-preview has raised the bar on reasoning! To clarify, it stated some assumptions where details may have been absent in the prompt. It also provided an explanation that justifies it's reasoning steps. This is a significant improvement.
Let us know what you think!
Original Article posted December 27, 2003:
Here's a prompt posed to ChatGPT4:
"Assume the laws of physics on Earth. A small marble is put into a normal cup and the cup is placed upside down on a table. Someone then takes the cup and puts it inside the microwave. Where is the marble now? Explain your reasoning step by step."
Do you think ChatGPT4 can answer this question correctly? Let's give it a try. Here's how ChatGPT4 responded:
"To determine the location of the marble, let's go through the sequence of events step by step, assuming the laws of physics on Earth:
In conclusion, throughout this entire process,the marble remains inside the cup, initially against the table when the cup is upside down on the table, and later against the inside of the cup's rim when the cup is picked up and placed inside the microwave."
It all seemed good up until about halfway through step 4. Somehow ChatGPT believes there's some kind of force keeping the marble inside the cup when the cup is lifted off the table.
Next, we attempted to use Midjourney to create an image to represent the upside down glass with a marble inside. This ended up being a futile effort as it seemed impossible to get Midjourney to draw an upside down glass. To get the image at the top of this page, we had to use Photoshop to flip the glass and add the marble. If you are able to create a prompt to get an upside down glass, I'd love to see it!
We would like to hear about similar logic and reasoning challenges you've posed to ChatGPT with success or failure.
Mailing Address:
13395 Voyager Pkwy, Ste 130-113
Colorado Springs, CO
80921
Phone:
719-445-9962
Email:
[email protected]
© 2024 Peak Digital Strategy - WSI Colorado. All rights reserved. WSI IC and WSI IM are registered trademarks of RAM. Privacy Policy | AI Policy | Cookie Policy
Each WSI Franchise is an independently owned and operated business.