This was an interesting experiement, mostly because it pointed out that although I have a lot of skepticism toward AI models like ChatGPT, I was still expecting it to do better than it did – particularly in math settings. I was also surprised, and it seemed so was the author, that the model had a harder time with simple arithmetic than harder calculus – this was where I kept assuming the model would fail, but it handled calc better.
It also was interesting that previous successes, such as the success of answering the first tic-tac-toe board question, led me to believe that it would answer correctly about the game in a second follow-up question. I generally went with my first assumptions, and according to the model I was “overconfident” in my responses, so I guess I’m confident it will mostly fail, with an approximate 50% success rate.
Leave a Reply