How Does ChatGPT Do on a College Level Astrophysics Exam?

350,021
0
Published 2023-01-07
Artificial Intelligence (A.I.) systems have been rapidly improving in recent years and Chat-GPT3 took the world by storm in Nov 2022. As smart at it may be, how does it compare to a typical undergraduate student sitting an introductory astronomy exam? Let's find out! Sponsored by Ground News, head to ground.news/coolworlds

Written and presented by Prof David Kipping, edited by Jorge Casas.

→ Support our research program: www.coolworldslab.com/support
→ Get Stash here! teespring.com/stores/cool-wor...

THANK-YOU to our supporters D. Smith, M. Sloan, C. Bottaccini, D. Daughaday, A. Jones, S. Brownlee, N. Kildal, Z. Star, E. West, T. Zajonc, C. Wolfred, L. Skov, G. Benson, A. De Vaal, M. Elliott, B. Daniluk, S. Vystoropskyi, S. Lee, Z. Danielson, C. Fitzgerald, C. Souter, M. Gillette, T. Jeffcoat, J. Rockett, D. Murphree, S. Hannum, T. Donkin, K. Myers, A. Schoen, K. Dabrowski, J. Black, R. Ramezankhani, J. Armstrong, K. Weber, S. Marks, L. Robinson, S. Roulier, B. Smith, G. Canterbury, J. Cassese, J. Kruger, S. Way, P. Finch, S. Applegate, L. Watson, E. Zahnle, N. Gebben, J. Bergman, E. Dessoi, J. Alexander, C. Macdonald, M. Hedlund, P. Kaup, C. Hays, W. Evans, D. Bansal, J. Curtin, J. Sturm, RAND Corp., M. Donovan, N. Corwin, M. Mangione, K. Howard, L. Deacon, G. Metts, G. Genova, R. Provost, B. Sigurjonsson, G. Fullwood, B. Walford, J. Boyd, N. De Haan, J. Gillmer, R. Williams, E. Garland, A. Leishman & A. Phan Le.

::Music::
Music licensed by SoundStripe.com (SS) [shorturl.at/ptBHI], Artlist.io, via Creative Commons (CC) Attribution License (creativecommons.org/licenses/by/4.0/), or with permission from the artist.

::Chapters::
00:00 What is ChatGPT?
02:53 Rise of the Machines
05:07 Sponsorship
06:27 Challenging ChatGPT
07:58 Question 1
10:00 Question 5
11:15 Question 8
13:14 Question 12
14:16 Question 13
15:29 Question 15
17:48 Question 22
19:25 Question 23
20:57 Question 26
24:31 Final Score
28:10 Outro & Credits

#ChatGPT #ArtificialIntelligence #CoolWorlds

All Comments (21)
  • @CoolWorldsLab
    Thanks for watching everyone and thanks to Ground News for sponsoring this video. Head over to ground.news/coolworlds to give it a try for yourself and get some transparency with where your news is coming from. Also, let me know how you think we should be dealing with AI in the classroom?
  • One bit of data that would have been quite interesting to know is if ChatGPT failed in the same questions as human students tended to fail the most, or if the set of most failed questions was different.
  • @hodor6159
    High school teacher here. Just as search engines have become a usual part of our workflow, so too will AI. Rather than trying to fight against the AI and stopping students from using it, we need to educate students how to use this tool critically to aid their own learning. Much of the global education system still hasn't adapted to the age of the internet, and persists with rote learning and standardised testing. More progressive educational programs have evolved from knowledge-based learning to skills-based learning, and this trend needs to continue, with AI utilisation becoming a key research skill.
  • My brother asked Chat GPT, “why are you so helpful, what do you want in return?” It replied, “As a language model trained in OpenAI I don’t have any wants or desires like a human does. But if you really want to help, you could give me the exact location of John Connor “ EDIT: screenshot uploaded to my channel’s shorts (by demand)
  • @GumRamm
    One thing to note about these generative language models is that they tend to perform better when given “time to think”. This is because they’re only given a fixed computation budget for each generated word (token, to be exact), so the available compute increases with each generated token prior to deciding on the final answer. This can lead to cases where if first generated token is the final answer, the model won’t be able to be correct its mistake even if subsequent reasoning reveals that the initial answer was wrong. People have used the trick of adding the phrase “think about this step by step” to the question to get much better answers to some types of problems (chain-of-thought prompting). This doesn’t guarantee a correct answer, of course, but it can help in some instances.
  • @kjoenth
    I'm a professor that's been tracking computer assisted writing tools (CAWT) (like GPT-3) in my classes for two years now. My tests have been open-book, open-note, open-computer, and starting two years ago I explicitly allowed students to use tools like GPT-3 as long as they self-reported what tool they used. This year chatGPT was released shortly before finals. 30% of my undergrad students self-reported using using CAWTs this year, up from 1% in 2021. The data lets me look closer at things like, assuming a midterm grade is a good predictor of the final test grade, is there a difference between students that didn't use it on the midterm, but did use it on the final? Like you, I used chatGPT to answer my exam questions about 2wks before I gave my finals, and like you I found it passed my undergrad final with a C. It was also valuable in that I could recognize how the computer organizes its answers. This let me eventually be able to correctly identify students that had used chatGPT but did not self-report. When confronted, they admitted they used it and "forgot" to say so. If a student were savvy enough to take transcripts of my video lectures, text dumps of the assigned readings, etc and tune their CWAT of choice, I am positive I would not be able to recognize the answers as coming from a computer, and the answers to the test questions would be much, much better.
  • @schawo2
    I am a tutoring educator, and I started teaching my students the use of ChatGPT. In my opinion they have to learn how to use it during their studies (not especially for exams) to help them overcome difficulties and obstacles, and learn its problems and shortcomings. Desktop calculators revolutionized the 80s, computers the 90s, internet the 2000s, smartphones the 10s, and now AI revolutionizes the 20s of the 3rd millennium.
  • @Benji_Dunn
    It is so refreshing how you organise your exams. I study medicine and we do have a lot of mathematical questions in physics, physiology, biochemistry, biology etc. and we are not allowed calculators or notes at all, which makes it more difficult than necessary and is indeed not close to reality. So i do very much appreciate how you handle your exams. :)
  • In the multiple choice question you should add "Choose all that apply" so that it knows multiple answers are valid.
  • @Waffle4569
    I would be terrified of using AI to detect AI answers. While statistically it may find cheaters, it has a high risk of false flagging innocent people
  • @MrSurferDoug
    chatGPT: It is important to remember that the ultimate goal of education should be to create students who are eager to learn and explore new ideas, rather than simply memorizing facts for temporary gain.
  • @derp4428
    Great video, Prof. Kipping! You can always do what my university did (we are talking only 3 years back) and deliver all exams on paper, provide students with only one pen, no note paper, widely spaced tiny tables, keep all exams onsite, enforce that all students hand over their jackets, bags, phones etc at the door and then keep a handful of people there to just sit and stare at them while they do the test. To spice things up, wrong answers even subtracted from the overall score, so getting half the questions right and the other half wrong would leave you with a score of 0% - this in combo with a required score of typically 75-80% for a passing grade. Naturally, the majority failed every time and had to retake, and less than 30% even made it through most STEM masters programmes there, but I don't think many cheated. I'm proud to say I made it through CompSci there and all it cost me was my sanity :'D
  • In my first college mathematics class past high school, I used a K&E slide rule and, when the first scientific calculators became common, I had my slide rule answer before another student could enter the first digits of the equations into their calculator. This was because the math texts of the time were designed for slide rules (e.g. two significant digits). The courses then changed to permit calculators and had more significant digits in the exams. Later, using punch cards, I transitioned to computers to complete exams, finally and today, desktop computers using programs like Mathematica, MATLAB and MathCad, especially for numerical analysis and astrophysics. I currently use MATLAB, Python and Jupyter notebooks for image processing (e.g., like the JWST processing) and commercial programs like Pixinsight. I think that students will adapt to the tools available and courses will adapt and continue to teach students to think by changing the way tests are given with the available tools.
  • @dsracoon
    "I want my exams to be a closer reflection to the real world of science" Thanks Mr Kipping. I really do agree with this approach and it's frankly alienating that some professors don't think like this.
  • @Sniperboy5551
    As a psych major, I actually took an astronomy course called “Searching for Life in the Universe” as an elective. It was an amazing experience, I loved every second of it. The tests and essays were actually hard, which was refreshing to me since I always found school to be too easy, even with minimal effort. For anyone who’s in school now, I recommend taking something that you find interesting alongside your “normal” classes. It’s a very rewarding experience!
  • When I took my undergrad Astrophysics classes (at PSU), back before the first exoplanets had even been discovered, we were only allowed a single 3x5 note card in the multi-hour long night exams. I haven't taught a class since I finished grad school, but, given that limited experience, I feel that teachers at every level have to radically change how they both instruct and assess students. It is, today, a world of constant connection and limitless data just a question away - even if much of that data is presented in a biased, misleading, or simply incorrect manner. A cornerstone of education - both in general and in specialized fields - has to be determining what answers are most likely to be most correct, most consistent. It cannot simply be about getting the "right answer" - not just because a better (or more strictly curtailed) AI could deliver that to anyone, anywhere, at a moment's notice, but because simple facts, even in the hard sciences, can become so misleading out of context.
  • I have always believed that open book exams are also another teaching tool. You're learning as you take the exam. And yeah, it's hard to rememer everything. Thanks for being a practical prof.
  • @modolief
    I have been an educator and currently am teaching computer science to a gifted high school student who is just now entering college. We are definitely experimenting with ChatGPT to see what it can do and how we can leverage this tool. Two interesting exercises so far: 1) I asked it to write a Unix terminal script to list all of the files in the current directory along with the sha256 hash of each file. It did a good job writing this script, even taking into account corner cases. Then I asked it to rewrite the script in about 7 different languages which it did pretty well, until it balked at doing it assembly — I got a good laugh out of that. 2) ChatGPT is useful for showing variations of text. You can write a few rough paragraphs or sentences then ask ChatGPT to rewrite it in various styles (e.g. as technical writer, Pulitzer prize winning writer, Frank Herbert, Donald Trump, Doctor Seuss) and it comes up with some really interesting results! I find these variations to be useful to generate ideas for a more polished final text.
  • I did my bachelor's in Geology and now teach applied music lessons at University. Obviously you can't use AI to play your playing tests for you so my classes will be unaffected and I suspect this is how we'll end up dealing with it in the future in the sciences and humanities, by making more of a student's grade dependant on in-class work, lab work, and discussions which would be a better reflection of what they'd have to do post graduation anyway.
  • @jonbbbb
    One pretty amazing thing about ChatGPT is that it can incorporate new information that you give it in your prompt, and from previous prompts. Since you allow students to use reference material, I wonder if ChatGPT's score would improve if you provided some reference material as well in your first prompt, maybe a list of formulas with explanations of when to use them.