Preview Mode Links will not work in preview mode

Welcome to the AI in Education podcast

With Dan Bowen from Microsoft Australia and Ray Fleming from InnovateGPT

It's a fortnightly chat about Artificial Intelligence in Education - what it is, how it works, and the different ways it is being used. It's not too serious, or too technical, and is intended to be a good conversation of background information.

Of course, as well as getting it here on the website, you can also just subscribe to your normal podcast service:


“This podcast is produced by a Microsoft Australia & New Zealand employee, alongside an employee from InnovateGPT. The views and opinions expressed on this podcast are our own.”

Dec 1, 2023

Academic Research


Researchers Use GPT-4 To Generate Feedback on Scientific Manuscripts

Two episodes ago I shared the news that for some major scientific publications, it's okay to write papers with ChatGPT, but not to review them. But…

Combining a large language model and open-source peer-reviewed scientific papers, researchers at Stanford built a tool they hope can help other researchers polish and strengthen their drafts.

Scientific research has a peer problem. There simply aren’t enough qualified peer reviewers to review all the studies. This is a particular challenge for young researchers and those at less well-known institutions who often lack access to experienced mentors who can provide timely feedback. Moreover, many scientific studies get “desk rejected” — summarily denied without peer review.

James Zou, and his research colleagues, were able to test using GPT-4 against human reviews 4,800 real Nature + ICLR papers. It found AI reviewers overlap with human ones as much as humans overlap with each other, plus, 57% of authors find them helpful and 83% said it beats at least one of their real human reviewers.



Academic Writing with GPT-3.5 (ChatGPT): Reflections on Practices, Efficacy and Transparency

Oz Buruk, from Tampere University in Finland, published a paper giving some really solid advice (and sharing his prompts) for getting ChatGPT to help with academic writing. He uncovered 6 roles:

  • Chunk Stylist
  • Bullet-to-Paragraph
  • Talk Textualizer
  • Research Buddy
  • Polisher
  • Rephraser

He includes examples of the results, and the prompts he used for it. Handy for people who want to use ChatGPT to help them with their writing, without having to resort to trickery



Considerations for Adapting Higher Education Technology Course for AI Large Language Models: A Critical Review of the Impact of ChatGPT

This is a journal pre-proof from the Elsevier journal "Machine Learning with Applications", and takes a look at how ChatGPT might impact assessment in higher education. Unfortunately it's an example of how academic publishing can't keep up with the rate of technology change, because the four academics from University of Prince Mugrin who wrote this submitted it on 31 May, and it's been accepted into the Journal in November - and guess what? Almost everything in the paper has changed. They spent 13 of the 24 pages detailing exactly which assessment questions ChatGPT 3 got right or wrong - but when I re-tested it on some sample questions, it got nearly all correct. They then tested AI Detectors - and hey, we both know that's since changed again, with the advice that none work. And finally they checked to see if 15 top universities had AI policies.

It's interesting research, but tbh would have been much, much more useful in May than it is now.

And that's a warning about some of the research we're seeing. You need to really check carefully about whether the conclusions are still valid - eg if they don't tell you what version of OpenAI's models they’ve tested, then the conclusions may not be worth much.

It's a bit like the logic we apply to students "They’ve not mastered it…yet"



A SWOT (Strengths, Weaknesses, Opportunities, and Threats) Analysis of ChatGPT in the Medical Literature: Concise Review

They looked at 160 papers published on PubMed in the first 3 months of ChatGPT up to the end of March 2023 - and the paper was written in May 2023, and only just published in the Journal of Medical Internet Research. I'm pretty sure that many of the results are out of date - for example, it specifically lists unsuitable uses for ChatGPT including "writing scientific papers with references, composing resumes, or writing speeches", and that's definitely no longer the case.



Emerging Research and Policy Themes on Academic Integrity in the Age of Chat GPT and Generative AI

This paper, from a group of researchers in the Philippines, was written in August. The paper referenced 37 papers, and then looked at the AI policies of the 20 top QS Rankings universities, especially around academic integrity & AI. All of this helped the researchers create a 3E Model - Enforcing academic integrity, Educating faculty and students about the responsible use of AI, and Encouraging the exploration of AI's potential in academia.


Can ChatGPT solve a Linguistics Exam?

If you're keeping track of the exams that ChatGPT can pass, then add to it linguistics exams, as these researchers from the universities of Zurich & Dortmund, came  to the conclusion that, yes, chatgpt can pass the exams, and said "Overall, ChatGPT reaches human-level competence and         performance without any specific training for the task and has performed similarly to the student cohort of that year on a first-year linguistics exam" (Bonus points for testing its understanding of a text about Luke Skywalker and unmapped galaxies)


And, I've left the most important research paper to last:

Math Education with Large Language Models: Peril or Promise?

Researchers at University of Toronto and Microsoft Research have published a paper that is the first large scale, pre-registered controlled experiment using GPT-4, and that looks at Maths education. It basically studied the use of Large Language Models as personal tutors.

In the experiment's learning phase, they gave participants practice problems and manipulated two key factors in a between-participants design: first, whether they were required to attempt a problem before or after seeing the correct answer, and second, whether participants were shown only the answer or were also exposed to an LLM-generated explanation of the answer.

Then they test participants on new test questions to assess how well they had learned the underlying concepts.

Overall they found that LLM-based explanations positively impacted learning relative to seeing only correct answers. The benefits were largest for those who attempted problems on their own first before consulting LLM explanations, but surprisingly this trend held even for those participants who were exposed to LLM explanations before attempting to solve practice problems on their own. People said they learn more when they were given explanations, and thought the subsequent test was easier

They tried it using standard GPT-4 and got a 1-3 standard deviation improvement; and using a customised GPT got a 1 1/2 - 4 standard deviation improvement. In the tests, that was basically the difference between getting a 50% score and a 75% score.

And the really nice bonus in the paper is that they shared the prompt's they used to customise the LLM

This is the one paper out of everything I've read in the last two months that I'd recommend everybody listening to read.




News on Gen AI in Education


About 1 in 5 U.S. teens who’ve heard of ChatGPT have used it for schoolwork

Some research from the Pew Research Center in America says 13% of all US teens have used it in their schoolwork - a quarter of all 11th and 12th graders, dropping to 12% of 7th and 8th graders.

This is American data, but pretty sure it's the case everywhere.



UK government has published 2 research reports this week.

Their Generative AI call for evidence had over 560  responses from all around the education system and is informing UK future policy design.  


One data point right at the end of the report was that 78% of people said they, or their institution, used generative AI in an educational setting


  • Two-thirds of respondents reported a positive result or impact from using genAI. Of the rest, they were divided between 'too early to tell', a bit of +positive and a bit of negative, and some negative - mainly around cheating by students and low-quality outputs.


  • GenAI is being used by educators for creating personalized teaching resources and assisting in lesson planning and administrative tasks.
    • One Director of teaching and learning said "[It] makes lesson planning quick with lots of great ideas for teaching and learning"
  • Teachers report GenAI as a time-saver and an enhancer of teaching effectiveness, with benefits also extending to student engagement and inclusivity.
    • One high school principal said "Massive positive impacts already. It marked coursework that would typically take 8-13 hours in 30 minutes (and gave feedback to students). "
  • Predominant uses include automating marking, providing feedback, and supporting students with special needs and English as an additional language.


The goal for more teachers is to free up more time for high-impact instruction.  


Respondents reported five broad challenges that they had experienced in adopting GenAI:

• User knowledge and skills - this was the major thing - people feeling the need for more help to use GenAI effectively

• Performance of tools - including making stuff up

• Workplace awareness and attitudes

• Data protection adherence

• Managing student use

• Access


However, the report also highlight common worries - mainly around AI's tendency to generate false or unreliable information. For History, English and language teachers especially, this could be problematic when AI is used for assessment and grading


There are three case studies at the end of the report - a college using it for online formative assessment with real-time feedback; a high school using it for creating differentiated lesson resources; and a group of 57 schools using it in their learning management system.


The Technology in Schools survey

The UK government also did The Technology in Schools survey which gives them information about how schools in England specifically are set up for using technology and will help them make policy to level the playing field on use of tech in education which also brings up equity when using new tech like GenAI.

This is actually a lot of very technical stuff about computer infrastructure but the interesting table I saw was Figure 2.7, which asked teachers which sources they most valued when choosing which technology to use. And the list, in order of preference was:

  1. Other teachers
  2. Other schools
  3. Research bodies
  4. Leading practitioners (the edu-influencers?)
  5. Leadership
  6. In-house evaluations
  7. Social media
  8. Education sector publications/websites
  9. Network, IT or Business Managers
  10. Their Academy Strust


My take is that the thing that really matters is what other teachers think - but they don't find out from social media, magazines or websites


And only 1 in 5 schools have an evaluation plan for monitoring effectiveness of technology.




Australian uni students are warming to ChatGPT. But they want more clarity on how to use it

And in Australia, two researchers - Jemma Skeat from Deakin Uni and Natasha Ziebell from Melbourne Uni published some feedback from surveys of university students and academics, and found in the period June-November this year, 82% of students were using generative AI, with 25% using it in the context of university learning, and 28% using it for assessments.

One third of first semester student agreed generative AI would help them learn, but by the time they got to second semester, that had jumped to two thirds

There's a real divide that shows up between students and academics.

In the first semester 2023, 63% of students said they understood its limitations - like hallucinations  and 88% by semester two. But in academics, it was just 14% in semester one, and barely more - 16% - in semester two


22% of students consider using genAI in assessment as cheating now, compared to 72% in the first semester of this year!! But both academics and students wanted clarify on the rules - this is a theme I've seen across lots of research, and heard from students

The Semester one report is published here:



Published 20 minutes before we recorded the podcast, so more to come in a future episode:


The AI framework for Australian schools was released this morning.

The Framework supports all people connected with school education including school leaders, teachers, support staff, service providers, parents, guardians, students and policy makers.

The Framework is based on 6 guiding principles:

  1. Teaching and Learning 
  2. Human and Social Wellbeing
  3. Transparency
  4. Fairness
  5. Accountability
  6. Privacy, Security and Safety

The Framework will be implemented from Term 1 2024. Trials consistent with these 6 guiding principles are already underway across jurisdictions.

A key concern for Education Ministers is ensuring the protection of student privacy. As part of implementing the Framework, Ministers have committed $1 million for Education Services Australia to update existing privacy and security principles to ensure students and others using generative AI technology in schools have their privacy and data protected.

The Framework was developed by the National AI in Schools Taskforce, with representatives from the Commonwealth, all jurisdictions, school sectors, and all national education agencies - Educational Services Australia (ESA), Australian Curriculum, Assessment and Reporting Authority (ACARA), Australian Institute for Teaching and School Leadership (AITSL), and Australian Education Research Organisation (AERO).