Preview Mode Links will not work in preview mode
Welcome to the AI in Education podcast With Dan Bowen and Ray Fleming. It's a weekly chat about Artificial Intelligence in Education for educators and education leaders. Also available through Apple Podcasts and Spotify. "This podcast is co-hosted by an employee of Microsoft Australia & New Zealand, but all the views and opinions expressed on this podcast are their own.”

Oct 30, 2019

Ray and Dan are joined by Troy Waller, who's the Accessibility Lead for the Microsoft Education team in Australia, to talk about how artificial intelligence is being used to support students.

Troy discusses how to help students write more effectively, using dictation in Word and PowerPoint in Office 365; PowerPoint's live captions and translation; Immersive Reader to help younger students, students with dyslexia and other reading difficulties, and even international students; Presentation Coach; and Microsoft's Translator app

Troy mentions the Microsoft Enable team on Twitter, and the Microsoft Accessibility website

You might need some of these links to learn more!

TRANSCRIPT FOR The AI in Education Podcast
Series: 1
Episode: 6

This transcript and summary are auto-generated. If you spot any important errors, do feel free to email the podcast hosts for corrections.

This podcast excerpt features a discussion with Troy, Microsoft's accessibility specialist for the Aussie education team, focusing on how Artificial Intelligence (AI) enhances accessibility in Microsoft Office and Windows products for all learners. A key theme is the shift from accessibility as a niche concern to universal design, where tools like dictation and translation are built-in features that benefit everyone, not just those with diagnosed needs. Specific examples highlighted include dictation, which allows students like one young boy to independently complete creative writing, and the PowerPoint Translator, which provides real-time, multi-language captions for improved understanding during presentations, especially for international students and the deaf. The conversation also explores Immersive Reader, an advanced tool that uses AI for natural-sounding read-aloud functions and can break down text into parts of speech, further demonstrating how complex AI services are being natively integrated to provide simple, powerful support.

 

 

 

 

 

Hi, I'm Ray.
And I'm Dan.
And we're here to talk about
AI in education. Ray.
Excellent. Dan, I think I need a bit more AI to keep going with us. Now, today, Dan, we've got a bit of a special guest.
Fantastic. Who's that?
He's sitting next to you, Dan.
Oh, yes. Troy, you're right. every from our team. Uh, so Troy is our accessibility specialist.
And if he doesn't know something about accessibility, it's not worth knowing.
Good.
But you remember last time we said, let's have a chat with Troy because all of the things we were talking about for the use of AI had implications for how it could be used to help students with accessibility issues. So
Troy, tell us a bit about yourself.
Hi guys. Yeah, so I'm Troy. I'm the uh accessibility lead for the Aussie education team here at Microsoft. I work very much with schools around switching on their Office 365 capabilities and making sure that they're getting the most out of what they what they purchased. And I also am lucky enough to be helping them with their accessibility stuff. I think the thing to sort of point out here is it's not just the kids that need it, it's all children, right? And all learners. So being able to turn this stuff on actually ends up helping the kids that we think need it and also everybody else.
That's a really good point.
I remember you saying, you know, the proportion of people that had access ility needs and that was always somebody else
and then you know I'm a little bit older than both of you so my eyesight is starting to go so now I need reading glasses and now I see some of those scenarios where somebody points their phone at a menu and gets the menu read out loud to them because you know if I'm in a really romantic restaurant with mood lighting I can't read the menu so I am now in that group of people that has accessibility needs
well for what it's worth I actually point my wife at the menu and then she reads it back to me but I but I understand what you're saying about using technology. Very good.
My early my early trick was always to say to the waiter, "This is a great menu, but what looks what's what's particularly good that you'd recommend?"
I want to go I want to go to rest to a restaurant with you, too. Sounds very exciting. So, Troy, when when we're looking at our products in the Microsoft suite, you know, there's a lot of them from Windows all the way through to Office. Are there ones that jump out to you that kind of really support the learners?
I think what's important to note is a lot of these accessible technologies, a lot of these supportive technologies are ubiquitous to Office right across Office and right across Windows for that matter. So, are there ones that really stand out to me? Yes. But remembering too that the fact that it's ubiquitous to Office means that everybody's got access to them all the time. But there's definitely some great things like we've got dictation right across um Office now. We've got the ability to uh translate. We've got the translator in there as well. We've got this thing called immersive reader which uses a combination of dictate and read aloud. And then of course we've got the translator app which is really good for the deaf and also for EAL students. So students learning another language.
Okay. So so you started off by talking about dictation. So we know computers are listening to us all of the time. We talked about that earlier, didn't we? D. Yeah, we did.
So what does that mean? Where is that used in accessibility?
Well, with dictation, the the most powerful story I heard was a speech pathologist said to me that she was working with a young boy who had never in his life, grade five, never in his life written a story on his own, never seen it through to completion. And with the dictation in office and also in Oneote, he was able to actually dictate his story and get his ideas down on page on paper or at least get his ideas down onto the screen. So for for an educator, we know that it's not just the actual act of writing, it's the the thinking, the planning, the structuring. And so by helping him with the actual act of writing, he was able to get all these ideas, this story down onto his OneNote page. And the exciting thing for him was it was the first time that he'd ever done it independently. And uh when the speech pathologist told me this story, I was blown away that there it is, the technology is actually doing what we were hoping it was going to do.
And and when you look at the technology behind that, to be a bit geeky for a minute, you you spoke to the engineering lead in charge of dictating some of these features,
in charge of translating
and then and then was when we were talking earlier, you were talking about the way that it wasn't picking up just one person on your machine and kind of learning your voice. It was bringing everybody in. Is that
Yeah. Because I remember the old days of dictation was you had a special piece of software and you had to read out a whole lo of stuff. You had to training and it was the other end of the world from pressing the home button on my phone and going hey Siri do this you know there's no training involved I am just talking to it
that's right so it's learning from everyone and that was something that that I I had to be corrected on I thought it was doing exactly what you were saying Ry that it was learning from the individual and just the individual on that device no we're the whole community the whole world community using this so it's learning to pick up what we're saying determining accents different languages etc so the more people that use it the better it gets at at hearing us.
Okay. So, AI can be super complicated and super powerful, but is it difficult to use? Like, do I have to go and install something? If I want to use for myself or for students to use dictation in Word, is it do I have to go and do something?
No. No. The brilliant thing about this is that it is native to Office, which means you just need to be running the the most recent builds of say Office 365 and making sure that all your up updates are there. You don't even need to be switching on intelligent services anymore. We used to have to do that at sort of in in the um settings. We don't need to do that anymore. So, it's just there waiting. And the cool thing, too, is it's waiting in the browser version of things like Word and OneNote and Outlook as well.
So, tell me, where do I find it?
So, in in your browser. So, if you were to go to, you know, outlook.com or
office menu bar at the top.
Exactly. Right. Yeah. So, it sits over towards the top right is where it sits.
For the benefit of the audience, Troy is pointing to the top right. You're reimagining this in your head. Okay. So, you're going to the top right. You're going to dictate. Is it called?
It's There's an icon. There's an icon. It's just a little microphone, right? So, you can hover over that and it'll say dictate. But for most of us and also for those that are challenged by text, it's just nice to see that little microphone. We know what that means. That's the speech.
And and I I suppose you we think about it from an accessibility point of view for people who want to talk into the machine and whatever. But it's also good for people working in the field, admin officers, doctors, people supporting people out in in the front line.
Another really cool thing about dictation is the fact that it's built into Outlook. So now when I want to dictate my emails. I could I can do that. So, I can be multitasking if I'm not challenged, for example, by by text or by typing. I can actually dictate my emails. But even better than that is for some students who are locked out of email communications, especially in secondary schools. If you've got your own your own kids in secondary, you know that they're going to be getting emails from their teachers all the time. You think about the amount of kids that are actually locked out of that mode of communication. So, the fact that they can dictate their emails, it's phenomenal. It's a game changer for them.
And and and the SIM engine, I suppose, you know, you can read that back to you as well. Well, so some of those students that that can't access.
That's right. So the readaloud function again that's ubiquitous to much of office.
So that you can have your emails read back to you or your word documents read back to you or your OneNote pages read back to you.
Yeah.
So I think that's really cool because you know we've been talking for a while now around AI in education. But what we've been talking about now is how you make things more accessible to students. So the AI stuff that's going on in the background is great wizzy stuff, but it's the simplicity of look just press press that button and start talking to your computer and doing, you know, dictation. But let's be honest, dictation's been around for a long time. You know, we made it easier and we made it recognize people more easily without having to train it, but it's been around for a long time. So, I've seen you show something in PowerPoint that blew my mind the first time I saw it because that's not been around for a while. So, tell me about what we've done with it in PowerPoint.
PowerPoint translator now sits as a native feature, so you don't necessarily need to download any add-ins, etc. you can, but you don't need to. And um what that does is it it listens to your voice in real time and it starts to build a transcript which is coming up, you know, with a slight lag depending on on the speed of your network, but it's it's bringing up the transcript. It's bringing up the captions in real time. It also has the ability to translate that into multiple languages. So I can set that in my presenter view with inside PowerPoint. I can set it to be translating directly into Korean. I can be set it set it to translate directly into Chinese. Or I can also give the audience the option to uh log into this presentation and they can translate it on their device into whatever language they're comfortable with.
That's pretty good for like parents and access for multiple languages or on just one translation. Yeah.
And I've seen that works in the apps. It also works on the web version. So on the web version, you just go into view and say show subtitles and it's just using the microphone to show it. And it it is just like watching foreign language programs on TV. You literally see the subtitles along the bottom of the screen.
Correct. It's Awesome. And so that's there natively. The bit you were talking about the translation, being able to see that. So that's built into it as well.
The the ability to actually translate into one language is built in. If you want to open your presentation up for other people to, for one of a better word, dial in or login through through their app, then you need to have the extension which is downloadable. The other thing that that's really powerful for too is for the deaf.
So being able to um open up your PowerPoint presentation, start to speak, and people that are hard of hearing having and the text the captions in real time is phenomenal.
So I think about higher education the scenarios I I think about with that are first of all just having transcripts on the slides so that you can see what is being said at the front of the room but the second is typically a lecture in Australia let's say it's got 100 students the the chances are 35 of those students are not English students and English isn't their first language in business courses it's like 80% of them aren't English what is really interesting is that That ability to be able to have the student to be able to see a translated set of subtitles on their own phone means that the Chinese students can see it in Chinese, the Indian students can see in Indian on their phone at the same time as seeing the lecturer speaking in English and seeing the subtitles in English from the lecturer. And the reason I think that that's an amazing scenario is that when I was younger, I lived in Holland. Dutch is a pretty difficult language, but I learned Dutch from watching American TV programs broadcast with Dutch subtitles because I could then relate that what's that word mean to being able to see it myself.
But the other clever use of AI in that tool is also that our uh services in the back end that are sitting there and listening. They're not only listening to what you're saying to translate it, they actually uses some of the context of the PowerPoint slides and the notes. So I think if you if you're doing a presentation of dinosaurs and you put lots of dinosaurs in there and the names of dinosaurs, then it it picks a context up and can try to bring those words through as well when you're talking.
And for what it's worth, that's where it's mimicking language learning. I was a language teacher for a very long time. And we would often tell the students to look at the context, look at the pictures, look at the scenarios that you're in, etc., and be listening for cues or watching for cues. And the the AI mimics that because it's looking at the context of the slides and thinking, is that word this or is that word that? And it then it you know very fast, of course, it's going into its database of words which you've uploaded and it's going it must be that word. So it's very very cool.
Yeah.
So the kind of scenario is we've talked about the fact that computers can recognize human speech as effectively as a human can. We then taken that really rocket sciency kind of stuff and turned it into something that's a click in PowerPoint that improves accessibility, but it's really interesting because in some scenarios it might be 80% of the people in the room actually need some accessibility support.
But what about students though dyslexic or dyspraic or something like that? Have we got tools to help them?
Oh, 100%. So we have this tool called immersive reader. Well, let's take a step back. We also have the readaloud. So readaloud sits in inside word. Read aloud sits inside Outlook. Read aloud also sits inside the Edge browser. And the ability to actually just read the text on the screen, but it reads it in a very natural way. It's not that sort of stunted uh robotic sounding voice, which I think you were saying you knew something about how this all worked. Oh, so the text to speech stuff you if you go back a few years, you'd know when you were listening to a computer reading something out because it was the War Games voice. Uh looking at me Langley, I'm I'm too old. I'm the only one that remembers war games.
I remember war games.
I just say, "Greetings, Professor Falcon."
Yeah.
Greetings, Professor Falcon.
Awesome. So, it was that kind of voice was reading text. And so, I used to find I couldn't listen to long blocks of speech because you had to really concentrate to hear it. We've now developed neural network text to speech. And so, that is about using much more of the context of the information in order to make the voice sound more natural. So, it's no longer sat natural. It's actually a flow of text natural. We We'll get a couple of recordings and stick them in here.
Oh, that would be good.
Greetings, Professor Falcon. I'm today's texttospech voice. And as you might be able to tell, I even sound a bit Australian. Speech to text is continuing to get better, so you can expect me to sound even more natural in the future.
It's really interesting how it's becoming more and more natural and for mentally less draining. to listen to. So now I'm prepared to listen to a page of text being read out to me because you know I'm lazy or I'm I'm a trainer in the car or something and I can't read.
But a step up from that Ray is it will also read the language pack on your machine and determine what accent it's going to read in. So if you set it for Irish, Great Britain, Indian English, Australian English, it'll come out with those accents which is phenomenal.
And so that's useful from an accessibility point of view. How?
Well, very much so because if you're trying to learn a a language, there's going to be pages that you're going to move through. So, for example, if you're, you know, recently arrived in a country like New Zealand and you're from India, it's going to be much easier for you to be listening to texts in an accent that's similar to what you know. So, for example, you may be listening at first to the text read to you in English, but with an Indian accent, but then as time progresses, then you'll move over to maybe a New Zealand accent, an Australian accent or something like that. So, you can stage it. But also, it's very popular with teachers to move away from and not that I'm saying there's anything wrong with this, but moving away from American accent because the American accent is so ubiquitous to our kids. Sometimes the teachers like to just give them a break from that. That being said, I don't quite subscribe to that because I think, hey, let's give them a multitude of accents. So, when I was a language teacher, for example, in Korea, we had um people that were from Canada, people from the UK, people from the US, and we intentionally tried to expose our students to as many of these different accents as possible because otherwise they tune into one and tune out to others. Reminds me of the time again when I lived in Holland where a lot of Dutch people spoke English with an American accent because there was so much American programming on TV.
Yeah.
So that's interesting. And you know you kind of got that bit around we're focusing a lot on the speech speech in speech out. What about the language understanding stuff? I know it goes beyond translation because when we were talking about chat bots a few weeks ago it was the ability to understand what somebody was saying. Break a park a sentence and go well that's a question. It's about this kind of object. I know I've seen you show the immersive reader where it's breaking apart parts of text.
Yeah. Well, even before we step away from from the read aloud, it'll actually read the punctuation as well. So, there's a question mark at the end of a sentence or as an exclamation point or a full stop and it'll it'll read its sentence back to reflect the punctuation. Yeah. What you're talking about there is the parts of speech within Immersive Reader. And what it does is it will identify the nouns, the adverbs, the adjectives, and the verbs. and it will contextualize those. So, for example, if I had a word like tag, in one context, it's going to be a noun. Um, I'm going to play a game of tag. In another context, it's going to be a verb. I'm going to tag you. In that sense, it will differentiate between tag as a noun and tag tag as a verb, which I don't know how it does that, but I expect that's really quite complex.
Yeah. So, that'll be part of the language understanding service. So, we've talked in the past around the fact that AI can now comprehend things more effectively. in humans. So, we've now got these situations where AI can pass an exam by reading a block of text and answering questions about that piece of text. And so, that's that language understanding service where it's tearing the whole thing apart and being able to get to a deep understanding of the intent. That actually comes from work that was started for search engines. Because when you go to the web and you're searching for something, what the search engines are trying to do is understand the intent of your question. If you go and put in a flight number, they know that the intent of your question is not to find every reference to that flight on the internet. It's to find out how that flight is going. So if you go and put in a flight number, what you see is how that flight is it scheduled to arrive on time, is it going to be late? Because it's trying to understand the intent. And that's the same service that was developed there now being used in accessibility tools to be able to then say, okay, so now we can use it to tear apart sentences and find references. So you know, it's really cool that you kind of got things that are built for one purpose and then being repurposed into another. Exactly right. And and that's what we call universal design and in in education we call it universal design for learning. So the application for that parts of speech is you may have a student that is challenged in some way and the ability to identify parts of speech is really helpful. But someone like myself who was teaching the early years, so younger years kids to read, being able to highlight all the nouns in a paragraph, all the adverbs, all the verbs, etc., and reflect back to a text type is really cool. And that's going to help all kids.
And the other thing do about immersive readers, we can color them. And the reason that was set up is so that people could actually set the colors to be um able to distinguish these words from one another. So, you're not going to put blue and brown. You might put blue and yellow. So, these are quite different colors. So, if you're having trouble distinguishing colors, but what that does for a teacher is a teacher can actually put that up with her younger kids and say, "This is a this is a narrative genre. Kids, we're supposed to have a lot of adjectives. What color are adjectives? They're blue. I don't see a lot of blue in my text." So, these kids that aren't diagnosed with something, but they're actually ually looking at a paragraph and seeing the parts of speech.
Awesome. And and the and the user interface with that as well. So, not only is it doing that, you know, I I remember seeing a demo I think you ran recently where you can then also change the contrast of the text, the color of the background, and then also focusing line by line. So, students who are kind of struggling to read can really kind of get focused in on particular words and sentences.
I love the fact that what we're talking about is the subjects we've been talking about the past few weeks around AI for education. We're not talking about And I we're talking about the outcomes, you know, what it can enable because you kind of get into a scenario where you say, well, we can do speech to text. Oh, well that means we can put subtitles on a slide. Well, that means we can put it on the students phone. Oh, we can translate it. That means we can translate it into their home language. And suddenly you're enabling a great learning scenario, but it's a bunch of technical AI on the other end. And then it's people that understand what is it that a teacher does that we can help to improve.
But but we've also opened up that immersive reader API, the the kind of application interface to other developers as well. So if the encyclopedia bratannica or whoever it may be a third party want to use those features then they can utilize that
and and it's free to them.
Yeah.
You know we've just shown our age a bit Dan because people are going encyclopedia what
yeah encyclopedia Australia now.
So I saw you talking about something I think was called presentation coach or powerpoint coach.
Uh yeah the presentator uh
presenter coach is presenter coach.
Okay tell me about that because that was like another step on again it was just using those same services but it was what is a problem we can try and solve.
So it will listen to a student or listen to anyone but listen to a student do their presentation and then it will give them feedback in terms of for example how many ums which I am horrible with it it'll count their ums um it'll it'll look at their pauses and things like that and it gives them feedback into how they've presented so that they can actually go back and you know have another shot and hopefully see a reduction in these errors or a reduction in the issues.
Look, I know why that would be good for me because I spend a lot of time presenting and for me to be able to get feedback, count my arms, that would be pretty awful to see first time. But how would that be useful for a student? Well, the world that we live in is, you know, students are doing presentations from as, you know, low as as prep uh or as reception. So, kids kids being able to to get that kind of support without necessarily a teacher in the room or or their peers is is really good.
Troy, the last thing I want to ask you about is you showed me something on your phone around translation, an app that was being used for translation and again that was really interesting for education scenarios. Just tell me what you were showing me.
Yeah. Well, the the translator, Microsoft translator sits as an app in, you know, your favorite mobile device, but it also is a web-based portal. So, you can come in through your, you know, your Surface, your Mac, whatever it is you're coming through. And what it does is it uses that same dictation engine that listens to your voice, but then it does a real time translation of that into into another language. Right? So there's 63 different text languages and 11 spoken languages. Right? So do you remember Star Trek? They would boldly go where no one's gone before and then they would arrive there and everyone speaks English. And of course what the workound for that in Star Trek was and this is me showing my geekness called a universal translator. People talked to me about Babelfish and I'm sure the universal translator was there first.
Well, we're living in that age now where for at least 11 of these languages I can be having a conversation with my in-laws in China. And this is a real world scenario. This really happened. I was able to have a depth of conversation with them where I'm speaking to them in in English and then the translators coming to them with, you know, in in very very short amount of time coming back to them in Chinese and vice versa. And so we had this level of conversation that we'd never had before was probably sometimes a good thing. When I reported this to the guys at the at the translator app, the developers, which was quite exciting just to be chatting to them, they were really blown away. They were really excited that that there was this sort of real real world application for it. And I think to being able to use that in a education context for example teachers talking to parents parent teacher night they could actually have the the app open and be having that sort of you know level of conversation that they haven't had before because whilst we say yeah get a translator in that's actually outside the pricing of most schools.
I'm uh sadly end up watching too much reality TV and I watch Border Force and they sit there interviewing somebody and there's somebody on the end of the phone and every time I watch that I think Why haven't they just put the phone on the table and done that? Cuz I see people doing it on holiday. I see people doing it in education scenarios that speak, translate, speak in a foreign language. That is an amazing scenario. Again, in our education system, certainly in higher education, huge numbers of international students and language is a bit of a barrier sometimes. So, you know, I can see those scenarios are really cool. What is really interesting is from what you've told us around what we're doing in accessibility terms is We're starting to make those features naturally usable within the products.
What I'd like to see in higher ed too, knowing, you know, my own experiences and work, you know, my own experiences and being alongside international students would be to see that the the translator as a cognitive service built into a number of these apps that the university are building. So whether that's um a an orientation style app etc. where the kids can actually or the students can actually get full access to everything in in their language and the leg work is all being done by by the cognitive service which is translated.
Yeah, 100%.
Awesome.
Good story. Yeah.
Okay. Well, that's been really useful because I've got a bit more depth even from the stories you've told me before, I've seen before around what we're doing to use the AI to provide services to the students. And you know, I I guess what I'm expecting to see is that more and more of those facilities are just going to be built into the apps we use. You know, they're not going to be accessibility is not this thing over on the side. which I remember when I first entered education technology, you know, it was a specialist area and out of 2,000 people that worked for the company I worked for, there was one expert that you went to every time. I think the stories you're telling me are about the way that we're building that in right across everything. And suddenly you can now actually provide accessibility for every student in every classroom regardless of what their accessibility needs.
And and for me, my takeaway is when you talked about inclusion there and I think it goes back to that tagline around when we all play we all win. And you know that that connection with it's not just about the individuals who are struggling. We've got tools for that, but it's about bringing everybody to the table and increasing everybody's learning and supporting everybody in the classroom like the accessibility controller on Xbox and things like that. Bringing everybody together.
That's exactly right. So, as we cast the net wide through universal design or universal design for learning, we're not only going to catch the fish that we're hoping to catch, we're going to catch the fish that we didn't even realize needed to be caught.
Yeah, great analogy.
So, Troy, if people are interested to learn more about what we've talked about, what do they type into their favorite search engine to get to the place that will give them answers?
Well, if people are really keen to connect with us, there's the Twitter handle um MSFT enabled, which is uh Microsoft accessibility. There's a lot of really good stuff that comes through there. They can type uh Microsoft accessibility into their favorite search engine or even Microsoft accessibility and their and their country. And then there's a whole heap of resources. There's a rabbit hole to fall down and I'm sure people have a great time.
Okay, so we'll put some links in. to the show notes for that.
The other thing I know is we have the disability answer.
Yep. That's that's part of Microsoft accessibility in your country. Brilliant.
That's right.
And that's about having a phone number where you can just phone up and ask somebody.
Correct. Yes. That's a real person on the end. 9:00 a.m. to 9:00 p.m. Monday to Friday, 10il 6 Saturday and Sunday. And then after that, there's a 24/7 uh realtime chat.
Awesome. And we don't know whether we chatting to a bot or a human.
Okay. Thanks for coming in, Troy. It's been really good to hear those stories and uh we'll chat to you again. at some point in the future.
Thanks, guys.
Thank you.