Oct 30, 2019
Ray and Dan are joined by Troy Waller, who's the Accessibility Lead for the Microsoft Education team in Australia, to talk about how artificial intelligence is being used to support students.
Troy discusses how to help students write more effectively, using dictation in Word and PowerPoint in Office 365; PowerPoint's live captions and translation; Immersive Reader to help younger students, students with dyslexia and other reading difficulties, and even international students; Presentation Coach; and Microsoft's Translator app
Troy mentions the Microsoft Enable team on Twitter, and the Microsoft Accessibility website
You might need some of these links to learn more!
TRANSCRIPT FOR The AI in Education Podcast
Series: 1
Episode: 6
This transcript and summary are auto-generated. If you spot any important errors, do feel free to email the podcast hosts for corrections.
This podcast excerpt features a discussion with Troy, Microsoft's accessibility specialist for the Aussie education team, focusing on how Artificial Intelligence (AI) enhances accessibility in Microsoft Office and Windows products for all learners. A key theme is the shift from accessibility as a niche concern to universal design, where tools like dictation and translation are built-in features that benefit everyone, not just those with diagnosed needs. Specific examples highlighted include dictation, which allows students like one young boy to independently complete creative writing, and the PowerPoint Translator, which provides real-time, multi-language captions for improved understanding during presentations, especially for international students and the deaf. The conversation also explores Immersive Reader, an advanced tool that uses AI for natural-sounding read-aloud functions and can break down text into parts of speech, further demonstrating how complex AI services are being natively integrated to provide simple, powerful support.
Hi, I'm Ray.
And I'm Dan.
And we're here to talk about
AI in education. Ray.
Excellent. Dan, I think I need a bit more AI to keep going with us.
Now, today, Dan, we've got a bit of a special guest.
Fantastic. Who's that?
He's sitting next to you, Dan.
Oh, yes. Troy, you're right. every from our team. Uh, so Troy is
our accessibility specialist.
And if he doesn't know something about accessibility, it's not
worth knowing.
Good.
But you remember last time we said, let's have a chat with Troy
because all of the things we were talking about for the use of AI
had implications for how it could be used to help students with
accessibility issues. So
Troy, tell us a bit about yourself.
Hi guys. Yeah, so I'm Troy. I'm the uh accessibility lead for the
Aussie education team here at Microsoft. I work very much with
schools around switching on their Office 365 capabilities and
making sure that they're getting the most out of what they what
they purchased. And I also am lucky enough to be helping them with
their accessibility stuff. I think the thing to sort of point out
here is it's not just the kids that need it, it's all children,
right? And all learners. So being able to turn this stuff on
actually ends up helping the kids that we think need it and also
everybody else.
That's a really good point.
I remember you saying, you know, the proportion of people that had
access ility needs and that was always somebody else
and then you know I'm a little bit older than both of you so my
eyesight is starting to go so now I need reading glasses and now I
see some of those scenarios where somebody points their phone at a
menu and gets the menu read out loud to them because you know if
I'm in a really romantic restaurant with mood lighting I can't read
the menu so I am now in that group of people that has accessibility
needs
well for what it's worth I actually point my wife at the menu and
then she reads it back to me but I but I understand what you're
saying about using technology. Very good.
My early my early trick was always to say to the waiter, "This is a
great menu, but what looks what's what's particularly good that
you'd recommend?"
I want to go I want to go to rest to a restaurant with you, too.
Sounds very exciting. So, Troy, when when we're looking at our
products in the Microsoft suite, you know, there's a lot of them
from Windows all the way through to Office. Are there ones that
jump out to you that kind of really support the learners?
I think what's important to note is a lot of these accessible
technologies, a lot of these supportive technologies are ubiquitous
to Office right across Office and right across Windows for that
matter. So, are there ones that really stand out to me? Yes. But
remembering too that the fact that it's ubiquitous to Office means
that everybody's got access to them all the time. But there's
definitely some great things like we've got dictation right across
um Office now. We've got the ability to uh translate. We've got the
translator in there as well. We've got this thing called immersive
reader which uses a combination of dictate and read aloud. And then
of course we've got the translator app which is really good for the
deaf and also for EAL students. So students learning another
language.
Okay. So so you started off by talking about dictation. So we know
computers are listening to us all of the time. We talked about that
earlier, didn't we? D. Yeah, we did.
So what does that mean? Where is that used in accessibility?
Well, with dictation, the the most powerful story I heard was a
speech pathologist said to me that she was working with a young boy
who had never in his life, grade five, never in his life written a
story on his own, never seen it through to completion. And with the
dictation in office and also in Oneote, he was able to actually
dictate his story and get his ideas down on page on paper or at
least get his ideas down onto the screen. So for for an educator,
we know that it's not just the actual act of writing, it's the the
thinking, the planning, the structuring. And so by helping him with
the actual act of writing, he was able to get all these ideas, this
story down onto his OneNote page. And the exciting thing for him
was it was the first time that he'd ever done it independently. And
uh when the speech pathologist told me this story, I was blown away
that there it is, the technology is actually doing what we were
hoping it was going to do.
And and when you look at the technology behind that, to be a bit
geeky for a minute, you you spoke to the engineering lead in charge
of dictating some of these features,
in charge of translating
and then and then was when we were talking earlier, you were
talking about the way that it wasn't picking up just one person on
your machine and kind of learning your voice. It was bringing
everybody in. Is that
Yeah. Because I remember the old days of dictation was you had a
special piece of software and you had to read out a whole lo of
stuff. You had to training and it was the other end of the world
from pressing the home button on my phone and going hey Siri do
this you know there's no training involved I am just talking to
it
that's right so it's learning from everyone and that was something
that that I I had to be corrected on I thought it was doing exactly
what you were saying Ry that it was learning from the individual
and just the individual on that device no we're the whole community
the whole world community using this so it's learning to pick up
what we're saying determining accents different languages etc so
the more people that use it the better it gets at at hearing
us.
Okay. So, AI can be super complicated and super powerful, but is it
difficult to use? Like, do I have to go and install something? If I
want to use for myself or for students to use dictation in Word, is
it do I have to go and do something?
No. No. The brilliant thing about this is that it is native to
Office, which means you just need to be running the the most recent
builds of say Office 365 and making sure that all your up updates
are there. You don't even need to be switching on intelligent
services anymore. We used to have to do that at sort of in in the
um settings. We don't need to do that anymore. So, it's just there
waiting. And the cool thing, too, is it's waiting in the browser
version of things like Word and OneNote and Outlook as well.
So, tell me, where do I find it?
So, in in your browser. So, if you were to go to, you know,
outlook.com or
office menu bar at the top.
Exactly. Right. Yeah. So, it sits over towards the top right is
where it sits.
For the benefit of the audience, Troy is pointing to the top right.
You're reimagining this in your head. Okay. So, you're going to the
top right. You're going to dictate. Is it called?
It's There's an icon. There's an icon. It's just a little
microphone, right? So, you can hover over that and it'll say
dictate. But for most of us and also for those that are challenged
by text, it's just nice to see that little microphone. We know what
that means. That's the speech.
And and I I suppose you we think about it from an accessibility
point of view for people who want to talk into the machine and
whatever. But it's also good for people working in the field, admin
officers, doctors, people supporting people out in in the front
line.
Another really cool thing about dictation is the fact that it's
built into Outlook. So now when I want to dictate my emails. I
could I can do that. So, I can be multitasking if I'm not
challenged, for example, by by text or by typing. I can actually
dictate my emails. But even better than that is for some students
who are locked out of email communications, especially in secondary
schools. If you've got your own your own kids in secondary, you
know that they're going to be getting emails from their teachers
all the time. You think about the amount of kids that are actually
locked out of that mode of communication. So, the fact that they
can dictate their emails, it's phenomenal. It's a game changer for
them.
And and and the SIM engine, I suppose, you know, you can read that
back to you as well. Well, so some of those students that that
can't access.
That's right. So the readaloud function again that's ubiquitous to
much of office.
So that you can have your emails read back to you or your word
documents read back to you or your OneNote pages read back to
you.
Yeah.
So I think that's really cool because you know we've been talking
for a while now around AI in education. But what we've been talking
about now is how you make things more accessible to students. So
the AI stuff that's going on in the background is great wizzy
stuff, but it's the simplicity of look just press press that button
and start talking to your computer and doing, you know, dictation.
But let's be honest, dictation's been around for a long time. You
know, we made it easier and we made it recognize people more easily
without having to train it, but it's been around for a long time.
So, I've seen you show something in PowerPoint that blew my mind
the first time I saw it because that's not been around for a while.
So, tell me about what we've done with it in PowerPoint.
PowerPoint translator now sits as a native feature, so you don't
necessarily need to download any add-ins, etc. you can, but you
don't need to. And um what that does is it it listens to your voice
in real time and it starts to build a transcript which is coming
up, you know, with a slight lag depending on on the speed of your
network, but it's it's bringing up the transcript. It's bringing up
the captions in real time. It also has the ability to translate
that into multiple languages. So I can set that in my presenter
view with inside PowerPoint. I can set it to be translating
directly into Korean. I can be set it set it to translate directly
into Chinese. Or I can also give the audience the option to uh log
into this presentation and they can translate it on their device
into whatever language they're comfortable with.
That's pretty good for like parents and access for multiple
languages or on just one translation. Yeah.
And I've seen that works in the apps. It also works on the web
version. So on the web version, you just go into view and say show
subtitles and it's just using the microphone to show it. And it it
is just like watching foreign language programs on TV. You
literally see the subtitles along the bottom of the screen.
Correct. It's Awesome. And so that's there natively. The bit you
were talking about the translation, being able to see that. So
that's built into it as well.
The the ability to actually translate into one language is built
in. If you want to open your presentation up for other people to,
for one of a better word, dial in or login through through their
app, then you need to have the extension which is downloadable. The
other thing that that's really powerful for too is for the
deaf.
So being able to um open up your PowerPoint presentation, start to
speak, and people that are hard of hearing having and the text the
captions in real time is phenomenal.
So I think about higher education the scenarios I I think about
with that are first of all just having transcripts on the slides so
that you can see what is being said at the front of the room but
the second is typically a lecture in Australia let's say it's got
100 students the the chances are 35 of those students are not
English students and English isn't their first language in business
courses it's like 80% of them aren't English what is really
interesting is that That ability to be able to have the student to
be able to see a translated set of subtitles on their own phone
means that the Chinese students can see it in Chinese, the Indian
students can see in Indian on their phone at the same time as
seeing the lecturer speaking in English and seeing the subtitles in
English from the lecturer. And the reason I think that that's an
amazing scenario is that when I was younger, I lived in Holland.
Dutch is a pretty difficult language, but I learned Dutch from
watching American TV programs broadcast with Dutch subtitles
because I could then relate that what's that word mean to being
able to see it myself.
But the other clever use of AI in that tool is also that our uh
services in the back end that are sitting there and listening.
They're not only listening to what you're saying to translate it,
they actually uses some of the context of the PowerPoint slides and
the notes. So I think if you if you're doing a presentation of
dinosaurs and you put lots of dinosaurs in there and the names of
dinosaurs, then it it picks a context up and can try to bring those
words through as well when you're talking.
And for what it's worth, that's where it's mimicking language
learning. I was a language teacher for a very long time. And we
would often tell the students to look at the context, look at the
pictures, look at the scenarios that you're in, etc., and be
listening for cues or watching for cues. And the the AI mimics that
because it's looking at the context of the slides and thinking, is
that word this or is that word that? And it then it you know very
fast, of course, it's going into its database of words which you've
uploaded and it's going it must be that word. So it's very very
cool.
Yeah.
So the kind of scenario is we've talked about the fact that
computers can recognize human speech as effectively as a human can.
We then taken that really rocket sciency kind of stuff and turned
it into something that's a click in PowerPoint that improves
accessibility, but it's really interesting because in some
scenarios it might be 80% of the people in the room actually need
some accessibility support.
But what about students though dyslexic or dyspraic or something
like that? Have we got tools to help them?
Oh, 100%. So we have this tool called immersive reader. Well, let's
take a step back. We also have the readaloud. So readaloud sits in
inside word. Read aloud sits inside Outlook. Read aloud also sits
inside the Edge browser. And the ability to actually just read the
text on the screen, but it reads it in a very natural way. It's not
that sort of stunted uh robotic sounding voice, which I think you
were saying you knew something about how this all worked. Oh, so
the text to speech stuff you if you go back a few years, you'd know
when you were listening to a computer reading something out because
it was the War Games voice. Uh looking at me Langley, I'm I'm too
old. I'm the only one that remembers war games.
I remember war games.
I just say, "Greetings, Professor Falcon."
Yeah.
Greetings, Professor Falcon.
Awesome. So, it was that kind of voice was reading text. And so, I
used to find I couldn't listen to long blocks of speech because you
had to really concentrate to hear it. We've now developed neural
network text to speech. And so, that is about using much more of
the context of the information in order to make the voice sound
more natural. So, it's no longer sat natural. It's actually a flow
of text natural. We We'll get a couple of recordings and stick them
in here.
Oh, that would be good.
Greetings, Professor Falcon. I'm today's texttospech voice. And as
you might be able to tell, I even sound a bit Australian. Speech to
text is continuing to get better, so you can expect me to sound
even more natural in the future.
It's really interesting how it's becoming more and more natural and
for mentally less draining. to listen to. So now I'm prepared to
listen to a page of text being read out to me because you know I'm
lazy or I'm I'm a trainer in the car or something and I can't
read.
But a step up from that Ray is it will also read the language pack
on your machine and determine what accent it's going to read in. So
if you set it for Irish, Great Britain, Indian English, Australian
English, it'll come out with those accents which is phenomenal.
And so that's useful from an accessibility point of view. How?
Well, very much so because if you're trying to learn a a language,
there's going to be pages that you're going to move through. So,
for example, if you're, you know, recently arrived in a country
like New Zealand and you're from India, it's going to be much
easier for you to be listening to texts in an accent that's similar
to what you know. So, for example, you may be listening at first to
the text read to you in English, but with an Indian accent, but
then as time progresses, then you'll move over to maybe a New
Zealand accent, an Australian accent or something like that. So,
you can stage it. But also, it's very popular with teachers to move
away from and not that I'm saying there's anything wrong with this,
but moving away from American accent because the American accent is
so ubiquitous to our kids. Sometimes the teachers like to just give
them a break from that. That being said, I don't quite subscribe to
that because I think, hey, let's give them a multitude of accents.
So, when I was a language teacher, for example, in Korea, we had um
people that were from Canada, people from the UK, people from the
US, and we intentionally tried to expose our students to as many of
these different accents as possible because otherwise they tune
into one and tune out to others. Reminds me of the time again when
I lived in Holland where a lot of Dutch people spoke English with
an American accent because there was so much American programming
on TV.
Yeah.
So that's interesting. And you know you kind of got that bit around
we're focusing a lot on the speech speech in speech out. What about
the language understanding stuff? I know it goes beyond translation
because when we were talking about chat bots a few weeks ago it was
the ability to understand what somebody was saying. Break a park a
sentence and go well that's a question. It's about this kind of
object. I know I've seen you show the immersive reader where it's
breaking apart parts of text.
Yeah. Well, even before we step away from from the read aloud,
it'll actually read the punctuation as well. So, there's a question
mark at the end of a sentence or as an exclamation point or a full
stop and it'll it'll read its sentence back to reflect the
punctuation. Yeah. What you're talking about there is the parts of
speech within Immersive Reader. And what it does is it will
identify the nouns, the adverbs, the adjectives, and the verbs. and
it will contextualize those. So, for example, if I had a word like
tag, in one context, it's going to be a noun. Um, I'm going to play
a game of tag. In another context, it's going to be a verb. I'm
going to tag you. In that sense, it will differentiate between tag
as a noun and tag tag as a verb, which I don't know how it does
that, but I expect that's really quite complex.
Yeah. So, that'll be part of the language understanding service.
So, we've talked in the past around the fact that AI can now
comprehend things more effectively. in humans. So, we've now got
these situations where AI can pass an exam by reading a block of
text and answering questions about that piece of text. And so,
that's that language understanding service where it's tearing the
whole thing apart and being able to get to a deep understanding of
the intent. That actually comes from work that was started for
search engines. Because when you go to the web and you're searching
for something, what the search engines are trying to do is
understand the intent of your question. If you go and put in a
flight number, they know that the intent of your question is not to
find every reference to that flight on the internet. It's to find
out how that flight is going. So if you go and put in a flight
number, what you see is how that flight is it scheduled to arrive
on time, is it going to be late? Because it's trying to understand
the intent. And that's the same service that was developed there
now being used in accessibility tools to be able to then say, okay,
so now we can use it to tear apart sentences and find references.
So you know, it's really cool that you kind of got things that are
built for one purpose and then being repurposed into another.
Exactly right. And and that's what we call universal design and in
in education we call it universal design for learning. So the
application for that parts of speech is you may have a student that
is challenged in some way and the ability to identify parts of
speech is really helpful. But someone like myself who was teaching
the early years, so younger years kids to read, being able to
highlight all the nouns in a paragraph, all the adverbs, all the
verbs, etc., and reflect back to a text type is really cool. And
that's going to help all kids.
And the other thing do about immersive readers, we can color them.
And the reason that was set up is so that people could actually set
the colors to be um able to distinguish these words from one
another. So, you're not going to put blue and brown. You might put
blue and yellow. So, these are quite different colors. So, if
you're having trouble distinguishing colors, but what that does for
a teacher is a teacher can actually put that up with her younger
kids and say, "This is a this is a narrative genre. Kids, we're
supposed to have a lot of adjectives. What color are adjectives?
They're blue. I don't see a lot of blue in my text." So, these kids
that aren't diagnosed with something, but they're actually ually
looking at a paragraph and seeing the parts of speech.
Awesome. And and the and the user interface with that as well. So,
not only is it doing that, you know, I I remember seeing a demo I
think you ran recently where you can then also change the contrast
of the text, the color of the background, and then also focusing
line by line. So, students who are kind of struggling to read can
really kind of get focused in on particular words and
sentences.
I love the fact that what we're talking about is the subjects we've
been talking about the past few weeks around AI for education.
We're not talking about And I we're talking about the outcomes, you
know, what it can enable because you kind of get into a scenario
where you say, well, we can do speech to text. Oh, well that means
we can put subtitles on a slide. Well, that means we can put it on
the students phone. Oh, we can translate it. That means we can
translate it into their home language. And suddenly you're enabling
a great learning scenario, but it's a bunch of technical AI on the
other end. And then it's people that understand what is it that a
teacher does that we can help to improve.
But but we've also opened up that immersive reader API, the the
kind of application interface to other developers as well. So if
the encyclopedia bratannica or whoever it may be a third party want
to use those features then they can utilize that
and and it's free to them.
Yeah.
You know we've just shown our age a bit Dan because people are
going encyclopedia what
yeah encyclopedia Australia now.
So I saw you talking about something I think was called
presentation coach or powerpoint coach.
Uh yeah the presentator uh
presenter coach is presenter coach.
Okay tell me about that because that was like another step on again
it was just using those same services but it was what is a problem
we can try and solve.
So it will listen to a student or listen to anyone but listen to a
student do their presentation and then it will give them feedback
in terms of for example how many ums which I am horrible with it
it'll count their ums um it'll it'll look at their pauses and
things like that and it gives them feedback into how they've
presented so that they can actually go back and you know have
another shot and hopefully see a reduction in these errors or a
reduction in the issues.
Look, I know why that would be good for me because I spend a lot of
time presenting and for me to be able to get feedback, count my
arms, that would be pretty awful to see first time. But how would
that be useful for a student? Well, the world that we live in is,
you know, students are doing presentations from as, you know, low
as as prep uh or as reception. So, kids kids being able to to get
that kind of support without necessarily a teacher in the room or
or their peers is is really good.
Troy, the last thing I want to ask you about is you showed me
something on your phone around translation, an app that was being
used for translation and again that was really interesting for
education scenarios. Just tell me what you were showing me.
Yeah. Well, the the translator, Microsoft translator sits as an app
in, you know, your favorite mobile device, but it also is a
web-based portal. So, you can come in through your, you know, your
Surface, your Mac, whatever it is you're coming through. And what
it does is it uses that same dictation engine that listens to your
voice, but then it does a real time translation of that into into
another language. Right? So there's 63 different text languages and
11 spoken languages. Right? So do you remember Star Trek? They
would boldly go where no one's gone before and then they would
arrive there and everyone speaks English. And of course what the
workound for that in Star Trek was and this is me showing my
geekness called a universal translator. People talked to me about
Babelfish and I'm sure the universal translator was there
first.
Well, we're living in that age now where for at least 11 of these
languages I can be having a conversation with my in-laws in China.
And this is a real world scenario. This really happened. I was able
to have a depth of conversation with them where I'm speaking to
them in in English and then the translators coming to them with,
you know, in in very very short amount of time coming back to them
in Chinese and vice versa. And so we had this level of conversation
that we'd never had before was probably sometimes a good thing.
When I reported this to the guys at the at the translator app, the
developers, which was quite exciting just to be chatting to them,
they were really blown away. They were really excited that that
there was this sort of real real world application for it. And I
think to being able to use that in a education context for example
teachers talking to parents parent teacher night they could
actually have the the app open and be having that sort of you know
level of conversation that they haven't had before because whilst
we say yeah get a translator in that's actually outside the pricing
of most schools.
I'm uh sadly end up watching too much reality TV and I watch Border
Force and they sit there interviewing somebody and there's somebody
on the end of the phone and every time I watch that I think Why
haven't they just put the phone on the table and done that? Cuz I
see people doing it on holiday. I see people doing it in education
scenarios that speak, translate, speak in a foreign language. That
is an amazing scenario. Again, in our education system, certainly
in higher education, huge numbers of international students and
language is a bit of a barrier sometimes. So, you know, I can see
those scenarios are really cool. What is really interesting is from
what you've told us around what we're doing in accessibility terms
is We're starting to make those features naturally usable within
the products.
What I'd like to see in higher ed too, knowing, you know, my own
experiences and work, you know, my own experiences and being
alongside international students would be to see that the the
translator as a cognitive service built into a number of these apps
that the university are building. So whether that's um a an
orientation style app etc. where the kids can actually or the
students can actually get full access to everything in in their
language and the leg work is all being done by by the cognitive
service which is translated.
Yeah, 100%.
Awesome.
Good story. Yeah.
Okay. Well, that's been really useful because I've got a bit more
depth even from the stories you've told me before, I've seen before
around what we're doing to use the AI to provide services to the
students. And you know, I I guess what I'm expecting to see is that
more and more of those facilities are just going to be built into
the apps we use. You know, they're not going to be accessibility is
not this thing over on the side. which I remember when I first
entered education technology, you know, it was a specialist area
and out of 2,000 people that worked for the company I worked for,
there was one expert that you went to every time. I think the
stories you're telling me are about the way that we're building
that in right across everything. And suddenly you can now actually
provide accessibility for every student in every classroom
regardless of what their accessibility needs.
And and for me, my takeaway is when you talked about inclusion
there and I think it goes back to that tagline around when we all
play we all win. And you know that that connection with it's not
just about the individuals who are struggling. We've got tools for
that, but it's about bringing everybody to the table and increasing
everybody's learning and supporting everybody in the classroom like
the accessibility controller on Xbox and things like that. Bringing
everybody together.
That's exactly right. So, as we cast the net wide through universal
design or universal design for learning, we're not only going to
catch the fish that we're hoping to catch, we're going to catch the
fish that we didn't even realize needed to be caught.
Yeah, great analogy.
So, Troy, if people are interested to learn more about what we've
talked about, what do they type into their favorite search engine
to get to the place that will give them answers?
Well, if people are really keen to connect with us, there's the
Twitter handle um MSFT enabled, which is uh Microsoft
accessibility. There's a lot of really good stuff that comes
through there. They can type uh Microsoft accessibility into their
favorite search engine or even Microsoft accessibility and their
and their country. And then there's a whole heap of resources.
There's a rabbit hole to fall down and I'm sure people have a great
time.
Okay, so we'll put some links in. to the show notes for that.
The other thing I know is we have the disability answer.
Yep. That's that's part of Microsoft accessibility in your country.
Brilliant.
That's right.
And that's about having a phone number where you can just phone up
and ask somebody.
Correct. Yes. That's a real person on the end. 9:00 a.m. to 9:00
p.m. Monday to Friday, 10il 6 Saturday and Sunday. And then after
that, there's a 24/7 uh realtime chat.
Awesome. And we don't know whether we chatting to a bot or a
human.
Okay. Thanks for coming in, Troy. It's been really good to hear
those stories and uh we'll chat to you again. at some point in the
future.
Thanks, guys.
Thank you.