Podcast #38: Simon Lau, Otter.ai – Automated Transcription Service

Simon Lau is VP Product of Otter.ai, a award winning and collaboration app that records and transcribes your meetings and conversations, speech to text, making them into notes.

Check out Otter: https://otter.ai/referrals/872BCIOU

Transcription by Otter.ai

Santiago Leon
Hello, and you are here at the sling on productions podcast where we interview entrepreneurs, tech influencers and companies that are striving to succeed, innovation, something that we’ve seen over the years and it’s still being done today. Today we have a special guests, I am in Lau, he’s the VP product of otter. He is a award winning part of the award winning collaboration app that records and transcribes your meetings and conversations and make them into notes. Welcome to the slam productions podcast Simon.

Simon Lau
Thank you, Santiago.

Santiago Leon
Pleasure to be here. So Simon, I’ve been using otter for the past couple months, six months and it has blow my mind. This is I actually got into transcription about two years ago when I was at this conference, and I noticed that they were using a human. But then I was doing some more research. I was like, Is there a computer AI that’s doing it? And I bumped into otter. And what I learned that otter was more accurate with transcription versus Google. Can you tell us a little bit about otter?

Simon Lau
Yeah, so all about AI is say we’re a startup in the Silicon Valley in California. We are a team of engineers product design. This is folks that are very passionate about building a great product that can transcribe all your conversations and we’re particularly focusing on beatings and especially now in the times virtual meetings. We have done a bunch of integrations with zoom so that we can live transcribe zoom meetings like you and I are having right now and recorded as part of this podcast. And that is to enable businesses, schools, nonprofits, basically anybody who have the need of providing live transcription, capturing meeting notes, capturing interview notes. So we provide a very, very accurate transcription service. It labels all the speakers. Everything becomes searchable and shareable and editable later. Yeah, that’s that’s all there in a nutshell.

Santiago Leon
Yeah. What I’ve heard the percentage that you guys are about, like 90% accurate. Is that what I’ve heard?

Simon Lau
It’s a it’s a it? Yes, absolutely. It depends on the type of content. It depends on the audio quality. But for the most part, we’ve gotten a ton of users telling us that It’s 90 95%. It all depends on a variety of factors, including audio quality and speech, clarity and whatnot. But yeah, we’re right up there. Best in best in class comm very competitive against all of the leading vendors out there. And especially we’re focusing on long form multi people conversations. So this is not like speech technologies that are optimized for short voice commands like the speech assistance that most people are familiar with, like Siri or Google Assistant. Right? It’s not about having a voice assistant to answer short questions. It is about what we call ambient voice intelligence. You know, it’s a voice assistant that actually listens to human to human conversation and turn it into very, very accurate transcript that you can use and that you can repurpose that you can look for information later.

Santiago Leon
Can’t tell us your involvement. How did you get started with otter?

Simon Lau
So Wow, okay, about three and a half years ago, I joined honor almost at the beginning of the company was founded in 2016. And I joined in 2017. So I joined us the head of product. So I manage the the product strategy, product requirements, talking to customers to learn the requirements and turn it into prioritization and roadmap of the product. And ultimately, we, we launched the auto product in 2018 in App Store and Google Play, and, and gained a lot of consumers and getting a lot feedback. And now we’ve been focusing on the enterprise and education institutions to make sure that we can enable companies and schools to become very effective in their online meetings. In their virtual virtual classes. So, in this work from home scenario in the remote learning scenarios, we are empowering everybody to provide accessibility to provide collaboration, so that everybody can focus on the conversation and without worrying about taking notes.

Santiago Leon
Can’t tell us the whole technology behind otter I mean, of course, you can’t reveal your secret recipe but what makes otter technology different from Google, and like the others,

Simon Lau
the key part is obviously speech to text technology or automatic speech recognition ASR So at the very core is turning spoken English into written English so that technology exists in many places, but our key focus is long form multi speaker conversation making that very accurate for a variety of speakers with different accents, variety of environments, whether it is online through a zoom call or other video calls. Calls through phone calls through in person in a conference in a noisy coffee shop, right, a variety of acoustic environments. So that’s one key focus. The other portion of our technology is speaker identification. In many other leading vendors, all they provide is just speech recognition or transcription. But or they might provide speaker separation or what what is called diarization, which basically means turning the paragraphs into speaker one speaker to Speaker three. But we take a step further, not only do we provide speaker diarization, we also do speaker identification. naming this paragraph Simon said this, Santiago said that and not only that, but it will remember it over time. So let’s say if we are colleagues and we have regular meetings, or professors who have regular time classes, the the voice prints of the speakers are going to be remembered and apply to future transcription as well. So then you create auto creates a legible transcript that you can go back and search and say, hey, what does San Diego say about podcasts in six months ago, what the Simon said about speech recognition two months ago, everything becomes very easily searchable.

Santiago Leon
Yeah, that’s something that yes, audio and video are very pretty quality is awesome. But text is so important, because you probably know this. I mean, of course, that whenever I produce a podcast, I transcribe it with otter, every podcast I do. And it is good for Google search engines. So they can pick up you know, whatever you said, and maybe have your website URL go up in like the Google search. Which Industries has surprised you guys that has caught more attention? Is it education? Is it like the nonprofit what’s part that you noticed? Like, wow, we didn’t think about this but this this is really working for them.

Simon Lau
Yeah, you touch upon one. So in your industry, whether it’s, you know, the media industry, whether it’s podcasts, video producers, anybody to deal with media, audio video, right, you have a need to be able to help with the production process. You want to you have a lot of raw audio or raw footage that you want to edit. So being able to transcribe it and make all that content searchable, and indexable. That’s super important. So So video and audio folks have found otter to be very, very powerful tool to satisfy their transcription needs and specific to video. There’s also Captioning so hope to provide accessibility. So otter also can export to SRT format so that that can be uploaded and combined with the video script to provide captioning closed captioning to video content. You also mentioned from an SEO standpoint, right? So for podcasters, who want to turn the podcast show notes more, to be more SEO friendly to drive more traffic? Yes. So that’s another case for be able to transcribe your podcast, turn it into something that you can embed, you can embed the auto transcript, either you want to export it into text and paste it into your show notes, or just embed the audio player, the author, audio transcript that is also SEO friendly because Google actually crawl inside the iframe as well. So you can actually provide a great listener experience so people can click on your podcast, listening, love seeing the audience. Transcript highlighting highlighting the words in a transcript as they’re listening to it, but also still drive SEO, drive traffic to your site. So yeah, that’s the media is definitely one industry. The other segments that surprised us initially was the accessibility community. So deaf and hard of hearing people who have accessibility needs will want to be able to participate in conversations. And with the pandemic situation right now. Deaf and Hard of Hearing people have an even tougher time because in virtual meetings, all of our faces are into these tiny squares is very hard to read lips. If you are out and about and go shopping, and people are wearing masks, well guess what you also cannot read lips. So the fact that otter exists in all the different platforms whether it is as an add on to zoom, or on the web, or as a mobile app on iOS iOS, and Android devices. That means anybody whether is deaf or hard of hearing, or even hearing people can install the otter app and be able to facilitate that communication so that even if people are wearing masks, you can still communicate with your friends and family members who are deaf and hard of hearing. So those are just a couple that comes to mind.

Santiago Leon
Yeah, and I think there’s one part that I think you guys will capture or have captured already, is people that know English as a second language

Simon Lau
that falls into our the education sector, where in the education sector First, there’s accessibility. So anybody who is deaf and hard of hearing, or even dyslexia or any type of learning, learning differences, ESL so English as a second language, that’s another category, but also, journalism students or PhD students are conducting research interviews. And And now, the more general student body, anybody who have who wants to take class notes, or any professors or TAs who want to distribute course notes. otter becomes central to this essential tool, especially now with online learning.

Santiago Leon
Yeah, this tool I mean, I have to admit, I mean, very, I mean, I have to be very honest, is probably revolutionary. I mean, in my opinion, and what you were saying about like, Speaker ID, guys, Speaker, I mean, like they identify the speaker one, Speaker two, and then you’re able to put like the name to it, and then it remembers next time that speaker comes up and you’re recording and remembers, yes, it’s Michael or it’s Sam, things like that. So it’s so accurate pinpoint on. You also been doing some live events and This is stuff that I really think that I think it’ll be used as well. Um, how, how, how has it been used with events? And what has been like the feedback from it?

Simon Lau
Yeah, so it started out about two years ago, we were partnering with TechCrunch, disrupt, and web summit. Those are the two big ones that most people recognize these big tech events. We’ve been doing live events and back then before the pandemic, there’s these in person events where they have two tracks, 15 tracks, 16 tracks, all these live sessions that it’s impossible for all the attendees to attend every single session. So both in terms of providing accessibility to these live event organizers, but also to enable the event attendees to catch up on sessions that they they can’t possibly attend While they were there. becomes very, very valuable because they can go back. They can even access it on their mobile app. So we even integrate it to the event application as well. So the the attendees have choices, they can download the otter app and subscribe to the TechCrunch or web summit public group and then be able to tune into both the live transcription as well as to go back to say, hey, at the end of the day, which sessions I miss, you can go back. And the feedback has been great, both from the conference event organizer and from the attendees. There was another conference, a smaller conference that was mostly I would say it’s a Silicon Valley Asian technology startup conference. So most of the attendees are English as a second language and they came to our booth and told us Wow, auto team, amazing job. You guys are helping me understand the setup. way better. So this is providing, providing the tremendous value. Right off the bat right now you don’t even you don’t even need to sell me like I am using your app, even at the conference, you don’t need to convince me. I just want to use order and I want to bring all their technology to the company that when I go back, I want the company to be able to use this technology. So yeah, so that’s really powerful.

Santiago Leon
Which you can use zoom events. It is offered with otter wit like with otter teams, right?

Simon Lau
Yes. So now with the with virtual events, order. Life notes is offered as part of the order for teams plan. So it’s at the level that’s targeted for both businesses, nonprofits and educational institution. So it’s directly integrated with zoom. So it supports both zoom meetings and zoom webinars. And we actually also have the capability to receive live streams audio through rtmp. So if you, you know, if you just contact us and if you are a virtual event organizer that can live stream, your audio through rtmp stream, then otter would be able to provide live transcription for you.

Santiago Leon
So you go, you’ll have the contact order for that service

Simon Lau
for that third one that I mentioned. Yes. Okay. Yeah, for meeting and zoom webinar is bundled as part of otter for otter for teams plan for rtmp. I just want to give you some additional options for your listeners to consider. We can also support that.

Santiago Leon
Tell us your partnership with zoom. I know that you guys signed I think over a year ago tell us how’s it been so far?

Simon Lau
So it’s been a few years ago. We otter is the official Transcription provider to power, the transcription capability within zoom as well. So that was the initial partnership. And then and then furthermore, otter also built additional integrations to so that that will enable our users and zoom pro or higher subscribers to be able to get a live transcription as well. For us, as we mentioned, for the auto four teams and all the premium and all their basic plan at different levels, we have different levels of integration. So the partnership has been great. You know, for for these zoom customers who subscribe to zoom business or higher, they can get that built in as part of zoom. But for the lower plans like zoom pro when they cannot afford the zoom business plan, then they can come to order and be able to have a very cost effective way to do Both post meeting transcription and live transcription.

Santiago Leon
Has there been talks of maybe adding like the captions in the video itself?

Simon Lau
Coming soon. Stay tuned.

Santiago Leon
Oh, that’s a good question. Okay, good. Have you ever has other big companies approached you guys to to integrate your API? I mean, like otter with their technology, is that been discussed or like any news you can talk about or

Simon Lau
have nothing to talk about? But absolutely, we we get contacted all the time. And it’s it’s a testament to orders, transcription, accuracy and utility. So yes, we have been approached by many, many, many companies, large and small, wanting to integrate to our technology and our team is staying focus we we have we formed few strategic partnerships. And as we grow, we’ll be open up to more more business partnerships.

Santiago Leon
Yeah, I think strategy is key because you don’t want to, you know, open up too fast. I mentioned API. I’m a huge developer and also a huge API guy. Have you guys thought about having an API? Like the roadmap in the future?

Simon Lau
Yeah, it’s definitely something that on a roadmap, it’s not, you know, a near term roadmap for now. But absolutely, that’s the that would be the direction that will go eventually. Yeah, that’s

Santiago Leon
something that I could definitely see because there’s not too many of those. any future plans with otter that you know of that you can reveal? Or I’ll a little tidbit.

Simon Lau
Yeah. So we’re, we sort of mentioned that we were, continue to expand our support for zoom to include things like zoom webinar. You you asked about captions. So That’s something that we’re working on as well. So live captions, that’s also coming. Advanced search capability. That’s, that’s one that’s very useful, especially for some of our long term users who have collected thousands of conversations within otter and now they want the ability to go back and find a needle in the haystack. Find that inside, who said, what, when, in what context. So Advanced Search is also coming to our page with one of our pay plans. And those are just a few key things. And really, we’re just continuously improving and in and there’s still gonna be improvement in terms of speech to text accuracy. There’s no stopping on that. And we want to continue to extend our leadership in half providing top notch accuracy. So there’s still a ton of More work on that. Also speaker identification that’s also in another area that’s still very low quality in the market. And we are the leader in providing the most accurate speaker identification. So we want to continue to improve, you know, we’re competing against ourselves. We want to make sure that we continue leading the industry in terms of both speech recognition, accuracy and speaker identification accuracy.

Santiago Leon
That’s very smart because I always hear Gary Vee says you want to compete. I mean, with your shadow, you know, you want to always be ahead of the game and you’re right. I don’t see many. I mean, not I think you’re the only one with Speaker ID. That’s something that is just way different. Tell us your partnership with Dropbox.

Simon Lau
The partnerships, Dropbox is great that’s focused on the media customers where we heard that from Dropbox a lot of their customers in the media segment are using Dropbox for collaboration of the content. So they just want a very easy way. If they’re already collaborating in Dropbox with all the raw audio and video footage, they just want a very turnkey solution so that anything that is shared in a dropbox folder would automatically get pulled into otter for transcription and automatically save the resulting transcript whether it is a text file, or PDF file or SRT format, in whichever format that they want directly inside the dropbox folder as well. So that is that integration has worked very well. It’s gotten a lot of traction with Dropbox, customer customers, especially in the media industry.

Santiago Leon
Has there been a solution where a person could get like those captions from otter into a video screen like has that been a a? I mean, like, why would you say That that has been possible to do that. Or

Simon Lau
you mean to actually export a video containing the captions?

Santiago Leon
Yeah. Well, for example, if you’re showing a video on a screen or TV, are you able to use auto caption on the video itself? You know, like,

Simon Lau
yeah, so our first focus is virtual meetings, especially zoom. So that live caption capability that we were building toward is going to be for zoom as a first step, and then we’ll explore more options as well. Okay. Before Yeah, yeah. And, and as a workaround right now, anybody who has, you know, depends on whether you’re coming from a content producers point of view or from a the audience point of view, right, if I’m a content consumer point of view, whether it’s for accessibility or for just better comprehension period. You can already use the author web app with any Or you can just sit at home watching TV, maybe you, you will be watching a podcast, maybe you’re watching a TED talk. Or maybe you’re listening to a to a podcast, right? Recently, I’ve shared some tutorial videos on social media to teach our users how to do that. Whether you want to learn from podcasts and take notes and collect snippets of insights from podcasts, because podcasts as much as there’s a lot of useful information in there. It’s also very time consuming to be able to follow and listen to the the entirety of a lot of podcasts. So what what most podcast listeners are asking for is that I want the ability to be able to sort of take notes and collect these, like, interesting soundbite. So they have they started using auto for that.

Santiago Leon
Yeah, for sure. I mean, I mean, I even use it sometimes when I listen to a podcast like I don’t have time so I go into the show notes of most pocket most I would say not every podcast has shown those with like that like everything transcribing like 20%. And I just, you know, hit find, boom, boom, boom Oh, there it is, this is what he said this is like the context with it and I think a lot of Packers a whole lot of podcasters need to use otter to you know, resolve that issue. Good for for like SEO and also for people just want to find information during the pandemic, I think you were mentioning a whole lot of via Twitter, that is going to work well with education can tell us a little bit about that.

Simon Lau
Yeah, so for education, spoke about the student accommodation use case. For example, UCLA is using otter for to provide accommodations. The alternative is to hear to have pure note takers to pair up, students. And that’s not a scalable solution. They have to use order to do that transcription. University. Chico State University has, has also use order for for broader use to be able to enable provide online note taking for class notes for the for the student body. So that’s another area. We’ve also found other pockets of student body for example journalism students, for example, higher education research in PhD dissertations, and those are the where they have a lot of requirements to be able to conduct a lot of interviews and be able to analyze and listen back to the interviews to pull out the key findings to to write their thesis to write the dissertation to write their offer. for journalists to write the articles, right? So it’s a very, very painful process that nobody wants to do. Nobody wants to manually transcribe themselves. So order really, really helps save a ton of time so that they can spend more time doing the actual research or they can spend more time writing their articles.

Santiago Leon
Yeah, definitely. For students be huge. Like, what did that Professor say yesterday around two o’clock. You know, you always wanted to always wonder what’s going to happen in order. Great, you know, great integration, great solution for that. Any plans for other languages that you guys want a different speech to text any plans in the future?

Simon Lau
Yeah, it’s most likely going to be the key languages that we would consider next would be Spanish, Japanese, Mandarin Chinese, just to name a few. The we’re looking at some time Next year, that’s probably a likely timeframe. Right now we’re still very laser focused on English.

Santiago Leon
Yeah, get that one language set and then move on to the next wow what what an interview I’m really glad that you came on and I really think I mean I mean I’ve been using this every day otter is my my go to for transcription, every podcast that I record I uploaded to otter and also use it for video captions as well and I exported to SRT which saves me a lot of time. It works well with Final Cut Pro perfectly.

Simon Lau
That’s fantastic. I’m so glad to hear that and look forward to any additional feedback and, and suggestions. We’d love to serve the the media folks as well making sure that it We benefit and save time for your workflow.

Santiago Leon
Time is key and everything. And how can people find out information with otter and yourself?

Simon Lau
Yeah, so just come to our website otter.ai o tt er.ai. And if you have any questions, whether it’s sales or product support, email us at support, thought support@auto.ai

Santiago Leon
Simon, I really glad that you came on to this Leon Productions podcast.

Simon Lau
Thank you so much, Santiago. Take care.

Transcribed by https://otter.ai