Thanks to technology, making an audio recording of a meeting is a trivially easy task these days. While speech to text services can help tackle the tedium of transcription, this just invites a slew of new problems: The need to clean up and confirm transcripts. To verify who said what. Not to mention having to ensure everyone consented to being recorded — and gets a timely copy of the meeting minutes afterwards.
On the value-add side, a manager might also wish for an easy way to generate a summary of key points after a meeting. And a tool that can automatically list any tasks so they don’t have to go back and pick them out of their notes. Analytics around meeting productivity could also be useful — helping answer questions like: Are my meetings running to time? Which speakers tend to dominate? How many decisions are being made and actioned?
These are the sorts of assistance capabilities that US-incorporated startup Reason8 has in the works for its AI-powered transcription service, which is launching out of beta today from the TechCrunch Disrupt Berlin stage.
Down the line the team reckons the tech will also be able to offer users linguistic and voice analysis — with the potential to act like a coach and offer individual guidance on improving in-meeting performance.
“In-person meetings are the least digitized field in personal communications,” says co-founder Vlad Belyaev, discussing the idea behind the product. “It is a huge market. All of us make meetings, and make meeting notes, write meeting summaries, meeting minutes.
“Our first niche is to focus on managers who spend a large amount of time on making meeting minutes and on tracking the tasks that they gave to their employees.”
The target user for Auroom is a middle manager who has three to four meetings per day and thus sinks a lot of time into distilling hot air into summaries and trackable tasks. “We are trying to make their life easier and more productive,” is his concise summary of the product goal.
At least two smartphones are required to record any one meeting via the app. This both provides enough audio input data for the speaker-separating AI to work with, and also means multiple meeting participants can participate in grabbing the record of the meeting if they wish. So it aims to help with the consent and ownership issue, too.
Belyaev says the idea for the app came about after he’d worked as a secretary and executive assistant — and spent a lot of time compiling and sharing meeting minutes. The challenge of keeping meeting participants who were spread across multiple timezones in the loop was another pain-point that fed the app idea.
The team incorporated in Delaware in May 2016 and in November last year raised $1.2 million in early-stage funding from Russian business angels to help fund development of their MVP.
The app is iOS only for now, it being easier to calibrate microphones across the more homogenous iPhone hardware than Android’s diverse range of devices. But an Android version is planned for next year, says Belyaev.
Reason8 is using Google’s cloud speech API for the speech to text conversion of meeting audio captured via its app — so the first thing to note is it’s not trying to replicate competitive and robust speech recognition technology that’s already available in the market.
Rather, its focus is on making that existing technology more useful in the context of meetings and managers. Its special sauce is a deep learning model trained to be able to identify different voices and thus to separate out speakers within a transcript — meaning the user doesn’t just get handed one big block of text.
This works without users needing to train it, according to Belyaev, who says the AI is able to separate speakers from the very first meeting. (And once individual users have identified themselves in the app it can link their name with their voice footprint to include their real name in meeting transcripts, too.)
“We use deep learning for better understanding your digital footprint of your own voice,” he notes. “We make it in a fully unsupervised way. So we do not need data for learning to distinguish speakers between each other.”
The second bit of Reason8’s proprietary tech is a natural language processing engine that it’s using to automatically identify specific tasks agreed on in the meeting.
This engine, which the team trained using open data sets, was developed out of an earlier product they deployed in the Russian market — aimed at helping companies improve the performance of their customer support services.
“We developed our own natural language understanding engine. One of the main features of this engine was and is that we are able to work with a very little amount of data,” he says, adding: “Very few amount of businesses have data to train the model. That was why we created the solution which is able to identify the sense of phrases.
“We are able to classify this meaning and identify — for instance — tasks from decisions and from ideas.”
How does the system avoid getting confused by the codified word salad that can get thrown around in meetings? By focusing on analyzing phrases rather than individual words, he says on that.
“We don’t analyze exact words — we analyze the whole phrases,” he tells TechCrunch. “That is why we analyze the meaning of the phrases. We’ve been developing our natural language engine for the last year and this gave us opportunity to better understand the meaning of exact phrases which people mention.”
The initial focus on extracting tasks is also a way for the team to shrink and tackle the linguistic context challenge that’s at the core of what they’re trying to do. “Tasks mainly have a very clear intent so you can mention that someone has to do something with a current deadline and this is easier to identify,” he says. “We decided to focus on a very narrow problem — so identifying tasks and then later decisions and ideas.”
The ambition is for the NLP engine to get smart enough to be able to automatically create meeting summaries in future too. For that another current feature of the app is key: A highlight button that users can manually tap on within the app to flag important sections of a meeting while it’s going on. This tells the tech to take note. Any highlighted portions will then be incorporated into the meeting report it delivers.
But these manual highlights are also a learning signal for the tech. Belyaev says it’s feeding them into the model to help it get better at understanding vocal intonation during important moments in meetings.
“With this information we are able to capture data-sets that defines not only by the meaning but also by intonation, by context that you say something important,” he explains. “When I say something important i’m trying to be precise, I’m trying to make pauses and to emphasize the intonation on it.
“So with all this context information — so not only from the text which we already have huge experience on, but also from the sounds — so we are able to identify better and to give better quality for our summarization engine.”
“Looking further, our vision is to provide for people a tool with which they could make meeting minutes and meeting summaries easily and a digitally enabled path to gather these self-enforcing data sets,” he adds. “So if people are using our product they improve their productivity with drafts of meeting minutes, then they make their own meeting minutes and send it to their colleagues.”
For now, the engine can identify tasks. Soon — “in the next few weeks” — it’ll also be able to classify “ideas and decisions”. But it isn’t yet capable of serving up fully fledged meeting summarizes. Instead users get a brief report with any manual highlights they trigger.
Belyaev says the plan is to add integrations with other communications tools — such as Google Hangouts, Zoom and Slack — in order to “capture more information to improve our summarization engine”. And they’re penciling in Spring 2018 for the full AI-powered meeting summarization feature to launch, he adds.
They also plan to integrate with existing project management systems to further expand the utility of the product. So, for example, a task automatically extracted from a meeting transcript could also be automatically inputted into a PMS.
“A transcript is great but for business in person meetings, meeting minutes and meeting summarizes add more value than just a transcript,” he argues. “Because, yes a transcript gives more value than just a record but what is even better is a summary and automatic task extraction and further integrations with project management systems like SNA or Jira… So it automatically creates tasks in project management systems. So it would be really great for improving productivity for managers and I think for many other potential users.”
Though he also emphasizes that the system is not going to be 100 percent perfect at parsing context — so also isn’t going to be able to entirely replace assistants. Rather it’s intended as a productivity-boosting aid.
The speaker separated transcription feature works for any language already supported by Google’s speech to text tech, according to Belyaev.
But the pre-trained models that power task and (soon) decision identification have been trained on English language speech so aren’t currently able to support additional languages. Though he says they are planning to train models on data-sets in other languages in future to expand support.
The team’s first commercial push for the product is focused on the US market. Pricing is freemium for now — with a limited usage basic version of the app and a pro subscription for those who want unlimited use and all the features.
The initial target is B2C earlier adopters who may be most comfortable prioritizing productivity gains over and above the privacy concerns associated with a technology that currently works by streaming audio data to the cloud for processing.
On the privacy front, Record8’s website states that it’s encrypting the data in transit and claims not to be storing any meeting data. But even so there are major privacy considerations and risks given it’s uploading recordings of private and potentially sensitive business meetings to the cloud. Risks that make its the current offering unsuitable for many businesses.
Belyaev says the team does intend to address the wider enterprise market in future — such as by offering a bespoke private cloud version of the system — but is starting with B2C and consumers first to push for traction to verify market demand. They’ll also be able to use the data of early adopters to continue honing their models to improve the product.
“In our plans, in the next half a year, we are aiming at enterprises,” he says. “If enterprises will want our solution on a paid basis — so they might like to make an in house private cloud solution with analytics, recommendations and all features of our product, so we are ready to provide it.
“We know for management consultant firms they would prefer to keep all the information on their servers or some private cloud like Amazon but they like to use their own solution for privacy concerns. So yes we will be able to provide it, we’re interested in providing it but we’d like to start with the B2C segment and with end customers by providing them an app for meeting summarization and transcription.”