Mozilla’s new open source model aims to revolutionize voice recognition

Mozilla’s new open source model aims to revolutionize voice recognition

Comments Off on Mozilla’s new open source model aims to revolutionize voice recognition

You may have noticed the steady and sure progress of voice recognition tech in recent times – all the big tech firms want to make strides in this arena if only to improve their digital assistants, from Cortana to Siri – but Mozilla wants to push harder, and more broadly, on this front with the release of an open source speech recognition model.

The initial release of this Automatic Speech Recognition engine has just been unleashed, based on work carried out by the Machine Learning team at Mozilla. The engine is modelled on ‘Deep Speech’ papers published by Baidu, which detail a trainable multi-layered deep neural network.

Mozilla says that its project initially had a goal of hitting a ‘word error rate’ of less than 10%. However, the firm says the engine’s word error rate on LibriSpeech’s test-clean set is now 6.5%, clearly beating this goal, and achieving close to the Holy Grail of human-level performance (which occurs at around 5.8%, according to the Deep Speech 2 paper).

Mozilla has worked hard to train the speech recognition model using ‘supervised learning’ and a huge dataset of thousands of hours of labeled audio, drawn from all manner of sources including free (TED-LIUM and LibriSpeech) and paid (Fisher and Switchboard) speech corpora.

Further labeled speech data was pulled from the likes of language study departments in universities, and public TV and radio stations, all of which was more fuel to the fire for honing the speech recognition engine.

And of course the huge strength of this project, its open source nature, means that this honed technology is now open to anyone to use in their speech recognition projects.

Streamlined speech

Mozilla further notes that the plan for the future is to release a model that’s light and fast enough to run on a smartphone or single-board computer like the Raspberry Pi.

The company has also unleashed its Common Voice initiative, which is an open and publicly available voice dataset containing some 400,000 recordings from 20,000 different speakers – that represents around 500 hours of speech.

As Mozilla puts it, the idea here is to “build a speech corpus that's free, open source, and big enough to create meaningful products with”, running in parallel with the new speech recognition model.

Microsoft is also making big strides on the voice recognition front, having achieved a word error rate of 5.1% in the Switchboard speech recognition benchmark, as announced back in the summer.

The post Mozilla’s new open source model aims to revolutionize voice recognition appeared first on Computerescue.info.

Darren Allan

Related Posts

Tor Browser Will Feature More Rust Code

Comments Off on Tor Browser Will Feature More Rust Code

Dell doubles-down with Black Friday discounts: 15% off XPS, Inspiron and Alienware products

Comments Off on Dell doubles-down with Black Friday discounts: 15% off XPS, Inspiron and Alienware products

Metadata From IoT Traffic Exposes In-Home User Activity

Comments Off on Metadata From IoT Traffic Exposes In-Home User Activity

Top 10 desktop, tablet, smartphone and mobile device stories of 2016

Comments Off on Top 10 desktop, tablet, smartphone and mobile device stories of 2016

Locky Ransomware being Distributed through Fake Flash Player Update Sites

Comments Off on Locky Ransomware being Distributed through Fake Flash Player Update Sites

Evernote CEO: ‘We let our users down’ with privacy policy change

Comments Off on Evernote CEO: ‘We let our users down’ with privacy policy change

New Cerber Ransomware Variant Released That Keeps Original Filename

Comments Off on New Cerber Ransomware Variant Released That Keeps Original Filename

Isanalyze.com Browser Redirect Removal

Comments Off on Isanalyze.com Browser Redirect Removal

Microsoft is nailing speech recognition, so Cortana’s future looks bright

Comments Off on Microsoft is nailing speech recognition, so Cortana’s future looks bright

5 iOS features we want in stock Android

Comments Off on 5 iOS features we want in stock Android

Popular uBlock Origin ad-blocker appears in Edge’s walled extension garden

Comments Off on Popular uBlock Origin ad-blocker appears in Edge’s walled extension garden

Number of internet users in Myanmar rockets

Comments Off on Number of internet users in Myanmar rockets

Create Account



Log In Your Account