An Indian data scientist maximised the computing powers of NVIDIA GPU by creating a model that translates Braille to text and audio, lending a helping hand to people with visual disabilities.
A YouTuber, Bhavesh Bhatt is a master of machine learning, data science, and a programming language called Python. He regularly breaks down key concepts in his clips, making the knowledge accessible to Indians. "I think one of the major pillars in terms of me learning about data science and machine learning has always been the community first approach," Bhatt told Indiatimes.
The Braille prototype was created by Bhavesh Bhatt as part of his final year B.Tech project back in 2013. "Think of Braille as embossing ?You have a 3x2 grid, so three rows and two columns wherein by touching that particular button, a person who's not able to see would would be able to understand that this is number 3... and so forth."
Based on this 3x2 grid, Bhatt created a language of sorts to allow "calculations that are done by the microcontroller... a stepper motor would rotate which will again have the embossing and the person can touch the results and understand that this calculation has led to this particular output, so he's well aware of what the number is," he explained.
Also read:?Scientists Restore Vision In Legally Blind People By Editing Their DNA
In 2019, Bhatt saw merit in improving this model by employing the powers of machine learning "that can help you take the image of a particular scene number... if someone is not able to see can you take a smartphone camera and point it at that particular page which has Braille embossings... Correspondingly, can those Braille embossings be given out in form of an audio output by Google Assistant or any voice assistant?," Bhatt explained the concept to us.
He claims that the model is 95% accurate based on the embossings he developed from a data set he "created on his own." In the absence of available data, Bhatt used data augmentation to create sets from "existing data sets." After generating a good amount of data, Bhatt fed it to NVIDIA on Google Cloud. "I thought can I create or can I replicate whatever I've done using hardware through machine learning and data science."
"I had to create a structure which could essentially understand where the digits are... then I also gave the audio output in form of a number or in form of text based on what the prediction was," he told us while adding how a Google Play application could be easily created using his method.
According to Bhatt, "the model is ready" but lacks an interface to help it interact with tools like Google Assistant. As of now, only Bhatt has access to the Braille-to-audio tool, but he plans "open source it" once the Android application is ready. To validate the results, the classic hardware uses an LCD screen complemented by an audio output unit. To ensure the machine was catering to those who needed it, Bhatt verified the results of his tool with people who cannot see.
Part of a four-member team that worked on the project, Bhatt wanted to create something with "social impact." If the model is truly 95% accurate, then we believe the goal was truly met.
Also read:?Blind Man's Vision Partially Restored After 40 Years With New Gene Therapy
Bhatt also created a scraper of his own to track price drops of things he's interested in on shopping websites. When Bhatt is not working on machine learning, he's making videos. Creating content isn't an easy job. Bhavesh Bhatt also overcome a few roadblocks to achieve the current level of proficiency that is on display on his YouTube channel. "Initially I used to spend around 6-7 hours on one video which I've kind of brought down to around 1.5 to 2 hours now," Bhatt said.
What do you think about Bhavesh Bhatt's novel approach to help out people with visual disabilities? Let us know in the comments below.?For more in the world of?technology?and?science, keep reading?Indiatimes.com.?