By Yakin Ouederni
Communicating in Arabic Sign Language with the tap of a phone
Baher Moursy ’18, Heba Sakr ’18, Youssef Khairalla ’18 and Tamara Nagui ’17 have always been driven by a desire to give back to their communities whenever the opportunity arises. So when it came time for them to decide on a graduation project in 2017, it was difficult to settle on one of the many ideas they thought of, but they were sure of one thing: Whatever it was, it needed to go beyond being just a thesis project.
After months of brainstorming, they came up with Eshara, a mobile application that translates Arabic Sign Language (ASL) into written Arabic in real time. Eshara works on any device with a camera, using video recognition technology to detect the hand movements of someone speaking ASL and simultaneously provide a written translation.
“Imagine a world in which a student can raise his or her hand in a classroom and sign what he or she is trying to communicate, and everyone else in the classroom reading or hearing the translation instantly,” Khairalla said.
Due to rapid advancements in artificial intelligence (AI), the world that Khairalla speaks of can soon become a reality. Eshara relies on computer vision and machine learning, a subset of AI that uses systems to identify patterns and make decisions with little to no human intervention. Both fields are increasingly being used across a variety of areas such as health care monitoring, financial services and transportation. In Eshara’s case, the group films people speaking ASL and then programs the software to recognize specific movements as words so that it may provide a written translation.
“Basically, we design and train several AI computing models to associate each ASL word gesture with its corresponding text,” said Mohamed Moustafa, associate professor in the Department of Computer Science and Engineering and the team’s supervisor. “This work of action recognition is actually part of a bigger research topic known as human-computer interaction. Its applications are all futuristic.”
Noting the great impact of machine learning on people’s lives, Moustafa emphasized the importance of investing time into developing this field. “Self-driving vehicles are expected to be fully autonomous in a handful of years, bidirectional language translation is helping real-time conversations, and mature models are being developed to recommend your next online purchase,” he said.
And it’s exactly this type of impact that the team hopes to make with Eshara: creating programs that make people’s lives easier by using AI to help underserved populations and inspire others to do the same. “We have the knowledge; we have the technology,” Moursy said. “Why don’t we use it for the good of society?”
Two years after presenting their idea to the thesis panel, the team members are still developing the app. Amr AbdelGhani ’19 has since joined the team.
What the team members have now is the demo they created for the graduation project, which is capable of translating individual words and not full sentences. By the end of the year, they hope to accomplish two goals for the app: translating full sentences and being quick enough to produce words in real time. Development takes place in two stages: data collection then programming the machine.
When the team members initially started conducting research, they were surprised to see that there was no project like theirs targeting ASL speakers in the Middle East. This made the data collection process much more difficult and time consuming. “It took us around a year to collect data for the demo,” Moursy said.
One year for 16 words.
Data collection involves choosing words to include in the Eshara dictionary and then filming those words being spoken in ASL. To ensure Eshara’s accuracy at all times of the day and in different locations, the team needed to film movements across a wide array of environments and with different people. With no ASL experience and no contacts with anyone who speaks it, the team members had to learn the movements and train their friends to shoot the videos.
“When filming the words, we needed a wide range of skin colors, hand sizes, backgrounds and lighting, and we had to film at different times of the day,” Sakr said.
Their project stalled for one year when they lost funding, but AUC secured funding earlier in 2019, and the team got right back to work in July. Since then, they have expanded the dictionary to between 800 and 1,000 words. “Being part of the Eshara team resembles a great opportunity for me to help create something that can directly impact people’s lives and has the potential to revolutionize communication with those who cannot hear,” said AbdelGhani. “The current progress is unprecedented for Arabic Sign Language recognition, and I believe that we could potentially push this technology to achieve a breakthrough in scalable automatic ASL recognition. The passion and excitement of the team along with AUC’s support make this project a fun and fulfilling journey.”
Turning AUC Tahrir Square into their workspace, the team members meet with ASL professionals and communities in Cairo that cannot hear or speak to film vocabulary. Khairalla, who is currently overseas, is helping with researching quicker and more efficient technologies for the app.
“Speed is our main concern,” Moursy said. “The app needs to be able to translate full sentences in real time.”
Sign language translating programs do exist for languages other than ASL, but what differentiates Eshara from all others is its accessibility. Most programs use censored gloves or 3D cameras to detect hand movements, while Eshara can be downloaded on mobile phones and works across different software.
“If someone who speaks Arabic Sign Language is sitting right next to me, there would be no way of communicating with him or her,” Moursy said. “Eshara would allow me to just pull out my phone and carry out a conversation. We’re not realizing that there’s a whole demographic of people we aren’t talking to.”
What started out as a graduation project turned into an initiative to include an often neglected demographic in society. Even AUC Tahrir Square, which started off as a meeting place for data collection, became a space for a community of people working toward a common cause.
“The ASL professionals were really excited about the project and even want to continue helping us for free,” Sakr said, emphasizing how this project will allow ASL speakers to enter the workforce more easily, making way for diverse skills and talents that were previously not tapped into. “We’re helping to create a better future for them because they can finally be able to take part in regular, everyday life like everyone else.”