This year’s International Arabic Language Day comes amid growing global interest in the Arabic language, and at a time when artificial intelligence is helping expand its presence on the Internet.
The United Nations celebrates December 18, which coincides with the day the Arabic language was adopted as the organization’s official language in 1973. The theme of this year’s celebration is “Arabic: The Language of Poetry and the Arts.”
The Arabic language’s long history of philosophy and the rich arts of poetry have been recognized. Ahmed Shawqi (1870-1932), known as the Prince of Poets, praised the Arabic language, saying: “He who filled the languages with beauty… placed beauty and its secret in the letter dād,” which is a special letter in the Arabic language.
With more than 350 million native speakers, Arabic is the fourth most widely spoken language in the world, according to Babel magazine.
In a message on the occasion of World Arabic Language Day last year, Audrey Azoulay, Director-General of UNESCO, pointed out that “the Arabic language constitutes a link between three continents, at the intersection of Europe, Asia and Africa.”
“The Arabic language forms a link between three continents, at the intersection of Europe, Asia and Africa. … Its geographical centrality meant that it was not only the language of merchants, but also the language of scholars, artists and philosophers.
Audrey Azoulay, Director-General of UNESCO
“Its geographical centrality meant that it was not only the language of merchants, but also of scholars, artists and philosophers,” Azoulay said, adding that the language gained strength and diversity from this location, “leading to reflection on its exceptional historical significance.” “
Arabic is also the language of the Qur’an, and has prevailed for centuries as the language of politics, science and literature, directly or indirectly influencing many other languages. It helped transmit scientific and philosophical knowledge to Europe during the Middle Ages. It also enabled dialogue between cultures along the land and sea routes of the Silk Road from the coast of India to the Horn of Africa.
Artificial intelligence supports the Arabic language
In a famous poem written in 1903 entitled “The Arabic Language Mourns Its Fate with Its People,” Hafez Ibrahim (1872-1932), the “Poet of the Nile,” embodied classical Arabic and imagined how she felt about contemporary efforts to replace it with colloquial forms.
In the Arabic poem, it says: “The Book of God is broad in wording and purpose, and is not narrow in its verses and sermons. How can I narrow down my search today to describe a machine, and format the names of inventions?
Today, it is machines and inventions that define the Arabic language, as advances in artificial intelligence and other technologies help computers master the language and expand their presence online and in the world.
Among the Arab researchers leading these efforts is Ahmed Ali, a principal engineer in the Arabic Language Technologies Group at the Qatar Computing Research Institute at Hamad Bin Khalifa University.
Ali discussed his group’s work in an interview with Al-Fanar Media. A transcript of that conversation follows, edited for length and clarity.
Al-Fanar Media: Do you think that the recent events in Gaza have increased interest in the Arabic language?
Ahmed Ali: Of course, the recent events in Gaza had an impact on international public opinion and interest in the Arabic language. For example, the world wants to understand a lot of data and audio recordings, in classical and Levantine Arabic, especially the Palestinian dialect. This requires modern technology to understand the content of these files. Therefore, these events greatly increased interest in learning about the Arabic language, as well as its various dialects. We believe that investing in this technology is an important means, so that the world can see part of the truth of what is happening in our Arab region.
The Arabic language is the least used language in scientific research outputs at the global level. How can we enhance its spread?
Ahmed Ali: Worldwide, most international research outputs are in English. The Arabic language has its own advantages; It has a unique diversity between spoken dialects and writing style.
During the past two decades, there have been two waves of increase in research in Arabic, spoken and written. The first was after the attacks of September 11, 2001, when interest in the Arabic language increased for purposes of understanding it. As for the second wave, it accompanied Web2.0 and the boom in Arabic content on social media, in 2011. For example, the Qatar Computing Research Institute published hundreds of research papers on understanding and analyzing the Arabic language in the best way. Scientific conferences.
Does the development of artificial intelligence endanger the Arabic language?
Ahmed Ali: AI is a machine that can learn from the data it sees, just like a little child. For example, we developed Speech Machine, which converts written text into spoken Standard Arabic, for newscasts and educational curricula. We have also developed automatic speakers for different dialects for social purposes, where colloquial dialects are dominant.
Based on your experience, how can speech be processed by computers?
Ahmed Ali: Given the richness of the Arabic language, dealing with it requires taking into account many challenges. Processing it requires a huge amount of data. These challenges include the abundance and difficulty of Arabic morphology, such as the word And they will treat it Which would be translated as (and+they+will+deal+with+that).
“We believe that investing in this technology is an important way, so that the world can see part of the truth of what is happening in our Arab region.”
Ahmed Ali, Principal Engineer in the Arabic Language Technologies Group at Qatar Computing Research Institute
Mastering diacritics and their effect on meaning is another challenge, as are words knowledge, Knowledge, knowledge, knowledge, knowledge ) Imagine then Commercial Record It could mean car, care, treatment, or essence).
There is also difficulty in identifying proper nouns (in the English language they begin with capital letters, while Arabic contains only small letters), in addition to the presence of many spelling and grammatical errors in Arabic writing and a lack of adherence to punctuation marks, in addition to the lack of electronic linguistic resources available to researchers. .
Are there technologies or tools developed to support the use of the Arabic language?
Ahmed Ali: At Qatar Computing Research Institute and the Arabic Language Technology Group, we aim to support the presence of the Arabic language on the Internet by building technologies that help computers master the Arabic language and make them first-class citizens in cyberspace.
The Institute works to contribute to building and supporting Arabic language technologies, and making them available to developers and programmers, as well as users, by publishing research related to these technologies, and building and developing programs to process the Arabic language.
There are major projects concerned with Arabic language technologies, such as the “Frasa” program for automatic processing of the Arabic language, and the “ASAD” program for analyzing and understanding Arabic social media, which detects offensive language, hate speech, feelings, dialect, and others. There is a canary and speaker program for converting audible voice into written text and vice versa, a Shaheen program for automatic translation between Arabic (and its dialects) and English, and an alert program for analyzing news, identifying popular and propaganda campaigns, and determining intellectual and political positions. Furthermore, NeuroX works to understand neural networks created by technology and computational prediction mechanisms.
How can we benefit from these technologies in Arab higher education?
Ahmed Ali: Hamad Bin Khalifa University is the first university in the Middle East to offer innovative, open online courses in collaboration with EdX. These courses are a great opportunity to use modern technology in education. Speech recognition can be used to transcribe online lectures, and text analytics can be used to track students’ learning progress.
Do these techniques contribute to teaching the Arabic language to non-native speakers?
Ahmed Ali: The QVoice project aims to build speech technology for automatic learning of Arabic pronunciation, empowering Arabic learners of different age groups and native language (L1) backgrounds, through accurate detection of pronunciation errors and appropriate feedback. It also aims to enable learners, especially non-native speakers, to learn and practice Modern Standard Arabic. It also aims to help native Arabic speakers reduce the impact of accents, and provide a learning experience tailored to the specific needs of individual learners through targeted feedback. Such an experience can boost confidence and encourage learners to continue learning.
At Qatar Computing Research Institute and the Arabic Language Technology Group, we aim to support the presence of the Arabic language on the Internet by building technologies that help computers master the Arabic language and make them first-class citizens in cyberspace.
Ahmed Ali
The project also aims to enhance Arabic speech research, by better modeling the Arabic phonetic space, to deal with different dialects and speaking styles. It also aims to improve L2 speech models, explore different acoustic modeling and augmentation techniques, enrich multilingualism, and introduce multimodality into Arabic speech research.
With the spread of misinformation in digital content and the need to improve verification efforts, how can we benefit from these technologies?
Ahmed Ali: The QRCI Center’s alert project works to develop techniques related to information analysis, especially in the context of combating rumors and enhancing media examination skills. It also aims to address the challenges posed by the spread of false information and rumors in Arabic-language content. The project includes research and development of tools and technologies to detect, analyze and combat the spread of misinformation online.
The project emphasizes the importance of enhancing critical thinking and media examination skills among users, to enable them to distinguish between reliable and unreliable information. QCRI is actively involved in research related to natural language processing, machine learning, and information retrieval, which are key aspects of developing rumor detection tools, and have benefited media organizations and the United Nations.
Related reading
The Arabic language holds a rich and significant place in the world, with its history, literature, and influence in various fields. As we celebrate the beauty and power of the Arabic language, we also look towards the future and the role that artificial intelligence plays in expanding its reach and accessibility. With the help of AI, the Arabic language is able to reach new audiences, break down communication barriers, and open up opportunities for its growth and development. This convergence of tradition and innovation is shaping a new era for the Arabic language, and we are excited to explore the possibilities it brings.