
Indo-Iranian languages are a sub-branch of the larger Indo-European linguistic group. This group includes the Persian language, the official language of Iran, the Hindi language family, the Punjabi language, the Dai language, and the Bengali language, with over 1.5 billion speakers worldwide, mostly in southwest and south Asia. Also known as Indo-Aryan languages, the core elements from this linguistic branch in historical texts suggest it dates back to the early second millennium BCE. Therefore, the origins of the Indo-Iranian languages family are closely intertwined with the history of Iranian Languages.
Satem Languages vs. Centum Languages
The Indo-Iranian language linguistic family or indo-aryan languages is considered a Satem language, a classification of the Indo-European language family based on palatovelar consonants. It includes over 300 languages, but there are three major Indo-Iranian branches in this linguistic classification. Before we dive into the many branches, let’s see what Satem and Centum languages are.
What are Satem and Centum?
Satem and Centum languages are two distinct branches of Indo-European language family. They are differentiated primarily by their treatment of the Proto-Indo-European velar consonants.
In Centum languages, which include many Western Indo-European languages like Latin, Greek, and Germanic, the Proto-Indo-European kʷ, gʷ, and g are retained as velar sounds or transformed into similar-sounding guttural consonants. This results in distinct phonetic characteristics and numerous cognates across these languages.
A Comprehensive Indo-Iranian Languages List

There are over 200 Indo-Aryan languages in the extensive Indo-Iranian language tree.
This linguistic group primarily includes various languages and dialects in the Indian subcontinent. There are over 200 Indo-Aryan languages in the Indo-Aryan language tree, but the major sub-branches of this linguistic category include:
Indo-Iranian Countries and Languages
Country | Predominant Languages |
India | Hindi, Bengali, Punjabi, Urdu, Tamil, Marathi, Telugu, etc |
Pakistan | Urdu, Punjabi, Sindhi, Pashto, Balochi |
Bangladesh | Bengali (Bangla) |
Nepal | Nepali |
Sri Lanka | Sinhala, Tamil |
Bhutan | Dzongkha, Nepali (among some communities) |
Iran | Persian (Farsi), Kurdish, Gilaki, Mazandarani |
Afghanistan | Dari (a variety of Persian), Pashto |
Tajikistan | Tajik (a variety of Persian) |
Uzbekistan | Tajik (in some regions) |
Old Indo-Aryan Language Tree
This ancient language tree originates from the Indo-Persian language, known as the Proto-Indo-Iranian language, predating the following Indo-Iranian branch languages.
Sanskrit: One of the most ancient languages in Asia, it originated from Vedic-Sanskrit. The earliest example of written Vedic-Sanskrit is the Rigveda, a collection of hymns and one of the four sacred texts of the Hindu faith. Later forms of this language include Epic Sanskrit and Classical Sanskrit. It is no longer widely spoken, but highly used in religious and literary contexts. Its influence can be traced in other Indo-Aryan language trees and scientific fields. There are traces of this language from 1200-1700 BCE.
Middle Indo-Aryan Language Tree
Prakrit: Also known as Apabhraṃśa, the language was derived from Sanskrit as the language of the masses. Sanskrit means “perfected”, while Prakrit means “derived”, showcasing a contrast in status since it is the common language derived from Vedic-Sanskrit. Historical evidence suggests it developed as early as the 3rd century BCE, but scholars speculate its development dates back to the 6th century BCE.
It includes three major literary dialects: Shauraseni Prakrit, Magadhi Prakrit, and Maharashtri Prakrit. The Brahmi script, closely associated with Prakrit, was the turning point in the evolution of the Indo-Aryan language tree system of writing. It highly influenced the Devanagari script, which is associated with the Hindu language.
Later Indo-Aryan Languages
Hindi is the official language of India and can be considered the offspring of Sanskrit and Prakrit. The earliest form of Hindi, Old Hindi (Apabhramsha), developed in the 7th century in North India. In the Medieval period, it developed further due to linguistic interactions with the Iranian and Arabic languages, creating a new form of Khari Boli. In the 19th and 20th centuries, it was standardized with the creation of the Devanagari script, which played a vital role in shaping India’s national identity.
While the Devanagari script is derived from the Prakrit, much of its vocabulary, verb conjugation, noun declension, and grammatical structures are influenced by Sanskrit. Hindu has two main categories:
- Eastern Hindi language family: Awadhi, Bagheli, and Chhattisgarhi
- Western Hindi languages: Brajbhasha, Bundeli, Haryanvi, Kanuji, and Khariboli
Urdu: Urdu is an Indo-Aryan descendant of the Apabhraṃśa language, which existed from the 6th to the 13th century CE. It is a Persianized register of the Hindustani language, which Mahatma Gandhi promoted as an element for unifying the subcontinent beyond religion and ethnicity.
Bengali Language: is the official language of Bangladesh and India’s West Bengal state, Bangla. It is a derivative of Magadhi Prakrit and Sanskrit. In the Medieval era, it was influenced by Iranian, Arabic, and Turkic languages. The Bengali script is a derivative of Kutilalipi, an eastern variation of the Brahmic script.
Punjabi Language: Spoken in the Punjab region divided between India and Pakistan, Punjabi also evolved from the Apabhramsha and Prakrit languages with influence from Iranian, Arabic, and Turkic linguistic elements. The earliest traces of Punjabi date back to the 12th century, but it gained vast popularity under the Sikh empire. It was originally written in the Gurmukhi script, developed by the Sikh Empire. This variation is still popular in the Indian Punjab region.
However, the Shahmukhi script used in modern Pakistan is a derivative of the Iranian script. Punjabi speakers have migrated worldwide, and you can find the diaspora community in the UK, Canada, the U.S., and Australia.
Gujarati: This language, spoken in the Indian state of Gujarat and by diaspora communities, is a derivative of the Medieval Prakrit and Apabhramsha languages created in the 12th century. It shares characteristics of Hindi, Punjabi, and Bengali but features a distinct grammar, vocabulary, and script. This language contains many Iranian and Arabic words.
Marathi: The official language of the Indian state of Maharashtra, Marathi is another Sanskrit derivative and independent language formed in the 11th century. It is mainly written in Balbodh Devanagari script, but Marathi literature is written entirely in the Modi script.
Nepali: Nepali (originally Khaskura) is the official language of Nepal. It is also spoken in Indian states such as Sikkim, West Bengal, Assam, the Darjeeling district, and the small nation of Bhutan. Nepali is a derivative of the Khas language, the language of the Khas people, an ethnolinguistic group in the Himalayan region.
Dardic Languages: The Dardic languages are a subgroup of the Indo-Aryan languages, part of the larger Indo-European family, primarily spoken in northern Pakistan, India, and eastern Afghanistan. These languages, including Kashmiri, Shina, Khowar, and Pashai, are known for retaining archaic Indo-Aryan features and exhibiting influences from neighboring Iranian and Turkic languages. Efforts to document and preserve these languages are ongoing, highlighting their importance to the heritage of the Himalayan and Hindu Kush regions.
Kholosi Language: Kholosi is a lesser-known language primarily spoken by the Kholos people in northeastern Iran, particularly in the Khorasan province. It belongs to the Northwestern branch of the Iranian languages, a subset of the larger Indo-Iranian language family. The Kholosi language is characterized by its unique phonetic and grammatical features, distinguishing it from more widely spoken Iranian languages, such as Persian (Farsi) and Kurdish.
Other noteworthy Indo-Aryan languages in the Indo-Iranian language tree include Assamese, Oriya (Odia), Sindhi, Kashmiri, Bhojpuri, Maithili, and Awadhi, which are mostly regional dialects of the Indian subcontinent.
Iranian Languages
The Iranian language family is a major branch of the Indo-Iranian language family, primarily spoken in West Asia. It features a unique cultural identity shared by the people of Iran, Afghanistan, Tajikistan, and parts of central Asia. All Iranian or Iranic languages originate from the Proto-Indo-Iranian language. Researchers speculate that a Satem ethno-linguistic group from the larger Indo-European group originated from the Andronovo culture in the Bronze Age, around 2000 BCE.
Old Iran Languages
Two known old Iranian languages date to between the first and second millennium BCE: the two Avestan languages and Old Persian language:
Old Avestan (Gathic): The main source of information about this ancient Iranian language is the Avesta, a Zoroastrian religious text dating back to the mid-second millennium BCE. The Avestan language does not specifically fall into the eastern-western categorization of Iranian languages, as it predates the criteria for that classification. It shares morphological ties with Vedic Sanskrit. Some researchers suggest that the retroflex phonemes in the Pashto language originate from the Gathic Avestan language.
Younger Avestan: The Old Avestan gradually developed in the 1st millennium BCE and was greatly simplified. This included phonetic, morphological, and lexical innovations and simplifications. The younger Avestan language has more in common with the Old Persian Language.
Old Persian (Arya): The attested language of the Achaemenid Empire, Old Persian, was known by its speakers as Aryia. In terms of inflection, it is similar to Avestand and Rigveda. The main example of Old Persian is the Bisotun inscription by Darius I, dating back to 525 BCE.

Middle Iranian Languages
The Middle Iranian Linguistic era started around the 4th century BCE and ended in 650 CE. During this period, the linguistic groups were divided into eastern and western Indo-Iranian branches, characterized by geographical distribution and linguistic features such as retroflex consonants.
The four major Middle Iran languages in the Indo-Iranian languages tree were:
Parthian: The language of the Parthian Empire (Arsacid Pahlavi) was a Western Iranian Language that originally used the Greek script. However, there are records of using a variation of the Pahlavi script known as Inscriptional Parthian. It was the official language of the Arascid Parthian Empire between 248 BCE and 224 CE. The language heavily influenced the development of the Armenian language, and Parthian words are still used in the modern Armenian language.
Middle Persian (Pahlavi): This Iran language developed in Persia Proper or Persis, modern Fars province. It is a Western Iranian language that adapted Imperial Aramaic to create the official script during Sassanid rule. It heavily influenced the development of new Farsi after the Arab conquest.
Linguistically, it is the ancestor of the modern Persian language, Dari Persian and Tajiki Persian. This gradual development marked the cultural resurgence of the Iranian identity. After the Arab invasion, Arabic became the official language of the ruling dynasties, and the use of Middle Persian was prohibited in writing books.
Bactrian (Ariao): Bactria was the eastern Iranian region in modern Afghanistan. It is an eastern Iranian language written in Greek script, which was the official language of the Kushan Empire in the 1st century. The latest use of the Bactrian language dates back to the 9th century, and evidence of this was found in the Tochi Valley in Pakistan.
Sogdian is another Eastern Iranian language that was predominant in the Sogdia civilization in northeastern Iran. Achaemenid texts mention this civilization, suggesting the existence of an older form from the era of Old Iranian languages.
Khwarezmian (Chorasmian): This Middle Iranian language is similar to the Sogdian language, sharing features of Eastern Iranian languages such as spirantization of word-initial positions. Little is known about the Khwarezmian language and its ancient form. It used a variation of Imperial Aramaic similar to Sogdian and Pahlavi and was replaced in the 13th century by the Persian language.
Other Middle Age Iran languages from the Eastern category include Saka and Old Ossetic.
New Iranian Languages

Following the Arab Invasion of Iran, the Iranian cultural identity underwent significant developments, and the Iranian language was no exception. This new era of development started in 900 CE and eventually resulted in the formation of the following languages:
Modern Persian (Farsi): Middle Persian evolved into the modern Persian language (Farsi) and is widely used in Iran, Afghanistan, Tajikistan, and Uzbekistan. Each has developed a unique variation: Iranian Persian, Eastern Persian (Dari), and Takiji Persian.
The word Dari is derived from Darbari, which means the language of the royal court. After the prohibition of the Middle Persian language in the post-Islamic era, Farsi-e Dari gradually developed in the Greater Khorasan Area two centuries after the Arab invasion. This resurrection is owed to the works of great literary figures such as Ferdowsi, Rudaki, and Jami.
Iranian Farsi and Dari are highly similar except in pronunciation and regional expression. Both are considered Western Iran languages. They use the same group of Indo-Iranian languages alphabet, which is derived from the Arabic alphabet and has four unique characters: (P)”پ” (ZH)”ژ” (G)”گ” (CH)”چ. However, the Tajiki Persian language uses Cyrillic script following Soviet rule and has loanwords from Russian.
Pashto: The native language of Pakistan and some regions in Afghanistan and Iranian borderlands to the east, Pashto is an Eastern Iran language. It shares characteristics with the Bactrian, Khwarezmian, and Saka languages, which are in the same category. It shares the same Indo-Iranian languages alphabet as modern Persian except for one character, (G)”گ”.
Kurdish: A Western Iranian language, Kurdish has three main branches: Kurmanji (Northern Kurdish), Sorani (Central Kurdish), and Southern Kurdish. This language has four writing systems: the Hawar alphabet (Latin script, common among Kurds in Turkey and Syria), the Sorani Alphabet (Perso-Arabic script popular in Iran and Iraq), the Cyrillic alphabet, and the Armenian alphabet.
Balochi: The Balochi language is one of the oldest Western Iranian languages originating from northwest Iran languages. Baloch people in the Baluchestan regions of Iran, Pakistan, and Afghanistan speak Balochi. In addition, there are Baloch speakers in Persian Gulf states, East Africa, and even Turkmenistan. The earliest Balochi poetry dates back to the 15th century, but it did not have a specific writing system until the 19th century. The remaining variations of this language include Koroshi, Southern-Western Balochi, and Eastern Balochi. The two main dialects are Mandwani and Domki (north and south, respectively).
Luri Luri language originated from Middle Persian and is the closest living language to Archaic PersiaThere are three dialects: Central Luri, Southern Luri, and Bakhtiari.
Other modern Iranian languages include Talyshi, Gilaki, and Mazandarani (Tabari).
Learn More about the Indo-Iranian Language Family
Travel to Iran on an Iran tour package or independently to discover Iran languages personally. You can find ethnic groups, indigenous tribes, and individuals who still maintain their native Iranian language apart from Farsi. You can hear these ancient living languages in villages across northern Iran, Lorestan, Sistan and Baluchestan, and many other locations in Iran.
Frequently Asked Questions about Indo-Iranian Languages
If you have any unaddressed questions about the Indo-Iranian language family, ask us in the comment section. We’ll respond as soon as possible.
What is Indo-Iranian languages family?
These languages are a branch of Indo-European languages spoken across Central, South, and Southwest Asia. The main groups of this Indo-Iranian branch languages are Indo-Aryan and Iranian languages.
How many people speak Indo-Iranian languages?
Nearly 1.5 billion people in Asia and worldwide speak Indo-Iranian languages. This includes India, Bangladesh, Iran, Afghanistan, Pakistan, and many other West Asian countries.
What are the main branches of Iranic Languages?
Iranian languages are categorized into Eastern and Western languages. This categorization applies to Middle Iranian languages and any modern languages in Indo-Iranian branch languages.
Where did Indo-Iranian languages originate?
These groups of languages originate from the proto-Indo-Iranian language, which traveled to the Iranian plateau in the Bronze Age.
How many languages in Iran are spoken today?
In total, there are over 70 languages spoken in Iran, reflecting the country’s ethnic and cultural diversity.
What language family is Farsi?
Farsi, also known as the Persian language, belongs to the Indo-Iranian language family. It is primarily spoken in Iran, Afghanistan (where it is referred to as Dari), and Tajikistan (where it is called Tajik). Farsi has a rich literary history and has influenced many other languages in the region.
How old is Persian language?
The Persian language has a long and continuous history, making it one of the oldest languages still in use. As part of the Iranic languages group, it can be broadly categorized into three main periods: Old Persian (circa 600-300 BCE), Middle Persian (circa 300 CE—900 CE), and New Persian (from the 9th century CE to the present).
Are Indo-Iranian Languages a Centum or Satem language?
These languages are the main branch of Satem languages in Asia. Satem refers to Satemization, which refers to the articulation of palatovelars further forward in the mouth as opposed to Centum languages. It is the earliest sign of the separation of proto-Indo-Europeans into different ethnolinguistic groups.
Is Nepali language derived from Sanskrit?
Yes, the Nepali language is derived from Sanskrit. It belongs to the Indo-Aryan branch of the aforementioned language family and has evolved significantly from its Sanskrit roots. Over time, Nepali has also incorporated influences from other languages, including Tibeto-Burman languages, English, and various regional dialects