Albanian (/ælˈbeɪniən/; shqip [ʃc͡çip] or gjuha shqipe [ɟ͡ʝuha ˈʃc͡çipɛ]) is an Indo-European language spoken by the Albanians in the Balkans and the Albanian diaspora in the Americas, Europe and Oceania. With about 7.5 million speakers, it comprises an independent branch within the Indo-European languages and is not closely related to any other language.
First attested in the 15th century, it is the last Indo-European branch to appear in written records. This is one of the reasons why its still-unknown origin has long been a matter of dispute among linguists and historians. Albanian is considered to be the descendant of one of the Paleo-Balkan languages of antiquity. For more historical and geographical reasons than specifically linguistic ones, there are various modern historians and linguists who believe that the Albanian language may have descended from a southern Illyrian dialect spoken in much the same region in classical times. Alternative hypotheses hold that Albanian may have descended from Thracian or Daco-Moesian, other ancient languages spoken farther east than Illyrian. Not enough is known of these languages to completely prove or disprove the various hypotheses.
The two main Albanian dialects, Gheg and Tosk, are primarily distinguished by phonological differences, and are mutually intelligible, with Gheg spoken to the north and Tosk spoken to the south of the Shkumbin river. Their characteristics in the treatment of the native and loanwords from other languages, have led to the conclusion that the dialectal split occurred after Christianisation of the region (4th century AD) and at the time of the Slavic migration to the Balkans, with the historic boundary between Gheg and Tosk being the Shkumbin which straddled the Jireček line. Standard Albanian is a standardised form of spoken Albanian based on the Tosk dialect. It is the official language of Albania and Kosovo and a co-official language in North Macedonia as well as a minority language of Italy, Montenegro, Romania and Serbia.
Centuries-old communities speaking Albanian dialects can be found scattered in Croatia (the Arbanasi), Greece (the Arvanites and some communities in Epirus, Western Macedonia and Western Thrace), Italy (the Arbëreshë) as well as in Romania, Turkey, and Ukraine. Two varieties of the Tosk dialect, Arvanitika in Greece and Arbëresh in southern Italy, have preserved archaic elements of the language.
Main article: Albanians
The dialects of Albania
The language is spoken by approximately 6 million people in the Balkans, primarily in Albania, Kosovo, North Macedonia, Serbia, Montenegro and Greece. However, due to old communities in Italy and the large Albanian diaspora, the worldwide total of speakers is much higher than in Southern Europe and numbers approximately 7.5 million.
The Albanian language is the official language of Albania and Kosovo, and co-official in North Macedonia. Albanian is a recognised minority language in Croatia, Italy, Montenegro, Romania and in Serbia. Albanian is also spoken by a minority in Greece, specifically in the Thesprotia and Preveza regional units and in a few villages in Ioannina and Florina regional units in Greece. It is also spoken by 450,000 Albanian immigrants in Greece.
Albanian is the third most spoken language in Italy. This is due to a substantial Albanian immigration to Italy. Italy has a historical Albanian minority of about 500,000, scattered across southern Italy, known as Arbëreshë. Approximately 1 million Albanians from Kosovo are dispersed throughout Germany, Switzerland and Austria. These are mainly refugees from Kosovo who migrated during the Kosovo War. In Switzerland, the Albanian language is the sixth most spoken language with 176,293 native speakers.
Albanian became an official language in North Macedonia on 15 January 2019.
There are large numbers of Albanian speakers in the United States, Argentina, Chile, Uruguay and Canada. Some of the first ethnic Albanians to arrive in the United States were Arbëreshë. Arbëreshe have a strong sense of identity, and are unique in that they speak an archaic dialect of Tosk Albanian called Arbëreshë.
In North America (United States and Canada) there are approximately 250,000 Albanian speakers. It is spoken in the eastern area of the United States in cities like New York City, New Jersey, Boston, Chicago, Philadelphia, and Detroit, as well as in parts of the states of Ohio and Connecticut. Greater New Orleans has a large Arbëresh community. Oftentimes, wherever there are Italians, there are a few Arbëreshe mixed with them. Arbëreshe Americans, therefore are often indistinguishable from Italian Americans due to being assimilated into the Italian American community.
In Argentina there are nearly 40,000 Albanian speakers, mostly in Buenos Aires.
Asia and Oceania
Approximately 1.3 million people of Albanian ancestry live in Turkey, and more than 500,000 recognizing their ancestry, language and culture. There are other estimates, however, that place the number of people in Turkey with Albanian ancestry and or background upward to 5 million. However, the vast majority of this population is assimilated and no longer possesses fluency in the Albanian language, though a vibrant Albanian community maintains its distinct identity in Istanbul to this day.
In Egypt there are around 18,000 Albanians, mostly Tosk speakers. Many are descendants of the Janissary of Muhammad Ali Pasha, an Albanian who became Wāli, and self-declared Khedive of Egypt and Sudan. In addition to the dynasty that he established, a large part of the former Egyptian and Sudanese aristocracy was of Albanian origin. In addition to the recent emigrants, there are older diasporic communities around the world.
Albanian is also spoken by Albanian diaspora communities residing in Australia and New Zealand.
Main article: Albanian dialects
The dialects of the Albanian language.
The Albanian language has two distinct dialects, Tosk which is spoken in the south, and Gheg spoken in the north. Standard Albanian is based on the Tosk dialect. The Shkumbin river is the rough dividing line between the two dialects.
Gheg is divided into four sub-dialects, in Northwest Gheg, Northeast Gheg, Central Gheg, and Southern Gheg. It is primarily spoken in northern Albania and throughout Montenegro, Kosovo and northwestern North Macedonia. One fairly divergent dialect is the Upper Reka dialect, which is however classified as Central Gheg. There is also a diaspora dialect in Croatia, the Arbanasi dialect.
Tosk is divided into five sub-dialects, including Northern Tosk (the most numerous in speakers), Labërisht, Çam, Arvanitika, and Arbëresh. Tosk is spoken in southern Albania, southwestern North Macedonia and northern and southern Greece. Cham Albanian is spoken in North-western Greece, while Arvanitika is spoken by the Arvanites in southern Greece. In addition, Arbëresh is spoken by the Arbëreshë people, descendants of 15th and 16th century migrants who settled in southeastern Italy, in small communities in the regions of Sicily and Calabria.
Main articles: Albanian alphabet and Albanian braille
Albanian keyboard layout.
The Albanian language has been written using many different alphabets since the earliest records from the 14th century. The history of Albanian language orthography is closely related to the cultural orientation and knowledge of certain foreign languages among Albanian writers. The earliest written Albanian records come from the Gheg area in makeshift spellings based on Italian or Greek. Originally, the Tosk dialect was written in the Greek alphabet and the Gheg dialect was written in the Latin script. Both dialects had also been written in the Ottoman Turkish version of the Arabic script, Cyrillic, and some local alphabets (Elbasan, Vithkuqi, Todhri, Veso Bey, Jan Vellara and others, see original Albanian alphabets). More specifically, the writers from northern Albania and under the influence of the Catholic Church used Latin letters, those in southern Albania and under the influence of the Greek Orthodox church used Greek letters, while others throughout Albania and under the influence of Islam used Arabic letters. There were initial attempts to create an original Albanian alphabet during the 1750–1850 period. These attempts intensified after the League of Prizren and culminated with the Congress of Manastir held by Albanian intellectuals from 14 to 22 November 1908, in Manastir (present day Bitola), which decided on which alphabet to use, and what the standardized spelling would be for standard Albanian. This is how the literary language remains. The alphabet is the Latin alphabet with the addition of the letters <ë>, <ç>, and ten digraphs: dh, th, xh, gj, nj, ng, ll, rr, zh and sh.
According to Robert Elsie:
The hundred years between 1750 and 1850 were an age of astounding orthographic diversity in Albania. In this period, the Albanian language was put to writing in at least ten different alphabets – most certainly a record for European languages. ... the diverse forms in which this old Balkan language was recorded, from the earliest documents to the beginning of the twentieth century ... consist of adaptations of the Latin, Greek, Arabic, and Cyrillic alphabets and (what is even more interesting) a number of locally invented writing systems. Most of the latter alphabets have now been forgotten and are unknown, even to the Albanians themselves.
Tree of Indo-European languages.
The Albanian language occupies an independent branch of the Indo-European language tree. In 1854, Albanian was demonstrated to be an Indo-European language by the philologist Franz Bopp. Albanian was formerly compared by a few Indo-European linguists with Germanic and Balto-Slavic, all of which share a number of isoglosses with Albanian. Other linguists linked the Albanian language with Latin, Greek and Armenian, while placing Germanic and Balto-Slavic in another branch of Indo-European.
The first written mention of the Albanian language was on 14 July 1284 in Dubrovnik in modern Croatia when a crime witness named Matthew testified: "I heard a voice shouting on the mountainside in the Albanian language" (Latin: Audivi unam vocem, clamantem in monte in lingua albanesca). The oldest document written in Albanian dates back to 1462, while the first audio recording in the language was made by Norbert Jokl on 4 April 1914 in Vienna.
During the five-century period of the Ottoman presence in Albania, the language was not officially recognized until 1909, when the Congress of Dibra decided that Albanian schools would finally be allowed.
See also: Illyrian languages
Albanian is considered an isolate within the Indo-European language family; no other language has been conclusively linked to its branch. The only other language that is a sole surviving member of a branch of Indo-European is Armenian.
The Albanian language is part of the Indo-European language group and is considered to have evolved from one of the Paleo-Balkan languages of antiquity, although it is still uncertain which particular Paleo-Balkan language represents the ancestor of Albanian, or where in Southern Europe that population lived. In general there is insufficient evidence to connect Albanian with one of those languages, whether one of the Illyrian languages (which historians mostly confirm), or Thracian and Dacian. Among these possibilities, Illyrian is typically held to be the most probable, though insufficient evidence still clouds the discussion.
Although Albanian shares lexical isoglosses with Greek, Germanic, and to a lesser extent Balto-Slavic, the vocabulary of Albanian is quite distinct. In 1995, Taylor, Ringe and Warnow, using quantitative linguistic techniques, found that Albanian appears to comprise a "subgroup with Germanic". However, they argued that this fact is hardly significant, as Albanian has lost much of its original vocabulary and morphology, and so this "apparently close connection to Germanic rests on only a couple of lexical cognates – hardly any evidence at all".
Historical presence and location
Main article: Origin of the Albanians
The location of the Albanoi tribe 150 AD
Illyrians, Dacians, Getae and Thracians at 200 BC
The place and the time where the Albanian language was formed is uncertain. American linguist Eric Hamp stated that during an unknown chronological period a pre-Albanian population (termed as "Albanoid" by Hamp) inhabited areas stretching from Poland to the southwestern Balkans Further analysis has suggested that it was in a mountainous region rather than on a plain or seacoast: while the words for plants and animals characteristic of mountainous regions are entirely original, the names for fish and for agricultural activities (such as ploughing) are borrowed from other languages.
A deeper analysis of the vocabulary, however, shows that this could be a consequence of a prolonged Latin domination of the coastal and plain areas of the country, rather than evidence of the original environment where the Albanian language was formed. For example, the word for 'fish' is borrowed from Latin, but not the word for 'gills', which is native. Indigenous are also the words for 'ship', 'raft', 'navigation', 'sea shelves' and a few names of fish kinds, but not the words for 'sail', 'row' and 'harbor' – objects pertaining to navigation itself and a large part of sea fauna. This rather shows that Proto-Albanians were pushed away from coastal areas in early times (probably after the Latin conquest of the region) thus losing large parts (or the majority) of sea environment lexicon. A similar phenomenon could be observed with agricultural terms. While the words for 'arable land', 'corn', 'wheat', 'cereals', 'vineyard', 'yoke', 'harvesting', 'cattle breeding', etc. are native, the words for 'ploughing', 'farm' and 'farmer', agricultural practices, and some harvesting tools are foreign. This, again, points to intense contact with other languages and people, rather than providing evidence of a possible Urheimat.
1905 issue of the magazine Albania, the most important Albanian periodical of the early 20th century
The centre of Albanian settlement remained the Mat river. In 1079, they were recorded farther south in the valley of the Shkumbin river. The Shkumbin, a seasonal stream that lies near the old Via Egnatia, is approximately the boundary of the primary dialect division for Albanian, Tosk and Gheg. The characteristics of Tosk and Gheg in the treatment of the native and loanwords from other languages are evidence that the dialectal split preceded the Slavic migration to the Balkans, which means that in that period (the 5th to 6th centuries AD), Albanians were occupying nearly the same area around the Shkumbin river, which straddled the Jireček Line.
References to the existence of Albanian as a distinct language survive from the 14th century, but they failed to cite specific words. The oldest surviving documents written in Albanian are the "formula e pagëzimit" (Baptismal formula), Un'te paghesont' pr'emenit t'Atit e t'Birit e t'Spertit Senit. ("I baptize thee in the name of the Father, and the Son, and the Holy Spirit") recorded by Pal Engjelli, Bishop of Durrës in 1462 in the Gheg dialect, and some New Testament verses from that period.
Linguists Stefan Schumacher and Joachim Matzinger (University of Vienna) assert that the first literary records of Albanian date from the 16th century. The oldest known Albanian printed book, Meshari, or "missal", was written in 1555 by Gjon Buzuku, a Roman Catholic cleric. In 1635 Frang Bardhi wrote the first Latin–Albanian dictionary. The first Albanian school is believed to have been opened by Franciscans in 1638 in Pdhanë.
One of the earliest dictionaries of Albanian language was written in 1693 which was an Italian language manuscript authored by Montenegrin sea captain Julije Balović Pratichae Schrivaneschae and includes a multilingual dictionary of hundreds of the most often used words in everyday life in the Italian, Slavo-Illirico, Greek, Albanian and Turkish languages.