Corsican (Corsu or Lingua Corsa) is a Romance language spoken and written on the islands of Corsica (France) and northern Sardinia (Italy), alongside French and Italian, which are the official languages. Historically Corsu is the native language of the Corsican people, once spoken predominantly as a first and only language on Corsica. In 1990, after over 200 years of being a part of France, nearly all Corsicans were fluent in French, their first language, an estimated 50% of those also had some degree of proficiency in Corsu, and a small minority, perhaps 10%, used Corsu as a first language.
The use of Corsican over French had been declining. In 1980 about 70% of the population "had some command of the Corsican language. In 1990 out of a total population of about 254,000 the percentage had declined to 50%, with only 10% using it as a first language. The language was clearly on the way out when the French government reversed its non-supportive stand and began some strong measures to save it. Whether these measures will succeed remains to be seen. No recent statistics on Corsu are available.
UNESCO classifies the Corsican language as a potentially endangered language, which has "a large number of children speakers" but is "without an official or prestigious status. The classification does not state that the language is currently endangered, only that it is potentially so. In fact it is being vigorously affirmed. Often acting according to the current long-standing sentiment unknown Corsicans cross out French roadway signs and paint in the Corsu names. The Corsican language is a key vehicle for Corsican culture, which is notably rich in proverbs and in polyphonic song.
At the primary school level Corsu can be taught up to a fixed number of hours per week (three in the year 2000) and is a voluntary subject at the secondary school level, but is required at the University of Corsica. It is available through adult education. It can be spoken in court or in the conduct of other government business if the officials concerned speak it. The Cultural Council of the Corsican Assembly advocates for its use; for example, on public signs.
A mythology concerning the Corsican language is to some degree current among foreigners, that it was a spoken language only or was only recently written. Omniglot goes so far as to assert "Corsican first appeared in writing towards the end of the 19th century ...." Whatever Omniglot may have meant throughout the 19th and 18th century there was a steady stream of writers in Corsican, many of whom wrote also in other languages.
Ferdinand Gregorovius, 19th century traveller and enthusiast of Corsican culture, reports that the preferred form of the literary tradition of his time was the vocero, a type of polyphonic ballad originating from funeral obsequies. These laments were similar in form to the chorales of Greek drama except that the leader could improvise. Some performers were noted at this, such as the 18th century Mariola della Piazzole and Clorinda Franseschi.
The trail of written popular literature of known date in Corsican currently goes no further back than the 17th century. An undated corpus of proverbs from communes may well precede it (see under External links below). Corsican has also left a trail of legal documents ending in the late 12th century. At that time the monasteries held considerable land on Corsica and many of the churchmen were notaries.
Between 1200 and 1425 the monastery of Gorgona, Benedictine for much of that time and in the territory of Pisa, acquired about 40 legal papers of various sorts written on Corsica. As the church was replacing Pisan prelates with Corsican ones there the legal language shows a transition from entirely Latin through partially Latin, partially Corsican to entirely Corsican. The first known surviving document containing some Corsican is a bill of sale from Patrimonio dated to 1220. These documents were moved to Pisa before the monastery closed its doors and were published there.
The search for earlier evidence of Corsican goes on. It is entirely possible that archaeology or research in monastic archives will turn up more.
The general classification of Corsican as a Romance language allows two possibilities as to the identity of the speakers of the first distinct Corsican, or Proto-Corsican. They created the language either from Proto-Romance or from a subsequent Romance language.
In 40 AD neither a Romance nor an Italic language were spoken by the natives of Corsica. The Roman exile, Seneca the younger, reports that both coast and interior were occupied by natives whose language he did not understand (see under Prehistory of Corsica). Latin at that time was generally spoken only in the Roman colonies. The occupation of the island by Vandals about 469 AD marks the end of authoritative influence by Latin-speaking Romans (see under Medieval Corsica). If the natives of that time were speaking Latin they must have acquired it during the late empire. The documents of the early Christian church concerning Corsica are in Latin, but they are only communications between church officials (see under Ajaccio).
The next window of opportunity for the predecessor of a Proto-Corsican was the administration of Corsica by Tuscany, then speaking the Tuscan dialect, an immediate predecessor of Italian. The first Italian documents date from the 10th century but Italian must have developed earlier and Tuscan even earlier. Tuscan would have come from the latest phases of Vulgar Latin; Proto-Corsican from the Tuscan spoken on Corsica.
The last historical possibility is that Proto-Corsican came from the Italian dialect of Pisa; its period of Corsican administration, however, was relatively short. Genoese is not a likely possibility as Corsican is attested before the presence of Genoa on Corsica. Historical circumstances alone reduce the window of opportunity only to within several hundred years.
Turning to the professional comparatists it is possible to definitely say, Corsican is not Tuscan and is not Italian. For example one of the characteristics of Tuscan and Italian is that Latin -u- in -us becomes -o: annus "year" but Italian anno. Corsican has annu, retaining the -u. Or, the -re infinitive ending as in Latin mittere, "send", is retained in Tuscan but lost in Corsican, which has mette/metta, "to put." The Latin relative pronoun, qui, "who, what", is inflected in Latin and Italian but in Corsican is the uninflectable chì. The number and profundity of differences is large and preclude the idea that they came from Tuscan rather than from Latin: "the Corsican language is not the same as Tuscan" and "Corsican has preserved certain Latin forms which have disappeared elsewhere.
The SLS is an abstract summary of all the lexical items and morphological features that distinguish the language and therefore determine the overall order of its digraphs. The statistical distance of one SLS from another measures the similarity of the two languages in a manner that does not depend on a subjective analysis of features or value decisions as to which should be considered. There is some variability of the signature depending on the selection of samples and the mathematical methods of conceiving and computing distance.
The ability to characterize languages by numbers creates a sample space for them in which the clustering of points reveals groups of similar languages, or if samples are taken from the history of the language, graphs that trace the divergence of languages from each other. These methods are limited only by the comprehensiveness of the sample texts.
An initial effort to develop a language classification tree having turned out unsatisfactorily in 2002 because of insufficient data a second effort in 2003 utilized the text of The Universal Declaration of Human Rights in 52 languages as sample texts to develop two trees by two different statistical methods. Discrepancies between them were attributed to insufficient sample data; for example, by one method Sardinian and Corsican are very close but by the other rather distant, with neither being close to Italian.
A recent attempt to bring the tree into sharper focus on the Romance languages diminished the number of languages to 34 and the statistical parameters to the Frobenius Distance and the Kalin (1-norm) Distance. It expanded the data set to include also other documents reflecting spoken language, such as newspapers, and made it diachronic, going back 22 centuries. Sardinian was not included but the results for Corsican are precise.
Corsican diverged from Italian, Corsican-Italian from Friulian and that group from a larger that includes Latin on the one hand and almost all the others on the other. In other words, there was a common ancestor on Italian soil and Corsica. The ancestor was not Latin and was to be distinguished from ancestors on other soils, in Iberia and Gaul.
The "Italian" from which Corsican diverged in mutual dissimilation was not modern Italian, still far in the future, but its ancestor, Tuscan, and that was not during the Tuscan period on Corsica, when it already existed. The common ancestor was a language about which little is known: spoken or vulgar Latin, often considered to be Proto-Romance. Written Latin was a literary language, hence it does not appear as an ancestor in the tree. The ancestors in Iberia and Gaul came from soldiers' Latin, of mainly foreign troops learning the spoken language.
The ancestor of Corsican, Tuscan and Friulian - which was spoken on the soil of the earlier Rhaetia - draws the attention as being on formerly Etruscan soil. Evidently when the Etruscans assimilated they did so with a unique signature.
The date of the first projected Corsican signature is about 1400 years ago, 600 AD more or less, well before Tuscan rule, in the early Christian period. This date is consistent with a Latinization of the Corsican people during the late empire and subsequent local development of Vulgar Latin into Proto-Corsican before close communication with Italy was again established.
On Maddalena archipelago the local dialect (called Isulanu, Maddaleninu, Maddalenino) was brought by fishermen and shepherds from Bonifacio during immigration in the 17th-18th centuries. Though influenced by Gallurese it has maintained the original characteristics of Corsican. There are also numerous words of Genoese and ponzese origin.
For example, Article 2 Item 4 of Law Number 26, October 15, 1997, of the Autonomous Region of Sardinia grants "al dialetto sassarese e a quello gallurese" equal legal status with the other languages on Sardinia (which Corsica does not do). They are being legally defined as different languages from Sardinian.
Vowels may be nasalized before n, which is assimilated to m before p or b, and the liquid consonant, gn. The nasal vowels are represented by the vowel plus n, m or gn. The combination is a digraph or trigraph indicating the nasalized vowel. The consonant is pronounced in weakened form. The same combination of letters might not be the digraph or trigraph but might be just the non-nasal vowel followed by the consonant at full weight. The speaker must know the difference. Example of nasal: pane is pronounced ['pãnɛ] and not ['panɛ].
The vowel inventory, or collection of phonemic vowels (and the major allophones), transcribed in IPA symbols, is:
|Phoneme|| Phone or|
| Open front unrounded|
|Open back unrounded||a||/â/||[ɑ]|
| Close-mid front unrounded|
| Inherited as|
open or close
| U celu [uʤ'elu]|
Ci hè [ʧ'ɛ]
| Close front unrounded|
1st sound, diphthong
|Close-mid back rounded||o||/o/||[o]||giòvani [ʤ'owãni]|