User:AdiJapan/Correlation


 * This article was removed from Wikipedia, the reason being that it contains original research. I keep it on my user page only because I hope I can find sources to support it. As such, while your editing is welcome, there is no guarantee that it will ever be part of Wikipedia. Please note that the fruit and vegetable names are listed here for linguistic reasons, so botanical nomenclature is irrelevant. Also note that English equivalents are given only to indicate the meaning of those nouns. On the other hand, I would very much welcome a list of Latin equivalents (where possible), with their respective gender. Thank you. --AdiJapan

Correlation between the shape of an object and the gender of the respective noun
In some of the languages that classify nouns in genders, there exists a distinct correlation between the shape of an object and the gender of the respective noun, even when no biological gender is involved. Specifically, it appears that objects that have one dimension significantly larger than the other two tend to be assigned the masculine gender, while round or flat objects tend to be named using feminine nouns. Such a correlation was clearly identified in the Alamblak language, but is also present for example in Indo-European languages.

To test this hypothesis a collection of nouns can be taken and their gender compared with the general shape of the respective objects. The table below shows such an experiment, performed on French, Italian, Romanian, and Russian nouns. The collection was restricted to about fifty nouns all of which denote fruit and vegetables. A noun is considered to obey the correlation if its gender is masculine in the case of rather oblong objects, or feminine in the case of rather round objects. Column S gives the reference shape of the object where I means rather oblong and O rather round. Whether or not the correlation rule is obeyed is shown in the columns marked 1 by placing a symbol X where the correlation is violated. Columns marked 2 show the gender of the nouns in their respective language, with M meaning masculine, F feminine, and N neuter. French and Italian only have masculine and feminine nouns; neuter nouns in Romanian behave in the singular as masculine (see Romanian nouns) and for the purpuse of this calculation they were counted as masculine. Similarly, neuter nouns in Russian were counted as masculine because they have the same behavior, at least in four out of six cases (see Russian grammar).

The correlation is estimated by counting the number of violations separately for each language. If this number is zero the correlation is 100%; if it equals about half of the noun samples then there is hardly any correlation; intermediate values correspond to a partial correlation. (If more than half of the nouns violate the rule then there is a reverse correlation, i.e. oblong objects tend to be named using feminine nouns.) Mathematically this can be expressed by defining a correlation degree as


 * $$c=1-2{N_v \over N_{tot}}$$,

where $$N_v$$ is the number of violations and $$N_{tot}$$ is the total number of samples per language.

Results
The last row in the table contains the correlation degree for each language. While French, Italian, and Russian have a relatively high rate of violations, leading to a correlation degree between 10% and 20%, Romanian nouns show a correlation higher than 50%.

The small number of noun samples and the simplistic definition of the correlation degree do not allow for rigourous estimations, nevertheless the results are suggestive. Romanian speakers tend to consider the visual aspect of objects when they assign genders to new nouns; it is possible that, in the case of multiple noun choices for the same object, the nouns that obey this correlation have more chances to survive in the language. French, Italian, and Russian nouns turned out to have a non-zero correlation degree, but the error margin of this calculation is in the order of 10% so that no clear conclusion can be drawn.