The term ethnic group is generally understood to designate population which:
1) is largely biologically self-perpetuating;
2) shares fundamental cultural values, realized in overt unity in cultural forms;
3) make up a field of communication and interaction;
4) has a membership that identifies itself, and is identified by others as constituting a category distinguishable from other categories of the same order (BARTH FREDRIK, 1998, 10-11).
In accordance with this definition, the ethnic group is considered in a complex of linguistic, cultural, and social phenomena, but corresponds to "many empirical ethnographic situations" (ibid., 11). In the described studies under the ethnos, we will, for simplicity, understand carriers of a certain language as a historically variable phenomenon at a certain stage of its existence without regard for their ethnic identity and self-designation. In principle, this corresponds to the recognition of language as the determining factor in the formation of ethnic communities. This approach is obviously natural for prehistoric times since anthropologists also highly appreciate the importance of language in ethnogenetic processes
During 30 years I was investigating prehistoric ethnogenetical processes in Eastern Europe that were already partially described in my book (STETSYUK V.M., 1998) I tried to be abstracted out of prevalent in linguistics views and received doctrines having no hypothesis and depending at study upon my own graphic-analytical method, as I named it and which was briefly described on the example of Slavic languages in a special published work (STETSYUK V.M., 1987). The essence of this method is a quantitative estimation and geometrical interpretation of mutual characteristics of monophyletic languages in the shape of frameworks for successive chronological levels. The chronological levels are determined by connecting linguistics and archaeological data. The key deal of the study was constructing graphic models of Indo-European, Turkic, and Finno-Ugric languages and their correlation with the geographical map of Eastern Europe. This unexpected result stimulated the following investigations.
The close language genetic relationship is usually associated with the bigger volume of lexical items identical in their origin. This tenet was established by linguists already at the end of the 19th century (FORTUNATOV F.F.,1956, 68-69). That is why it was the vocabulary material that was mainly used in my work. The lexical-statistical methods were chosen for this study because vocabulary material has the advantage of being discrete and abundant therefore it easily lends itself to mathematical computation.
Many linguists consider language vocabulary as very unstable. It is true, sounds and grammatical categories do not easily disappear or undergo large-scale changes. But a distinct word can disappear or swift meaning. Moreover, there are quite a lot of loan words in many languages. But one can observe that loan words are of such kind that one can suppose that they could be borrowed by less advanced peoples from the languages of more advanced peoples with whom they were in contact during some long period. Contrariwise, the basic vocabulary had to be in everyday use since the ancient primitive period. Just the oldest words mostly remain in the language and they are the most frequently used. According to the opinion of Russian grammarian A.V. Desnitskaja, “indigenous vocabulary includes a considerable deal of the most frequent words which reflect elementary concepts and constitute the largest volume of word-produced nests.” (DESNITSKAYA A.V., 1966, 9). Other Russian linguists Arapov and Herts said the following about the correlation of the word frequency and its age:
“There is a correlation between the word’s frequency and the onset of the word’s appearance in the language… The majority of frequently used words are old words and vice versa, the lesser is the word frequency the more it is likely that this word is a neologism.” (ARAPOV M.V., HERTS M.M., 1974, 3).
The oldest words of a language determine its genetic origin. For example, English has borrowed around 50% of its total vocabulary from French and Latin but its oldest words characterize it as a Germanic but not synthetic Germanic-Romanic tongue.
The authors indicate that first this correlation was put forward by G. Zipf in 1947 and it was he who underlined its significance for the quantitative analysis of facts that had to do with the language history. However, it is necessary to consider that some words of small frequency can be old and that many neologisms can have high frequency but these newly invited words were removed from this lexical-statistical study because of their meaning.
The change rates of the basic vocabulary cannot be extremely high. M. Swadesh who specialized in the studies of changes in the lexical nucleus of languages wrote that one can observe the difference in the vocabulary and in the word use of older and younger generations, but these differences do not extend to such a level that they can cause mutual misunderstanding and it was this circumstance that limited the speed of the language change the most (SWADESH M., 1960). No doubt that some number of words (though relatively small) disappears because of different reasons, but they often leave derivatives with remote but related meanings.
Under these circumstances, the words for this study with the application of the graphic-analytical method were selected considering their high frequency or their reflection of elementary concepts.
While presentation so-called "semantic problem" will always be present always arising when using new methods or principles, as well as in the description of received by the results.
The problem lies in the fact that the meaning of words and concepts that we use depends on our previous conception but they something might be wrong (CHEW JEOFFREY, 1968: 46).