Вы находитесь на архивной версии сайта лаборатории, некоторые материалы можно найти только здесь.
Актуальная информация о деятельности лаборатории на lex.philol.msu.ru.
MSU-LGCLL:: A.A.Polikarpov - Cognitive Model of Lexical System Evolution

 

Cognitive Model of Lexical System Evolution

and its Verification

 

Anatoliy A. Polikarpov 


 

Contents


1. Introduction.

2. A Genus of Natural Classification Systems.

3. Language as a Communicative Kind of Natural Classification Systems.

4. Polysemy and Polysensuality.

5. Sources of Polysemic Structure.

6. Basic Abstractivization Assumptions.

7. Word Life Cycle and Processes of Word Semantic Potential Dissipation.

8. Polysemy Trajectory in Time.

9. Initial Formalization of the Model.

10. Evolution of Semasiological Parameters during Word Life Cycle.

10.1. Homonymic Disposition of Words.

10.2. Synonymic Activity of Meanings.

10.3. Phraseological Activity of Meanings.

11. Dynamics of Word's Derivational Potential.

12. Verification of the Model.

12.1. Age of Words and Abstractness of their Meanings.

12.2. Qualitative Drift of New Meanings.

12.3. Age and Polysemy of Words.

12.4. Polysemy of Words and Average Generating Activity of their Free and Bound Meanings.

12.5. Polysemy and Stylistic Markedness of Meanings.

12.6. Polysemy and Average Frequency of Meanings' Use.

12.7. Age of Words and their Synonymic and Phraseological Activity.

12.8. Age of Words and their Homonymic Activity.

12.9. Age of Words and their Derivational Activity.

12.10. Survival Rate of Words of Different Age and other System Features of Words.

13. Final Conclusions.

Footnotes.

References.

© A.A.Polikarpov, 2001

 


1. Introduction.

Subject of a present paper is to develop further some points of a model of Natural Language lexical subsystem evolution (most explicitly presented earlier in [Polikarpov, 1993; 1994; 1995; 1998]) and to verify some main predictions made in it by use of some experimental data.

One of the general methodological principles of the approach realised here is that only initial investigation of evolutionary driving forces and formative evolutionary regularities for a typical word (which are predominantly cognitive) may provide a clue to modelling evolutionary regularities of the entire lexical system of language. Only basing on dynamic features of micro-units and knowledge of some boundary conditions of lexical population functioning is possible to build a theory of the development of the lexicon system as a whole.

Moreover, cognition of the developmental regularities of lexical system is a key point for the system understanding of language evolution mechanism on the whole. This is because, first, it is lexical units and their relationships that are the source of grammatical (morphological, syntactical and phraseological) units and relationships. Second, we assume that some general regularities of the semantic development of morphemic, phraseological and syntactic units should, in principle, be isomorphic to those of lexical units. The lexical level of language system provides the most evident information on regularities of the evolutionary process, and therefore should be examined first of all and may be regarded as a clear model for respective investigation of other language levels.

 

2. A Genus of Natural Classification Systems.

An approach realised here in the most basic manner should be formulated using, first of all, a concept of Natural Classification Systems. The concept is grounded on the ideas of General Systems theory (G.P.Melnikov) and General Biology (N.A.Bernstein, P.K.Anokhin, G.Quastler). The notion provides linguistics with really cognitive basis which is striven for in Science of Language. From this point of view, any living organism, for its survival, i.e. for the balance in relations with its surroundings, besides substance exchange with them, should be in tireless classificational activity, in the process of (1) reccurring reflecting vitally important peculiarities of its ecology in the form of images, (2) combining and (3) comparing them. The first feature of the classification process is characterised, first, by its selectivity, directionality in reflection by the organism of the outer world - in qualitative, as well as in quantitative aspect. So, reduction of the flow of information pourin by Nature on extracting coinciding components of them and omitting specific ones. What is the most important, this leads to permanent process of producing new, more abstract images (applicable to the wider variety of vital situations than it was possible for the initial, more concrete images), to formation and regular change of hierarchically organised subject and aspectual natural classification systems, consisting of various, but co-ordinated, "pyramids" of abstractions (G.P.Melnikov) in the reflecting sphere of any member of a society. In the basis of any individual "pyramid" is the most numerous set of concrete, immediate images, regularly supplemented and replaced by new impressions of an individual. Every higher level of the pyramid as compared to every lower one consists of more abstract, "emptier" images, which are less numerous and less dependent on current changes in life situations of an individual. Nevertheless, changes, replacement of some previously elaborated categories also inescapably occur on higher levels of any natural classification, only not so often as on lower ones and as a result only of really significant changes in life situations of individuals. Only revolutionary events may lead to noticeable changes on the highest, "ideological" levels of any member of a society personal classification. Co-ordinated use of images of different degree of abstractness, totally covering sensual universum of a living organism provides it with the ability to survive in various life situations, to adapt its behaviour to features of any current surroundings, even to those which it comes across the first time. In this case at least some of the most abstract images of its classification may be useful for general survival orientation and behaviour.

3. Language as a Communicative Kind of Natural Classification Systems Genus.

Natural Language is a specific, communicative kind of the genus "Natural Classification Systems". Language is elaborated in specific, communicative situations. This kind of life situations is opposed to practical ones by purposes of activity in them. The aim of behavior there is not gaining some immediate practical results, as it is characteristic for the use of practical classifications in practical situations, but "mere" to help members of a community to exchange their life experience, to exchange some "pieces" of their individual practical classifications having been elaborated by individuals independently and very often complementary and mutually useful for possible use by partners in their actual or further practical activity. This means that communicative situations in principle are secondary, auxiliary for practical ones. Therefore main parameters of communicative situations depend principally on some main informational parameters of practical situations.

From this point of view Language is some secondary, auxiliary classification built upon some basic, practical classification system, specifically reflecting it. Language consists of images, called meanings, initially directly borrowed during previous communicative practice from a set of relatively abstract units of a practical classification, called senses, and, possibly, further elaborated already within a language, being specifically transformed during communicative activity.

"Help" of language in exchange of experience between communicants consists in providing them with necessary and not redundant set of meanings as standard, relatively abstract image means for non-standard, even creative situational hinting by a sender for a receiver on some useful practical images, senses, which are usually relatively more concrete. Hinting at some sense by some meaning is possible on the basis of their similarity, coincidence by some components of them. Selection by a sender of some meaning, which is the most similar to a target sense, is the initial step in the hinting-guessing chain formed by collaborative communicative activity of a sender and a receiver. A sender can gain access to a receiver's mind because each group of the same meanings usually is strongly associated in his and receiver's minds by previous communicative learning with an image of the same external object, called a sign, which being produced by a sender fulfils the role of an intermediate external means in hinting-guessing chain "sender-sign-receiver".

Correlated set of meanings and image reflections of signs in communicants' memory completely covers, communicatively classifies the whole sense field of a society and constitutes Language.

Effective sense guessing of a receiver as his successful search for a proper sense (among all activated by some "received" meaning) is possible only on the basis of the active process of basic sense prediction realised in the receiver's mind. The prediction significantly restricts the whole search space and facilitates sense selection task for a receiver. Effective sense hinting of a sender, successful search by him of a proper meaning, most similar to some current sense (aimed for "transmission") is possible on the basis of even more sophisticated process of sense prediction in the sender's mind, consisting in constructing there a dynamic model of receiver's states of mind, comparing already "transmitted" sense pieces with the total original "sense picture" aimed for communication, fixing points of uncertainty in the model of receiver's sense apprehension, selecting a new meaning and a sign for hinting at some next point in receiver's sense picture, etc.

4. Polysemy and Polysensuality.

There are more meanings than signs. So, every sign, at least potentially, is polysemous, possesses some degree of semantic uncertainty, and consequently some amount of difficulty for a receiver to select (after recognition of a sign ) that meaning among all possessed by a sign, that is relevant to a current situation of a communicative act. A natural simplest measure for the difficulty of this kind is logarithmic value of the number of different meanings possessed by a sign or Shannon's entropy measure. For use of the latter measure someone should take into account not only number of meanings, but also their different activity, probability to be used. As an empirical correlate to the probability frequency of signs use in some text material can be used.

There are less meanings than senses. It means that each meaning is, in principle, polysensual, possesses for a receiver some amount of denotative (referential) uncertainty measured analogously to the previous case by use of logarithmic or Shannon's entropy scale for the variety of senses (and their use activity) covered by a meaning.

5. Sources of Polysemic Structure.

Combining of meanings of the same sign within its polysemic structure not on the basis of their essential relations (which are hierarchical) between them, as it is observed in "pyramids" of practical natural classification systems, but on the basis of casual relations of any kind of similarity, is the most important distinctive feature of language sign meanings' classification structures as opposed to extralinguistic (practical) ones. Casual nature of relations between sign's meanings reflects the hinting-guessing principle of meaning-sense relationships which underlies the very possibility to communicate. "Reflects" because the source of new meanings for a sign eventually is in the sense sphere. Every new meaning is a former sense attracted at some time by a sign for fulfilling a new function - of an intermediate means of hinting at some target senses. But the general direction of meanings' quality differentiation within the succession of them acquired one by one by a sign in its history should be, in general (as a tendency), the same, as in the case of hierarchical relations between notions (senses) in some practical classification - from some relatively specific, concrete initial meaning of a sign to successively more abstract ones. This is a result of some basic semiotic preference principle in acquisition of new sign meanings, important for sign survival (see below).

The same direction of gradual quality change is characteristic of the own history of any meaning (see also below).

6. Basic Abstractivization Assumptions.

Assumptions about the most probable direction of any word semantic development (as well as of any sign of other language levels) in its history -

(1) from relative concreteness, objectness of its initial meaning towards gradually increasing abstractness of each successive meaning and

(2) from relatively concrete quality of some meaning at the beginning of its appearance in language towards its gradually increasing abstractness -

are key points for a system of ideas, which have got the name "a model of word life cycle" [Polikarpov, 1988; 1990; 1991; 1993; 1994; 1994; 1995; 1997]. Now the model is derived, first, in the direction of explaining these assumptions as natural consequences of some selective mechanisms acting in language communications. Second, there are added some new aspects in the model, which mainly concern semasiological and wordformational regularities of the whole lexical system evolution.

Basic abstractivization assumptions be explained within the framework of even more fundamental ideas of a language sign nature and its realisation during its life. The initial point of the model of word life cycle is the assumption of associative-semantic (or, simply, semantic) potential possessed by any word (and by any sign of any other level) in language. The potential is manifested by the ability of every word's meaning (image, by its basic nature) for associative interaction with other meanings and extralinguistic senses (which are also images by their nature) present in the human mind. Any kind of associations is based on a certain type of similarity of images in terms of coincidence of their features, components. This versatile ability of human brain, first of all, as it was above mentioned, provides the basis for linguistic communication giving communicants the possibility of hinting to a broad area of senses (extralinguistic images) by means of limited number of meanings (linguistic images) which, in principle, are polysensual, and by yet more limited number of linguistic signs of different levels (morphemic, lexical, phraseological, syntactic) which, in principal, are polysemantic.

The associative ability of a human brain is also important for understanding a principal mechanism for each sign's semantic development and, accordingly, for understanding main lines of a possible mechanism for historical development of the whole set of signs, the entire language system. The development involves the phenomenon of assigning to some senses the status of meanings. This means that in the course of communicative acts some of associative links that have been developed between some sign meaning and some sense are strengthened, socialised (in the course of mutual teaching of communicants), making them useful for further use of some former sense not as a communicative goal, but as means of hinting at other senses, as a new meaning.

Moreover, there should be two tendencies acting in the same direction in word's semantic history - (1) to acquire relatively more abstract meanings (than parental ones) on each next step of the process of word's semantic development and (2) to have each meaning slightly transformed during its own life also in the direction of greater abstractness.

These general tendencies are explained by the reasons as follows. First of all, acquisition of new additional meanings is useful for a word sign in its struggle for existence as an additional functional load, making greater its necessity for the language system in the communicative acts, i.e. making it safer in the processes of redistribution of functions between existing signs (which occur from time to time in language) and making it less vulnerable to changes happening in the outer, sense world (which occur according to possible disappearance of some previously denoted object and, as a result, going out of use of a corresponding sense or group of senses which were covered by some meaning of a word).

Second, exactly the same reason of growing chances for a sign to survive is meaningful in the case of predicted qualitative tendency of any successive word meaning greater abstractness as compared the maternal meaning. A more abstract meaning covers broader referential sphere than less abstract one and therefore it and a sign possessing it can be more stable, more independent from possible changes in the realm of senses, following going out of use of some kinds of previously denoted objects and possible redistribution of functions within a language system.

Third, every member of a speech community may use any meaning slightly different from other communicants in terms of the sense sphere covered by the meaning and, consequently - in terms of those features which are reflected in it. As a result of constant natural mutual teaching every member of a community tends to widen in time his view of a sense sphere covered by the same meaning. At the same time widening of a sense sphere naturally leads to disappearance of some components in the meaning as irrelevant for a wider sphere, leads to the tendency for abstractivization of a meaning in majority of communicants' individual language systems during each meaning's own time of functioning in language, and eventually - in language on the whole.

If the initial point of the process of meaning's own quality development begins from some relatively abstract level (reached previously by a former sense before its becoming a new meaning) then it has greater chances to be transformed with time into some syntactic type of meanings than some other meanings have.

7. Word Life Cycle and Processes of Word Semantic Potential Dissipation.

As a result of these tendencies each sign being originally introduced into language (or produced in it) for designation, as a rule, of a certain single, mainly specific, object-oriented meaning may gradually realise, redistribute and eventually dissipate its semantic potential (present first in its initial meaning) in the course of further sign use by successive and parallel generation of new meanings from the first and following meanings and in the course of successive loss of previously acquired meanings (approximately in the same order as they appeared in the history of a word - see below).

This means that in the course of "generating" of new meanings by a word some semantic components of a particular parental meaning turn out to be already busy, to have spent their associative potential, i.e., every meaning becomes less able to "generate" new meanings after giving birth to some next meaning. But taking into account the fact that each next new meaning begins its own , but relatively weaker generative history we need to arrive at the conclusion that in this case we come across, first, the phenomenon of redistribution of the generating ability between a parental and a new meaning. A parental meaning spends part of it transmitting it to the next one. Next meaning acquires this part and begins its own process of spending and transmitting part of its generating ability to some new, later born meanings, etc.

So, for understanding the whole "generating" process we need to take into account that each of emerging new meanings of a word has, on the average, a smaller number of features and therefore - lower associative "energy" and therefore - lower "propensity" for generating next new meanings as compared to the meaning-predecessor. Thus, the process of formation of word's new meanings should gradually weaken and stop completely at a certain point in word's life when the last appeared meaning has its semantic potential so small that it is insufficient for generating process of even minimum degree. Loss of components by a meaning during its own functioning in language contributes additionally to the process of gradual diminishing of each next meaning generative activity.

The general dynamics of realisation of word's semantic potential is also determined by the fact that each newly emerging meaning displays certain expectancy of its subsequent extinction. On the average, the expectancy should be the highest in the case of the first, the most concrete meaning whereas with each subsequent, more abstract meaning it should gradually decline to reach minimum (while remaining still non-zero) in the case of the last meaning. Extinction expectancy of any meaning is necessarily realised in due course.

This prediction is also based on the concept of component structure of a meaning, unfailing development of subsequent meanings towards increasing "emptiness", abstractness and, therefore, the possibility of using each of subsequent meanings to refer to increasingly broader area of senses. This is what predetermines the ever reducing dependence of each such meaning on related extralinguistic senses, the ever reducing vulnerability to changes in the extralinguistic world (such as extinction of some sense due to disappearance of some objects formerly denoted by it).

The degree of vulnerability of meanings in the succession of them cannot reduce infinitely because successive increase in the level of their abstractness must have a certain limit.

Thus, redistribution of the semantic potential within semantic (polysemic) structure of a word is inescapably followed by gradual eventual loss of it (realised in the form of meanings loss) leading to the final complete exhaustion of the potential to the end of word's life cycle (when the last, on the average, most abstract meaning has eventually its semantic potential lower than it is necessary for giving birth, at least, to a single next meaning).

The whole process of redistribution and loss of semantic potential can be called dissipation of semantic potential.

Dissipation of the word's semantic potential is also the result of each meaning's predisposition to its own abstractivization taking place during its life-time. But the contribution of this factor to the overall result of word's semantic potential development is much lower and not so noticeable as that of more basic factor of gradually growing abstractness of each successive meaning compared to the previous one in the history of a word. That is why in further attempt to formalise the situation of word's semantic development this factor has not been used yet.

 

8. Polysemy Trajectory in Time.

The combination of the two processes - (1) increasingly retarded acquisition of meanings and (2) accumulation, with time, of lost meanings should result in the characteristic asymmetric bell-shaped pattern of polysemy size development for a typical word during its life time. (See Fig. 1).

Figure 1

  

T - physical time

Pt - polysemy at a certain period of time of word's existence

This pattern is preconditioned by the fact that up to a certain point in word's development the number of acquired new meanings should exceed the number of lost meanings. During this period word polysemy is growing with constant slowing-down. It reaches maximum by the point when these two processes attain equilibrium. At later stages of the process of new meanings acquisition continuing to die down (till complete halt) and formerly accumulated meanings showing ever growing "life fatigue", we expect gradual, prolonged reduction of word polysemy. With the last meaning going out of use the word dies as a whole.

It may either disappear from language altogether, or become incorporated as a bound stem or as an affix in some derivative or compound word, or as a lexical constituent may be used within some phraseological expression. The possibility for a word to be transformed into an affix is connected with the high degree of abstractness of last meanings of a word which is compatible with classifying function of any affix.

Processes of grammaticalisation, i.e. transformation of some syntactic constructions into word-forms and some phraseological units into compound words are some usual processes happening regularly following regular necessity for new words and affixes in a language. The necessity is a result of increasing popularity, rising frequency of use of some meaning previously denoted by some combination of words or by some syntactic construction. In both cases a form of expression according to the general law of abbreviation [Zipf, 1935] should contract, leading to a new word, or to a new form, transforming phraseologic meaning into a lexical meaning and syntactic - into a morphologic one.

Thus parallelly to dying out of some words someone can observe the process of renewal of the whole stock of lexical signs (and signs of other levels) in language.

 

9. Initial Formalisation of the Model.

The general outline proposed above can be further refined and formalised in the following way.

The semantic potential of any meaning is determined basically by the degree of its complexity, i.e. richness in components. The semantic potential of the whole word is determined by respective characteristics of the first meaning that gives birth to the second and other meanings immediately and mediatedly derived from it. Each component of any meaning has its own level of activity (energy) and its own level of instability (vulnerability). Activity (resp. instability) of each meaning can be estimated as a sum of activities (resp. instabilities) of its constituent components.

At the present stage of investigation we make a simplifying assumption on the equality of all components of a meaning in terms of their basic characteristics (such as activity and instability). As a result, we can estimate activity (resp. instability) of each meaning by measuring the number of its constituent features (semantic components) when considering its component structure at some appropriate level.

The last remark on "appropriate level" was made in connection with the possibility of multiple ways and levels of such consideration. This means that we postulate a potentially unlimited number of levels in the meaning's component structure, and the possibility of unlimited deepening into the microstructure of its components, and eventually - the possibility of dual representation of the component structure - both discrete and continual.

With the development of all later meanings towards greater abstractness, i.e. with reduction of the number of components present in them, the generating activity and the level of instability of each consecutive meaning should steadily decrease.

On the basis of these general assumptions one can put forward several suppositions on a more concrete shape of the dependence between the "internal" development (abstractivisation) of meanings and their "external" development, between the development of their ability for generating new meanings and propensity to extinction. The number of the possible more specific suppositions should be necessarily multiple because rather small number of them can be reliably preferred now without considering one by one all consequences of each supposition and verifying them, comparing the consequences with available empirical data. Other way of arriving to significant conclusions is to build a wider (and deeper) system of basic assumptions (or theoretical notions and principles partly already supported by the evidence from some adjacent fields of cognition) taking into account a wider number of factors relevant to a functioning object under consideration. Those conclusions which can be drawn from some more general theoretical assumptions present exact model of a particular kind of objects only in the case if those assumptions and boundary conditions specifying realisation of some general principles were correct and sufficient for obtaining the result. After being tested by specific experimental data, a particular model (as a hypothetical construction) acquires the status of theory, the status of law explaining the observed phenomena [Bunge, 1976; Altman, 1993; 1997]. In case of a number of competing models (according to the lack of some verified fundamental theoretical knowledge) they are tested in some specific order up to arriving at some satisfactory one.

For trying to show possible way of building some more formalized model let us clarify some of our original assumptions. They consist, ferst, in the following.

(a) We assume that all components within a meaning are equal in their quality (of different kinds), e.g. the duration of their lifetime. Possibly this simplifying assumption will reveal in future some incousistencies with some delicate consequences, but for this stage of modelling it is acceptable;

(b) We assume that in equal intervals of time (Dt) the relative quantity of each meaning’s components which it looses in the process of abstractivisation (as compared to its previous state) should remain the same.

A more formalized presentation of the assumption is the following:

(1)

 

(2)

where t is a lifetime of a consecutive meaning of a word;

M - total number of components in a given meaning;

DM - number of components by which the whole list of components of a meaning which it looses as compared to its previous state;

K' and -K'' - constants.

By introducing a factor of proportionality n such that n = - K''/ K' we can further write that K''= -nK'. Substituting values for K'' and K' we obtain

(3)

Proceeding from the possibility of continual representation of the structure of image objects it is quite natural to assume that DM and M may be regarded as infinitely small quantities. In this case formula (3) can be written as a differential equation

(4)

It follows that

(5)

Number of components in a meaning decreases exponentially in time. The greater cofficient n, the higher is tempo of a meaning’s abstractivisation. lnM is a measure of a meaning’s complexity.

Duration of lifetime of a meaning can be expressed in the following way:

(6)

Here lnMo – Mt can be interpreted as quantitative estimation of losses of a meaning’s complexity on the way from it arising to the actual point of time.

As it seen the whole lifetime of a consecutive meaning is proportional to the logarithm of the whole number of components lost on the way from the initial state to the actual one. In other words, the less complex becomes a meaning, as compared to its initial state, the longer life expectancy it has.

Also it can be interpreted in the aspect of relative ability of any meaning as compared to some normative one to survive in time. The greater the differenc of a meaning with a normative one in complexity the longer should be its life expectancy.

As can be seen, we have here the logarithmic law of relationship between the two quantities - lifetime of a meaning and the number of features present in it.

Another, power form of this dependence is obtained if one of the original postulates is not Dt = Const, but

(7)

Moreover, hypothesis (1) and (7) can be combined in a more generalized form:

(8)

where z is a certain factor ( 0=<z=<1 ). At some values of z ( z=<1 ) we have a power dependence, at z=0 a logarithmic one. Only testing in a variety of conditions can give some evidence for the relevance of any of variants or even both of them, but in different conditions within their spestrum in some system (as it happened in the history of "psychophysic law" in psychology [Zabrodin, 1981] ).

On similar grounds, then postulates concerning gradually changening activity of some successive meaning in semantic generative process can be put forward.

As a result of some calculations we can show that instability and activity of any meaning at any step of word's existence are mutually proportional in their quantitative value, as dependent on the same factor - number of components in a meaning.

On the same basis, a possibility arises for designing various types of curves for historical growth and reduction of polysemy by combining various probabilities of loss and acquisition of meanings at each stage of words development as well as for calculating types of curves for historical dynamics of the overal word's semantic volume, its frequency, length etc. (Compare with another approach [Krylov, 1995]).

Moreover, taking into account the possibility for admitting some assumptions on the law of the distribution of words entering language any period of time according to the amount of their semantic potential is possible to predict the proportion of words from different polysemy, age, lenght, etc. zones at some momentary distribution of some language.

However, it should be taken into account that the calculations of the proportion in language, at each particular moment, of shares of words having a certain number of meanings as well as the proportion among words of various polysemy of those that belong to different age, grammatic, frequency, formational, length, etc. groups depend on the general - either stationary or non-stationary - mode of the language evolution process that is under consideration.

If in a particular language a communicative-evolutionary process proceeds in a stationary mode (i.e. with relatively invariable size of sense field requirements imposed on language speakers from the outside, and under invariable conditions of the language's typological structure and therefore practically invariable number of words in the overall lexicon, that leads to the balance between the number of words and meanings coming to, and leaving the language, in each successive period of time), then it is possible to model the process taking into account some simpler variant of relations, than in the case of nonstationary mode.

In the case of any mode of the process, first of all, we can assume that each group of words entering into a language during a particular period of time should be distributed in terms of their semantic potential volume according to either (1) hyperbolic or (2) exponential law and, therefore, that of activity and instability of their original meaning (which determines the whole semantic history of a word). Both variants are related to the empirical fact that at any time there is preferential emergence of words extremely inactive and undisposed to longevity. The words disposed to the greater degree of active or stable existence are less spread. The greater is the degree of each kind of disposition the less possible is the appearance of a word with features of such level.

In case of any law assumption - one can arrive to some set of conclusions on the form of momentary distribution of words in language in terms of their age, polysemy, synonymic, phraseologic activity, morphemic structure, length etc.

One of attempts to construct empirically an equotian, responsible for momentary distribution in various languages of word policemy was undertaken in [Polikarpov, 1987]. It was shown that it is possible to use it for describing typological differences between lexicons of relatively analytic / synthetic languages [Polikarpov, 1987]; and for describing the type (relative size) of the whole vocabulary under consideration (i.e., total number of meanings covered by a certain vocabulary). For this parameter see [Polikarpov, 1987; Polikarpov, Kurlov, 1994].

There is an open field for professional mathematicians for this kind of modeling of quantitative relations between parameters of language within its history.

A more complicated problem is modelling the non-stationary situation of language existence, e.g. the situation of steadily extending (resp., reducing) sense field of a society (i.e., the field of the society's sense requirements) or the situation of changing typological structure of language (namely, its development either towards analyticity, with inevitable reduction of the language's basic vocabulary and its grammar, or towards synthetism, with build-up of a broader lexemic vocabulary and more sophisticated grammar than before).

As for the situation of progressive extension of the society's sense field, it should be pointed out, first of all, that, on the whole, this should result in general inhibition of the evolutionary process of lexical units. This is due to the fact that the need for designation of a broadening range of senses in a language should in the most conspicuous way lead to the constant growth of the number of meanings and sign units of a language's vocabulary. Given the relatively invariable natural norm of communication (limited by physiological abilities and needs of communicants), this fact results eventually in constant reduction of average relative frequency of use of the language's meanings and lexical units, in slower, on the average, turnover of any unit, in slower rate of word life cycle, and, finally, in slower general renewal of the whole language vocabulary.

Besides, and this is of no less importance, if the language's sense field is steadily extending then we have a steadily changing proportion of words being at various stages of realisation of their semantic potential (compared to the analogous proportion for the stationary mode of language functioning).

On the other hand, in the case of steadily narrowing scope of sense functions of language we have a progressive reduction of vocabulary size, increase in the average word occurrence rate, in the turnover rate of each unit of the narrowed lexical set, etc.

It is even more evident that the transition of language to a more analytical (or, on the contrary, more synthetic) structure will result in reduction (or, respectively, extension) of the basic vocabulary which leads in its turn to acceleration (retardation) of life cycle of an average lexical unit, to growth (decline) of average polysemy, to higher (lower) maximum of polysemy, to faster (slower) going out of use of lexical units previously entered into language. Exactly the last feature of the inescapably different tempo of the process of loosing units from some lexicon principally was not taken into account in glottochronological model which lead to some noticeable inaccuracies in predictions by M. Swadesh (e.g., for Norwegian and Icelandic languages).

Analysis of the non-stationary mode of vocabulary functioning at the stage of initiated (and yet unfinished) transition to a more analytic (or, on the contrary, synthetic) structure, as well as the non-stationary mode combining steady change in the sense field volume of a language (narrowing or broadening) and in its typological status (e.g., "immediate" emergence of the so-called pidgins and their possible consecutive development into creoles) raises further questions and should be undertaken in a further special work. (1)

 

10. Evolution of Semasiological Parameters during Word Life Cycle.

In addition to word's polysemy development (in qualitative and quantitative aspects) the entire system of its semantic (semasiological) and grammatical characteristics should undergo transformation with time.

First of all, specific development of its meanings' ability to enter into set expressions (generation of phraseologically bound meanings), to enter into synonymic and homonymic relations should be mentioned. Parameters describing engagement of word's meanings in these relations are of fundamental semiotic nature. Homonymy, along with, and in addition to polysemy, characterises word's multi-meaning character, variability of it in semantic aspect. Synonymy, along with doubletism, characterises variability of means for expression of the same senses. Finally, phraseologically bound meanings of a word serve to correlate it with the units of the next, phraseological level.

 

10.1. Homonymic Disposition of Words.

Ability of a word to be engaged in homonymic relationships may be related to its advanced position on the way of realisation of its semantic potential. However, the picture is rather intricate here, because there exist two immediate sources of homonymy: (1) occasional coincidence of sounding (or spelling) of two words as a result of their shortening and elimination of phonetic (or graphical) elements by which they formerly differed (even in the case of coincidence of some borrowed and genuine words probability for this depends on the length of words and, eventually, on the intensity of processes of contraction in the language borrowing units) and (2) splitting of a formerly whole word into two homonyms as a result of loss of some link meaning in the chain of consecutive polysemy development and resulting break-up of relations between remaining meanings.

The combination of these two sources in language predetermine complicated dependence of the resulting homonymy activity of a word on the length of time of its existence in language.

The first source should predetermine concentration of this kind of homonyms at the maximum of frequency of word's use. The maximum should be located later on word life cycle road than maximum of polysemy because frequency of a word should be proportional to the width of the sense area, covered by all word's meaning on the whole. At the same time, it is possible to calculate, that during some equilibrium of coming and going out meanings should be a situation of replacement of some previous, relatively concrete meanings by some more abstract ones, which inescapably leads to overall widening of word's overall referential area and its frequency of use for some time later after maximum of polysemy.

The second source should predetermine the concentration of this kind of homonyms in the area of word's polysemy which is located before its maximum, because initial meanings are forming there chains of successive meanings more often than clusters on later stages of word semantic development. Providing equality of all other conditions chains are more predisposed to disintegration than clusters.

Moreover, initial meanings are usually more concrete, than later meanings and therefore are more likely to get out of use (according to their greater instability). Both these factors lead to the more often break-up of the polysemy structure for words on earlier oligosemantic stages of their development than on later ones.

 

10.2. Synonymic Activity of Meanings.

The age correspondence of word meanings' ability to enter into synonymic relations can be predicted as follows.

Synonymy according to the most principle definition of it consists in coincidence of two (or more) meanings of different words by almost all their components, excluding only small number of them able to be neutralised in some conditions (D.Shmeliov). Every meaning has certain objective probability to coincide with any other meaning of language by (almost) all their components. It mainly depends on the degree of meanings' abstractness, richness in components. The more abstract is a meaning the more likely it comes across with other, "almost the same" meaning.

This situation depends principally on the greater repetition of ways for abstractivisation, higher rate of preserving of some components repeating in different situations by different words' meanings on their way to more abstract shape. It leads to gradual growth of the degree of likeliness between meanings of each next level of abstractness, to increase in objective probability for them to coincide or to become synonymous.

So, we predict the gradual growth of average synonymic activity of meanings during all life time of words possessing them.

 

10.3. Phraseological Activity of Meanings.

The connection of the degree of phraseological activity of words with their age is predicted in the following way. The ability of word's meaning for such relations is dependent basically on the wealth of features inherent in it. The most rich in features are the initial meanings displaying the strongest material character. Any physical object in nature is practically inexhaustible in its characteristics and, consequently, in features reflected in the meaning. However, due to its highly specific character every such meaning has rather narrow compatibility with meanings of other words, i.e. more limited objective external basis for entering into phraseological relations with meanings of other words.

Historically subsequent more abstract meanings, though already having poorer set of features (the fact that in itself restricts the potential ability of such meanings for phraseologisation), simultaneously acquire on this basis the greater ability to enter into syntagmatic relations with the range of meanings of other words (among which there often occur those available for forming new phraseological units). If the positive factor of extended compatibility acts in this process more strongly than the negative factor of impoverishment of meanings' features then the phraseological activity of meanings have to grow in the course of word's ageing and initial growth of the polysemy.

Will the activity grow infinitely? Obviously, not because in the movement towards extreme abstractness (synteticity) of meanings, their extreme potential referential broadness, universal versatility cannot be followed at some late stages of this process by proportional rise of their contextual diversity. This is explained by the fact that the size of the whole language vocabulary providing candidates for compatibility is quite finite, limited.

As a result, starting from a certain period of semantic history the contribution of the context factor will begin to decrease considerably leading to growing importance of the opposite factor and resulting in decline, on the average, of amount of phraseological meanings "generated", on the average, by each free meaning appeared in the very end of word's semantic history.

 

11. Dynamics of Word's Derivational Potential.

 In a similar way to the trends in word's polysemy development the character of realisation of word's derivational potential can also be predicted. This ability of a word also fundamentally depends on its semantic potential. More exactly - on words' categorial-semantic potential.

It means that there should exist differences in predisposition of words belonging to different grammar categories (to different parts of speech) to different degree of activity in generating from them new words and in stability in standing against various life circumstances. General gradation here is the same as in the case of different kinds of meanings - according to their degree of categorial abstractness. The most concrete according their categorial semantics are nouns, less concrete - adjectives, adverbs, verbs. Further - semisyntactic and pure syntactic words - numerals, pronouns, conjunctions, prepositions. According to the degree of this characteristics they should feature different generative and longevity ability.

As a specific unit of this process one can consider a group of words derivationally related.

Here, the prediction is that, like in the case of polysemy development, a special balance of increments and losses should exist in the word formation process. Within derivational group of words those words which were produced on earlier steps of the process should be, on the whole, more active and more vulnerable than words on further steps of it. Build-up of derivatives increasingly inhibited in time and accumulation of extinct derivatives should produce an asymmetric bell-shaped time curve of derivational volume for a group of words derivationally related during group ageing similar to that of polysemy of a word during its ageing.

 

12. Verification of the Model.

 At the moment direct checking of predictions on word semantic historical development for individual words in many cases looks unlikely, because we do not have at our disposal sufficiently complete data on historical acquisition and losses of meanings (including loss of meaning's components) and derivatives by words not for any relatively representative group of words. The more distant point in history we reach, the smaller amount of reliable semantic facts we are able to obtain. However, numerous facts of stable existence for centuries and even millennia of many syntactic words, higher vulnerability of other words, especially nouns, higher probability for further replacement of the youngest, actually less polysemic words and words with mostly object-oriented meanings support the plausibility of the hypothesis. But it is not sufficient for converting of some hypothesis into a theory. Only systematic qualitative investigation of representative portions of language data can be considered as appropriate. One of the most important principles on this way is functional replacement of considering of different stages of evolution of some words by considering synchronic results of the evolution of greater amount of words from different age categories. In other words, we are going to replace

some historical picture by fragments from different actually existing pictures reflecting different stages of the same typical process. Of course, it is possible only in the case, if we assume , that each word (unit of lexical population) is not absolutely unique, but follows some general tendencies. That is why considering of different parameters of words from different age categories is the main aspect of our experimental testing of presented model of lexical system evolution.

 

12.1. Age of Words and Abstractness of their Meanings.

This is a crucial question for the whole model because many other predictions made here are based on it. The predominantly "material" character of the original meaning of really new words (not derivatives of already existing words, i.e. not their derivational variants) and its regular development in each consecutive meaning towards ever growing abstractness has been repeatedly mentioned in linguistic works as a hypothetical tendency (cf, e.g., in the modern period the proposition by S. Ullman [1963] and N.D. Arutyunova [1976, p.163, 165]). But, to the best of our knowledge, it was not tested using at least minimally representative data. To elucidate this issue, two series of experiments were undertaken.

In the first one, carried out together with A.V. Andreevskaja, from the above-mentioned corpus of 4185 root words of the modern Russian language a sample of 521 autosemantic words was taken for testing.

The qualitative type of words' meanings under consideration was determined in the following way. Initially to each meaning the code of its semantic field according to thesaurus classification of R. Hallig and W. Wartburg [Hallig and Wartburg, 1952] was assigned. For the purposes of our analysis the classification categories were further arranged in three main groups: a) the group of material-concrete meanings (names of utensils, implements of labour, plants, animals, etc.); b) the group of abstract meanings (names of main types of logical relations, mental and psychological phenomena, temporal and special relations, etc); c) the intermediate group not belonging definitely to any of the two opposite groups.

The following age groups were singled out: -

"1" - the oldest words of Indo-European or earlier origin;

- "2" - words of Common Slavonic origin (that emerged in the parent language after the beginning of the 1-st millennium B.C. and before 6-th century A.D.);

- "3" - Old Russian words (that emerged in the language in 7-th through 14-th centuries);

- "4" - words that emerged in 15-th through 17-th centuries;

- "5" - words that emerged in the 18-th century;

- "6" - words that emerged in the 19-th century;

- "7" - words that emerged in the 20-th century.

 

Table 1 gives an idea of proportion of meanings of variously aged words in each of the qualitative categories.

 

Table 1

Semantic quality of Russian autosemantic root words' meanings as a function of words' age

number of age period of words overall number of word meanings quality type of words' meanings
abstract intermediate concrete
1 61 96.80 0 3.20
2 289 16.54 23.62 59.84
3 112 13.59 15.53 70.87
4 102 4.90 13.72 81.37
5 188 9.04 20.21 70.45
6 149 6.04 20.13 73.83
7 59 5.08 10.17 84.75
Total 960 16.25 18.13 65.62

 

Age periods:

1

Common Indo-European and earlier

2

Common Slavonic

3

Old Russian

4

15-17 centuries

5

18 century

6

19 century

7

20 century

Figure 2.

Percentage of Russian root words' meanings of different semantic quality (relatively concrete, abstract, intermediate) as a function of words' age

 

The data show that in spite of some local deviations the trend for development of words with increase of their age to increasingly abstract meanings is predominant.

There is in progress an experimental testing of the same kind for a greater amount of Russian and English words of any kind.

 

12.2. Qualitative Drift of New Meanings.

The second series of experiments was carried out (together with Kustova G.I.) using a representative sample of Russian words that acquired new free meanings in the 70s of the 20th century according to the "Novye Slova i Znacheniya" dictionary [Kotelova, 1984]. For words of various parts of speech and for various lexico-grammatical classes, by using component analysis methods the following statistics was obtained for the development of new meanings compared to maternal ones towards higher/lower abstractness or the development mainly indifferent to this differentiation (see Table 2).

Table 2

Correlation of new free meanings of Russian words in terms of lower/higher/no change of degree of abstractness (compared to those meanings from which they originated)

Part of speech category of words

Lexico-grammatical category of words

Direction of words' new meanings development

 

 

More abstract

No noticeable change

More concrete

All 

 

Noun

 

Concrete

47

29

22

98

Personal

18

3

3

24

Abstract

30

4

7

41

Adjectives

Qualitative

3

-

-

3

Relative

29

-

1

30

Verb

 

30

4

4

38

Adverb

 

5

-

-

5

Total

 

162

40

37

239

Figure 3.

As it is seen from the data, the predominant direction of the word's semantic development for some period is the movement towards increasing abstractness. The most noticeable (but not prevailing) deviations from this pattern occur in nouns which are, on the average, younger and more specific words in their lexical and categorial semantics, than words of other parts of speech. More exactly, they are words that form new meanings by metonymical transfer and are oligosemantic, i.e. should display according to our model, a less regular character of the development than that of the latest meanings. There is in progress work of testing the model for English new meanings from Barnhart's Dictionary. (2)

 

12.3. Age and Polysemy of Words.

Although we are yet unable to follow polysemy development regularities for individual words in their history we can do this for the totality of words of various ages currently coexisting in a particular language. These data can be used to check the hypothesis that words of different age should be on different stages of their polysemy development. Having checked this hypothesis we will be able to extend the observation quite safely to each typical word in language being at each consecutive developmental stage. (3)

For initial testing of the model we used first of all the following data:

- a random sample of 521 autosemantic root words from the general corpus of 4185 modern Russian autosemantic root words (4) ;

- a complete set of 40 modern Russian syntactic root words.

Root words (i.e. those split by morphemic analysis into a free root and, optionally, an ending) were chosen for the analysis for several reasons. Most important were considerations of semantic and derivational "purity" of root words. The semantics of derivatives is often a "reflected light" of their derivation base, i.e., it is borrowed (in the case of the so-called "syntactic derivation") from it completely or in part. For this reason investigation of semantics of syntactic derivatives cannot provide "pure" information on semantic regularities of emergence of new meanings on the basis of existing ones within this word (5).

Age groups were singled out in the way described above.

The resultant distribution of the average polysemy for different age groups is in good conformity with our prediction (see Table 3). The general growth of polysemy from the seventh to the second period is not only halted, but slightly reversed during transition from the second to the first period, i.e. the loss of meanings at this point, seemingly, begin to prevail over their acquisition.

 

Table 3

Average polysemy as a function of age for Russian root words

  Age period

 

Words of various parts of speech

 

Total

Nouns

Adj-s

Verbs

Num-s

Prn-s

Synt. words

1

2.5

-

-

-

-

-

2.5

2

2.8

2.9

3.0

3.5

3.8

1.0

-

3

2.2

2.0

7.0

-

-

-

-

4

1.6

1.7

1.25

-

-

-

-

5

1.9

1.9

-

-

-

-

-

6

1.4

1.4

-

-

-

-

-

7

1.2

1.2

-

-

-

-

-

Total

1.9

1.9

2.8

3.5

3.8

1.0

2.5

Here age periods are as follows:

1 : Common Indo-European  

5 : 18 century

2 : Common Slavonic

6 : 19 century

3 : Ancient Russian

7 : 20 century

4 : 15 - 17 centuries

 

Figure 4. 

Average polysemy of Russian root words as a function of their age

As it can be seen, adjectives, verbs, numerals and syntactic words (conjunctions and prepositions) are distributed in the age sample according to predicted order: the more abstract is categorial meaning of words the older they are. At the same time increase of the average number of meanings during movement from nouns to adjectives and verbs is changed to decline of polysemy during further movement to pronouns and syntactic words. (6)

Similar results were obtained in the analysis of Estonian data. For the analysis, 6596 root words were taken from now available portion (A-K) of the "Explanatory Dictionary of the Estonian Language".

Following age groups were singled out:

1 - words of general Uralic origin (no later than 7th millennium B.C.);

2 - words of Finno-Ugric origin (7th to 2nd millennium B.C.);

3 - words of Finnish origin (2nd millennium B.C. to 850 A.D.);

4 - words of Estonian origin (850 to 1850 A.D.);

5 - new vocabulary (1851 to 1990 A.D.).

As is seen from Table 4, the Estonian linguistic data feature mere inhibition of polysemy growth, rather than polysemy reduction, for the oldest words. It may depend on two main factors:

(1) on very weak differentiation of really oldest words from not so old and mixture of oldest words on declining branch of the polysemy development with a group of words of a younger ages having greater degree of polysemy;

(2) on the way of counting of number of meanings for words of old ages, which are very often syntactic and therefore having very wide (abstract) meanings. This in a regular way should lead to extension of word's referential scope, to more active (on the average) splitting of each consecutive meaning into a number of contextual "submeanings". Naturally, contextual "submeanings" are very often confused with real meanings by lexicographers. It may be the case here and in many other situations, as well (e.g., in Russian [Polikarpov, 1987]).

Table 4

Average polysemy as a function of word's age for Estonian root words

Word ages

9000

6500

2500

640

70

No of words

24

28

395

1423

4706

Aver. No of meanings

4.6

3.75

2.75

1.31

1.24

Figure 5.

12.4. Polysemy of Words and Average Generating Activity of their Free and Bound Meanings.

If there is a dependence of words' polysemy on the quality of meanings (more abstract meanings are characteristic for more polysemous words), we can conclude, that they should mediate dependence between polysemy of words and average ability of their meanings to give birth to new meanings.

It can be seen from the data (tables 5, 6) obtained by analyzing words having new free and bound meanings presented in Barnhart's dictionary [Barnhart, 1973] for English and in the "Novye Slova i Znacheniya" dictionary (ed. by N.Z. Kotelova) for Russian [Kotelova, 1984] that the real situation in general corroborates with the predictions made in the model.

Table 5

Correspondence between new meanings of words according to the dictionary "Novye Slova i Znacheniya" (NSZ) and overall lexico-semantic system of Russian language (a random sample from "Slovar' Sovremennogo Russkogo Literaturnogo Yazyka" - SSRLY [1948-1965])

Polysemy in SSRLY

Total number of meanings in SSRLY

Total number of words in SSRLY

Number of words with new free meanings in NSZ

Total number of new free mean-s in NSZ

Share of new free meanings

Number of words with new bound meanings in NSZ

Total number of new bound mea-nings in NSZ

Share of new bound mea-nings

Ratio between new bound and new free mea-nings

1

76382

76382

155

179

0.23%

116

130

0.16%

0.52

2

52204

26102

77

85

0.16%

76

92

0.17%

0.96

3-4

43904

13305

63

78

0.17%

73

98

0.22%

1.14

5-8

22827

3914

29

32

0.14%

46

76

0.33%

2.29

9-39

9212

779

3

3

0.03%

15

18

0.19%

5.00

Total

204529

120481

327

377

0.18%

326

414

0.20%

1.10

 

Figure 6.

Dependence of activity of Russian word meanings in generating new meanings on polysemy zones which words belong to

 

Table 6.

Correspondence between new meanings of words (Barnhart 1973 - B-73) and overall lexico-semantic system of English language (a random sample from OED)

 

Polysemy of words according to OED

Total number of words in back-ground sample from OED

Total number of meanings in sample from OED

Number of words with new free meanings in B-73

Total number of new free meanings in B-73

Share of new free meanings within all meanings of some polysemy zone

Number of words with new bound meanings in B-73

Total number of new bound meanings in B-73

Share of new bound meanings

1

2

3

4

5

6

7

8

9

1

2654

2634

154

174

6.6

8

15

0.56

2

638

1276

116

139

10.9

14

19

1.48

3-4

471

1579

109

134

8.5

14

25

1.58

5-8

188

1125

74

102

9.0

11

16

1.42

9-16

54

602

27

43

7.1

9

16

2.65

17-32

9

176

5

8

4.5

2

2

1.13

33-64

3

131

0

0

0

0

0

0

65-128

3

240

0

0

0

0

0

0

Total

4000

7763

485

600

7.7

58

93

1.20

 

Figure 7.

 

The way of presenting results is based on calculation of ratio between new meanings and all meanings of words belonging to some polysemy zone in language on the whole (using either whole dictionary, as it is the case for Russian, or some representative sample, as for English). Rise or decline of this ratio manifests the general tendency in generating activity of each meaning of this language words from some polysemy category.

As it turns out, the more polysemous the words, the more abstract and the less disposed are their meanings to generation of new free meanings. Some deviations from the general tendency for free meanings dynamics in the area of monosemantic words for English can be explained by the abundance of dead monosemantic words in OED accumulated there according to its historical type.

At the same time, picture for empirical dynamics of bound (phraseological) meanings is different from dynamics of free meanings which is in full accordance with the prediction of the model: average growth of activity of each free meaning in generating new bound meanings during movement towards greater polysemy is changing to the decline of this activity in the area of maximum polysemy. And this is a case for both series of experiments - Russian and English as well.

 

12.5. Polysemy and Stylistic Markedness of Meanings.

The model is supported also by the data on the relation between polysemy and stylistic markedness of words' meanings for Russian and other languages. For example, the data obtained by examining three out of four volumes of the "Dictionary of the Russian Language" (DRL) ed. by A.P. Yevgenyeva (1-st edition, M., 1957-1961) allow to elucidate the relation between the degree of stylistic markedness of meanings of words and the polysemy zone to which they belong (see Table 7).

 

Table 7

Degree of stylistic markedness of Russian words' meanings from various polysemy zones (according to DRL)

Polysemy zones of words

All marked & unmarked meanings

Stylistically marked meanings of words

Bookish-Special

Colloquial

Obsolete & Archaic

Dialectal

Total for marked meanings

 

N

N

%

N

%

N

%

N

%

N

%

1

45628

5978

13.10

8441

19.38

3294

7.22

476

1.04

18589

40.74

2

21560

2763

12.82

4534

21.03

1675

7.77

302

1.40

9274

43.01

3-4

17351

1552

8.94

3159

18.21

978

5.64

152

0.88

5841

33.66

5-8

8680

622

7.17

1242

14.31

340

3.92

45

0.52

2249

25.91

9-16

2238

105

4.69

268

11.97

71

3.17

9

0.40

453

20.24

17-34

227

3

1.32

15

6.61

3

1.32

1

0.44

22

9.69

2-34

50056

5045

10.08

9218

18.42

3067

6.13

509

1.02

17839

35.64

Total

95684

11023

11.52

18095

18.91

6361

6.65

985

1.03

36428

38.07

Figure 8.

These data apparently indicate that, as a rule, the degree of neutrality, universality of meanings for use in various spheres is growing, on the average, with the growth of polysemy of words they belong to. This is most evident in the case of reducing markedness of meanings with bookish-special markers. The effect is in good conformity with data on increasing average degree of abstractness of meanings of words with the growth of their polysemy which was discussed before. (7)

 

12.6. Polysemy and Average Frequency of Meanings' Use.

The key component of the word life cycle model is the proposition that the process of word's semantic potential exhaustion is rooted in incessant development of quality of meanings towards the loss of material reference, increased abstractness, growing referential scope. Extension of word's referential scope, in its turn, should further lead to increased relative frequency of word's each consecutive meaning. In the works investigating the relationship between word frequency and polysemy the fact of accelerated growth of meanings' frequency for increasingly polysemous words has been presented [Zipf, 1945; Guiraud, 1954; Andrukovich, Korolyov, 1977; Tuldava, 1979; Koehler, 1986; Arapov, 1987], but only implicitly. Only basing on the model presented here is possible not only to notice the tendency, but to estimate it as it deserves - as some extraordinary evidence for one of the most fundamental tendencies in language existence.

Below there is present Table 7a containing data on number of word meanings in connection of average frequency of word meanings' use from "Dictionary of Shevchenko's Language".  

Table 8.

Dependence of each meaning's average frequency of use on the polysemy of words from "Vocabulary of T.G. Shevchenko")

POLYSEMY

N. of Words

Av. freq. of each of meanings

1

3779

8,76

2

815

20,46

3-4

394

64,12

5-8

100

231,41

9-16

12

640,17

17-32

3

1629,00

33-64

1

3225,00

Figure 9.

12.7. Age of Words and their Synonymic and Phraseological Activity.

 As for synonymic and phraseological activity of meanings of variously aged words, the propositions of the word life cycle model are supported by the data obtained on the basis of the above sample of Russian root words (see Table 8 and Table 9).

Table 9

Share of meanings engaged in synonymic relations among meanings of variously aged Russian words

Age periods

1

2

3

4

5

6

7

Total

Total number of meanings for all sample words

61

289

112

110

196

149

62

980

Number of meanings having synonyms

13

57

22

16

13

6

3

125

Share of meanings having synonyms

0.21

0.20

0.20

0.14

0.07

0.04

0.05

0.13

Age periods:

1

Common Indo-European

2

Common Slavonic

3

Old Russian

4

15-17th centuries

5

18th century

6

19th century

7

20th century

Figure 10. 

Share of meanings engaged in synonymic relations among all meanings of variously aged words

Table 10.

Ratio of the number of phraseologically bound meanings to the number of free meanings of variously aged words

Period

1

2

3

4

5

6

7

Total

Total number of free meanings

61

289

112

110

196

149

62

980

Total number of phras. Bound meanings

40

289

29

13

22

4

1

398

Share of phras. bound meaning

0.65

1.00

0.26

0.12

0.11

0.03

0.02

0.40

Figure 11.

Age periods: as above

The data show that the older words we take the more synonymically and phraseologically active are their meanings on the average. However, as the model predicts, the oldest words (period 1) feature a decline of phraseological activity according to some limitation factor in the organisation of language (namely, exhaustion of the language's vocabulary as a source of various possible phraseological combinations).

 

12.8. Age of Words and their Homonymic Activity.

 Next point is testing the validity of a theoretical conclusion on correlation between two kinds of processes leading to building of homonymic groups - (a) the process of contraction of words during their semantic, frequency and length development and (b) the process of semantic splitting of some words during their semantic development. According to our model the first process should have maximum intensity later than the second one.

As it can be seen from these data (Table 10) obtained together with E.B. Lenskaia from the large dictionary of Russian language (Slovar Sovremennogo Russkogo Literaturnogo Yazyka" - SSRLY [1948 - 1965]) our hypothesis can be regarded as proven. As one can see two distributions differ significantly for the total range of values. Two distributions present two waves of growing and decreasing values of two types of homonymic involvement of words of different ages. In full accordance with theoretical expectations a wave of words' involvement in coincidence type of homonymy is older and has its maximum earlier than those of splitting type of homonymy.

 

Table 11.

Distribution of Russian homonyms of different types according to their age

Age period of words Type of lexical homonymy 
Occasional phonetic coincidence of words  Splitted polysemy  Total
abs. % abs. % abs. %
1 4 0.6 0 0.0 4 0.6
2 52 7.0 13 1.9 65 8.9
3 90 13.0 32 4.6 122 17.6
4 129 19.0 70 10.0 199 29.0
5 86 12.0 78 11.0 164 23.0
6 39 6.0 48 6.8 87 12.8
7 29 4.2 27 3.9 56 8.1
Total 429 61.8 268 38.2 697 100.0

 

Figure 12.

Age periods: as above

 

12.9. Age of Words and their Derivational Activity.

Here are the data on average derivational activity of variously aged Russian root words presented below, measured both by the number of derived words from them (Table 11) and by the number of derivational grades (steps) generated after them (Table 12). Vocabulary data and criteria for the age determination of Russian base root words are the same as described above.

 

Table 12.

Average number of derivatives for Russian root words as a function of their age

 Age period

1

2

3

4

5

6

7

Total

Number of words

40

105

51

69

104

102

52

523

Aver. number of derivatives

2.1

45.1

13.6

8.3

6.2

4.7

1.7

14.0

Age periods:

1 - Common Indo-european,

2 - Common Slavonic,

3 - Old Russian,

4 - 15-17 centuries,

5 - 18 century,

6 - 19 century,

7 - 20 century.

 

Figure 13.

Average number of derivation steps in a word formational nest as a function of roots words’ age

Table 13.

Average number of derivation of steps of Russian root words as a function of their age

Age period

Average no. of derivational steps

 

Total

Noun

Adj.

Verb

Num.

Pron.

Synt. words

1

0,7

-

-

-

-

-

0,7

2

2,4

2,2

3

3,75

3,2

1

-

3

1,4

1,3

3

-

-

-

-

4

1,5

1,5

-

-

-

-

-

5

1,5

1,5

-

-

-

-

-

6

1,3

1,3

-

-

-

-

-

7

0,8

0,8

-

-

-

-

-

Total

1,5

2

3

3,75

3,2

1

0,7

Age periods: as above

The data show that derivation process most clearly features initially the evident predominance of acquisition of derived words over their loss. Arriving to the "final" (1) point in word-formation evolution, the process of loss of previously acquired derivatives becomes predominant.

It should also be pointed out that the distribution of root words of various parts of speech into age groups is quite regular as well. Root words of 18-20th centuries comprise only nouns. No adjectives younger than the 15th century and verbs younger than the 8th century were found.

Some new results for dependencies of word-formational features of Russian words on their age and other factors see in [Polikarpov a.o., 1998].

 

12.10. Survival Rate of Words of Different Age and other System Features of Words.

Below are present some data on the differences of survival rate in modern Romance languages for Latin words of different age (Table 13). It is gathered by M.E. Kapitan during research carried out within a complex program of word life cycle model testing. Latin words were taken from the list of the 1-st thousand of the most frequent Classic Latin words [Gardner, 1970].

Table 14.

The correlation between age and rate of survival in 5 modern Romance languages for the first 993 most frequent Latin words of all parts of speech together

 

Ordinal number of age periods

 Latin

Romanian

Italian

French

Spanish

Portuguese

 N

%

N

%

N

%

N

%

N

%

N

%

1

250

100

117

46.8

162

64.8

144

57.6

156

62.4

163

65.2

2

78

100

31

39.7

46

59.0

40

51.3

43

55.1

44

56.4

3

15

100

4

26.7

8

53.3

6

40.0

7

46.7

7

46.7

4

650

100

130

20.0

254

39.1

205

31.5

241

37.1

240

36.9

Total

993

100

282

28.4

470

47.3

395

39.8

447

45.0

454

45.7

Figure 15.

 Here 1, 2, 3, 4 are different periods at which Latin words appeared, according to etymological dictionaries of Latin:

1. Period of Indo-European unity;

2. Period of Western-Indo-European unity;

3. Period of Italic unity;

4. Proper Latin period.

 

As one can see there is a remarkable regularity for older words to be safer in further life in modern Romance languages. It is because the older are actually existing words the greater is their further stability because of greater abstractness of their meanings.

There was also revealed a significant positive correlation for survival rate of Latin words on their polysemy, frequency and degree of categorial abstractness. The latter means that syntactic words on the whole are preserved in time better than autosemantic ones. Among autosemantic words there is a tendency for nouns to be less safe in time than for adjectives, verbs and adverbs. For more details see [Kapitan, 1994; Kapitan, 1995; Kapitan, Polikarpov, 1998].

 

13. Final Conclusions.

In a mathematical model of word's life cycle to be developed in future some place should be found for a typology-dependent parameter accounting for the size of vocabulary which serves as a scene for events of semantic evolution. A preliminary qualitative consideration concerning the parameter is that typologically smaller vocabulary (e.g., English as compared to a Russian one) has accelerated realisation of the word's semantic potential, increased maximum level of polysemy and shortened, on the average, word's life cycle, i.e., in the final analysis, accelerated general renewal of the whole vocabulary (and, in all likelihood accelerated general renewal of the system of grammatical markers) in comparison with the case of language with typologically larger vocabulary.

Preliminary data on typological development of some Germanic, Romance, Slavonic and other languages which were in this respect in different conditions confirm the fact of accelerated renewal of vocabulary and grammatical markers in the case of more analytic, more lexically and morphologically (but not phraseologically) poor languages [Polikarpov, 1979, ch.2,3].

This model explains that was not only unexplained, but even unexplainable within a glottochronological model of M. Swadesn - system differences between languages in their tempo of their vocabulary renovation, including connected with it the question of different rate for survival of lexical units in languages of different typology, different cultures (different sizes of their sense field), communities of different size, etc. All these questions could be put only on the basis of a model like presented here.

Among others there is a possibility of qualitative interpretation of certain extinction probability of syntactic words. It must account both for inevitable structural, grammatical rearrangements of any language (i.e., periodic systemic elimination and replacement of formerly existing grammatical meanings while typological rearrangement of a language) and for inevitable competition of various synonymic means of expression of the same conceptual category (even of those which are syntactic categories).

If in the course of semantic development a syntactic or autosemantic word acquires a meaning in which it becomes involved in a high-frequency syntactic construction (combination of the syntactic word with other syntactic or autosemantic ones) or in a high-frequency phraseological expression, then the construction is very likely to undergo step-by-step contraction, the result being the formation of a new lexeme (word form or compound) into which the original word is included as a morpheme (a root or an affix).

At a later stage of such word development one of roots of a compound word may also become an affix if the root (according to its inner disposition) is productive in formation of compound words in the language and compound words with this root are sufficiently frequent in speech. The intermediate stage of transformation of a root into an affix is known as semiaffix.

In the course of possible further growth of occurrence rate of the word form, language users may loose, forget the motivation of the word's morphemic composition (e.g., following the loss of independent use of syntactic or autonomous words due to competition between words and omitting some of them), semantic reinterpretation may occur of a group of morphemes as a new single morpheme within the word.

This process also leads to further reinterpretation of phonetic envelope of the word form, de-etymologisation of its composition, eventual disappearance of any traces of the morpheme which historically has served as a building component for the word (especially, if by that time the prototypical word of the affix has really gone out of independent use). Then the complete end of the word's life cycle may be stated. For some time the "remnants" of the word (morpheme) may serve as a soil for growth of subsequent generations of words, but finally they should be completely dispersed in cycles of "linguistic metabolism". In the case of complete dispersion of all affixal remnants of the former word in all languages derived from the same parent language and absence of written fixation of stages of their development, etymological analysis of many old "primary" words turns out to be already impossible. However, valuable information on that score is available in the case of languages of some ancient cultures (e.g., Romance, Indian, Semitic languages) whose written monuments reflect some remote states of language, its intermediate and various modern varieties representing various stages of "digestion" of the original language material.

This is an open field for further investigations.

 

Footnotes

1) For our previous attempts to analyse possible mechanism of analytisation or, on the contrary, synthetisation of language structure (including vocabulary) see [Polikarpov 1976; 1979]. (back to the text)

2) For a more detailed analysis of these data see [Polikarpov, 1999; Polikarpov a.o., 1999]. (back to the text)

3) Initial testing is understood as the checking of the most general fact of dependence of polysemy of words on their age, presented by consecutive growth of polysemy with time followed by slow-down in the growth tempo and even by possible reduction of polysemy at the very end of word's life time. (back to the text)

4) These data were selected and analysed in co-operation with A.V. Andreevskaya. For more information see [Polikarpov a.o., 1999]. (back to the text)

5) For more details on a "pure" unit of lexical system (called hyperlexeme) see [Boroda, Polikarpov, 1988; Karimova 1989; Karimova, Polikarpov, 1989]. (back to the text)

6) The research was conducted in co-operation with Ya.V. Perezhogina at the Department for Theoretical and Computational Linguistics of Moscow University. (back to the text)

7) For a more detailed consideration of the problem of conditioning of stylistic markedness of vocabulary by its other systemic characteristics (such as explanation of local deviations from the general trend in the relation between the degree of markedness of meanings and polysemy status of words they belong to) see [Polikarpov, Kurlov, 1994; Kolodyazhnaya, Polikarpov, 1994]. (back to the text)

 

References

ALTMANN G. [1991]. Science and Linguistics // QUALICO 91. First Quantitative Linguistics Conference. September 23-27, 1991. University of Trier, Germany. - Trier, 1991.

ANDRUKOVICH P.F., KOROLYOV E.I. [1977]. O Staticheskikh i Leksiko-Semanticheskikh Svojstvakh Slov (On Statistical and Lexico-Semantic Properties of Words) // NTI, Ser.2, No 4, 1977.

ARAPOV M.V. [1987]. Upotrebitelnost i Mnogoznachnost Slova (Occurrence Rate and Polysemanticity of a Word) // Kvantitativnaja lingvistika i avtomaticheskij analiz tekstov - 1987 / Acta et Commentationes Universitatis Tartuensis, issue 774. - Tartu: Tartu University Press, 1987.

ARUTYUNOVA [1976]. Predlozhenije i Jego Smysl (Sentence and its Sense).- M.: Nauka Publishers, 1976.

BARNHART C. [1973]. A Dictionary of New English. 1963-1972. - London, 1973.

BORODA M.G., POLIKARPOV A.A. [1988]. The Zipf-Mandelbrot Law and Units of Different Text Levels // Boroda M.G.(ed.): Musikometrika 1. - Bochum: Brockmeyer, 1988.

GARDNER D.D. [1970]. A Frequency Dictionary of Classical Latin Words. A dissertation submitted to the committee on linguistics and the committee on graduate studies of Stanford University in partial fulfilment of the requirements for the degree of doctor of philosophy. Stanford University. - Stanford, 1970.

GUIRAUD P. [1954]. Les Caracteres Statistiques du Vocabulaire. - Paris: Press universitaires, 1954.

HALLIG R., WARTBURG W. von [1952]. Begriffssystem als Grundlage fuer die Lexikographie. - Berlin, 1952.

KAPITAN M.E. [1994]. Regularities of Latin Words Survival in Modern Romance Languages // Journal of Quantitative Linguistics. 1994. v.1. N 3.

KAPITAN M.E. [1995a]. O zakonomernostiakh sokhrannosti latinskoj leksiki v sovremennykh romanskikh jazykakh (About the Regularities of Latin Words Survival in Modern Romance Languages). Ph.D. Dissertation. - Moscow, 1995.

KAPITAN M.E., POLIKARPOV A.A. [1999]. Vlijanije Razlichnyh Sistemnykh Kharacteristic Latinskikh Slov na ikh Sokhrannost (Influence of Different System Features Of Latin Words on their Survival) // Sistemnye Issledovanija v Lingvistike (System Researches in Linguistics). vol. 1. - Moscow, 1999 (in press).

KARIMOVA G.O. [1989]. Giperleksemnaja Gruppirovka Slov kak Sposob Predstavlenija Sistemnosti Leksiki (Hyperlexemic Grouping of Words as a Way of Representation of the Systemic Organization of Lexical Units) // Kvantitativnaja lingvistika i avtomaticheskij analiz tekstov / Acta et Commentationes Universitatis Tartuensis, issue 872. - Tartu: Tartu University Press, 1989.

KARIMOVA G.O., POLIKARPOV A.A. [1989]. Printsipy Vydelenija Giperleksemy kak Jedinitsy Leksicheskoj Sistemy Jazyka (Principles for Singling out Hyperlexeme as a Unit of the Lexical System of Language) // Derivatsionnyje Tipy i Gnyozda v Sinkhronii i Diakhronii. - Vladivostok: DVO AN SSSR, 1989.

KOEHLER R. [1986]. Zur Linguistischen Synergetik: Struktur und Dynamik der Lexik. - Bochum: Brockmeyer, 1986.

KOEHLER R. [1991]. Synergetic Linguistics // QUALICO 91. First Quantitative Linguistics Conference. September 23-27, 1991. University of Trier, Germany. - Trier, 1991.

KOLODYAZHNAYA L.I., POLIKARPOV A.A. [1994]. Study of Quantitative Correlations between Stylistics, Grammar and Polysemy of Words (on the Basis of Ozhegov Dictionary) // QUALICO - 94. 2nd International Conference on Quantitative Linguistics. September 20-24, 1994, Moscow, Lomonosov Moscow State University, Philological Faculty. - Moscow, 1994.

KOTELOVA N.Z. (ed.) [1984]. Novye Slova i Znachenija. Slovar-Spravochnik po Materialam Pressy 70-kh gg. (New Words and Meanings. Reference Dictionary on the Press of the 70s). - M., Russkij Jazyk, 1984.

KRYLOV Yu.K., YAKUBOVSKAYA M.D. [1977]. Statisticheskij Analiz Polisemii kak Jazykovoj Universalii i Problema Semanticheskogo Tozhdestva Slova (Statistical Analysis of Polysemy as a Linguistic Universal and the Problem of Semantical Sameness of Word) // NTI. Ser. 2. Vyp. 3. - Moscow, 1977.

KUSTOVA G.P., POLIKARPOV A.A. [1999]. Ot Konkretnogo - k Abstraktnomu: Logika Razvitija Leksicheskikh Znachenij (From Concrete towards Abstract: Logic of the Development of Lexical Meanings) // Sistemnyje Issledovanija v Lingvistike (System Researches in Linguistics). - Moscow, 1999. (in press)

POLIKARPOV A.A. [1976]. Faktory i Zakonomernosti Analitizatsii Jazykovogo Stroja (Factors and Regularities of Language Structure Analytization). Ph.D. thesis. - Moscow, 1976.

POLIKARPOV A.A. [1979]. Elementy Teoreticheskoj Sotsiolingvistiki (Elements of Theoretical Sociolinguistics). - Moscow: Moscow University Press, 1979.

POLIKARPOV A.A. [1987]. Polisemija: Sistemno-Kvantitativnyje Aspekty. (Polysemy: Systemic-Quantitative Aspects) // Kvantitativnaja lingvistika i avtomaticheskij analiz tekstov / Acta et Commentationes Universitatis Tartuensis, issue 774. - Tartu: Tartu University Press, 1987.

POLIKARPOV A.A. [1988]. K Teorii Zhiznennogo Tsikla Leksicheskikh Jedinits (Towards the Theory of Lexical Units Life Cycle ) // Papers from the Scientific Conference "Applied Linguistics and Automatic Text Analysis". - Tartu, 1988.

POLIKARPOV A.A. [1990]. Leksicheskaja Polisemija v Evoljutsionnom Aspekte (Lexical Polysemy in Evolutionary Aspect) // Linguistica, 1990, - Tartu: Tartu University Press, 1990.

POLIKARPOV A.A. [1991]. On the Hypothesis of Word Life Cycle // QUALICO 91. First Quantitative Linguistics Conference. September 23-27, 1991. University of Trier, Germany. - Trier, 1991.

POLIKARPOV A.A. [1993]. A Model of the Word Life Cycle // Contributions to Quantitative Linguistics / Ed. by R. Koehler, B.B. Rieger. - Dordrecht, 1993.

POLIKARPOV A.A. [1994]. Zakonomernosti zhiznennogo tsikla slova i evolutsija jazyka. Statja 1. Modelirovanije osnovnykh sistemnykh sootnoshenij (The Regularities of Word Life Cycle and Language Evolution. Article 1. The Modelling of the Main System Correlations) // Russkij Filologicheskij Vestnik (Russian Phylological Bulletin), N 1, 1994. - Moscow, 1994.

POLIKARPOV A.A. [1994a]. Evolutionary Aspects of a Language as a Natural Classification System // QUALICO - 94. 2nd International Conference on Quantitative Linguistics. September 20-24, 1994, Moscow, Lomonosov Moscow State University, Philological Faculty. - Moscow, 1994.

POLIKARPOV A.A. [1995]. Zakonomernosti zhiznennogo tsikla slova i evolutsija jazyka. Statja 2. Teorija i eksperiment (The Regularities of Word Life Cycle and Language Evolution. Article 2. Theory and Experiment) // Russkij Filologicheskij Vestnik (Russian Philological Bulletin), N 1, 1995. - Moscow, 1995.

POLIKARPOV A.A. Cyclic Processes in Becoming of Lexical System: Modelling and Experiment (in Russian). Doctor of Philological Sciences Dissertation. - Moscow, 1998.

POLIKARPOV A.A. a.o. Chronogical Morphemic and Word-Formational Dictionary of Russian: Data Base Compiling and its Quantitative-Systemic Analysis // Problems of General, Comparative and Contrastive Linguistics (in Russian). - Issue 2. - M., 1998.

POLIKARPOV A.A. a. o. Modelling of Lexical System Evolution / System Studies in Linguistics. Vol. 1-2 (in Russian). - M., 1999 (in press).

POLIKARPOV A.A., KURLOV V.Ya. [1994]. Stilistika, Semantika, Grammatika: Opyt Analiza Sistemnykh Vzaimosvjazej (po Dannym Tolkovogo Slovarya) (Stylistics, Semantics, Grammar: An Essay of Investigation of System Correlations (Using Data of an Explanatory Dictionary)) // Voprosy Jazykoznanija (Problems in Linguistics), N 1, 1994, Moscow. - 1994.

TULDAVA Yu.A. [1979]. O Nekotorykh Kvantitativno-Sistemnykh Kharakteristikakh Polisemii (On Some Quantitative-Systemic Characteristics of Polysemy) // Kvantitativnaja Lingvistika i Avtomaticheskij Analiz Tekstov / Acta et Commentationes Universitatis Tartuensis, issue 689. - Tartu: Tartu University Press, 1979.

YAKUBOVSKAYA M.D. [1977]. O Vnutrennikh Prichinakh Rasshcheplenija Semanticheskogo Tozhdestva Slova (On Internal Causes of Splitting of Word's Semantic Sameness) // Filologicheskiye Nauki, No 3, 1977.

 ZIPF G.K. [1935]. Psycho-Biology of Language. - Boston (Mass.), 1935.

 ZIPF G.K. [1945]. The Meaning-Frequency Relationship of Words // The Journal of General Psychology, Vol.33, 1945.

 

Contact:

Home address: Russia 117463 Moscow, Karamzina 9-1-204

 Off. phone:+7(095) 9392622 or +7 (095) 939-3178

Home phone: +7 (095) 422-4195

Fax: +7 (095) 939-2622 E-mail: polikarp@philol.msu.ru