Abstract


AN INSIGHT INTO THE CODING PROCESS IN THE NATURAL LANGUAGES IN TERMS OF THE INFORMATION THEORY (IN THE EXAMPLE OF TURKISH)

This study focuses on the transformation that morphemes undergo in the coding process. This transformation is related to the use. Morphemes marked according to their articulation evolve according to the frequency in the messages. In the study, a method, which will be used in the cases that the Turkish morphemes are coded according to the information of value of Turkish morphemes. This method is based on the information theory which is developed by C. E. Shannon and which describes “meaning” as a measurable concept. This method relates the information value of symbols in the messages to their frequency in the messages. Therefore, it was focused on the relation between the frequency and their code numbers by giving several examples. The information of value of the several Turkish words is measured depending on their frequency and the code number which is necessary for the coding of each morpheme, depending on these values. However, it is necessary that a corpus should be formed and the separation of each text into morphemes and that the frequency of each morpheme within the corpus should be determined. At that stage, the data formed by the analysis of the text pieces and composed of 100 text pieces published specifically in the last decade and which have been prepared by the author of this study for a previous study. In addition, the entropy of the Turkish morphemes has been calculated by using their frequency in the corpus and, the relationship between the number of letters in the alphabet and the number of signs required to mark the words is presented. Finally, the preferred method in the study is presented for discussion. The preference between the required measurement method and the possible measurement method is at a dimension which will require the questioning of the information theory.



Keywords

Informational theory, entropy, coding, alphabet.





References