Frequency Distribution of Letters, Bigrams and Trigrams in the Macedonian language
Journal
Yearbook-Faculty of Computer Science
Date Issued
2012
Author(s)
Panov, Stojanche
Abstract
Frequency analysis in cryptanalysis is based on the fact that, in
any given piece of written text, certain letters and combinations of two or three
letters occur with varying frequencies. In this paper we present average
frequency distribution of letters, bigrams and trigrams in the Macedonian
language. Letter frequency of the most common first letter and last letter in
words is also given. Our results are based on approximately 15000 pages of
written text from the following subjects: poetry, prose, drama, natural
sciences, social sciences, law, different laws, economy, and computer
science. Obtained letter frequency sequence is “А О И Е Т Н Р С В Д К Л П
М У З Ј Г Б Ч Ш Ц Ж Њ Ф Ќ Х Ѓ Џ Љ Ѕ”, the most common letter pairs are
“НА АТ ТА НИ ТЕ РА ОТ СТ ТО КО” and the most common trigrams are
“ИТЕ АТА УВА ИЈА АЊЕ СТА ОСТ ВАЊ ПРО ПРЕ”.
any given piece of written text, certain letters and combinations of two or three
letters occur with varying frequencies. In this paper we present average
frequency distribution of letters, bigrams and trigrams in the Macedonian
language. Letter frequency of the most common first letter and last letter in
words is also given. Our results are based on approximately 15000 pages of
written text from the following subjects: poetry, prose, drama, natural
sciences, social sciences, law, different laws, economy, and computer
science. Obtained letter frequency sequence is “А О И Е Т Н Р С В Д К Л П
М У З Ј Г Б Ч Ш Ц Ж Њ Ф Ќ Х Ѓ Џ Љ Ѕ”, the most common letter pairs are
“НА АТ ТА НИ ТЕ РА ОТ СТ ТО КО” and the most common trigrams are
“ИТЕ АТА УВА ИЈА АЊЕ СТА ОСТ ВАЊ ПРО ПРЕ”.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
27-Article Text-1836-1-10-20150831.pdf
Size
1.72 MB
Format
Adobe PDF
Checksum
(MD5):4ae1a089214b2d8b279a1b6e4a722295
