Зенков А.В..
Числа указывают на автора: стилометрическое сопоставление немецкоязычных модернистских текстов
// 语文学:科学研究.
2024. № 11.
和。 50-62.
DOI: 10.7256/2454-0749.2024.11.72167 EDN: PDWIOX URL: https://cn.nbpublish.com/library_read_article.php?id=72167
The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program has been developed to search in the text for cardinal and ordinal numerals expressed both in numbers and verbally (in different word forms). The program automatically removes phraseological units and stable combinations from the text that accidentally (without the author's intention) contain numerals. Previously, the text is manually cleared of auxiliary numerals such as pagination, chapter numbers, etc. It is shown that the numerals used by the author in the (artistic) text are individual for each author; their totality is a characteristic feature (author's invariant, "fingerprint") that distinguishes the texts written by different authors. A comparative stylometric analysis of a number of literary works by Thomas Mann, Hermann Broch, Robert Musil, and Elias Canetti – the representatives of German-language literary modernism of the 20th century – is performed. Substantial authorial differences in the manner of using numerals were discovered. The results of the analysis were subjected to hierarchical clustering process (the Manhattan metric; Complete linkage and Between-groups methods). The cluster analysis correctly distributed the texts according to their authorship. The use of various clustering methods for text analysis enhances the significance of the results obtained and confirms their non-random nature. This demonstrates that the novel method of stylometry is able to accurately attribute literary texts to their correct authors.
E. Canetti, H. Broch, R. Musil, T. Mann, numerals in texts, authorship of texts, attribution of texts, quantitative linguistics, stylometric, stylometry
Северина Е.М., Фёдоров Н.А..
Проект Chekhov Digital: семантическая разметка параллельного корпуса переводов художественной прозы А. П. Чехова на немецкий язык
// 语文学:科学研究.
2024. № 4.
和。 73-82.
DOI: 10.7256/2454-0749.2024.4.70560 EDN: PXMQSB URL: https://cn.nbpublish.com/library_read_article.php?id=70560
本文讨论了在契诃夫数字项目的框架内发展契诃夫小说翻译的语义标记平行语料库的原则。契诃夫数字项目是一个以TEI(文本编码倡议)格式出版的作家收藏作品的数字学术出版物。 平行语料库项目的重点是创建一个数字基础设施来研究作者的作品,使研究人员能够分析和比较原始文本和他们的翻译。 确定了与作家作品的重要元素的解释,其翻译成德语的具体细节以及小说翻译的语义标记有关的困难,例如,在定义语义标记元素之间的界限和关系方面出现了困难。 提出了克服它们的方法,包括使用数字方法和自然语言处理技术。 该项目使用自然语言处理的数字方法和技术,数字出版物文本编码倡议(TEI)的标准。 基于TEI标准的文本标记结构使文档具有机器可读性,这允许您开发用于复杂语义信息检索的工具。 契诃夫数字项目将契诃夫的作品翻译成不同语言的平行语料库纳入契诃夫数字项目,使翻译研究领域的研究工具得以扩展,从而可以比较翻译文本和原文文本,发现词汇、语法、风格和文化参考的异同,并使日常研究过程自动化,从而使大量文本的搜索和分析更加有效。 该项目的成果将有助于数字人道主义环境的发展,有助于保护和普及契诃夫的文学遗产。 创建语义标记的平行翻译语料库对于文学评论家,语言学家和翻译家来说非常重要,使他们能够研究契诃夫作品翻译的具体细节,并开发新的文本分析和 在项目期间获得的经验将对未来的研究和实际应用有价值,展示数字技术在人道主义研究和教育方面的有效性。
契诃夫数码计划, 数码版, 契诃夫, 平行外壳, 文本编码倡议, 机器可读标记, 语义搜索, 数码科技, 自动文本处理, 解析;解析