![]() |
ИСТИНА |
Войти в систему Регистрация |
ИСТИНА ИНХС РАН |
||
Проект предполагает сбор, анализ и картографическое представление материала по диалектам тюркских языков России (в дальнейшем возможно подключение сопредельных регионов), направленные на создание лингвогеографического компонента интегрального описания этих языков – Электронного атласа тюркских языков России.
The project involves the collection, analysis and cartographic representation of the material in the dialects of Turkic languages of Russia (in the future it is possible to connect adjacent regions) on the base of the material, collected during the field works and found in the archives, aimed at creating a linguogeographic component of the integral description of these languages - the Electronic Atlas of the Turkic languages of Russia. The task of fixing and researching the endangered languages is the major one in linguistics. Until now, there was no unified solution that brought together teams of linguists in assembling materials of endangered languages: dictionaries, corpora and their analysis in terms of phonetics, morphology , syntax, semantics, etymology. Currently, the team of Ural-Altaic department of the Institute of Linguistics first posed the problem of constructing a special system, easy to use and not requiring additional resources for researchers' computers, which would allow to easily publish their dictionaries and texts , carry out the necessary operations on the data collected and receive results, regardless of the particular system environment. Such a system must be stable, reliable, easy to use and capable to support distributed computing mechanisms for the efficient processing of phonetic , morphological, syntactic, semantic and etymological data. TThe task of constructing a software system [virtual laboratory] that provides the necessary functionality to researchers is now largely resolved (it was started by its decision RFBR 14-06-00271 "Construction of a virtual laboratory for reliable storage and distributed processing of data of endangered languages"). The other side of the problem - an adequate arrangement for the collection of material that should be a content of the virtual laboratory. Obviously, this problem should be solved according to features of a language (languages) under consideration. Turkic languages and dialects now hold the second place (in the quantity of native speakers; after the Slavic languages and dialects) in the territory of the Russian Federation. About 3/4 of all the Turkic languages of the world have areas of compact residence of their speakers in Russia. According to recent data, 13 Turkic languages of Russia are included in the list of endangered languages (ie, about a third of the full list of all Turkic languages). In addition, there is a constant erosion of dialects of larger Turkic languages (not yet threatened themselves), but the loss of dialects threatens considerable loss of information on the functioning of these languages in synchrony and diachrony, as well as the loss of natural sources оf enrichment and development of literary languages. Description and study of Turkic languages is a well-developed discipline that has great scientific tradition. Nevertheless, the problem of computer-oriented integral description sets new requirements for the material under review. Integral description is planned to consist of the following areas: a) Dictionaries of language idioms (of languages and dialects): adudiodictionary of modern dialects and archival dictionaries concordances created on the basis of the first Cyrillic books in Turkic languages b) Computer-oriented grammatical descriptions and the means of automatic morphological analysis. All data should be accessible for online adjustment. c) Corpora of texts: modern and archive. d) The dynamic representation of linguogeographic information: the possibility of map construction according to individual linguistic features contained in dictionaries, grammars and corpora of languages and dialects. As, generally speaking, all the components described are databases, the format of data representation should include the forming of database queries, followed by visualizing of query results. Accordingly, a linguogeographic database containing information on the distribution of linguistic phenomena on settlements (attributive database) should be built, as well as a database containing of settlements coordinates. Ultimately, the system must be able to obtain additional analytical information and to build new facilities on the basis of existing ones due to query to other components of the description. Another important task is extraction of features distinguishing certain groups of selected linguistic phenomena (discriminant analysis). When working on the project RGNF No. 15-04-00370, "Developing questionnaires for collecting materials for the integrated description of minority Turkic languages and dialects of Russia", the task of developing a system of dialectological questionnaires for collecting material for the Electronic Atlas of Turkic languages within the framework of their integral description was basically done , and also for a number of idioms a trial filling of these questionnaires was received. In the course of research under the RHSF grant 15-04-00361а “First written monuments in Uralic and Altaic languages” we have already found more than a thousand texts written in Uralic or Altaic languages in the XIXth century. The overwhelming majority of the books were translations made by the Translation Committee of Orthodox Missionary Society. In this project it is planned to attract the material of the first Cyrillic texts, published in the XIX the beginning of XX century, on 9 Turkic languages (Chuvash, Tatar, Crimean Tatar, Bashkir, Altai, Shor, Kazakh, Uzbek, Yakut). For these texts, it is planned to make glosses, after that in a virtual laboratory LingvoDoc (http://lingvodoc.ispras.ru) it will be possible to convert them into dictionaries, then we plan to make the etymological connections with modern dictionaries of Turkic languages. Further study of the material based on the developed feature system, refinement of this system and its presentation on electronic maps in order to elucidate the importance of various features and their compositions for genetic and areal classifications will constitute the main content of the work on the planned project.
Результатом проекта должен стать набор электронных карт, отражающих значения языковых признаков, наиболее выпукло выявляющих основные классификационные параметры тюркских языков, в области фонетики, фонологии, морфонологии, морфологии и лексики.
Авторы ранее занимались сравнительно-историческим и типологическим описанием тюркских языков и диалектов; имеют печатные труды по этой тематике; занимались полевой работой в тюркских языках и диалектах и компьютерной обработкой полевых данных; как было указано выше, обработали некоторое количество данных по предложенной методике; занимаются, кроме того, составлением корпусов тюркских языков и компьютеризованнных описаний к ним; этимологическими базами данных по тюркским языкам. Ряд методических приемов отработан коллективом при работе по предшествующим проектам, в основном на диалектах чувашского языка и на языках Южной Сибири.
грант РНФ |
# | Сроки | Название |
1 | 11 января 2018 г.-31 декабря 2020 г. | Создание электронного диалектологического атласа тюркских языков России |
Результаты этапа: |
Для прикрепления результата сначала выберете тип результата (статьи, книги, ...). После чего введите несколько символов в поле поиска прикрепляемого результата, затем выберете один из предложенных и нажмите кнопку "Добавить".