![]() |
ИСТИНА |
Войти в систему Регистрация |
ИСТИНА ИНХС РАН |
||
The electronic data representation is spread now to very different domains. And it became important for music. Lot of audio, video and text files are stored on the web servers worldwide. The work is concentrated on files with music notation (music sheets). Different music sofware was developed to help composers, amateurs to write their ideas as music sheets. Just as MP3 files have become synonymous with sharing recorded music, MusicXML files have become the standard for sharing interactive sheet music. With MusicXML it is possible to write music notation in one program and share the results with people using other programs, because the fomat is supported in more than 150 applications, including Finale, MuseScore, Guitar Pro etc. There are systems that helps in search of audio files by a fragment of a melody, by the name of the composer and/or the name of composition. However the task of music notation retrieval is still demanded by Web users. We are interested in the development of methods and algorithms that allows to perform search of editable music notation files for various instruments having a fragment of a musical notation. A musicXML file can be considered as an example of XML file, so it is a structured document. Several methods were developed for structured retrieval. However the particular domain does not allow to use such methods. The retrieval method needs to take into account that for the notation of the same piece the keys can be different, the music sheets can be created for different instruments and the duration of notes can be different. That is why we need special internal representation for the query and for the files in the library. The vector space model can be used for such search, but it requires to store lot of complimentary information (an appropriate internal representation) as a quite large database. The vectors have lot of dimensions and small values of coordinates. If we define a new similarity measure and adapt the cosinus measure (or another one) it helps to find a composition but requires to perform the full search for the internal database. Such a process takes quite long even working on a local computer and because incredibly slow if the add delays in Internet connections. That is why we propose to use another approach. In the information retrieval we are faced with the situation than we have two similar (but not identical) fragments: a search query and a candidate to be an answer. We need to find if the represent the same composition. The task is quite similar to a mathematical one: while performing mathematical induction proof we have a hypothesis and a goal that are similar, and we need to rewrite the goal to prove that it is implied by the given. To solve the problem we can use rippling. The precondition for using the method is the following: we are rewriting one sentence to another and the sentences have to be similar. The differences are annotated as a wave front and the common parts form a skeleton. We do rewriting of goal to reduce differences and the skeleton has to be preserved. To estimate differences reduction a measure is proposed. It has to become smaller with every rewriting step and it equals 0 than all the differences are removed. To make this method working for information retrieval in musicXML first we have to find a composition (a set of compositions) that is similar to the search query by estimation their longest common subsequences of notes. After we annotate the common parts in these two music fragments as a skeleton and the remaining part is called wave front. The wave front has to be removed by rewriting rules, that are formed in advance from the general rules known from music theory. Also the set of rules is being extended by the rules came from practical use of the search system. The rewriting stops in two cases: either all the differences are removed or there are no applicable rules. The first case means that the original fragments represents the same (or similar) compositions. In the second case we have to estimate composition as less relevant (or even not relevant) to the search query. The proposed method was implemented in a prototype system that performs search for music sheets. As an input data the system takes a MusicXML file with a query and the appropriate results are displayed as a list together with the similarity measure. The system has a web interface, but up to the moment it works in a test mode with a library of compositions stored on a local computer.