ON ONE APPROACH OF FORECASTING NATURAL DISASTERS WITH THE SYSTEM OF PATTERN RECOGNITION WITH LEARNING AUTHOR(S)

There are a number of recognition problems in different fields that can be solved with the system of pattern recognition with learning – SPRL elaborated by us. The problem of forecasting natural disasters (floods, mudslides) in the given year, the fixed region, and the period belongs to it. To solve it, it is set in the terms of pattern recognition with learning according to which it is necessary to pre-determine the learning descriptions in the same region of the previous years using data of the previous 12 months of the period. From learning descriptions, firstly are separated control descriptions, then the variants of learning and learning recognizable descriptions. Besides, it is necessary to determine descriptions in year, in the same region using data of the same previous period of the (the first model). After transformation and increasing the informativity of the learning descriptions, the knowledge and data bases are determined for learning recognizable and control descriptions in relation to the variants and classes (the second model). Using them, one decision is made on belonging to the respective class for learning recognizable descriptions, but for control descriptions – the primary decisions according to the number of variants, and then on their basis – one decision. Exactly according to the results of the recognition of control descriptions a decision is made on the occurrence (non-occurrence) of a natural disaster in the same region and period (the third model). The article discusses the arguments related to this fact. This model considers the correction of data bases with respect to variants and classes, also, defines the effectiveness of working of the SPRL and its detector of trust. Considering the specifics of forecasting, the initial data of at least 5 years are required to select the best knowledge and data bases with the use of which a disaster should be forecasted.

Introduction. Pattern recognition is one of the priority areas of cybernetics in which the objectives of pattern recognition with learning occupy an important place. Forecasting of natural disasters (floods, mudslides) is also a significant problem. To solve it with the system of pattern recognition with learning -SPRL [1,2] elaborated by us, we have to set it in the terms of pattern recognition with learning.
Suppose we have the description corresponding to the phenomenon belonging to two classes (we mean occurrence and non-occurrence of natural disasters). From these descriptions, let us call the first class the set of such descriptions which correspond to the fact of occurrence of a natural disaster, 2 RS Global and the second classto the fact of non-occurrence of disasters. To predict whether a natural disaster will occur or not in the given year o t , region, and time period 0 T , it is necessary to have learning descriptions previously determined based on the data of the previous 12 months of the period 0 T in this region in other years. Based on these learning descriptions, such knowledge and data bases should be determined for each class which can be used to recognize, for example within the accuracy of a 30day period, in the year o t , (as well as in other years) whether the same natural disaster will occur or not in the same fixed region and the given time period 0 T . Based on the aforementioned, to solve this problem, i.e. to predict a natural disaster, it is necessary to determine the initial learning descriptions in advance. Moreover, the recognizable descriptions determined on the basis of the data of the previous 12 months of this year o t , in the same region and the period 0 T should be recorded in the same codes, i.e. in the "language" in which the learning descriptions will be recorded. Learning descriptions and their transformation. In the case of objects, a learning description [3,4] is a sequence of the parameter valuesfeatures characterizing the object, for which the class the object represented by this sequence belongs to is known in advance. The sequence of parameter valuesdescription is also called a realization [1,5]. It is a vector with m components, mis the number of parameters. In the case of a natural disaster, the fact of occurrence or non-occurrence of this phenomenon must be known in the fixed region and the given period 0 T . In such a case, let us call this period a learning zero block. The sequence of values of the characteristic parameters of the phenomenon in the period of the previous 12 months corresponding to this learning zero block 0 T in relation to each month and region in case of occurrence or non-occurrence of a disaster will be called learning description.
Thus, to recognize objects as well as to forecast a natural disaster (phenomenon) with the system of pattern recognition with learning -SPRL [1,2], it is necessary to have appropriate learning descriptions in advance.
The SPRL is implemented on a personal computer and on the basis of the real data is verified to recognize objects of different classes: satellites, airplanes, helicopters, and also their types, for the classification of irises, etc. The system contains: 1) the model of preliminary elaboration of the initial information, 2) the learning model, and 3) the recognition model. The first two models include the methods and algorithms of solving 7-7 objectives, and the third modelof solving 8 objectives.
This system can also be used to forecast a natural disaster taking into consideration its specificity. This means that it will be possible to forecast whether a natural disaster is expected or not in the given year, in the fixed region, and in the given period of time 0 T (zero block). For this, it is necessary to know the data in the previous years in the same region and in the previous 12 months of the period 0 T (learning zero block). The data should be represented in the form of sequences of values of the relevant characteristic parameters of a phenomenon, i.e. descriptions. Under these data are meant the features (the values of the parameters) characterizing a phenomenon in the appropriate period, which may include the characteristic features determining occurrence and non-occurrence of natural disasters. Besides, the system must preliminarily elaborate the data corresponding to the previous 12 months of the learning zero block i.e. such a zero block which is given in the previous years, corresponding to the same region and the same time 0 T , in case of which the fact of occurrence, as well as non-occurrence of a natural disaster, is known in advance. Meanwhile, these data must be given in relation to the learning zero block of the given phenomenon in case of occurrence and nonoccurrence separately. Hence, we must have the learning zero block and learning descriptions corresponding to the previous 12 months of this learning zero block in the previous years and the given region for each class separately.
It is on the basis of the data that learning descriptions should be determined by the method (proposed in the paper [6]), which will be included in the first model of the SPRL.
The method considers choosing the initial parameters, dividing each month from the previous 12 months of the learning zero block into small periods (learning blocks), dividing the 24-hour time interval into two intervals, determining primary, additional, and formal additional parameters. Based on the values of these parameters, primary learning descriptions are determined for each class in relation to each month 3 from the previous 12 months of the learning zero block, the learning blocks included in them, and the time intervals mentioned above. From them, using the vector-optimization method [7] such primary learning descriptions are chosen which belong to the Pareto set. The learning descriptions determined in this way are transferred to the sixth objective of the first model of the SPRLthe model of preliminary elaboration of the initial information. It uses random numbers to separate control descriptions from learning descriptions, and from the rest of the descriptions, using random numbers and slipping control procedure, determines the variants of learning recognizable and learning descriptions for both classes. Further, when we mention variants, we mean variants of learning recognizable or learning descriptions. Learning descriptions will be transferred to the learning model, which transforms these descriptions (included in the variants) in relation to the informativity. It is obvious that on the basis of the learning descriptions determined from the values of less informative parameters, it can be difficult and even impossible to determine such knowledge and data bases that can be used to forecast the appropriate natural disaster.
On the other hand, since these descriptions represent learning descriptions, it is therefore known whether the corresponding natural disaster occurred or not in the indicated learning zero block. It means that there must exist such reasons which condition the fact of occurrence (or non-occurrence) of this disaster. We must search these reasons in these internal hidden connections which really exist between the parameters characterizing the disaster in relation to the previous 12 months of the learning zero block in case of occurrence as well as non-occurrence of this phenomenon. These connections are not explicitly given in the mentioned learning descriptions. In order to define these connections, it is necessary to transform these descriptions. All procedures of this transformation that we use for learning descriptions in relation to the zero block are used for learning descriptions included in both classes, i.e. in the case of occurrence and non-occurrence of a disaster.
Based on the variants of the above-mentioned learning descriptions, the learning model already in relation to informativity, determines these connectionsnew formal (artificial) parameters, functions. For this, the learning model uses: balanced incomplete and partially balanced incomplete block-designs (BIB-and PBIB(2)-designs), i.e. [8,9]; geometrical configurations [10] and the vector-optimization method of choice [7]. Exactly thus determined parameters (also their values) show inner hidden connections between initial parameters (accordingly, between their values) [1]. To determine them, the method of solving the second objective of the learning model is used. According to this method, for example, in the case when we have Let us call it primary BIB-design and its blocksprimary blocks. If we enumerate primary blocks and from them determine configuration with the same parameters (7, 4, 1), we will get the following secondary BIB-design, i.e.: Each block, i.e. extended block of the secondary BIB-design contains 3 primary blocks. Such extended blocks are separated from each other with two brackets and a dot. Each learning description is transformed into thus determined secondary BIB-design. It means that in each primary block will be recorded the values of the parameters given in it from each learning description in relation to the variants and classes. Therefore, each extended block can be represented as a geometrical configuration. In our case, i.e., when K = 3, it is represented as a triangle whose vertex coordinates will be the values of the parameters included in each primary block. Artificial parameters are determined by means of geometrical configurations (triangles). They can be trigonometric functions, their combinations median ratios, etc., even heuristically determined functions, i.e. all those functions formulas that characterize geometrical configurations and at the same time show connections between their vertex coordinates.

RS Global
When a geometrical configuration is a triangle, whose vertex coordinates are the elements (A, B, C) of an extended block, the primary blocks ) , , ( , some functions (artificial parameters) determined by means of this geometrical configuration look as follows: Thus, 7 triangles correspond to the above-mentioned secondary − BIB design. After this from the artificial parameters determined in relation to each side and angle of the triangles are selected those parameters whose one and the same meanings are contained if not in all, in most or at least in a certain number of learning descriptions included only in one and the same class. In case of necessity, parameters with large informativity measures can also be added to them. Determination of informativity measures of parameters is the function of the fifth objective of the learning model. From all parameters, determined this way the best parameters will be chosen separately for each class included in the corresponding variant. Singular as well as pair parameters whose values characterize only one, and meanwhile one and the same class are considered in the criterion of choice. After dividing the values of the selected parameters into intervals, they will be recorded in new codesin a new "language" (the third objective of the learning model). From them, using the single and pair codes and with the help of the elaborated criteria, the best parameters will be chosen. Thus, the learning model transforms learning descriptions in relation to classes included in each variant this way.
Knowledge and Data Bases. After the transformation of the learning descriptions, they will be recorded in the new language in relation to the variants. This meant that the learning descriptions included in different variants will be recorded in the corresponding (different) "language". Based on them, the relevant knowledge and data bases are determined. For each learning recognizable description, the knowledge base is determined on the basis of the learning descriptions included only in one corresponding variant and on the base of this knowledge baseone data base. For each control descriptions, knowledge bases are determined on the basis of the learning descriptions included in each variant taken separately from all variants. Thus, in this case, the knowledge bases of the number of these variants are obtained and on the basis of the knowledge basesthe same number of data bases. Consequently, knowledge and data bases of the number of these variants are obtained this way for each control description.
The knowledge base contains all the formulas (functions) as well as such values that are used for the transformation of the learning descriptions; it also contains the "language" in which will be recorded learning descriptions included in each variant.
The data base which is determined on the basis of the above-mentioned knowledge base, contains characteristicsparameter (function) values, i.e features of both classes, but separately for the first class and the second class: singular features, their pairs, triplets, and specific combinations of features (groups). These groups contain combinations of characteristic and non-characteristic features of classes. Triplets, as well as these combinations, are determined using the above-mentioned combinatorial schemes without an exhaustive search. Therefore, the number of triplets and these groups are significantly decreased, because of what it is possible to use them. At the same time, the learning model determines thresholds for each class separately on the basis of the sum of the informativity measures of features in relation to the variants and classes.
The data base, besides the above-mentioned characteristics (singular features, their pairs, triplets, specific combinations), also contains values that will estimate these characteristics. The following is implied in them: If a feature characterizes only one class, then it is called maximally informativity feature (mif), but the number of descriptions containing it is called the characteristic value of this feature; if a feature characterizes only one class and at the same time participates in all descriptions included in this class, we call maxmif (maximum maximally informativity feature); when a feature participates in the descriptions of more than one class (but not in all descriptions included in all classes), then we call it an informative feature which is characterized by the informativity measure. The learning model 5 calculates these measures based on the characteristics of such features which are not mif. The data base is determined separately for each class included in each variant of learning descriptions.
The process of transforming learning descriptions and determining knowledge and data bases is called the machine learning process. Thus, determined knowledge and data bases are passed to the recognition model.
Choice of knowledge and data bases and forecasting of natural disasters. As soon as the recognition model receives knowledge and data bases from the learning model, it will use the knowledge bases for the transformation of learning recognizable descriptions and control descriptions. In the first case, it will use the knowledge base determined on the basis of the learning descriptions included in the corresponding only one variant. In the second case, i.e. for each control description, the recognition model will use the corresponding knowledge base separately determined from all variants. Therefore, the learning recognizable descriptions included in each variant will be recorded in only one "language" in relation to classes. In the second case, each control description will be represented as the descriptions of the number of variants and will be recorded separately in the "language" given in the knowledge base corresponding to each variant, also in relation to classes.
The same will happen in the case of data bases. In the first case, i.e. to recognize each learning recognizable description, will be used the data base determined on the basis of one corresponding variant. In the second case, i.e. to recognize control descriptions, for each of them will be used the data bases of the number of variants also in relation to classes.
Thus, based on the aforementioned, in the first case, one decision is made to recognize each learning recognizable description. In the second case, for each control description, the primary decisions according to the number of variants of learning descriptions are made and then, on their basisone decision is made [1]. The same will happen when instead of control descriptions we will have new recognizable descriptions determined on the basis of the previous 12 months in relation to the zero block. Thus, in this case, to recognize one and the same description are made the decisions that are made using knowledge and data bases determined according to the number of variants containing different learning descriptions. It is obvious recognition made on their basis will be more reliable than one decision made on the basis of one variant using only one knowledge base and data base.
In accordance with the characteristics included in the data bases, 5 types of criteria are used to make the primary decision: to make a decision with the first type of criterion, maxmifs of the appropriate class are used, with the second type of criterionmaxmifs and mifs, the third typemifs, the fourthmifs and the sums of informativity measures of features, the fiftha threshold determined on the basis of informativity measures for each class and informativity measures [1]. Moreover, such values participate in these criteria which the recognition model needs to make the primary decisions, for example, a number of maxmifs, mifs, learning descriptions, etc.
The recognition model, using the appropriate criteria [1] based on the primary decision (in which is taken into consideration how many descriptions out of the recognizable descriptions and with what degree of belonging they belong to the class) will make a decision on belonging to the corresponding class with the degree of belonging. Exactly based on the results of the recognition of these control descriptions the recognition model will make a decision on the occurrence or non-occurrence of a disaster in the zero block. This is conditioned by the fact that after making a transformation of these learning descriptions on the basis of them are determined such knowledge and data bases which contains new functionsartificial parameters. These functions (as it has already been mentioned) indicate the internal hidden connections that exist between the values of the parameters given in the descriptions included in each class in relation to variants. It means that learning descriptions contain those maxmifs, mifs, and other characteristic features that correspond to the occurrence or non-occurrence of a disaster in the learning zero block. Besides, they are determined in relation to each month from the previous 12 months of the learning zero block, in the case of occurrence and non-occurrence separately, i.e. for each class separately. If recognizable descriptions are correctly recognized based on the same knowledge and data bases, they should also contain such maxmifs, mifs, and other characteristic features that correspond to the occurrence or non-occurrence of a disaster in the zero block (i.e. in the period in which the fact of occurrence or non-occurrence of a disaster is not known for them in advance and at the same time did not participate in determining knowledge and data bases). Therefore, in the case of correctly recognizing control descriptions, their recognition indicates the fact that the same connections (artificial parameters, their meanings) correspond to the conditions of the previous 12 months of the learning zero block in the 6 RS Global learning process. This correspondence also indicates the correspondence of occurrence or nonoccurrence of a disaster in the zero block and the learning zero block in which it is known in advance.
Thus, at first, will be studied the condition of the previous 12 months of the zero block; the hidden connections between the parameters characterizing the disaster in relation to the months in case of occurrence and non-occurrence of a disaster will be determined. Then in the following year, in the same region and the zero block, this situation will affect the fact of occurrence or non-occurrence of a disaster. This will be confirmed also by the fact that since data bases are determined in relation to variants and classes, it is clear after the transformation on the basis of data bases of which class the control descriptions will be recognized, consequently, occurrence or non-occurrence of a disaster will be recognized in the zero block. The corresponding degree of belonging will be calculated based on the results obtained using the primary decisions [1].
Factually, the results of recognizing learning recognizable descriptions (which is determined using the slipping control procedure) and control descriptions show the results of recognizing new recognizable descriptions in the same region and period 0 T . This is conditioned by the fact that the first model at the very beginning separates the control and learning recognizable descriptions from the primary learning descriptions. Because of this, it is implied that the fact of occurrence or nonoccurrence of a disaster in the learning zero block is not known to them.
Therefore, it is obvious that control and learning recognizable descriptions will not participate in the formation of the corresponding knowledge and data bases in relation to the variants. It means that the learning zero block has the role of the zero block, as well as the control and learning recognizable descriptionsthe role of new unknown descriptions which are determined in relation to the previous 12 months of the zero block. Moreover, they are recognized with the help of the knowledge and data bases determined in the learning process (in which control and learning recognizable descriptions do not participate).
Learning recognizable descriptions should be recognized in the learning process using knowledge and data bases determined in the previous years. Control descriptions, as well as new recognizable descriptions corresponding to the zero block, are recognized when the system completes its work. It is on the basis of the results of recognizing control descriptions that the occurrence or nonoccurrence of a disaster in the zero block will be recognized. This is conditioned by the results of this recognition because they show how correctly these knowledge and data bases are determined on the basis of the learning descriptions in the learning process corresponding to the fact of the occurrence or non-occurrence of a disaster, how well the situation given in the previous period of the zero block is prepared for the occurrence or non-occurrence of a disaster in the same region and period, also to what extent it is possible to recognize the same disaster in the same region and period in the following year based on the knowledge and data bases determined in the learning process.
As for learning recognizable descriptions, their separation from the initial learning descriptions as well as their recognition helps the learning model to form appropriate knowledge and data bases corresponding to the number of variants and to correct data bases. They also help the recognition model in making primary decisions according to the number of variants and determining the detector of trust of the system. The results obtained after recognition of learning recognizable and control descriptions are reflected on the effectiveness of working of the SPRL and detector its trust elaborated by the recognition model.
The approach proposed above implies that for each month, the learning process will be conducted on the basis of their previous 12 months and the learning descriptions, as well as knowledge and data bases, will be separately determined for them. The results of recognition of learning recognizable, control, and also, new recognizable descriptions participate in the process of specification (correction) of these data bases. As the recognition model takes into consideration the correction of data bases based on the results of the recognition of the above-mentioned recognizable descriptions, the first two of them will be corrected if these descriptions belong to the class with any degree of belonging. The third of them will be corrected when the new recognizable descriptions belong to the corresponding class with the degree of belonging that equals only to one. This indicates that the more frequently the system works, the more reliable will be the subsequent recognition results in case of occurrence and non-occurrence of a natural disaster. It will be more convincing to evaluate (recognize) the possibility of occurrence of a disaster.

7
Unlike object recognition, in the case of forecasting (recognition) disaster in the given region and period, taking into consideration its complexity and specificity, it is necessary to have the data of at least 5-5 years for each class separately, i.e. in case of occurrence and non-occurrence of a disaster. Based on them, we will determine the knowledge and data bases. From the knowledge and data bases, only those will be used separately for forecasting natural disasters in the following years, on the basis of which the correct decision was made about the occurrence or non-occurrence of natural disasters. In case the decisions made on the basis of different knowledge and data bases coincide, the degrees of belonging of these descriptions will increase according to the following: where is the degree of belonging of a description to a class. The effectiveness of the working of the system and its detector of trust will increase accordingly (their determination on the basis of the degree of belonging to a class, also based on the results of recognition of learning recognizable and control descriptions is considered in the recognition model). In order to choose knowledge and data bases to forecast natural disasters in the following years, we consider the effectiveness of working of the system and its detector of trust as the characteristics of the knowledge and data bases. Let us mark them with E and T accordingly. After this, we will use the vectoroptimization method according to which those bases that belong to the Pareto set will be considered as the best knowledge and data bases.
The vector optimization method of choice is used even if we have several knowledge and data bases and if it is based on one part from them; decisions are made on the occurrence of a natural disaster but are based on another parton non-occurrence of a natural disaster in the same region and the zero block (what will not happen if the initial parameters are correctly chosen from the beginning). In the case of the above-mentioned, knowledge and data bases will be divided into two groups. The vector optimization method of choice is used for each group separately. The detector of trust and the effectiveness of working the system will increase for each group separately according to the formula (1). If we consider them as vector components in relation to which group they were determined, we determine the lengths of these vectors. Depending on which group the vector length will be more (the system effectiveness of working will be closer to 1 and detector of trustto T> 0.8), the knowledge and data bases will be used the next year to forecast occurrence or non-occurrence of the same natural disaster in the same region and period, i.e. the zero block (these numbers are taken presumably. They should be determined based on statistics when real data will be used).
After determining the knowledge base and data bases if they are determined on the basis of the real data, the recognition model also should recognize the descriptions determined based on the real data of the previous period of the same zero block and region in other years. Then, on the basis of the recognition results, the recognition model will forecast the natural disaster, as it happened in the case of control descriptions (for which the fact of occurrence or non-occurrence of disasters was also not known). After using the real data it can be necessary to change or add the initial parameters considered in the method of determining the initial learning descriptions [6].
Conclusions. To forecast natural disasters (floods, mudslides) in the given year T ; on the basis of the data given in other years also in the same region and period determines learning descriptions. From them, first of all, are separated the control descriptions, then the variants of learning and learning recognizable descriptions. After increasing the informativity of learning descriptions in the variants, using the combinatorial schemes, the second model determines; artificial parameters that show hidden connections between parameter values given in the learning descriptions; on the basis of learning descriptions in relation to variants and classes for each recognizable description determined one knowledge base but for each control descriptionknowledge base of the number of variants. On the basis of knowledge bases in this case the same number of the data base is determined in relation to variants and classes, but in the first case one data base. Consequently, based on the knowledge and data bases, the third model for learning

RS Global
recognizable descriptions, with the help of corresponding criteria, will make one decision of belonging to the corresponding class, for control descriptionprimary decisions according to the number of variants and on their basisone decision. Based on the corresponding arguments and according to the results of the recognition of control descriptions, the third model will forecast a natural disaster in the given region and period. After correcting the data bases, on the basis of the results of the recognition of control and learning recognizable descriptions, this model determines the effectiveness of working of the system and its detector of trust on which these recognition results will be reflected. The learning process should be based on at least 5-year data. They are presented separately and knowledge and data bases are determined according to them. Such bases are chosen from them on the basis of which in the learning process the control descriptions are assigned to the corresponding class with the maximum degree of belonging. When using real data, it may become necessary to change the initial parameters, which is considered in the method of determining the initial learning descriptions [6].