JUMPING FROG METHOD FOR OPTIMAL CLASSIFICATIONS

In the article the problem of finding optimal classifications on a finite set is investigated. It is shown that the problem of finding an optimal classification is generated by a tolerance relation on a finite set. It is also reduced to an optimization problem on a set of permutations. It is proposed a modification of the mixed jumping frogs to find suboptimal solutions of the problem of classification.


Introduction.
Classification is a powerful scientific method. The classification problem arises in almost all areas of knowledge when analyzing research results, when designing and forecasting, when assessing and making decisions. Often having a simple formulation, the classification problem turns out to be quite complex and ambiguous. Moreover, sometimes when trying to classify, interesting paradoxes arise associated with the unification of fundamentally different objects into one class.
The solution to the classification problem, as a rule, includes a significant proportion of subjectivity, individual assessments, fuzzy, informal conclusions. Often the priorities of the decision maker (DM) influence the solution of this problem. This leads to the construction of fundamentally different classifications based on the same primary information. Especially often this situation arises in those areas of knowledge in which it is impossible to use numerical estimates in the classification of objects and phenomena, due to which there is a need for fuzzy assessments, the use of the concepts "similar".
Formulation of the problem. The aim of this work is to construct metaheuristics for finding a suboptimal classification defined by a tolerance relation on a finite set. This approach allows one to construct partitions close to optimal sets in accordance with the relation of "proximity" of elements. Moreover, this relationship of proximity is not transitive. The proposed algorithms can find wide application in applied problems related to the problem of object classification by a number of attributes. Such problems often arise in the economic, social and technical sciences.
Solution method and analysis of the results. From the point of view of mathematics, the classification problem can be considered from different positions. The main one is the set-theoretic approach when constructing a classification. However, in practice, it turns out that this approach is good only post factum, that is, for clarifying and formally describing an already constructed classification.

RS Global
The most widespread up to now are statistical classification models, which allow grouping objects according to the results of statistical data analysis [1,2,3]. Metric algorithms use the formalization of the concept of similarity between objects and the hypothesis of compactness [2,3,4]. There is another principle: the so-called logical classification algorithms. This approach is based on the principle of inductive inference of logical laws or induction of rules [5,6,7]. Classification models are becoming more widespread is they are based on the tools of the fuzzy sets theory. A comparatively new direction is classification models based on integral mathematics. An interesting direction is the use of artificial intelligence methods for solving classification problems. An overview of existing recognition methods is given in the monograph [6].
Statement of the classification problem on a finite set By a partition of a finite set X we mean a set of its nonempty subsets 12 , ,..., n X X X such that: , , , , 1, 2,..., The classification problem on a finite set consists in finding a partition that has some given properties. A set partition defines the canonical equivalence relation associated with this partition. Namely: two elements are considered equivalent if they belong to the same split element. On the other hand, it is easy to show that any equivalence relation on a finite set determines its partition into classes of elements equivalent to each other.
Recall that an equivalence relation on a set X is a binary relation "" with the following properties: 1) reflexivity: 3) transitive: , , , Let us present a simple algorithm [8], which allows for a given equivalence relation to construct the corresponding partition of the set X into classes of equivalent elements.
Step 0. An arbitrary ordering (numbering) of elements of the set X: is chosen 12 , ,..., A set of representatives of equivalence classes is determined, which is empty at the initial stage of the algorithm. The set of equivalence classes is also empty. ….
Step i. The next element x of the ordered sequence of elements of the set X is selected and sequentially compared with the set of representatives of already defined equivalence classes. If this element is equivalent to a representative of the class Xk, then it is placed in the class Xk. If it is not equivalent to any of the elements of the set of representatives of the classes, then the element is entered into the set of representatives and defines a new class of equivalence.
The algorithm ends when all elements have been viewed and categorized. The result of the algorithm is a set of representatives of different classes and a set of classes of equivalent elements.
It follows from the transitivity of the equivalence relation that the set of classes obtained as a result of the operation of the algorithm does not depend on the initial ordering of the elements of the set X (Step 0). Another ordering can only change the sequence of equivalence classes and the set of representatives. The above algorithm for finding equivalence classes and a set of representatives will be called linear.
Let us note one feature of the linear algorithm. It can be applied not only to an equivalence relation, but also to any binary relation. However, if the relation is not transitive, then the result of the algorithm will already significantly depend on the choice of the initial ordering of the elements.
The relation of tolerance and classification based on the concept of proximity of elements.
Most of the existing classifications in applied sciences are not built on the basis of the equivalence relation, but on the basis of another binary relation -the tolerance relation. Tolerance relation is a reflexive and symmetric relation "  " on the set X, that is, a relation that is determined by the following properties: 1. x x y X x y y x      .

3
A typical example of such a relationship is the relation of approximate equality on a set of numbers. In practice, the attitude of tolerance appears in the form of a relationship between objects, which is described by the words "similar", "close".
If the tolerance relation "  " is defined on a finite set X, then we can apply a linear algorithm for class allocation and obtain a classification on this set. However, in contrast to the classifications that are based on equivalence relations, classification, constructed on the basis of tolerance relationship, depends essentially on the choice of the initial ordering of the elements of X. Different ways of ordering elements can lead to fundamentally different classifications.
Optimality criterion for classifications.
There are many approaches to determining the optimal classification. Informally, a classification is optimal if the elements within the classes are "close enough" to each other, and the classes themselves are "far enough" from each other. Let's consider one of these approaches.
A closeness measure on a finite set X is a function : p X X R + →with the following properties: 1) , In particular, the distance between points in metric space can serve as a measure of proximity. Let a positive number be given 0   . We will say that the elements , This ratio is the ratio of tolerance and, as mentioned above, gives rise to many different classifications, which are defined by the selected ordering on the set X. We will call such classifications -classifications. The linear partitioning algorithm changes slightly. Namely: at step i there is a class (among the constructed ones), the representative of which is closest to the analyzed element. If the measure of proximity between this representative and the element in question is less than or equal to the value, then the element is added to the class. Otherwise, the element in question becomes a representative of the new class.
The distance between two non-empty disjoint subsets is ,

⎯⎯⎯ →
Such a task is computationally complex [9]. However, the algorithm itself for finding a partition by a given order relation has a polynomial complexity.
Thus, the optimal classification of the search problem is reduced to the problem of finding an optimal permutation of the elements of the X. This allows us to propose a number of metaheuristics for finding suboptimal solutions to the classification problem [10]. Consider two such metaheuristics, the effectiveness of which has been confirmed by a large number of applications.
Permutation evolutionary algorithm. The standard scheme of the evolutionary permutation algorithm is used. Let us briefly describe the principle of operation of such an algorithm [10]. The set of all permutations of n elements is chosen as the base set of solutions n S . At the initial step, a set of solutions is constructed using the initial population operator 0 . n YS At each next step, it is assumed that a certain set of permutations is given the current population. At the first step, this is a set The mutation operator M performs a random transposition (replacement of two elements) in a permutation with a given probability In this way many elements are found the descendants of Y . The evolution operator is applied to the intermediate population YY , which is the union of the current population and a set of descendants, which selects a new current population on this set. The evolution process is repeated until the condition for stopping the evolutionary algorithm is satisfied. The solution of the original problem is restored from the found permutation.
Mixed jumping frogs method.
The algorithm of the method of mixed jumping frogs is simple to understand and implement, has a small number of parameters, and has been successfully used to solve combinatorial and continuous optimization problems [5,6].
The essence of the jumping frog algorithm for finding the optimal permutation is reduced to the following sequence of steps.
Step 1. Initialize the initial frog population as a set of points in the permutation space with Kendall's metric n S .
Step 2. Calculate the value of the optimality criterion for each permutation from the initial population.
Step 3. Arrange the solutions in descending order of the optimality criterion value.
Step 4. Divide virtual frogs (solutions) into memplex blocks in such a way that the first virtual frog in the sorted list falls into the first memplex, the second is entered into the second memplex, etc.
Step 5. Find the best Step 7. If the previous operation does not improve the solution, then try to improve the position of the worst virtual frog by moving it towards the globally better frog 2 11 ( , ) k s Cross s s = .
Step 8. If the last operation does not improve the position of the virtual frog, then instead of it, randomly create a new frog in the search areaa permutation.
Step 9. Combine virtual frogs of all memplexes into one group.
Step 10. If the conditions for the completion of the algorithm are not met, then go to Step 3.
Step 11. The last globally best virtual frog corresponds to a suboptimal problem solution.
Let us now describe this algorithm formally, taking into account the parameters. The method parameters are as follows: 1) the number of classes of frogs 2) the number of elements r in each class (it is assumed that the sizes of the classes are the same and 2 r  ); 3) the maximum number of steps K of the algorithm; 4) the number D of the best frogs in the class, and 0 Dr  .
In accordance with the specified parameters, the size N of the frog population (the set of feasible solutions) is determined by the formula Step k (1 kK  ). The set is ordered ( 1) k P − by the value of the objective function, that is The algorithm ends when the specified number of steps has been completed. The current permutation * s determined at the last step is taken as the optimal solution to the problem.
Note that description is, the above algorithm s Resch was the problem of finding the optimal permutations of n elements in the set of all permutations with the objective function () Fs which is defined on the set of permutations. In this case, the specific type of the objective function does not matter. Therefore, the above algorithm can be used to find suboptimal solutions to optimization problems on a set of permutations with arbitrary objective functions. Numerical experiment. The weights of the graph edges were chosen randomly in the range [1,100]. These weights were considered as a measure of proximity for the respective vertices. The set of vertices of the graph was considered as a set of elements to be classified. A positive number was chosen randomly in the interval [0,1]. Linear orders (permutations) on the set of vertices were not adjusted using the Fisher-Yates shuffle algorithm [17].
The problems were solved using a local search algorithm, a random search method, evolutionary algorithms, and the method of mixed jumping frogs.
The comparison of algorithms was carried out in the following directions: Record is the number of problems in a series where the algorithm turned out to be the best among the tested. Bord rating is the sum of the number of points scored on each problem in the series. For the first place in comparison, 5 points were assigned, for the second 4, for the third 3.
The results of the algorithm comparison are presented on Table 1.  A 50  100  44  286  88  468  100  500  100  500   B 100  100  ten  221  0  348  100  500  100  500  H 500  100  0  212  0  280  94  394  100  500  D 1000  100  0  118  0  201  92  392  100  500 Conclusions. In this article, a method for finding optimal β-classifications based on two wellknown metaheuristics was considered. A numerical experiment showed good results of the proposed algorithms in comparison with local and random search. This approach can be transferred practically without changes to other types of classifications, which are based on the concept of proximity of elements.

RS Global
In addition to the metaheuristics proposed in the work, any other metaheuristics applicable to optimization problems on fragmentary structures can be considered in a similar way [10].