Intelligent Aircraft Maneuvering Decision Based on CNN

Aiming at the maneuvering decision of aircraft in air combat, an intelligent maneuvering decision model based on convolutional neural network(CNN) is proposed in this paper. Firstly, the situation data, maneuvering decision variables and evaluation indexs are given, and a CNN model that can realize intelligent maneuvering decision is established. Then, according to the evaluation indexes, the structure and parameters of the CNN model are adjusted through the simulation experiments to improve the accuracy and robustness of the maneuvering decision. After that, the validity of the intelligent maneuvering decision model proposed in this paper is verified through comparative experiments that the CNN model can make stable maneuvering decisions with high accuracy. Finally, the flight path in an air combat process is presented.


Introduction
Future air war will inevitably develop towards unmanned and autonomous [1]. Autonomous maneuver decision is a critical part to reach a higher level of autonomy and air-combat decision [2,3]. To meet the needs of modern air combat, it is very important to establish a reliable maneuvering decision method, which can make auxiliary decisions to help pilots in manned aircraft and realize autonomous maneuver decision in UAVs. The air combat environment is dynamic, so the decision-making method should be accurate and fast [4].
Many intelligent maneuver decision methods have been formed, such as genetic algorithm [5], expert system method [6] and MCTS Method [7]. Literature [1] realized the maneuvering decision through monte carlo reinforcement learning and showed the flight path and the variation of control variables, which proved the validity of the method. However, this literature only used the position and attitude of both sides to describe the air combat state. The process of air combat is complex, more information should be considered into model such as radar information, missile information.
Deep learning has efficient ability of feature extraction and knowledge expression [8]. In order to realize the intelligent aircraft maneuvering decision, this paper constructs a model using CNN to fit maneuvering decision variables. The maneuvering decision model takes more comprehensive factors as inputs and directly outputs the change rate of attack angle and the change rate of throttle coefficient, making the decision results more accurate and intuitive.

Convolutional Neural Network
After being trained with known patterns, the Convolutional neural network can learn mapping relations between inputs and outputs without any mathematical expression.
The CNN usually consists of convolutional layers, pooling layers, and fully connected layers [9,10]. The essence of convolutional layer is a feature extractor, which can automatically extract the deep information of input data. The pooling layer realizes the sampling process of the feature map, and keeps useful information while reducing the amount of data. The full connection layer is usually located at the end of the network and ouput the results of fitting or classification. A common structure of CNN is shown in Figure 1. The cores of CNN are the calculation of the convolutional layer and the update of network parameters. In convolutional layer, local convolution is performed on the feature map of the previous layer by using different convolution kernels. The weight of same convolution kernels is shared, which can reduce the number of parameters. The process of convolution is as follows: Where is the input of the j th feature map of the l th layer.
is the output of the i th feature map of the l-1 th layer. is the convolution kernel between the i th feature map of the previous layer and the j th feature map of the current layer. is the bias of the j th feature map of the l th layer.
is all the feature maps of the previous layer that are connected with the j th feature map of the current layer.
Parameter updating is a process to strengthen the mapping ability of the network. The process includes obtaining the loss function, the back propagation of the error and updating parameters of the network. This paper obtains the loss function according to the following formula: (2) Where represents the loss function. represents the output value of output layer, and represents truth value. The back propagation of the error can be realized by calculating the derivative of the convolution kernel and the bias of the loss function for each layer. Finally, parameters are updated in some way after calculating the gradient of the network parameters.

Framework of the Maneuvering Decision System
According to the CNN principle mentioned above, this paper constructs a maneuvering decision model based on CNN and trains the model through the fitting process to the air combat historical dataset. In other words, the nonlinear expression ability of CNN is utilized to realize the nonlinear mapping from air combat situation to maneuvering decision. The Framework of the maneuvering decision system is shown in Figure 2. By inputting our situation data and enemy's situation data into the CNN model trained, the maneuvering decision variables including the change rate of attack angle and the change rate of throttle coefficient can be obtained. According to the (3), (4), maneuvering control variables including attack angle , throttle coefficient can be obtained to realize the maneuvering control of our aircraft.
Where represents the current time and represents the next time.

Data Preprocessing
In the process of air combat, the data of speed, position, acceleration and attitude are generated according to the time sequence. These data describe the state and trend of the aircraft at a certain time and were collected to form the historical dataset.
The size of data collected for different attributes varies greatly, so the dataset must be normalized. In addition, the size of data at adjacent moments differs slightly. If inputting the training samples in their time order, the gradient may disappear with the back propagation of the error. Therefore, it is necessary to scramble the data.
The formula to normalize the data to the range of [0, 1] is as follows: Where is the original data.
is the normalized data. and respectively represent the maximum and minimum of the data.

Model Training
Adam is an adaptive learning rate algorithm [11]. By calculating the first and second moment estimation of the gradient of each parameter, this algorithm designs an independent adaptive learning rate for each parameter [12]. Compared with other adaptive learning rate algorithms, this algorithm has faster convergence rate, more effective learning and smaller fluctuation of loss function.
Where is the gradient of the parameter . denotes the parameter before the update.
denotes the parameter after the update.

Evaluation Indexs
This paper sets mean square error( ) and goodness of fitting( ) as the evaluation indexs for the validity of model.

Mean square error
Mse is the mean squared error between true and predicted values, and it is often used to fully assess the quality of the network. In this paper, it is expressed as: Where , denote the Mse of the test dataset of the change rate of attack angle and the change rate of throttle coefficient. n is the group number of test dataset. , are the true value and predicted value of change rate of attack angle. , are the true value and predicted value of change rate of throttle coefficient.

Goodness of fitting
Goodness of fitting can be used as the evaluation index of model fitting. The closer it is to 1, the better the fitting. In this paper, it is specifically expressed as: Where , denote the goodness of fitting of the test dataset of the change rate of attack angle and the change rate of throttle coefficient. , represent the mean of the true values of the change rate of attack angle, the change rate of throttle coefficient in the test dataset.

Datesets
In order to train the CNN network effectively, it is necessary to have sufficient data, including training dataset, test dataset.etc. In the one-to-one air combat, an air combat process was set up as a simulation, and the situation information of both sides was recorded in simulation. After several simulations under different initial situations, a total of 16,700 sets of data were collected as the dataset of this paper. Among them, 11700 sets of data were used as the training dataset, and 5000 sets of data were used as the test dataset. The situation information of both sides collected in the air combat are as follows.

The Situation Information of Our
Side. The situation information of our side includes simulation information, survival information, position information, speed information, acceleration information, attitude information, control information, target information, sensor information, radar information and missile information.

The Situation Information of the Other Side.
It is difficult to obtain all the situation information of the enemy in the actual air combat. This paper selects the position information and speed information of aircraft that can be obtained as situation information of the other side. Table 1 were used to fit the change rate of attack angle and the change rate of throttle coefficient respectively in the two sets of experiments. After training, the test dataset was input into the model to obtain the predicted value. and were calculated by comparing the predicted value with the true value. After that, the best model was selected.

Training and Testing The optimization experiments included two sets of experiments. CNN models in the
In this experiment, the training was carried out for a total of 5,000 rounds. The batchsize was 512. The activation function was tanh function. The optimized method was Adam, and the parameters in this method were set as , , , .  Table 2 shows the evaluation indexs for different CNN models.

Results and Analysis.
By analyzing and , it can be concluded that M3 reached the optimum when fitting the change rate of attack angle and the change rate of throttle coefficient. In this paper, BP neural network and CNN(M3) are compared in comparative experiments. The structure of M3 is shown in Table 3.    Table 4. The losses in the process of model training were recorded and used as the evaluation basis for the convergence rate of the model. The evaluation indexs of comparative experiments were the convergence rate of training, Mse and of the test dataset. In this experiment, the training was carried out for a total of 5,000 rounds. The batchsize was 512. The activation function was tanh function. The optimized method was Adam, and the parameters in this method were set as , , , .    Figure 3 shows the comparison of convergence speed between BP neural network and CNN when fitting the change rate of attack angle. Figure 4 shows the comparison of convergence speed between BP neural network and CNN when fitting the change rate of throttle coefficient. From the experimental results, it can be seen that the CNN achieved small errors in the process of fitting. Meanwhile, compared with BP neural network, CNN had faster convergence speed and smaller and more stable losses in training.

The Flight Path of an Air Combat
After models of maneuvering decision were established, they were applied to the simulation of air combat. In the simulation, our side used the CNN model to make maneuvering decisions, while the other side used the original method. A result of air combat is shown as Figure 5. In Figure 5, the red fighter represents our side and the blue fighter represents the other side. In the initial state, the red fighter and the blue fighter have the same weapons and aircraft type. Figure 5 shows that the red fighter made a dive to approach and attack the blue fighter.

Conclusions
According to the problem of maneuvering decision of aircraft in air combat, this paper constructs a maneuvering decision model by using the powerful ability of feature extraction and reducing the over-fitting of the CNN. By training a large amount of air combat data, the model can directly extract the internal relationship between combat situation data and maneuvering decision variables. The experimental results in this paper show that the model is accurate in maneuvering decision. Meanwhile, the CNN has better performance, stronger generalization ability and higher fitting accuracy in maneuvering decision compared with BP neural network.
In this paper, the maneuvering decision model outputs the maneuvering decision variables directly in the decision-making process, and then the maneuvering control can be realized. This makes the decision-making results more intuitive and effective, and also provides ideas for further research.