^{1}

^{1}

^{1}

^{2}

^{1}

^{2}

Aiming at the characteristics of the nonlinear changes in the internal corrosion rate in gas pipelines, and artificial neural networks easily fall into a local optimum. This paper proposes a model that combines a principal component analysis (PCA) algorithm and a dynamic fuzzy neural network (D-FNN) to address the problems above. The principal component analysis algorithm is used for dimensional reduction and feature extraction, and a dynamic fuzzy neural network model is utilized to perform the prediction. The study implementing the PCA-D-FNN is further accomplished with the corrosion data from a real pipeline, and the results are compared among the artificial neural networks, fuzzy neural networks, and D-FNN models. The results verify the effectiveness of the model and algorithm for inner corrosion rate prediction.

Due to the influence of the medium composition, temperature, terrain, and other factors, corrosive substances are easily produced in steel gas pipelines, which can lead to internal corrosion. Internal corrosion is one of the causes of aging in natural gas pipeline systems. Corrosion will cause a thinning of the inner wall of the pipeline and reduce its structural strength, which will lead to natural gas leakage and seriously threaten the safety, integrity and economy of the whole gas transmission system [

A number of modeling approaches have been used for the corrosion rate prediction. Chou et al. compared the prediction accuracy of the carbon steel corrosion rate in marine environments based on an artificial neural network (ANN), support vector machine (SVM), classification and regression tree (CART), linear regression (LR) and hybrid metaheuristic regression models, and the results showed that the hybrid metaheuristic regression model had superior prediction accuracy in this case [_{2} corrosion on pipelines at high partial pressures and assessed the degree of suitability for CO_{2} corrosion rate prediction [

Fuzzy logic, introduced by Zadeh, contains three features: modeling of nonlinear processes by using IF-THEN rules, employing linguistic variables instead of or in addition to numerical variables, and using approximate reasoning algorithms to formulate complex relationships [

In the present study, the FNN only learns and optimizes the parameters in the fuzzy system and adaptive adjustments based on a preset neural network, which is time consuming and leads to low-accuracy structure identification [

The internal corrosion rate prediction effect of the model is largely determined by the correlation between the input data and the output data of the model. In addition to temperature, pH, and flow rate, the internal corrosion rate prediction is affected by oxygen content, pressure, and so on. Consequently, we introduce other factors to augment the prediction. However, if these factors are directly used as D-FNN model input, the redundant information of these factors will cause the inaccurate prediction results. At present, methods for dealing with this problem mainly include principal component analysis (PCA), which is a statistical method used for dimensional reduction and feature extraction. This method is particularly suitable for dealing with situations where such factors are highly interrelated [

This paper is structured as follows. In Section

PCA is a common multivariable statistical method used for feature extraction and dimensional reduction in analysis. The method uses a linear projection to map high-dimensional data to a representation in a low-dimensional space that maximizes the variance of the data in the projected dimension by using fewer data dimensions and retaining more original data points, thus realizing the dimensionality reduction process [

Assume there is a P-dimensional random vector

Calculate principal component loads:

The D-FNN combines the advantages of fuzzy systems and neural networks. The D-FNN is based on the extended radial basis function (RBF) neural networks and its essence is a fuzzy system based on the Takagi–Sugeno–Kang (TSK) model [

The architecture of a dynamic fuzzy neural network [

Layer 1 (input layer):

Layer 2 (membership function layer): each node represents a membership function. The membership function can be denoted as the Gaussian function:

Layer 3 (

Layer 4 (defuzzification layer, also known as the normalized layer): this layer achieves a normalized calculation, and the number of nodes in this layer is equal to the number of fuzzy rules. The output of the

Layer 5 (output layer): each node in this layer represents an output variable, and the output is the accumulation of all the input signals:

The weights are a linear structure and can be expressed as follows:

Substituting equations (

The structure of the D-FNN is not preset but is formed according to the gradual increase in the learning process. Therefore, the learning algorithm of the system mainly includes the generation of fuzzy rules, the determination of premise parameters, the determination of weights, and the pruning technique of rules, to achieve the specific performance required by the system [

Determining the structure of the network is one of the main purposes of the training algorithm. To determine whether to add a new rule, it mainly depends on two judging indicators: the accommodating boundary and the system errors. The containable boundary characterizes the coverage of a membership function; multiple existing membership functions have the characteristics of dividing the entire input space. Therefore, if a new sample appears in the coverage of a Gaussian membership function that already exists, it means that this sample can be represented by an existing Gaussian function, so there is no need to add new rules or RBF units to accommodate this new sample. The description of the basis for obtaining rules based on the accommodating boundary is as follows.

For the

In addition to judging based on the accommodating boundary, system errors need to be considered. If there are too many or too few rules, the unnecessary complexity will be increased, which will worsen the system performance and reduce the generalization ability of the system. Thus, the system error is a vital factor in ensuring the new rules.

For the

The width of the RBF unit can affect the generalization ability of the system. Therefore, the newly generated rules, that is, the width and center of the RBF unit, need to be adjusted. The adjustment method is as follows:

Assuming that

For any input

Rewrite equation (

The relationship between the expected output

Find an optimal parameter coefficient vector

In this paper, we trim the number of fuzzy rules in the third layer with the error reduction rate (ERR). This algorithm decomposes the output of the fourth layer into an orthogonal base matrix and an upper triangular matrix by QR decomposition. Then, the ERR is calculated by the orthogonal basis matrix. Using the pruning algorithm, significant neurons are selected so that a parsimonious structure with high performance can be achieved [

The proposed hybrid model inherits the merits of the independent models and enhances the performance of the internal corrosion rate prediction compared with previous models. The complexity of the algorithm mainly includes two aspects: PCA and D-FNN. The flow chart of the PCA-D-FNN is shown in Figure

The flow chart of PCA-D-FNN.

Natural gas should be purified to remove impurities, such as H_{2}O and H_{2}S, before entering the pipeline. However, it is difficult to remove these impurities completely. Therefore, the inner wall of the pipeline will be corroded during long-term operation or under special working conditions. Corrosion in pipelines is affected by many factors, and its impact process is complex. Qiao et al. used computational fluid dynamics (CFD) simulation analysis to conclude that the solid particles in the natural gas flow were the main cause of corrosion in the elbow of the gas pipeline [_{2} had a greater corrosive effect on steel pipes at high temperatures (40°C–60°C) [^{+} concentration and pH value were high [_{2}, corrosion products, and H_{2}S had a great influence on the corrosion of gas pipelines [_{2} content, H_{2}S content, Cl^{−} content, moisture content, pH, flow rate, temperature, pressure, and oxygen content) are chosen according to the workers’ experience. The corrosion rate is derived from an online monitoring system which is shown in Figure

Rates of corrosion.

The PCA method is used to analyze the above features, and a few principal components that can represent all the information are extracted, which will reduce the input dimension of the model and improve the prediction accuracy. Using PCA algorithm proposed in Section _{2}S content has a higher value on the first principal component, CO_{2} content has a higher value on the second principal component, moisture content has higher values on the third principal component, and the flow rate is higher on the fourth principal component. Therefore, we choose H_{2}S content, CO_{2} content, moisture content, and the flow rate as the input of the D-FNN prediction model.

Eigenvalue and cumulative variance contribution.

Principal component | Eigenvalue | Cumulative variance contribution rate |

1 | 7.973 | 36.52 |

2 | 5.222 | 59.86 |

3 | 2.987 | 80.35 |

4 | 1.502 | 86.62 |

5 | 0.412 | 89.44 |

6 | 0.398 | 92.38 |

7 | 0.214 | 95.95 |

8 | 0.177 | 97.72 |

9 | 0.098 | 99.76 |

10 | 0.028 | 99.89 |

In this paper, the D-FNN model is established to predict the inner corrosion of the pipeline. There are four input nodes screened out by PCA algorithm, 34 pairs of input and output data are used in this research, while 24 pairs are used as the training dataset and the rest are the test dataset. The precision of the model is set to 0.05. When the accuracy of the training error is less than 0.05, or the maximum iteration number is 80, the training is terminated. The initial parameters of D-FNN are

(a) The change of fuzzy rules during model training. (b) Root mean square error during training.

To study the prediction accuracy of the proposed model, the root mean square error (RMSE), the mean absolute percentage error (MAPE), and Theil’s inequality coefficient (TIC) are employed to evaluate the model performance in this paper. The RMSE is employed to evaluate the difference between the observed values and the actual values, the MAPE is a commonly accepted metric, and the TIC indicates a good level of agreement between the studied process and the proposed model [

Three metric rules.

Metric | Equation | No. |

RMSE | 30 | |

MAPE | 31 | |

TIC | 32 |

The ANN, FNN, and D-FNN models have also been chosen in comparison with the PCA-D-FNN model. In the contrastive experiment, all models were trained using 24 pairs’ dataset with the remaining 10 pairs as test dataset. The architecture of the ANN consist of four input nodes, one hidden layer, and one output layer, and the hidden layer contains seven nodes, while the transfer function is

The RMSE, MAPE and TIC of the prediction values for the internal corrosion rate performance of models on the testing dataset.

Evaluation metrics | ANN | FNN | D-FNN | PCA-D-FNN |

RMSE | 0.6863 | 0.6273 | 0.5464 | 0.4232 |

MAPE (%) | 12.44 | 9.26 | 7.11 | 5.91 |

TIC | 0.3471 | 0.3248 | 0.2784 | 0.2352 |

The result of LOOCV with different algorithm.

Evaluation metrics | ANN | FNN | D-FNN | PCA-D-FNN |

RMSE | 0.7324 | 0.6121 | 0.4931 | 0.4133 |

MAPE (%) | 8.56 | 7.82 | 6.01 | 5.32 |

From Table

The computation time of the ANN, FNN, D-FNN, and PCA-D-FNN models are 1.923s, 2.341s, 2.571s, and 1.621s, respectively. The proposed method can be used for internal corrosion rate prediction of gas pipeline.

The internal corrosion rate of gas pipeline is affected by many factors, and the reliability of the pipeline will be affected greatly by internal corrosion. Thus, conducting accurate forecasting of the internal corrosion rate appears to be especially important. Therefore, a hybrid model called the PCA-D-FNN is proposed in this paper. PCA is an effective method that is used to extract features and reduce the dimensions of the original sample, and four factors, including 86.62% of the original information, are extracted. Then, the D-FNN is used to conduct the prediction and is shown to take advantage of the fuzzy rules and ANNs to overcome the drawbacks of the single methods. This method generates fuzzy rules in the dynamic learning process, which grow exponentially instead of increasing with variables, thus improving the generalization ability of the network. The experimental results prove the effectiveness of the hybrid model through testing the proposed model by using the collected corrosion data. Through a comparison of PCA-D-FNN with ANN, FNN, and D-FNN models, the PCA-D-FNN model is shown to predict the internal corrosion rate with an RMSE of 0.4232, an MAPE of 5.91%, and a TIC of 0.2352 on testing dataset, which is more accurate than other models. The LOOCV results of different models also show that the results of PCA-D-FNN are better than other algorithms. It can also be determined that PCA-D-FNN obtains the best forecasting performance with a fast convergence rate and a high ability to search for global optimums. Therefore, the proposed model demonstrates great potential in applications concerned with the internal corrosion rate of pipelines.

The data used to support the findings of this study have not been made available because they are currently under embargo while the research findings are commercialized. Requests for data, 10 months after publication of this article, will be considered by the corresponding author.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work was supported by National Science Foundation of China (no. 51874255).

_{2}corrosion in pipeline steels

_{2}environment