Deep-Learning Estimation of Perfusion Kinetic Parameters in Contrast-Enhanced Ultrasound Imaging

Contrast-enhanced ultrasound (CEUS) is a sensitive imaging technique to evaluate blood perfusion and tissue vascularity, whose quantification can assist in characterizing different perfusion patterns, e.g. in cancer or in arthritis. The perfusion parameters are estimated by fitting non-linear parametric models to experimental data, usually through the optimization of non-linear least squares, maximum likelihood, free energy or other methods that evaluate the adherence of a model adherence to the data. However, low signal-to-noise ratio and the nonlinearity of the model make the parameter estimation difficult.We investigate the possibility of providing estimates for the model parameters by directly analyzing the available data, without any fitting procedure, by using a deep convolutional neural network (CNN) that is trained on simulated ultrasound datasets of the model to be used.We demonstrated the feasibility of the proposed method both on simulated data and experimental CEUS data. In the simulations, the trained deep CNN performs better than constrained non-linear least squares in terms of accuracy of the parameter estimates, and is equivalent in term of sum of squared residuals (goodness of fit to the data). In the experimental CEUS data, the deep CNN trained on simulated data performs better than non-linear least squares in term of sum of squared residuals.


INTRODUCTION
At medical ultrasound imaging frequencies (1-10 MHz), blood signal is weak. Contrast-enhanced ultrasound (CEUS) is an imaging modality that exploits microbubbles as contrast agent to visualize blood flow. Since microbubbles are constrained into the vascular bed and have sizes comparable to red blood cells, they are an ideal agent to non-invasively assess vascularization and perfusion e.g. in arthritis [1B5]. CEUS data are generally quantified within a user-defined region of interest (ROI) by analyzing the time intensity curve (TIC) which is the average of all ROI pixels in each frame. However, the heterogeneity of the tissues of interest can only be assessed through a pixel-wise quantification, which involve fitting a parametric model of tissue perfusion to the TICs extracted from individual pixels [1][2][3][4][5]. The procedure to obtain the value of the parameters that best adapt a model to the available observations usually involve the minimization of a cost function measuring the distance of the measurements from the model estimates, and it is performed through classical approaches as non-linear least squares (NLLS), maximum likelihood (ML) or maximum a posteriori (MAP). Model fitting may yield parameter estimates with large variance as well as considerable bias, since the measured CEUS data are generally very noisy and involve only a small number of sampling points. Moreover, all methods may converge to erroneous solutions since the objective function can have multiple local minima, making them sensitive to initial conditions. Finally, model fitting is computationally demanding, and may become a relevant bottleneck when a model should be fitted to large datasets, with a large number of pixels (and thus TICs) for every patient.

A. Related works
Few attempts have been made in parametric model fitting using machine learning in general and deep neural networks in particular, as the bulk of research in the applications of deep neural networks to dynamical system concentrates in the prediction of future system states, using AI as a black-box or gray-box (for a review see [6]), and only retrospectively linking the network weights to some physical meaning. Additionally, many state-of-the-art classification records have been achieved by deep convolutional neural networks (CNNs) [7][8][9]. However, parametric mapping in imaging applications using deep learning has been attempted only recently, especially in medical magnetic resonance dynamic imaging data [10][11][12][13], where CNNs were used to directly estimate perfusion parameters from temporal sequences. The major drawback of these methods is that no ground-truth data (couples of time-curves and true model parameters) is available for training the proposed neural networks: all proposed methods rely on using the results from other methods as target values for the supervised learning. The trained CNNs hence incorporate all errors possibly present in the reference method.

B. Contributions
Similarly to other methods, the present work aims at employing CNNs for direct estimation of model parameter values when modelling a temporal sequence.
The key novelty is that, at variance with all currently available methods, we want to investigate the possibility of training a CNN on simulated data, allowing a controlled setting and an accurate monitoring of the network performance, and applying it on real data maintaining stateof-the-art performance. The work will compare the performance of the proposed approach with classical nonlinear-least squares (NNLS) [1,5,14] both on data simulated from a non-linear model, and on real data from CEUS images.

A. Training data definition
In the following sections each training example ; is assumed to be a one dimensional vector & ? , where ? is the number of time points % . % 5 I , obtained by a sampling of a curve described by a parametric model V# ; %W with additional noise n(t).

B. Convolutional neural network architecture for each model parameter
The proposed architecture is represented in Fig. 1 and is composed by three blocks, each composed by a sequence of two convolutional layers with non-linear activation (leaky ReLU) followed by a max-pooling layer. The convolutional filters are 1x3 to reflect the 1-dimensional nature of the problem, and the max pooling has dimension 2 (halving the size of the input data). The convolutional blocks are then followed by a flattening and three fully connected layers performing the parameter regression. The loss function minimizes the mean relative error, so to be able to reliable estimates across all spectrum of possible parameter values. Being > the number of available training samples, p the true value of a scalar parameter and # its current estimate, the relative error loss is computed as: for which the derivative to be used in the backpropagation becomes:

DATA
A. Simulated data Following the perfusion model proposed in [15], we assume that a time-activity curve can be formalized as a nested model composed by a principal component described by a Gammavariate curve :8==8 and a slow recirculation component 746 :  With + O[ ' % -( . ] being the parameter vector defining the model. In order to better control the generation of physiologically-meaningful curves and avoid numerical difficulties when using fitting methods, a different parametrization has been chosen: where: With this parametrization we can generate > set of parameters, and the corresponding curves assuming a time range % X Y$; a white noise and a proportional noise has been added to each generated curve to obtained the simulated data curve V+ %W :  [15,16] of subjects affected by inflammatory arthritis, a random patient was used in the current study. From the original cohort, all patients showed signs of inflammatory finger joint involvement: the joint with the highest disease activity was chosen for CEUS examination. Each subject underwent a 2-minute CEUS scan as described in [15] CEUS images were motion corrected and co-registered to the corresponding B-mode anatomical image, and the timeintensity curves were normalized by the maximum value. Only pixels within the synovial region (manually or semiautomatically outlined [17]) that showed a significant enhancement were evaluated, resulting in 7238 TICs extracted from the single patient considered. All curves were resampled and trimmed to 100 seconds to meet the input size of the CNN. Some representative time-intensity curves are shown in Fig. 2.

A. Parameter estimation on simulated data
A set of > O10000 data curves were generated, and split using 80% for training, and 20% for testing. The CNN was implemented and trained in Matlab (The Mathworks Inc.) using MatConvNet [18]. The training was performed for 200 epochs with a decreasing learning rate (from 1/ to 10 ) with the Adam [19] update and a minibatch size of 100. The results were compared with those obtained using a constrained non-linear least squares method; to provide a fair comparison, the constrained non-linear least squares estimation method was provided as as parameter limits those used in the generation of the curves, that might be an additional knowledge that leaks in the CNN during the learning phase. The absolute relative error (ARE) of the j th parameter was defined as in Eq. 12, and results are reported in Tab. II: V W However, since the mean absolute relative error is heavily affected by relatively small errors on parameters with values close to zero, to mitigate this effect, we also reported the mean weighted absolute error (WAE) (Tab. III), that is: where # ; V!W is the average of the real parameter values. Surprisingly, despite the difference in the values of parameter estimates, the average mean squared error (mean of sum of squared residuals -MSSR) between the estimated curves and the data was similar for CNN (6.2) and NLLS (6.0), with no statistically significant difference.

B. CEUS data
The Deep-CNN trained on simulated data was applied on real data for estimating the parameters of the chosen model. In order to assess the performance of the proposed method, the parameters were estimated on the same data using a classical constrained NLLS curve fitting. At variance with the results on simulated data, the MSSR obtained using the parameter obtained from CNN and from the constrained NLLS methods vary significantly (Tab. II). Moreover, the mean estimated values for the different parameters are reported in Tab. III: for all parameters, the difference between CNN and NLLS estimates is significant, and it can qualitatively appreciated by visually inspecting the parametric maps shown in Fig. 3.

DISCUSSION & CONCLUSIONS
Fit of non-linear parametric models to CEUS data is usually carried out through iterative procedures that are sensitive to initial condition and local minima of the objective function used as fit criterion. This makes the estimation and interpretation a very sensitive issue, as the estimated parameters are used to obtain insight into the inner behavior of physiological, pathological or cognitive mechanism, so that modellers often need to check and adjust the estimation procedure until convergence is ensured. Some attempt into using deep-learning regression have been tried, using as ground truth the values of the model parameters obtained by alternative methods. At variance with these approaches, the proposed method is trained on simulated data, and the results show that it outperforms classical methods in term of estimation accuracy and goodness of fit results, maintaining the performance when applied to real CEUS data, providing parameters that fit the data better than NLLS, in a fraction of the time.
The main current limitations of the proposed method is the need to be able to generate a reasonable simulation of the data, and, more importantly, the need to have a constant sampling grid, as the input of the CNN is a curve sampled at ? time points.

ACKNOWLEDGMENTS
The work has been partially supported by University of Padova intramural research project BIRD160889/16. The authors have no relevant financial or non-financial interests to disclose

COMPLIANCE WITH ETHICAL STANDARDS
The study was approved by the local Ethic Committee of the University Hospital of Padova (Italy) (number 52723; October 11, 2010), and written informed consent was obtained from each participant in accordance with the principles outlined in the Declaration of Helsinki, after being informed about the intent and the methodology of the study.