Cortical Thickness variability in Multiple Sclerosis: The role of lesion segmentation and filling

Cortical Thickness (CTh) estimation from Magnetic Resonance Imaging (MRI) data of Multiple Sclerosis (MS) patients is biased at variable extent by the presence of white matter lesions. To overcome this limitation, several methods have been developed. In this study, we evaluate the impact on CTh measurements of different lesion corrections obtained combining three lesion segmentations (manual or automatic) with three intensity filling methods at whole brain and regional scale.


INTRODUCTION
Multiple Sclerosis (MS) is a neurological disease which involves an inflammatory state responsible for axonal myelin destruction and cerebral lesions. Among the MS potential hallmarks, cortical thinning has recently become a significant biomarker [1] of the disease progression. The cortical thickness (CTh) can be assessed by analysing structural T1-weighted (T1w) MRI images. Many methods have been proposed to perform this analysis, mainly following volumetric or surface based [2] approaches.
While in healthy subjects, all proposed CTh estimation methods show comparable accuracy [2], in MS patients, the presence of brain lesions poses a challenge for its correct estimation. In particular, White Matter (WM) lesions typically appear as hypo-intensities on T1w MR images as depicted in Fig. 1 and, consequently, can affect the performance of CTh estimation methods in two ways. First, all correction methods require the T1w MRI data to be registered to a common image space (typically the same over which the atlas is defined). The non-linear registration process can be easily misguided by the presence of such hypo-intense lesions. Secondly, WM lesions next to the cerebral cortex can be easily mistaken with Gray Matter (GM) tissue or cerebrospinal fluid (CSF) by the CTh estimation algorithms. One common approach to reduce the bias introduced by the MS lesion presence, consists in the accurate spatial segmentation of WM lesions [3] followed by an intensity filling [4] procedure that replace the intensities within the WM lesion areas with the values of neighbouring normalappearing WM tissue as represented in Fig. 2. The Gold Standard (GS) method for MS lesion detection consist, however, in a manual segmentation. Given the subjectivity and the time required by the manual segmentation, a number of automatic alternatives [3] have been proposed.  In this study, we evaluate how CTh estimation is affected by different combinations of lesion segmentation and filling methods. CTh effects due to lesion correction will be assessed both at global or regional brain scale.

MATERIALS &METHODS
Data 15 MS patients (12 RR, 3 SP, age 43±5 y, range 36-54 y, M/F: 6/9) were retrospectively selected as a subset of an ongoing study by randomly choosing subjects with a variable amount of lesions. The patients underwent an MRI protocol which included the acquisition of a 3D T1-MPRAGE sequence (TR/TE=8.3/3.7ms, Field of view of 240x240x180 mm, 1 mm 3 isotropic resolution) and a 3D Fluid Attenuated Inversion Recovery (FLAIR, TR/TE=8000/263ms, TI=1650ms, 1 mm 3 isotropic resolution). Data were acquired at the Neuroradiology Unit, Department of Radiology, University Hospital Verona, Italy, using a Philips Achieva 3TX MRI scanner equipped with 8-channel head coil. The study was approved by the local ethical committee and all patients signed the informed consent. T1w and FLAIR images were pre-processed with: intensity normalization (N4BiasFieldCorrection, ANTs, [5]), skullstripping (bet, FSL, [6]), then FLAIR image was rigidly registered (ANTs [7]) on the T1w image. Finally, as in [8] a brain parcellation of 98 Regions Of Interest (ROI) was obtained with a multi-atlas segmentation approach (Multi-Atlas Label Fusion, MALF, [9]).

Lesion segmentation
The reference lesion segmentation was provided by an expert neuroradiologist through manual segmentation of the T1w and FLAIR images of each MS patient. Automatic segmentations were obtained using two automatic methods: Lesion Segmentation Tool -LST [10], and Salem Lesion Segmentation -SLS [11].

Lesion filling
Three different methods for replacing the lesion intensities with normal-appearing WM tissue were considered: lesion_filling provided by FSL library [12], Lesion Segmentation Tool -LST [10] and SLF of the Salem Lesion Filling Toolbox [11].

Cortical thickness estimation
Voxel-wise CTh was estimated using a Diffeomorphic Registration based method (DiReCT, ANTs [13]), applied on T1w images corrected with all the combinations of lesion segmentations (manual, LST, SLS) and filling (FSL, LST, SLF). For each patient, a Whole Brain (WB) representative value was obtained averaging the voxel-wise CTh on all the voxels of the cortical ribbon reported as ℎ , , , where wb denote the spatial scale, ∈ { , , } denotes the manual (man) or automatic (LST, SLS) segmentation, ∈ { , , } the filling method used. Similarly a regional estimation was obtained by averaging only the voxels inside single ROIs, represented as ℎ , , (where r additionally indexes the ROI as = 1, … ,98 pointed).

Statistical Analysis
As first step, we want to assess if different lesion filling on the same T1w image with a fixed lesion segmentation provide consistently different CTh estimates. To this aim, we compute the Coefficient of Variation ( % = / , with the standard deviation between CTh with different corrections applied and their mean value) of the CTh estimates both for the WB and ROI level. Then, we want to test if there is any difference among different combinations of three segmentation methods and three filling methods. CTh differences are assessed by CTh Mean Relative Error as: where the CTh estimated after correction with a segmentation i and filling j is compared with one estimated after a manual segmentation and filling k (GS-corrected). Statistical pairwise comparisons of CTh are conducted with permutation test (5000 permutations, typical significance of 0.05 when not specified) and False discovery rate (FDR) criterion to account for multiple comparisons (0.05 rate when not specified) while group-wise comparisons adopted a repeated measures Analysis Of Variance (rANOVA). The full 3x3 factorial design allows us to study the main sources of CTh variability related to the lesion correction applied before CTh estimation. Taking advantage of this, a two-way rANOVA with segmentation and filling as withinsubject design factors. CTh differences between differently corrected images are thus evaluated to find any effect of segmentation or filling factors or their interaction. All the analysis was repeated at whole brain and regional level in Matlab (R2015b, The Mathworks, Natick, MA).

CTh estimation variability
The CV % of whole-brain CTh estimates considering all three filling methods based on a manual segmentation together, was 0.8%. This percentage increased at 1.31% when averaging the CV % of ROI-wise estimates with the three GS corrections over all the ROIs. Such small CV % suggests that no substantial CTh variations should be expected with the filling methods evaluated. Similarly, including also all the

TABLE 1.
Whole brain CTh differences between corrections methods assessed with MRE (%) with standard deviation over the subjects in brackets. Significant differences between different combinations (p<0.05) were marked with "*", while "**" denote highly statistically significant differences (p<10 -3 ). SLS+SLF and LST+LST automatic correction pipelines are coherently highlighted in blue and green. The same applies for the number of regions which exhibit statistically significant differences (FDR-corrected) among compared methods. Strikingly, FSL filling with any automatic segmentation provide a high number of regions with significant differences compared to the manual segmentation counterpart and considerably more regional differences than any other combination of SLS-LST segmentations with LST-SLF fillings compared. Moreover, LST and SLF filling, in combination with any segmentation method (SLS or LST), appear to perform corrections that provides a good agreement of CTh estimates with the relative GS correction, in fact most of the ROIs do not exhibit significant statistical differences. Despite some regional variability, all correction methods are in overall agreement since the maximum observed MRE of 2-3% is quite comparable with the inter-subject CTh regional variability. The MRE distribution over the ROIs, dedicated to automatic corrections (SLS+SLF and LST+LST) against GS ones (manual+SLF and manual+LST) is depicted in Fig. 3. Filling factor represent a statistically significant main effect in most (92/98, FDR-corrected) of the ROIs as given by 2way rANOVA test analysis (Greenhouse-Geisser correction accounted for non-sphericity, tested with Mauchly test).

DISCUSSION
Generally, MS lesion presence is expected to increase the CTh estimate in the immediate neighbouring of a lesion. In fact, as in T1w images MS lesions typically appear iso to ipo-intense to GM areas, they may be incorrectly considered part of the cortical ribbon, thus increasing the local CTh.
To tackle this effect, the lesion correction aims at recovering a normal appearing tissue intensity, removing (at least locally) this source of bias in the CTh estimation procedure. As highlighted by the low CV % observed, the CTh estimation variability due to different segmentation and filling methods, is overall limited and hardly increased when considering averaged regional estimations instead of whole brain CTh. The role of lesion filling is dominant in explaining the differences on the CTh estimation since it accounted for significantly more MRE variability than the segmentation used. This was confirmed by the 2-way rANOVA test.
To note is that, this observation holds similarly for the whole brain CTh and the regional CTh. In particular, all MRE values concerning FSL filling corrections were significantly higher, regardless of segmentation and filling compared to it. Moreover, SLF and LST filling with manual segmentation provided lower CTh estimates than FSL filling coupled with any automatic segmentation. This suggest that generally FSL filling provide a correction which leads to a CTh underestimation both globally and regionally. This is also consistent to FSL filling algorithm behaviour which corrects all lesions with white matter T1w intensity. The limited MRE variance over the MS subjects as well as the ROIs, similar among all pairwise correction comparisons, suggest a limited dependency over the specific lesion correction. Proposed automatic correction pipelines LST+LST [10], SLS+SLF [11] provided very low MRE compared to their the respective GS counterparts (manual+LST, manual+SLF). The MRE distribution over the ROIs (Fig. 3) of LST+LST and SLS+SLF methods exhibit consistently similar variability and shape when compared to GS counterparts. A positive consistent MRE bias around 1% appear in most of the ROIs when FSL filling is compared against, even thought with MRE overall limited under 2 to 3 % in most regions.

CONCLUSIONS
In this study, we evaluated how different combinations of lesion segmentation and filling methods affects the cortical thickness estimation in presence of MS-related lesions. Before estimating CTh, T1w images were processed combining a segmentation and a lesion filling method out of three possible lesion segmentation with three lesion-filling methods from which CTh differences related to the applied correction were assessed. The major findings of this study are that the CTh estimated after lesion filling on manually segmented lesion areas are consistently similar to fully automatic correction methods. The observed mean relative errors between CTh after automatic corrections compared to gold standard ones ranged from -1.6 % to 1.8 % in most of the regions, as well as at whole brain. Thus, the minimal expected CTh variability as solely given by the lesion correction, is in the order of magnitude of 2% in terms of MRE. Overall, the intensity filling step of the lesion correction was very significant as it accounted for most of the CTh differences observed regardless of the segmentation used.