Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Autism is a multifaceted neurodevelopmental condition whose accurate diagnosis may be challenging because the associated symptoms and severity vary considerably. The wrong diagnosis can affect families and the educational system, raising the risk of depression, eating disorders, and self-harm. Recently, many works have proposed new methods for the diagnosis of autism based on machine learning and brain data. However, these works focus on only one pairwise statistical metric, ignoring the brain network organization. In this paper, we propose a method for the automatic diagnosis of autism based on functional brain imaging data recorded from 500 subjects, where 242 present autism spectrum disorder considering the regions of interest throughout Bootstrap Analysis of Stable Cluster map. Our method can distinguish the control group from autism spectrum disorder patients with high accuracy. Indeed the best performance provides an AUC near 1.0, which is higher than that found in the literature. We verify that the left ventral posterior cingulate cortex region is less connected to an area in the cerebellum of patients with this neurodevelopment disorder, which agrees with previous studies. The functional brain networks of autism spectrum disorder patients show more segregation, less distribution of information across the network, and less connectivity compared to the control cases. Our workflow provides medical interpretability and can be used on other fMRI and EEG data, including small data sets.
Furthermore, an autism misdiagnosis might occur because many other disorders have similar symptoms. In this way, it is essential to develop a quantitative and accurate method for autism diagnosis based on physical exams. This paper considers data from functional brain networks and machine learning algorithms to propose a computer-aid diagnostic methodology for autism.
Our approach is based on previous studies that suggested that autism is a manifestation of changes in the brain organization5. Abnormal neuronal connectivity has recently become the essential hypothesis for explaining the symptoms associated with autism6. By adopting the fMRI technique, Belmonte and Yurgelun-Todd7 demonstrated that the inputs of the autistic brain regions are cut off, with reduced activation and functional correlations with sensory areas. fMRI data from children with ASD8 suggest a strong parietal cortex activation responsible for visuospatial and sensory processing. In a resting state, regions of the medial prefrontal cortex related to the executive function comprised of skills that enable the individual to make decisions, pay attention, and differentiate conflicting thoughts are suppressed9. Apart from the medial prefrontal region, the rostral anterior cingulate cortex and the posterior cingulate cortex have also been investigated10. The function of the former includes memory recall and learning. In contrast, the posterior cingulate cortex is responsible for cognitive, emotional, and learning processes. Its metabolic activities during rest are deactivated during demanding cognitive tasks. According to Kennedy et al.10, the midline resting network of patients with ASD is less active than that of the control group, and task deactivation is insignificant. In structural terms, Keller et al.11 suggested the development of the brains of autistic children is atypical, showing an early overgrowth of white matter, followed by its reduction in adolescence and adulthood. Furthermore, Diffusion Tensor Imaging (DTI) results revealed the disorganization of white matter paths12.
These studies demonstrate that the structure of the brains of autistic people and healthy individuals differ. Therefore, we speculate that autism can be identified by reviewing information on brain anatomical organization. This data can be collected from electroencephalogram (EEG) or functional magnetic resonance imaging (fMRI) experiments. EEG is a relatively inexpensive method readily available in most contexts and has an excellent temporal resolution. Data from EEG has been used to enhance our understanding of human brain structural and functional networks13,14,15. On the other hand, fMRI has a low temporal resolution but a high spatial one, thus being well suited for analyses of spatial brain dynamics16,17. fMRI scans produce a set of three-dimensional images recorded over time and measure a signal (called BOLD signal (The decrease in the rate of deoxyhemoglobin can be detected with the increase of the NMR signal. This effect is called Blood Oxygenation Level-Dependent (BOLD))). The temporal evolution of the BOLD series is called the hemodynamic response function and is determined by the pixel intensity in fMRI images18,19. Each cube of an fMRI image, called a voxel, which anatomically maps a position in the brain, has a BOLD time series. Here, we consider the BOLD series to develop the classification method for autistic patients.
After mapping the brain, it is possible to classify people with ASD and typical development (TD) using machine learning methods. Machine learning (ML) techniques permit automatically extracting knowledge from a database. Previous studies have evaluated the effectiveness of machine learning in diagnosing ASD with supervised machine learning algorithms that distinguish between two classes, namely ASD and TD. Up to the present date, at least 45 articles have focused on supervised machine learning algorithms that aid in ASD diagnosis, where the most used ones are based on support vector machines (SVM)20 (see Table 1 for publications on the use of fMRI for distinguishing between ASD and TD).
Although ML has provided important advances in diagnosing autism, considerable challenges must be addressed. Many classification methods need to be more interpretable, which is disadvantageous, especially for understanding medical data29,30. Also, according to Table 125,28, small data sets are quite common31,32,33,34, which might cause unreliable results. To overcome the lack of interpretability, we can consider new techniques that have emerged in recent years towards facilitating the interpretation of machine learning results (e.g., SHapley Additive ExPlanations (SHAP) values35 identify the most important features for a model36,37,38). Moreover, to circumvent the use of small medical data, data augmentation techniques (e.g., sliding windows), which split data (e.g., time series from EEG and fMRI )39,40,41, might be adopted. However, one of their limitations is the loss of information during the splitting process, which the overlapping windows technique can solve. Part of the window information is repeated in each subsequent window and used for EEG42,43 and fMRI44,45 data. In this paper, we consider these methods to develop a new method for diagnosing autism that is interpretable and can be used in small data sets. In summary, our contributions are the following:
Complex network measures characterize brain organization, quantifying the differences between ASD and TD patients. In addition, we use SHAP values for a biological interpretation of the connections between brain regions and their relation with ASD.
We adopt a sliding window data augmentation approach to increase the sample size by splitting the time series into smaller series with either mutually exclusive sections of the time series or overlapping sections of the sliding windows, in which portions of the sequence are repeated in multiple observations. This approach enables handling small medical data.
It is essential to point out that despite the extensive studies involving ML algorithms for the diagnosis of ASD (as mentioned in Table 1), previous works considered just one pairwise metric, i.e., Pearson correlation21,22,27. However, as verified in previous studies (e.g.46), correlation metrics are vital for diagnosing mental disorders. Therefore, we considered nine different pairwise metrics to find which best captures the ASD brain changes. Furthermore, unlike the studies in Table 1, we employed the SHAP (SHapley Additive exPlanations) values to identify the connections that differ in ASD and control patients. Moreover, we considered measures of complex networks to analyze how functional brain networks are modified in ASD. Thus, we proposed a more robust methodology that considers not just ML algorithms but also complex network measures while offering a medical interpretation of the results produced.
Brain regions of interest (ROI), rather than the entire BOLD time series obtained from each voxel of the brain image, are considered. A brain atlas containing these ROIs is used; therefore, only the BOLD time series voxels of this ROIS were adopted. Among the numerous predefined atlases, Bootstrap Analysis of Stable Clusters (BASC) was chosen since it was the map with the best performance for distinguishing ASD patients by deep learning model, according to22. It was proposed in49 and generated from group brain parcellation by BASC method, which is a k-means clustering-based algorithm that identifies brain networks with coherent activity in resting-state fMRI50. BASC map with a cluster number of 122 ROIs was used here (see Fig. 1). The preprocessed BOLD time series extracted for 122 regions can be found in the Supplementary Information.
Methodology to obtain the connectivity matrices. In (a), time series of 122 ROI is extracted from fMRI data with the use of BASC BOLD atlas (highlighted in blue, purple, and orange). The time series are correlated, (b), by pairwise statistical metrics (Pearson correlation was used in this example) towards forming the connectivity matrices, where each row and column correspond to one of the Brodmann areas for a patient with ASD for one with TD. The same highlighted matrices are arranged in a two-dimensional and three-dimensional brain schematic for better visualization.
b37509886e