Low birth weight is one of the primary causes of child mortality and several diseases of future life in developing countries, especially in Southern Asia. The main objective of this study is to determine the risk factors of low birth weight and predict low birth weight babies based on machine learning algorithms.
Copyright: 2022 Islam Pollob et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: This study was based on an analysis of existing public domain survey datasets that are freely available online with all identifier information removed. The survey was approved by the Ethics Committee in Bangladesh. The authors were granted permission to use the data for independent research purposes. The link of the dataset is _Standard-DHS_2014.cfm?flag=0.
Some important literature related to identifying the most informative risk factors and predicting LBW using various ML algorithms. Eliyati et al. [28] used the Indonesia Demographic Health and Survey (DHS) 2012 dataset with 12055 respondents having eight factors of LBW. They did not use any feature selection methods to identify the high-risk factors of LBW. They took 80% dataset for the training set and 20% dataset for the test set. They used two classifiers, namely LR and support vector machine (SVM), with four kernels: Gaussian radial basis (GRB), polynomial (Poly), linear, and hyperbolic tangent (HT) for predicting LBW and found that LR achieved a higher AUC of 0.56. Senthilkumar and Paulraj [29] used the maternal details of 189 respondents, 59 of whom had LBW babies. They used different ML algorithms such as LR, naive Bayes (NB), random forest (RF), SVM, neural network (NN), and classification tree (CT) to predict LBW. CT achieved the highest accuracy of 89.9%. Hange et al. [30] worked with North Carolina State Centre for Health Statistics-2006 data with 10,000 respondents and 131 variables. For predicting LBW, they used a synthetic minority oversampling technique with three ML-based classifiers, including J48, random tree (RT), and REP Tree. They found that J48 gave 0.90 AUC. Borson et al. [31] used the BDHS-2011 and 2014 datasets, each having 4498 respondents and eight predictors. They adopted 10-fold cross-validation (CV) and six classifiers such as LR, NB, k-nearest neighborhood (k-NN), RF, SVM, and multilayer perceptron (MLP) to predict LBW. Among them, LR gained a higher AUC of 0.83.
This study analyzed an existing public domain survey dataset that was freely available online with all identifier information removed. The ethics committee in Bangladesh approved the survey. The authors were permitted to use the data for independent research purposes.
Data for nominal and ordinal variables were expressed as a percentage (%), whereas data for continuous variables was meanSD. We employed a chi-square test for nominal variables and an independent paired t-test for continuous variables to examine the association between different factors and LBW. A p-value
Decision tree (DT) is one of the first tree-based supervised ML techniques [36]. The core objective of DT is to create a training model for predicting membership class level (LBW/NBW) by lowering the generalization error [37]. To construct a model, DT contains multiple levels, in which the top-most node is usually called the root node, every internal node (child node) denotes a test on an input predictor variables or factors, every branch denotes the outcome of the test set. Every leaf/terminal node denotes the membership class label. It can handle both categorical and continuous data. It requires minimum data preparation and can analyze massive datasets quickly.
The baseline and demographic characteristics of the participants are shown in Table 2. The average prevalence of LBW in Bangladesh was 16.2%. The average age of a mother whose baby with LBW was 24.85.9, with a height of 1.50.1 and a weight of 51.4910.9. About 15.6% of LBW babies came from the Dhaka region. It was noted that 14.8% of LBW babies were delivered by cesarean section (CS). Table 2 indicates that region, education, wealth index, weight, height, twin child, child alive, and delivery by CS were statistically significantly associated with LBW.
The LR results of the different associated risk factors for LBW are presented in Table 3. Table 3 showed that Chittagong region, no educated mothers, the poorest and middle of wealth index, height, the child is twin, and child alive were significant risk factors of LBW (p
The current study determined the significant risk factors for LBW and predicted LBW babies using the critical risk factors with ML in Bangladesh. We implemented the LR-based method to determine the most significant risk factors. The LR method demonstrated that region, education, wealth, height, the child is twin, and the child is alive were the significant risk factors of LBW. Our study showed that the prevalence of LBW was 16.2% which was coincided with the prevalence of LBW (16.0%) (
We may use other feature selection methods like the random forest, principle component analysis, multilevel logistic regression, stepwise logistic regression, and so on instead of logistic regression. We may also adopt different classifiers like support vector machine, Gaussian process classification, artificial neural network, AdaBoost, and deep learning for the prediction of LBW. We also want to see the effect of LBW over time.
According to the current study, various demographic features are still significant for producing LBW in babies. It is possible to recommend that the government may create opportunities for women to access higher education and take a necessary step to improve the economic condition of poor people in Bangladesh.
The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Background: There are a myriad of language cues that indicate depression in written texts, and natural language processing (NLP) researchers have proven the ability of machine learning and deep learning approaches to detect these cues. However, to date, these approaches bridging NLP and the domain of mental health for Bengali literature are not comprehensive. The Bengali-speaking population can express emotions in their native language in greater detail.
Objective: Our goal is to detect the severity of depression using Bengali texts by generating a novel Bengali corpus of depressive posts. We collaborated with mental health experts to generate a clinically sound labeling scheme and an annotated corpus to train machine learning and deep learning models.
Methods: We conducted a study using Bengali text-based data from blogs and open source platforms. We constructed a procedure for annotated corpus generation and extraction of textual information from Bengali literature for predictive analysis. We developed our own structured data set and designed a clinically sound labeling scheme with the help of mental health professionals, adhering to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) during the process. We used 5 machine learning models for detecting the severity of depression: kernel support vector machine (SVM), random forest, logistic regression K-nearest neighbor (KNN), and complement naive Bayes (NB). For the deep learning approach, we used long short-term memory (LSTM) units and gated recurrent units (GRUs) coupled with convolutional blocks or self-attention layers. Finally, we aimed for enhanced outcomes by using state-of-the-art pretrained language models.
Results: The independent recurrent neural network (RNN) models yielded the highest accuracies and weighted F1 scores. GRUs, in particular, produced 81% accuracy. The hybrid architectures could not surpass the RNNs in terms of performance. Kernel SVM with term frequency-inverse document frequency (TF-IDF) embeddings generated 78% accuracy on test data. We used validation and training loss curves to observe and report the performance of our architectures. Overall, the number of available data remained the limitation of our experiment.
Conclusions: The findings from our experimental setup indicate that machine learning and deep learning models are fairly capable of assessing the severity of mental health issues from texts. For the future, we suggest more research endeavors to increase the volume of Bengali text data, in particular, so that modern architectures reach improved generalization capability.
Sign language is a form of communication medium for speech and hearing disabled people. It has various forms with different troublesome patterns, which are difficult for the general mass to comprehend. Bengali sign language (BdSL) is one of the difficult sign languages due to its immense number of alphabet, words, and expression techniques. Machine translation can ease the difficulty for disabled people to communicate with generals. From the machine learning (ML) domain, computer vision can be the solution for them, and every ML solution requires a optimized model and a proper dataset. Therefore, in this research work, we have created a BdSL dataset and named `KU-BdSL', which consists of 30 classes describing 38 consonants ('banjonborno') of the Bengali alphabet. The dataset includes 1500 images of hand signs in total, each representing Bengali consonant(s). Thirty-nine participants (30 males and 9 females) of different ages (21-38 years) participated in the creation of this dataset. We adopted smartphones to capture the images due to the availability of their high-definition cameras. We believe that this dataset can be beneficial to the deaf and dumb (D&D) community. Identification of Bengali consonants of BdSL from images or videos is feasible using the dataset. It can also be employed for a human-machine interface for disabled people. In the future, we will work on the vowels and word level of BdSL.
795a8134c1