Browse Articles

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Anand et al. | Mar 09, 2025

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls
Image credit: Anand and Sampath 2025

The mountain chain of the Western Ghats on the Indian peninsula, a UNESCO World Heritage site, is home to about 200 frog species, 89 of which are endemic. Distinctive to each frog species, their vocalizations can be used for species recognition. Manually surveying frogs at night during the rain in elephant and big cat forests is difficult, so being able to autonomously record ambient soundscapes and identify species is essential. An effective machine learning (ML) species classifier requires substantial training data from this area. The goal of this study was to assess data augmentation techniques on a dataset of frog vocalizations from this region, which has a minimal number of audio recordings per species. Consequently, enhancing an ML model’s performance with limited data is necessary. We analyzed the effects of four data augmentation techniques (Time Shifting, Noise Injection, Spectral Augmentation, and Test-Time Augmentation) individually and their combined effect on the frog vocalization data and the public environmental sounds dataset (ESC-50). The effect of combined data augmentation techniques improved the model's relative accuracy as the size of the dataset decreased. The combination of all four techniques improved the ML model’s classification accuracy on the frog calls dataset by 94%. This study established a data augmentation approach to maximize the classification accuracy with sparse data of frog call recordings, thereby creating a possibility to build a real-world automated field frog species identifier system. Such a system can significantly help in the conservation of frog species in this vital biodiversity hotspot.

Read More...

Monitoring drought using explainable statistical machine learning models

Cheung et al. | Oct 28, 2024

Monitoring drought using explainable statistical machine learning models

Droughts have a wide range of effects, from ecosystems failing and crops dying, to increased illness and decreased water quality. Drought prediction is important because it can help communities, businesses, and governments plan and prepare for these detrimental effects. This study predicts drought conditions by using predictable weather patterns in machine learning models.

Read More...

Automated classification of nebulae using deep learning & machine learning for enhanced discovery

Nair et al. | Feb 01, 2024

Automated classification of nebulae using deep learning & machine learning for enhanced discovery

There are believed to be ~20,000 nebulae in the Milky Way Galaxy. However, humans have only cataloged ~1,800 of them even though we have gathered 1.3 million nebula images. Classification of nebulae is important as it helps scientists understand the chemical composition of a nebula which in turn helps them understand the material of the original star. Our research on nebulae classification aims to make the process of classifying new nebulae faster and more accurate using a hybrid of deep learning and machine learning techniques.

Read More...

Prediction of preclinical Aβ deposit in Alzheimer’s disease mice using EEG and machine learning

Igarashi et al. | Nov 29, 2022

Prediction of preclinical Aβ deposit in Alzheimer’s disease mice using EEG and machine learning

Alzheimer’s disease (AD) is a common disease affecting 6 million people in the U.S., but no cure exists. To create therapy for AD, it is critical to detect amyloid-β protein in the brain at the early stage of AD because the accumulation of amyloid-β over 20 years is believed to cause memory impairment. However, it is difficult to examine amyloid-β in patients’ brains. In this study, we hypothesized that we could accurately predict the presence of amyloid-β using EEG data and machine learning.

Read More...

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Chatterjee et al. | Oct 25, 2021

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Seeking to investigate the effects of ambient pollutants on human respiratory health, here the authors used machine learning to examine asthma in Lost Angeles County, an area with substantial pollution. By using machine learning models and classification techniques, the authors identified that nitrogen dioxide and ozone levels were significantly correlated with asthma hospitalizations. Based on an identified seasonal surge in asthma hospitalizations, the authors suggest future directions to improve machine learning modeling to investigate these relationships.

Read More...

Advancing pediatric cancer predictions through generative artificial intelligence and machine learning

Yadav et al. | Dec 21, 2024

Advancing pediatric cancer predictions through generative artificial intelligence and machine learning

Pediatric cancers pose unique challenges due to their rarity and distinct biological factors, emphasizing the need for accurate survival prediction to guide treatment. This study integrated generative AI and machine learning, including synthetic data, to analyze 9,184 pediatric cancer patients, identifying age at diagnosis, cancer types, and anatomical sites as significant survival predictors. The findings highlight the potential of AI-driven approaches to improve survival prediction and inform personalized treatment strategies, with broader implications for innovative healthcare applications.

Read More...

Transfer Learning for Small and Different Datasets: Fine-Tuning A Pre-Trained Model Affects Performance

Gupta et al. | Oct 18, 2020

Transfer Learning for Small and Different Datasets: Fine-Tuning A Pre-Trained Model Affects Performance

In this study, the authors seek to improve a machine learning algorithm used for image classification: identifying male and female images. In addition to fine-tuning the classification model, they investigate how accuracy is affected by their changes (an important task when developing and updating algorithms). To determine accuracy, a set of images is used to train the model and then a separate set of images is used for validation. They found that the validation accuracy was close to the training accuracy. This study contributes to the expanding areas of machine learning and its applications to image identification.

Read More...

Deep learning for pulsar detection: Investigating hyperparameter effects on TensorFlow classification accuracy

Upadhyay et al. | Jan 31, 2026

Deep learning for pulsar detection: Investigating hyperparameter effects on TensorFlow classification accuracy

This study investigates how the hyperparameters epochs and batch size affect the classification accuracy of a convolutional neural network (CNN) trained on pulsar candidate data. Our results reveal that accuracy improves with increasing number of epochs and smaller batch sizes, suggesting that with optimized hyperparameters, high accuracy may be achievable with minimal training. These findings offer insights that could help create more efficient machine learning classification models for pulsar signal detection, with the potential of accelerating pulsar discovery and advancing astrophysical research.

Read More...