Machine Learning 21th October


Advanced Voice Analysis with Machine Learning Algorithms




Introduction to Machine Learning Algorithms for Audio Processing, Simone Scardapane, Dipartimento di Ingegneria dell'Informazione, Elettronica e Telecomunicazioni, Sapienza Università di Roma 

When dealing with high-dimensional data such as audio and videos, classic programming struggles in implementing even the simplest processing routines. Machine learning offers an alternative pathway, wherein historical data on the problem to be solved is used to “train” a model to generalize to new cases (e.g., learning to classify speech impairments from a set of diagnosed segments). In the first part of this presentation, we will present the general framework of machine learning with several use cases and known algorithms (e.g., support vector machines). We will also see situations in which this type of classical machine learning is limited, especially when trying to deal directly with the raw audio/video. In the second part of the presentation, we will introduce deep learning, a class of machine learning algorithms that are quickly becoming the de-facto standard in many vision and audio processing problems, including in medical imagery for clinical support. Deep learning algorithms can generalize from a vast amount of data starting directly from the raw signal, thus enabling strong performances in problems ranging from automatic machine translation to audio synthesis and music generation. We will also mention some of the limitations and current research problems of interest to the general public, most notably interpretability of the results and fairness of the models.


New Frontiers in Neurology: Advanced Voice Analysis, Antonio Suppa, Dipartimento di Neuroscienze Umane, Sapienza Università di Roma

The human voice is a complex biological signal made up by a large dataset of voice features. Historically, the first pioneering attempt to measure and characterize voice impairment in patients with various neurologic disorders has been achieved in the 60s by using the qualitative analysis of voice audio-recordings. Seminal studies have promoted great research interest in the field over the subsequent decades. Indeed, growing technological advances in the 80s have progressively improved the quality of audio-recordings allowing the objective analysis of voice disorders. Among advanced techniques, the spectral analysis has allowed the objective extraction of several features including those reflecting energy (e.g. fundamental frequency-f0) and spectrum envelope (e.g. shimmer, jitter, harmonic to noise ratio) of the recorded voice. More recently, a growing number of researchers has applied an advanced voice analysis based on machine-learning algorithms in order to assess changes of voice features objectively and with higher accuracy than spectral analysis. In this lecture, we will first report our experimental paradigm including procedures for voice recordings and machine-learning analysis, and then the statistical analysis used for calculating the diagnostic accuracy of the algorithm. Also, we will discuss the recent advances in the objective diagnosis of voice disorders through machine-learning algorithms in several neurological diseases including Essential Tremor (ET) and adductor-type spasmodic dysphonia (ASD). Finally, we will discuss the objective assessment through machine-learning of voice improvement following pharmacological treatments.  



mercoledì 21 ottobre 2020, ore 15.00 - 17.00

Per partecipare occorre collegarsi tramite Google Meet al link


© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma