Explainable Learning of Long-term Dependencies through Machine Learning

Martinez-Garcia, Fernando

Please use this identifier to cite or link to this item: http://hdl.handle.net/11375/29455

Title:	Explainable Learning of Long-term Dependencies through Machine Learning
Authors:	Martinez-Garcia, Fernando
Advisor:	Down, Douglas
Department:	Computing and Software
Keywords:	Explainable, Interpretable, AI, RNN, LSTM, Machine Learning, Time series, Adaptive Predictive Control, E-LSTM GI-LSTM
Publication Date:	2024
Abstract:	Machine learning-based models have yielded remarkable results in a wide range of applications, revolutionizing industries over the last few decades. However, a variety of challenges from the technical point of view, such as the drastic increase in model size and complexity, have become a barrier for their portability and human interpretation. This work focuses on enhancing specific machine learning models used in the time-series forecasting domain. The study begins by demonstrating the effectiveness of a simple and interpretable-by-design machine learning model in handling a real-world time-series industry-related problem. This model incorporates new data while dynamically forgetting previous information, thus promoting continuous learning and adaptability laying the groundwork for practical applications within industries where real-time interpretable adaptation is crucial. Then, the well-established LSTM Neural Network, an advanced but less interpretable model able to learn long and more complex time dependencies, is modified to generate a model, named E-LSTM, with extended temporal connectivity to better capture long-term dependencies. Experimental results demonstrate improved performance with no significant increase in model size across various datasets, showcasing the potential to have balance between performance and model size. Finally, a new LSTM architecture built upon the E-LSTM’s increased temporal connectivity while embedded with interpretability is proposed, called Generalized Interpretable LSTM (GI-LSTM). This architecture is designed to offer a more holistic interpretation of its learned long-term dependencies, providing semi-local interpretability by offering insights into the detected relevance across time-series data. Furthermore, the GI-LSTM outperforms alternative models, generally produces smaller models, and shows that performance does not necessarily come at the cost of interpretability.
URI:	http://hdl.handle.net/11375/29455
Appears in Collections:	Open Access Dissertations and Theses

Files in This Item:

File	Description	Size	Format
Martinez_Fernando_2023December_PhD.pdf Open Access		6.03 MB	Adobe PDF	View/Open

Show full item record