Geometric Deep Learning For Financial Time Series and Efficient Fine-Tuning of Foundation Models
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis presents two significant research contributions: one focuses on improving the adaptation of large language models (LLMs) using parameter-efficient fine-tuning (PEFT), and the other addresses the effective modelling of history-dependent stochastic processes—specifically Volterra processes, which are commonly applied in quantitative finance.
In the first part, I introduce a user-friendly adaptation pipeline that boosts the performance of a standard foundation model, bringing it much closer to a fully fine-tuned, task-specific version. Remarkably, it achieves this while using significantly less compute and memory, all while keeping data private. The pipeline leverages existing learnable low-rank adapters (LoRA) for known datasets and predicts adapter values for new datasets using this readily available information. Its main advantage is that it can run on a standard laptop without requiring GPU power, ensuring that data remains local. This method effectively closes about half of the performance gap between an untuned base model and a fully fine-tuned one, making specialized models more accessible to researchers, practitioners, and everyday users who lack expensive infrastructure or work with sensitive data on devices like smartphones.
The second part addresses a computational challenge in translating the non-Markovian Volterra process into a format suitable for computation. This translation is difficult because the data history dimension affecting the current state grows with the length of the path. I propose a two-step approach to make this process manageable: first, the Volterra process is mapped onto a simpler, lower-dimensional manifold; then, a geometric deep learning model—a "hypernetwork"—is applied, specifically designed for the manifold’s structure. We provide both mathematical and computational evidence demonstrating the model’s effectiveness and practicality (with proofs developed by co-authors available in the main paper), along with extensive testing of each parameter to validate our approach.