Monday, September 1, 2025

Integrating Structured and Unstructured Data with LLMs and RAG

Traditional quantitative methods often rely on structured data, such as time series. With the emergence of Large Language Models (LLMs), it is now possible to process unstructured data. A new line of research focuses on integrating unstructured data analysis into traditional frameworks.

Along this line, Reference [1] proposed the use of LLMs together with retrieval-augmented generation (RAG) to process both structured and unstructured data concurrently. Specifically, the authors developed a system that first applies LLMs to detect regime shifts using time-series techniques, then employs RAG to integrate external knowledge into the model’s decision-making process. By retrieving relevant information from a vector database and combining it with the model’s capabilities, RAG improves both the interpretability and effectiveness of trading strategies.

The authors pointed out,

This study demonstrates the integration of fine-tuned LLMs with RAG for adaptive trading strategies and portfolio management. By combining numerical time series data and textual insights from news and macroeconomic indicators, the proposed framework addresses the challenges of multimodal data integration in financial markets.

The experimental results highlight the value of fine-tuning smaller LLMs, such as GPT-4o Mini, which improves regime shift detection and trading decision accuracy while maintaining computational efficiency. The application of SAX enhances the compatibility of time series data with LLMs, while the CoT framework ensures transparency and robustness in decision-making. This proof-of-concept establishes a solid foundation for integrating advanced LLMs in quantitative finance.

In short, incorporating RAG into the framework enhances the model’s ability to understand complex macroeconomic environments and adapt trading strategies as conditions evolve. Experimental results show significant gains in predictive accuracy and risk-adjusted returns, demonstrating the practical value of these fine-tuning methods in finance.

Let us know what you think in the comments below or in the discussion forum.

References

[1] Li, C., Chan, C.H.R., Huang, S.H., Choi, P.M.S. (2025). Integrating LLM-Based Time Series and Regime Detection with RAG for Adaptive Trading Strategies and Portfolio Management. In: Choi, P.M.S., Huang, S.H. (eds) Finance and Large Language Models. Blockchain Technologies. Springer, Singapore.

Post Source Here: Integrating Structured and Unstructured Data with LLMs and RAG



source https://harbourfronts.com/integrating-structured-unstructured-data-llms-rag/