Authors
Bingyan Lu, Caleb Reinking, Paul Brenner, and Nick Laneman
Abstract
Spectrum policy documents, such as notices and comments in Federal Communications Commission (FCC) and the National Telecommunications and Information Administration (NTIA) proceedings, are often lengthy and complex, making them difficult to access and understand. Combining large language models (LLMs) with retrieval-augmented generation (RAG) techniques offers a promising way to address these challenges by improving information retrieval and processing. This paper evaluates the use of RAG in spectrum policy analysis. We establish an open-source knowledge dataset of policy comments that has been cleaned and annotated to work effectively with RAG systems. We also develop a corresponding test dataset of question-answer pairs to evaluate system performance. By benchmarking four RAG-based systems, we show that RAG techniques significantly enhance the ability of LLMs to interpret and respond to complex spectrum policy queries, demonstrating their potential to make these important documents more accessible.