COA-llaboration: Where Different LLMs Techniques Will Fit in to Military Planning

LLMs are refined and and used with a variety of techniques, and that will influence how they're used in military planning.

Nov 06, 2024

In my first article, I explored the potential of AI-powered Course of Action (COA) analysis in military planning, examining foundational aspects of integrating artificial intelligence into mission-critical frameworks. Our second article delved into the role of Large Language Models (LLMs) alongside established tools like tabletop exercises, optimization, and simulation, highlighting how LLMs can complement, and in some cases, surpass traditional methods.

This final piece will focus on four techniques that tune/extend the power of LLMs. Each approach brings specific strengths and shortcomings for certain scenarios. Understanding the upsides and limitations is essential for maximizing LLM’s utility within military decision-making processes.

Introduction

Historically, military organizations have relied on tools like tabletop exercises, optimization models, and simulations to develop and refine strategies. While these tools remain invaluable, they face significant challenges in adapting to the speed and complexity of modern Multi-Domain Operations (MDO), which requires coordinated action across land, air, maritime, space, and cyber domains. The static nature of traditional tools limits their utility in real-time, dynamic environments. To bridge this gap, new tools are needed—ones that are dynamic, adaptable, and intelligent enough to synthesize vast data streams and deliver nuanced, real-time recommendations.

The integration of LLMs into COA frameworks brings both immense promise and complex challenges. Deploying LLMs and machine learning models in military contexts requires grappling with issues of data reliability, operational transparency, and adaptability. Past research, such as Black and Darken’s (2023) work on AI scaling in digital wargaming, underscores the rigor and precision needed for AI systems to meet military standards.

LLMs show promise in the field of wargaming, but how they will be used and augmented is an outstanding question. Since LLMs have exploded into the mainstream, they are often discussed and treated as a singular tool, one that can universally handle a range of tasks. However, the way an LLM is trained, tuned, and partnered with other technologies makes a profound difference in its effectiveness, especially in high-stakes, nuanced fields like military planning. Each technique for adapting or augmenting LLMs—whether through data retrieval, fine-tuning, custom prompts, or structured reasoning—brings unique strengths and trade-offs. By exploring these distinct approaches, we can better understand how to maximize the utility of LLMs in COA analysis and align AI-driven recommendations with mission-critical goals.

Approaches to Enhancing COA Analysis with AI

Below are four LLM approaches that offer unique strengths for COA analysis, along with a summary of their benefits, challenges, and ideal use cases.

RAG (Retrieval-Augmented Generation): Real-Time Knowledge for COA

RAG integrates external data retrieval with a generative language model to enhance response accuracy and relevance. This approach enables COA analysis to benefit from real-time, contextually relevant data, grounding AI-driven insights in the latest information. By drawing from established sources like tactics, techniques, and procedures (TTPs) or doctrinal guidelines, RAG helps LLMs avoid hallucinations and aligns better with commanders’ intent.

Benefits: RAG’s ability to ground information in credible documentation is particularly advantageous in COA contexts. By pulling in real-time data streams, it helps create recommendations that are not only accurate but timely.
Challenges: Implementing RAG effectively requires a robust data infrastructure and rigorous data curation, as the quality of recommendations depends on reliable data sources.
Ideal Use Case: RAG is optimal for high-level coordination efforts, especially when organizing and synthesizing information across various organizations and structures where current, accurate intelligence is essential. For example, if the United States Indo-Pacific Command (INDOPACOM) wants to organize information across command structures, an enterprise-grade search and chat solution based on RAG will take them a very long way.

Fine-Tuning Models: Customization for Strategic Precision

Fine-tuning adapts models to specific domains by training them on relevant data, enabling them to generate insights tailored to unique mission parameters. Techniques like Low-Rank Adaptation (LoRA) and Functionally Invariant Paths (FIP) can make fine-tuning more efficient by balancing resource demands with performance needs. FIP, developed with my Yurts’ Co-Founders, leverages differential geometry to maintain prior knowledge while adapting to new tasks.

Benefits: Fine-tuning allows models to internalize and reflect nuanced, domain-specific language and strategic priorities, producing highly tailored outputs that align with established frameworks.
Challenges: Fine-tuning requires considerable resources and may risk overfitting if not carefully managed, potentially reducing flexibility in unfamiliar scenarios.
Ideal Use Case: Fine-tuning is not a panacea when it comes to LLMs, specifically for COAs. A fine-tuned model may offer improved encoding of strategy, and doctrine, of various scenarios and provide more on-point solutions, but incomplete by itself. It performs best when used alongside RAG, which can complement fine-tuned models by grounding outputs in real-time data.

Prompt Baking: Rapid, Cost-Effective Customization

Prompt baking embeds specific COA rules and values directly into prompts, enabling rapid response customization without extensive retraining. By carefully crafting prompts that reflect COA frameworks, this method offers a streamlined solution for efficiently guiding model outputs.

Benefits: This approach enables quick and inexpensive adjustments to model outputs, allowing COA to adapt as requirements evolve.
Challenges: Prompt baking has limited adaptability in highly dynamic scenarios and requires expert input to avoid cognitive bias.
Ideal Use Case: Useful for rapidly evolving COA situations where guidelines may need to be reconfigured based on new intelligence, but where retraining or fine-tuning isn’t feasible.

Reasoning: Enhanced Depth and Orchestration

Reasoning models are essential for creating robust, logically sound COAs, especially in high-stakes situations. Open-source frameworks like OpenR integrate process reward models (PRMs) and guided search to encourage a more coherent, rigorous reasoning process. This approach combines reinforcement learning with structured, step-by-step inference processes, which improves both the accuracy and defensibility of recommendations.

Benefits: Emphasizing structured reasoning enables models to produce transparent, logically sound outputs that are easier to scrutinize and defend. This is particularly useful for scenarios where decisions need to align strictly with strategic guidelines.
Challenges: Reasoning models can be computationally demanding and may slow down response times, which can be a limitation in time-sensitive scenarios.
Ideal Use Case: This approach excels in high-stakes situations where COA insights require thorough scrutiny and justification for presentation to senior leaders or external stakeholders. Step-by-step reasoning ensures that each recommendation aligns with COA frameworks and provides a defensible rationale for each suggestion.

A Path Forward: A Unified Approach

The optimal solution for advanced COA analysis is likely a hybrid strategy that leverages the strengths of each technique. Each approach offers benefits, but when combined, they create an adaptable, and holistic system for tackling the complexities of multi-domain military operations. Here’s how these techniques can work together:

RAG for Information Retrieval and Knowledge Coordination
RAG serves as the backbone for broad knowledge sharing and natural language-based interactions., RAG enables the system to ground COA recommendations in factual, relevant information, reducing the risk of hallucinations and improving decision reliability. RAG is particularly crucial in scenarios like high-level operational coordination (e.g., cross-command collaborations), where organizing vast information streams across agencies is critical.
Fine-Tuning Models for Customization and Strategic Alignment
Fine-tuning aligns the LLM with the nuanced language, doctrine, and strategic priorities specific to a mission or organization. While not a standalone solution, fine-tuning enhances RAG by encoding the organization’s distinct operational nuances into the model, making recommendations that are not only accurate but contextually tailored. Fine-tuning is instrumental in embedding enduring principles and values, enabling the model to maintain strategic coherence over longer-term analyses and aligning outputs with specific doctrinal objectives. It’s particularly useful for encoding high-level strategies that may remain consistent even as data inputs fluctuate.
Prompt Baking for Agile Adjustments
Prompt baking facilitates the fast and cost-effective customization of responses, allowing COA analysis to adapt dynamically as operational requirements shift. This approach complements fine-tuning by embedding specific, mission-critical values directly within the prompts, making it ideal for rapidly adjusting to tactical changes without a full model retrain. For example, in quickly evolving scenarios, such as during live operations or intelligence updates, prompt baking allows for swift recalibrations to align the model’s behavior with immediate needs, keeping COA outputs relevant.
Reasoning Models for Structured, Justifiable Recommendations
Reasoning provides the necessary logical structure for high-stakes, scrutinized decisions, bringing defensibility and coherence to COA recommendations. In complex, multi-domain operations, reasoning acts as an orchestrator, creating control loops that ensure the alignment and consistency of COA outputs with overarching mission goals. Reasoning models are vital for justifying COAs to high-level stakeholders, especially in contexts that demand strong accountability. These models offer transparent, step-by-step reasoning that ensures each decision is traceable, making them ideal for environments where scrutiny and defensibility are non-negotiable.

The COA analysis framework will go beyond conventional military planning methods by integrating these techniques into a cohesive system. It will better support the demands of modern, multi-domain operations, providing military decision-makers with precise, adaptable insights and strategically aligned with mission-critical objectives.

This is the future of COA and LLMs: a robust, multi-faceted AI framework capable of supporting the intricacies of modern warfare.

Bhargava, A., Witkowski, C., Detkov, A., & Thomson, M. (2024). Prompt Baking. Retrieved from https://arxiv.org/pdf/2409.13697

Black, S., & Darken, C. (2023). Scaling Artificial Intelligence for Digital Wargaming in Support of Decision-Making. Naval Postgraduate School, Monterey, CA, USA.

Raghavan, G., Tharwat, B., Hari, S. N., Satani, D., Liu, R., & Thomson, M. (2024). Engineering flexible machine learning systems by traversing functionally invariant paths. Nature Machine Intelligence, 6(10), 1179–1196. https://doi.org/10.1038/s42256-024-00902-x

Yu, S., Zhu, W., & Wang, Y. (2023). Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient. Appl. Sci., 13(7), 4569. https://doi.org/10.3390/app13074569

Beyond Visual Range — AI, Defense, and Policy

Discussion about this post