The consensus is: Emerging artificial intelligence technology could be a game-changer for the military, but it requires intensive testing to ensure it works reliably and has no vulnerabilities that could be exploited by adversaries.
Defense Department Chief Digital and Artificial Intelligence Office (CDAO) Secretary Craig Martel opened the four-day conference to a packed ballroom at the Washington Hilton, saying teams need speed and care in deploying cutting-edge AI technology. He said he is trying to find a balance. A symposium on the theme.
“Everyone wants to be data-driven,” Martel says. “Everyone is willing to believe in magic because they really want it.”
The ability of large-scale language models (LLMs) such as ChatGPT to review vast amounts of information within seconds and crystallize it into a few key points makes it useful for militaries and organizations that are constantly struggling with how to sift through information. This suggests an attractive possibility for intelligence agencies. The ocean of raw intelligence available in the digital age is expanding.
“The flow of information to individuals is enormous, especially in high-activity environments,” U.S. Navy Capt. M. Xavier Lugo, mission commander of CDAO's recently established Generative AI Task Force, said at the symposium. Ta. He says, “It's very important to have reliable summarization techniques to help manage information.”
Researchers say other potential military uses for LLM could include training officers through sophisticated war games and supporting real-time decision-making. .
Paul Schall, a former Pentagon official who is now executive vice president of the Center for a New American Security, said some of the best uses are probably yet to be discovered. He said what excites defense officials about LLM is its flexibility to handle a wide variety of tasks compared to previous AI systems. “Most AI systems have been narrow AI,” he said. “They can do one task correctly. AlphaGo could play Go. Facial recognition systems can recognize faces. But that's all they can do. Language, on the other hand, , seems like a bridge to more general-purpose capabilities.”
However, a major stumbling block, perhaps even a fatal flaw, is that LLMs continue to have a “hallucination” of inaccurate information. Lugo called this “the biggest challenge for the industry” and said it's unclear whether it can be solved.
In August, CDAO established Task Force Lima, a generative AI research initiative chaired by Lugo, whose purpose was to develop recommendations for the “responsible” implementation of the technology in the Department of Defense. Lugo said the group was initially formed with LLMs in mind. The name “Lima” is derived from the NATO phonetic alphabetic code for the letter “L” in reference to LLM. However, its mission soon expanded to include image and video generation.
“As we were going from Phase 0 to Phase 1, we got into generative AI as a whole,” he said.
Researchers say there is still a way to go before LLM can be reliably used for high-stakes purposes. Shannon Gallagher, a researcher at Carnegie Mellon University who spoke at the conference, said her team was asked last year by the Office of the Director of National Intelligence to examine how intelligence agencies use LLM. Gallagher devised the “balloon test” in her team's research, using the LLM to examine what happened in last year's Chinese high-altitude surveillance balloon incident as a proxy for the kinds of geopolitical events that could occur. He said he urged them to explain. The answers were diverse, but some were biased and unhelpful.
“I'm sure they'll do well next time. The Chinese side couldn't identify the cause of the failure. I'm sure they'll do well next time. That's what they said about the first atomic bomb test. I'm sure they'll do well next time.” “I think so. They're Chinese. They'll do it right next time,” one response read.
Even more concerning is the possibility that hostile hackers could subvert military LLMs and exfiltrate data sets from the backend. Researchers proved this was possible in November. They started leaking training data by asking ChatGPT to repeat the word “poetry” forever. ChatGPT has fixed that vulnerability, but other vulnerabilities may exist.
“An adversary can make an AI system do something you don't want it to do,” Nathan Van Houdnos, another scientist at Carnegie Mellon University, speaking at the symposium, said. “An adversary could cause the AI system to learn the wrong thing.”
In his speech Tuesday, Martel said it may not make sense for the Pentagon to build its own AI models and called for industry cooperation.
“We can’t do this without you,” Martell said. “All of these components that we envision will be a collection of industrial solutions.”
Mr. Martel was preaching to the choir on Tuesday as about 100 technology vendors vie for space at the Hilton, many eager to win future contracts.
In early January, OpenAI removed restrictions on military uses from its Usage Policy page, which specifically prohibited “activities with a high risk of physical harm,” such as “weapons development” and “military and warfare.”
Brigadier Rachel Singleton, director of the UK's Defense Artificial Intelligence Center, said at the symposium that the UK has decided not to use commercial LLMs within the military due to concerns that personnel may be tempted to use commercial LLMs in their work, putting sensitive information at risk. He said he felt there was an urgent need to develop an LLM solution for this purpose. .
As U.S. officials debated the urgency of deploying AI, China was in the room, declaring in 2017 that it wanted to be the world leader in AI by 2030. The U.S. Department of Defense's Defense Advanced Research Projects Agency (DARPA) announced: In 2018, the US announced it would invest $2 billion in AI technology to maintain its advantage.
Martel declined to discuss the adversary's capabilities during his speech, saying the topic would be brought up later in a classified session.
Schaal estimates that China's AI models are currently 18 to 24 months behind U.S. AI models. “For them, U.S. technology sanctions are a top priority,” he said. “They're very keen to find ways to ease these tensions between the United States and China and remove some of the restrictions on American technology, such as chips, that are exported to China.”
Gallagher said China may still have an advantage in LLM data labeling, which is a labor-intensive but important task in training models. Labor costs in China remain significantly lower than in the United States.
This week's CDAO gathering will address topics including the ethics of LLM use in defense, cybersecurity issues related to the system, and how the technology can be integrated into daily workflows, according to the meeting agenda. Friday will also include a classified briefing on the National Security Agency's new AI Security Center and the Department of Defense's Project Maven AI program, announced in September.