Wednesday, June 12, 2024

SD Times Open-Source Project of the Week: STORM

Software DevelopmentSD Times Open-Source Project of the Week: STORM

STORM is an open-source project that can research and develop a full report of a topic, complete with sources. It originated from a paper out of Stanford. 

According to the Stanford researchers, there is a disconnect online between “the vast amounts of accessible information and what an individual can realistically assimilate.” They explained that while generative AI models can generate information, they actually contribute more to the problem by creating even more information to parse through. 

“This research project explores building a knowledge curation agent that can proactively research a topic, organize the information, and present the most pertinent insights in a reader-friendly way. Hopefully, STORM can provide a good starting point for knowledge discovery and liberate in-depth learning from the laborious search process,” the Stanford website states

It breaks down the report generation process into two stages: pre-writing and writing. In the pre-writing stage, STORM gathers information from the Internet, collects citations, and creates an outline. In the writing stage, STORM uses that outline and citations to create a Wikipedia-like article. 

It conducts research using two strategies. One is to look at articles on similar topics and then using that information to fine-tune the question-asking process. The other is to simulate a conversation between a Wikipedia writer and a topic expert, which allows the model to continuously update its understanding of the topic and ask pertinent follow-up questions.

The original paper that STORM was announced in was Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models by Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, and Monica S. Lam.

As of this writing, the project has 3.3k stars on GitHub and has been forked 323 times. It can be found on GitHub here

Check out our other content

Check out other tags:

Most Popular Articles