AI: Striving to Become a Trusted 'Future Advisor'

AI: Striving to Become a Trusted ‘Future Advisor’

What does the concept of “predictive technology” look like? When the foundational capabilities of general large models, the precision of specialized predictive models, the practical value of external tools, and the assurance of trustworthy mechanisms are organically integrated, AI will gain a new insight into the future. This will position AI as a trusted “future advisor” in critical areas such as financial risk control, weather forecasting, public governance, and industrial production, providing intelligent support for humanity to grasp future trends and becoming an important force in empowering social development and serving national governance modernization.

Four Technical Paths for ‘Predicting the Future’

Faced with increasingly complex predictive demands in the real world, researchers have developed two core lines and four specific technical paths around large model predictive technology. These paths are not competing alternatives but complement each other in different scenarios, collectively constructing a complete research framework for large model predictions.

The essential difference between the two core lines lies in whether a dedicated model is tailored for the prediction task: one path is to “borrow a boat to go to sea,” skillfully utilizing existing mature large language models for predictions; the other is to “build a ship for long voyages,” reconstructing dedicated foundational models for predictions. Both paths advance simultaneously, adapting to diverse task requirements.

Directly invoking large language models is the easiest entry point for large model predictions. Researchers convert various prediction tasks into common natural language questions, providing historical information, event backgrounds, and constraints for the model to directly assess future trends and output predictions. This approach has a low barrier to entry and does not require significant modifications to the model, merely changing the application of existing tools, allowing for impressive performance in news event analysis and business trend assessments. However, it falls short in high-precision numerical predictions required in fields like meteorology and finance due to the inherent limitations of large language models in numerical computation and factual output accuracy.

Time series tokenization modeling represents a cross-domain “intelligent borrowing.” It cleverly introduces classic natural language processing ideas into time series data analysis, transforming continuous time series data into token representations similar to words in language through discretization, scaling, and quantization techniques, and then training using a language model-like architecture. The representative model, Chronos, achieves probabilistic predictions and cross-dataset generalization by mapping time series to a fixed vocabulary, significantly reducing development costs. However, this convenience comes at a cost, as the data transformation process inevitably leads to loss of numerical details and quantization errors, akin to a rough polishing of fine parts, impacting prediction accuracy.

Building dedicated time series foundational models marks a shift in large model predictive research from “borrowing strength” to independent innovation. Researchers no longer view time series simply as pseudo-text but design pre-training schemes and model architectures tailored to the essential laws of time series data and the core needs of prediction tasks. Google’s TimesFM employs a decoder architecture, showcasing strong zero-shot prediction capabilities; Lag-Llama, developed by multiple universities and research institutions in the U.S., focuses on probabilistic predictions and cross-domain generalization; and Moirai, developed by an American AI company, boldly attempts to adapt to more scenarios with a unified training approach. These models act as customized “armor” for prediction tasks, more closely aligned with the characteristics of prediction tasks, achieving higher precision numerical predictions and becoming the preferred choice for high-precision prediction scenarios.

Reprogramming large language models and multimodal integration provide low-cost thinking for large model predictions. Research related to Time-LLM confirms that it is unnecessary to retrain massive time series models with hundreds of billions of parameters; simply reprogramming to align time series with text prototypes allows “frozen” large language models to participate in prediction tasks. This approach opens a feasible channel for the general large model + specialized adaptation technical route, further promoting the deep integration of text, numerical, and contextual knowledge modeling, making predictions more aligned with the complex and variable demands of real-world prediction scenarios.

These four technical paths do not have absolute advantages or disadvantages; they are like different keys fitting different locks. When prediction tasks require combining general knowledge and textual context for open trend judgments, routes related to large language models act as universal keys with greater advantages; when tasks pursue high-precision numerical outputs and stable cross-domain generalization capabilities, dedicated time series foundational models become the customized keys for precise matching. They support and achieve each other under different research resource conditions and actual task requirements, jointly promoting the steady advancement of large model predictive technology.

Moving Towards Real Application Scenarios

In the research arena of large model predictive technology, international studies have started earlier and developed a more systematic technical framework, delving deeper into foundational research and frontier exploration. Although domestic research started later, it has rapidly caught up, forming unique advantages in scenario adaptation, open-source ecology, and application implementation.

International academic research on large model predictions has evolved from text reasoning to diverse predictions. Early studies mainly focused on using large language models for text reasoning and event development judgments, akin to meticulous cultivation in a small domain. In recent years, it has gradually broken boundaries, expanding into time series, spatiotemporal data, and even scientific predictions, initiating a new phase of “expanding territory.” In the more complex field of scientific predictions, Microsoft’s ClimaX has pioneered the establishment of a foundational model framework for weather and climate tasks, while Aurora, also developed by Microsoft, extends the foundational model concept to the Earth system, capable of simultaneously handling various prediction tasks such as weather, air quality, and wave forecasts, akin to equipping the Earth with an intelligent early warning system, showcasing the immense potential of scientific foundational models in complex system predictions.

Notably, the international academic community maintains a rational and cautious attitude toward the predictive capabilities of large models. Related studies have found that the excellent performance of large models in standardized tests does not equate to reliability in predicting real-world future events—GPT-4, for instance, performed worse than the median human group in open-world prediction competitions. Addressing this core issue, international researchers have conducted competition studies, retrieval enhancement studies, and uncertainty detection studies, allowing international research to form a distinctive characteristic of “model capability enhancement + prediction result validation + trustworthy mechanism construction,” laying a solid foundation for the practical application of technology.

Domestic research, relying on the rapid development of general large models, has achieved impressive latecomer advantages, gradually forming a virtuous development pattern of rapid iteration of general large models, systematic review research, and steady progress in application implementation. In the arena of general model ecosystem construction, various participants showcase their strengths: Qianwen 3 has established a complete system in multilingual support and reasoning efficiency optimization, akin to building a multilingual intelligent bridge; DeepSeek-V3 has achieved breakthroughs in high-performance open-source models, making core technologies more accessible; Wenxin 4.5 continues to refine multimodal integration and engineering deployment, consistently aligning with practical application needs. Although these general large models are not solely focused on prediction, they provide a solid capability foundation for domestic large model predictive research, enabling researchers to conduct more targeted studies while standing on the shoulders of “giants.”

In terms of application implementation, domestic efforts are actively exploring ways to bring large model predictive technology out of the “ivory tower” and into real-world applications across various industries. Some studies deeply integrate expert knowledge with large language models for strategic early warning, accurately achieving trend judgments and risk identification in complex situations; others closely combine large models with meteorological monitoring data to enhance the accuracy and timeliness of short-term precipitation predictions. Although these studies are not entirely equivalent to pure numerical time series predictions, they signify that domestic large model predictive technology is transitioning from theoretical discussions to practical applications, beginning to explore technology paths that meet local needs and align with industry realities.

Overall, while foreign research has delved deeper into the development of specialized foundational models for predictions and scientific predictions, forming a relatively complete technical system akin to excavating extensive tunnels underground, domestic research has showcased distinctive features in adapting to Chinese scenarios, building low-cost open-source ecosystems, and implementing industry applications, akin to constructing high-rise buildings that fit local contexts above ground. As high-quality time series data and industry-specific data continue to accumulate in China, along with the gradual improvement of specialized evaluation systems, domestic foundational models for prediction tasks still have immense potential for enhancement and will undoubtedly contribute unique and valuable Chinese wisdom to the development of global large model predictive technology.

Bridging the Gap from ‘Powerful to Trustworthy’

Compared to traditional predictive methods, large model predictive technology has achieved a profound transformation from “point calculations” to “comprehensive judgments,” evolving from cold mechanical calculation tools into intelligent agents capable of understanding context, weighing factors, and providing rational judgments. This unique capability stems from its inherent core advantages but is also akin to a growing star, steadily evolving toward “trustworthiness,” striving to become a reliable “future advisor” for humanity.

The core advantage of large model predictive technology is its innate exceptionalism, particularly prominent in practical applications. Firstly, it has strong cross-task transfer capabilities. Traditional agricultural yield prediction models cannot be directly applied to stock market trend analysis; switching domains requires starting from scratch. In contrast, large models leverage their general representation capabilities from large-scale pre-training to quickly adapt to different fields such as agriculture, finance, and industry with minimal samples. Secondly, it has great potential for handling complex dependency relationships. For instance, predicting river water levels during flood seasons is influenced by multiple factors such as rainfall, upstream flooding, and topography, which traditional models struggle to capture; however, time series foundational models can learn patterns within contextual ranges, akin to possessing “keen insight” to see the connections behind the data. Thirdly, it excels in multi-source information fusion. Traditional meteorological predictions rely solely on numerical monitoring data, while large models can integrate satellite cloud images, meteorological textual reports, geographic information, and other multi-source content, transforming predictions from “viewing a leopard through a tube” to “panoramic observation.” Lastly, it offers excellent predictive explanation and decision support capabilities. It can not only predict the trend of a specific stock but also explain the influencing factors behind it, such as industry policies and market supply-demand dynamics, even providing risk control suggestions, becoming a professional intelligent partner for decision-makers.

Despite its significant advantages, large model predictive technology is not without flaws; there remains a gap that urgently needs to be bridged from the laboratory to real-world application scenarios. Firstly, the model’s generation and reasoning capabilities do not equate to actual predictive capabilities. Some models perform excellently in simulated meteorological prediction tests but often “fail” in real severe convective weather warnings, simply because the test answers are embedded in the training data, while real predictions require comprehensive assessments of unobserved events—it’s easy to theorize, but challenging to execute. Secondly, retrieval enhancement addresses symptoms rather than root causes. Although pairing models with information retrieval improves prediction accuracy, it also indicates that the model relies solely on its memory, akin to guarding an old library, struggling to keep pace with real-world changes; real-time acquisition of the latest knowledge is crucial. Furthermore, hallucinations and factual instability pose core obstacles, akin to hidden time bombs. Additionally, constraints related to cost, data, and evaluation systems make large-scale applications challenging. Training high-precision models requires vast computational resources, resulting in high development costs; in reality, time series data is fragmented and lacks uniform labeling, raising the question of how to produce quality outputs from inferior raw materials. Existing evaluation systems often emphasize numerical errors while neglecting factual stability, leading many models to appear excellent yet struggle to implement.

Looking ahead, the development direction of large model predictive technology is clear and focused, centering on “from powerful to trustworthy,” aiming to create a mature technical system that can stably serve real-world decision-making. Firstly, general large models will evolve into specialized foundational models for predictions, demonstrating stronger competitiveness in high-precision demand scenarios such as meteorology and finance. Secondly, tool enhancement will become an important direction, enabling models to autonomously invoke external tools like search and simulation, akin to equipping intelligent agents with treasure chests to better tackle complex scenarios. Thirdly, trustworthiness, controllability, and explainability will become research priorities; future prediction systems must not only achieve numerical precision but also quantify risks and trace judgment bases, which is key for high-risk scenario implementations. Fourthly, low-cost deployment and industrialization will accelerate; as inference costs decrease and open-source ecosystems improve, technology will transition from being exclusive assets of a few institutions to common tools across various industries. Lastly, domestic research will focus on localized adaptations, creating specialized models that align with the Chinese context and local data, ensuring that large models are more accurate, stable, and trustworthy in domestic financial risk control and government early warning scenarios.