Authors: Xiangyu Zhao, Chengqian Ma
Category: Computation and Language, Artificial Intelligence
ArXiv: http://arxiv.org/abs/2408.01423v1
Abstract: Large Language Models (LLMs) exhibit remarkable proficiency in addressing a diverse array of tasks within the Natural Language Processing (NLP) domain, with various prompt design strategies significantly augmenting their capabilities. However, these prompts, while beneficial, each possess inherent limitations. The primary prompt design methodologies are twofold: The first, exemplified by the Chain of Thought (CoT), involves manually crafting prompts specific to individual datasets, hence termed Expert-Designed Prompts (EDPs). Once these prompts are established, they are unalterable, and their effectiveness is capped by the expertise of the human designers. When applied to LLMs, the static nature of EDPs results in a uniform approach to both simple and complex problems within the same dataset, leading to the inefficient use of tokens for straightforward issues. The second method involves prompts autonomously generated by the LLM, known as LLM-Derived Prompts (LDPs), which provide tailored solutions to specific problems, mitigating the limitations of EDPs. However, LDPs may encounter a decline in performance when tackling complex problems due to the potential for error accumulation during the solution planning process. To address these challenges, we have conceived a novel Prompt Recursive Search (PRS) framework that leverages the LLM to generate solutions specific to the problem, thereby conserving tokens. The framework incorporates an assessment of problem complexity and an adjustable structure, ensuring a reduction in the likelihood of errors. We have substantiated the efficacy of PRS framework through extensive experiments using LLMs with different numbers of parameters across a spectrum of datasets in various domains. Compared to the CoT method, the PRS method has increased the accuracy on the BBH dataset by 8% using Llama3-7B model, achieving a 22% improvement.
Summary: The author identifies issues with traditional Large Language Model (LLM) prompt design methods, which can be categorized into two types: Expert-Designed Prompts (EDP) and LLM-Derived Prompts (LDP). EDPs are highly reliant on human expertise, which can be problematic due to their immutable nature and the potential for redundant computational resource use in solving simpler problems. Conversely, LDPs, which are entirely generated and optimized by LLMs, are susceptible to error accumulation during the prompt generation process due to a lack of effective supervision. To address these issues, the author proposes a new framework called Prompt Recursive Search (PRS), inspired by the differentiation process of human stem cells. The PRS framework allows for automatic prompt design and computational resource conservation by supervising problem complexity and preventing error accumulation during reasoning. The framework enables LLMs to avoid overly complex planning for simple problems and shares the prompt design workload, thereby reducing dependency on human experts. The effectiveness of the PRS method is validated through comparisons with traditional prompt design approaches like CoT on the BBH dataset across multiple domains using models with varying numbers of parameters. Ablation experiments with the Yi-34 model further confirm the effectiveness of the PRS framework, showing that it successfully integrates the advantages of both EDP and LDP while overcoming their individual shortcomings.
Thinking Direction:
The author identified problems with current Large Language Model (LLM) prompt design approaches. Expert-Designed Prompts (EDP) rely on human expertise and are not flexible, leading to inefficiencies in solving simple problems due to their complex and static structures. LLM-Derived Prompts (LDP) are dynamic and can be optimized by the LLM itself but are prone to error accumulation due to insufficient oversight in the prompt generation process.
To solve these problems, the author proposed a new framework called Prompt Recursive Search (PRS). This framework leverages the strengths of both EDP and LDP types to tackle complex tasks by assessing problem complexity and, if necessary, breaking down complex problems into simpler parts. LLMs with varying parameter sizes are employed to solve these simplified problems, which reduces the dependency on human expertise and the likelihood of error propagation.
The PRS method was validated for its effectiveness by comparing it to traditional prompt design approaches using the BBH dataset and models with different parameter sizes. The result is a prompt design framework that is both efficient, saving computational resources, and error-resistant, as it incorporates supervision over problem complexity and avoids over-reliance on human-designed prompts.