Skip to content

mbrner/ai-agent-survey

Repository files navigation

AI Agent Survey

Papers

  • Toolformer: Language Models Can Teach Themselves to Use Tools: Self-supervised language model that is trained to use external APIs, thereby bridging the gap between large-scale language models' abilities to solve complex tasks and their struggles with basic functionalities like arithmetic or factual lookup.
  • ReAct: Synergizing Reasoning and Acting in Language Models: The paper presents REACT, a model that enhances the capabilities of large language models by synergistically generating reasoning traces and task-specific actions, thereby improving performance on language and decision-making tasks, as well as increasing human interpretability and trustworthiness.
  • Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models: The study introduces Plan-and-Solve Prompting, a new approach that improves the reasoning capabilities of large language models by breaking down the task into smaller subtasks, planning and solving them sequentially.
  • Gorilla: Large Language Model Connected with Massive APIs: A fine-tuned LLM that excels at writing API calls significantly improves the accuracy of API usage and reduces hallucination errors, proving its potential to increase the reliability and applicability of LLM output.
  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face: A multi-model collaborative system that automates tasks by leveraging various machine learning models from Hugging Face, using a four-stage pipeline of task planning, model selection, task execution, and response generation controlled by an LLM.
  • Voyager: An Open-Ended Embodied Agent with Large Language Models: The paper introduces an LLM-powered agent designed for lifelong learning in Minecraft, which proposes tasks based on its skills and state, refines skills based on environmental feedback, and continually explores the world.
  • ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs: The study introduces ToolLLM and ToolBench, a framework and dataset respectively, aimed at enhancing the tool-use capabilities of large language models (LLMs), with the developed ToolLLaMA (ToolLLM applied to LLaMA) demonstrating robust performance in executing complex instructions and generalizing to unseen APIs, thus contributing to the democratization of AI technologies.
  • Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models: This study demonstrates that using tool documentation, rather than demonstrations, can effectively teach large language models (LLMs) new tools, simplifying the process, reducing bias, and improving scalability, with the performance of LLMs even improving with the comprehensiveness of the documentation, up to a certain length.
  • TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents: This study presents a structured framework for Large Language Models (LLMs) based AI agents, introducing a novel approach for flexible problem-solving, and demonstrating through evaluations that these agents, particularly ChatGPT, show significant potential in task planning, tool usage, and managing complex tasks, thus highlighting their promising prospects for future research and real-world applications.
  • AgentBench: Evaluating LLMs as Agents: This paper introduces AgentBench, a comprehensive benchmark for evaluating Large Language Models (LLMs) as agents across diverse real-world tasks, revealing a significant performance gap between top-tier and open-source models, and providing a toolkit for future research and development in the field.
  • BOLAA: Benchmarking and Orchestrating Llm-Augmented Autonomous Agents: The paper explores Large Language Model-augmented Autonomous Agents (LAAs), introducing a novel strategy (BOLAA) for managing multiple LAAs, and demonstrates that the BOLAA architecture outperforms others in complex tasks, with the best performance observed with Llama-2-70b, suggesting the potential of fine-tuning multiple smaller-sized specialised LAAs and the importance of pairing the LLM with the optimal LAA architecture.
  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models: The 'Tree of Thoughts' (ToT) framework is a novel approach to problem-solving with Language Learning Models (LLMs), enabling more deliberate decision-making and strategic planning, and demonstrating superior performance on challenging tasks, while also improving the interpretability of model decisions and offering potential for future applications.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published