Microsoft’s TaskWeaver is a code-first framework for planning and executing complex data analytics tasks. It converts user requests into executable code snippets, manages the needed plugins, executes the code and gives a human-readable answer. The model can handle complex tasks that require multiple steps and plugins, such as data cleaning, transformation, visualization, and modeling.
The project is open-source. The code repository includes usage instructions and two walkthrough examples.
TaskWeaver is designed to be easy to use and customizable. It can handle different data formats, such as pandas DataFrames, tables, graphs, images, etc., instead of just text strings. You can encapsulate your own algorithms into plugins and create multiple AI assistants that work together to solve complex tasks.
The model can use examples to learn domain-specific knowledge and enhance the LLM’s capacity to create precise plans and code for complex tasks.
As the cost of using large language models (LLMs) can be very high, TaskWeaver enables different modules to be configured with distinct LLM models. For less demanding tasks the users can deploy cost-effective models like GPT 3.5, Claude or Llama.
TaskWeaver rigorously verifies the generated code before execution, detects the potential issues and suggests corrective measures.
The model
The model consists of 3 main components, as depicted in the next picture.
- the Planner that interacts with the user and breaks down the user request into subtasks
- Code Generator (CG) that generates executable code for each subtask using LLMs and plugins
- Code Executor (CE) that runs the code and maintains the state of the session
CG and CE form the stateful Code Interpreter, which keeps the execution context and variables in the same session.
1. The planner is the system’s entry and end point that interacts with the user. It is responsible for planning (divides the user’s request into subtasks and manages their execution) and responding (converts the execution result into a readable response for the user).
2. The Code Generator (CG) uses LLMs to generate code for different languages, such as Python, SQL, R, etc. It can also use existing plugins, such as plotting a chart, performing sentiment analysis, or connecting to a database. The CG calls the plugins as functions in the code snippets and passes the data and the parameters as arguments.
For example, if the user asks for the average sales of each product category, the agent may generate a code snippet like this:
# Load the data from a CSV file
df = pd.read_csv('sales.csv')
# Group the data by category and calculate the meandf.groupby('category').mean()
3. The Code Executor (CE) runs the code snippets. It verifies the code before execution, to check for any errors or safety issues, runs the code and returns the execution results to the Planner.
In a continuous cycle of reasoning and action (ReAct), the Planner observes these results and updates the plan if needed. It may ask for additional information from the user or refine the overall strategy. The cycle repeats for each subsequent subtask until the complete plan is successfully executed.
Finally, the Planner converts the results into a response for the user.
Extension to multi-agents
TaskWeaver can be scaled up to a multi-agent architecture where a complex project is divided into multiple agents, with each agent in charge of a specific set of functions. This ability enables:
- modular project design: complex tasks are broken down into smaller agents, which is helpful for projects with numerous plugins
- flexible task expansion: new functions can be added to an existing project without modifying existing code
- easy integration: the model can be used as a service or a library and called by other multi-agent frameworks, such as AutoGen (a framework that enables the development LLMs using multiple agents that converse with each other to solve tasks)
As shown in the following figure, there are two ways to extend the TaskWeaver to a multi-agent environment. In one approach, a TaskWeaver powered agent can directly invoke other agents via plugins. Another way is to embed TaskWeaver powered agents into a pre-existing multi-agent framework, such as AutoGen.
Design verification
TaskWeaver’s design verification process ensures that the code it generates is accurate and secure. They tested the framework’s ability to reason and act (ReAct), decompose tasks, and perform safe and stateful coding and execution. The experiments showed that TaskWeaver can successfully perform these tasks and avoid unsafe or prohibited operations, such as file deletion or secret key leakage.
How to use TaskWeaver
- you can interact with it through the command-line interface
- you can use it in your project. Clone the TaskWeaver repository and install it as a library
- You can also use it as a service that will be called by other programs
For more details, visit the GitHub repository.
Conclusion
TaskWeaver is a code-first agent framework that can plan and execute data analytics tasks based on user requests. It seamlessly integrates the coding capability of LLMs with domain-specific knowledge through examples. It includes wide range of tools that make it easier for users to customize the system.
You can use the model for different tasks, such as finding the most popular topics in a collection of news articles, plotting charts, performing sentiment analysis, or connecting to databases.
Learn more:
- Research paper: “TaskWeaver: A Code-First Agent Framework” (on arXiv)
- The GitHub repository