In Half 1 of this tutorial collection, we launched AI Brokers, autonomous packages that carry out duties, make choices, and talk with others.
In Half 2 of this tutorial collection, we understood tips on how to make the Agent try to retry till the duty is accomplished by Iterations and Chains.
A single Agent can normally function successfully utilizing a software, however it may be much less efficient when utilizing many instruments concurrently. One technique to deal with difficult duties is thru a “divide-and-conquer” strategy: create a specialised Agent for every process and have them work collectively as a Multi-Agent System (MAS).
In a MAS, a number of brokers collaborate to attain frequent objectives, typically tackling challenges which are too troublesome for a single Agent to deal with alone. There are two principal methods they will work together:
- Sequential circulation – The Brokers do their work in a particular order, one after the opposite. For instance, Agent 1 finishes its process, after which Agent 2 makes use of the end result to do its process. That is helpful when duties rely on one another and should be achieved step-by-step.
- Hierarchical circulation – Normally, one higher-level Agent manages the entire course of and offers directions to decrease degree Brokers which give attention to particular duties. That is helpful when the ultimate output requires some back-and-forth.
On this tutorial, I’m going to point out tips on how to construct from scratch various kinds of Multi-Agent Techniques, from easy to extra superior. I’ll current some helpful Python code that may be simply utilized in different comparable instances (simply copy, paste, run) and stroll by each line of code with feedback so to replicate this instance (hyperlink to full code on the finish of the article).
Setup
Please confer with Half 1 for the setup of Ollama and the primary LLM.
import ollama
llm = "qwen2.5"
On this instance, I’ll ask the mannequin to course of pictures, due to this fact I’m additionally going to want a Imaginative and prescient LLM. It’s a specialised model of a Massive Language Mannequin that, integrating NLP with CV, is designed to know visible inputs, comparable to pictures and movies, along with textual content.
Microsoft’s LLaVa is an environment friendly selection as it could possibly additionally run with no GPU.
After the obtain is accomplished, you may transfer on to Python and begin writing code. Let’s load a picture in order that we will check out the Imaginative and prescient LLM.
from matplotlib import picture as pltimg, pyplot as plt
image_file = "draghi.jpeg"
plt.imshow(pltimg.imread(image_file))
plt.present()
As a way to take a look at the Imaginative and prescient LLM, you may simply cross the picture as an enter:
import ollama
ollama.generate(mannequin="llava",
immediate="describe the picture",
pictures=[image_file])["response"]
Sequential Multi-Agent System
I shall construct two Brokers that can work in a sequential circulation, one after the opposite, the place the second takes the output of the primary as an enter, similar to a Chain.
- The primary Agent should course of a picture supplied by the consumer and return a verbal description of what it sees.
- The second Agent will search the web and attempt to perceive the place and when the image was taken, based mostly on the outline supplied by the primary Agent.
Each Brokers shall use one Device every. The primary Agent could have the Imaginative and prescient LLM as a Device. Please keep in mind that with Ollama, to be able to use a Device, the perform should be described in a dictionary.
def process_image(path: str) -> str:
return ollama.generate(mannequin="llava", immediate="describe the picture", pictures=[path])["response"]
tool_process_image = {'kind':'perform', 'perform':{
'identify': 'process_image',
'description': 'Load a picture for a given path and describe what you see',
'parameters': {'kind': 'object',
'required': ['path'],
'properties': {
'path': {'kind':'str', 'description':'the trail of the picture'},
}}}}
The second Agent ought to have a web-searching Device. Within the earlier articles of this tutorial collection, I confirmed tips on how to leverage the DuckDuckGo package deal for looking the online. So, this time, we will use a brand new Device: Wikipedia (pip set up wikipedia==1.4.0
). You’ll be able to straight use the unique library or import the LangChain wrapper.
from langchain_community.instruments import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
def search_wikipedia(question:str) -> str:
return WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper()).run(question)
tool_search_wikipedia = {'kind':'perform', 'perform':{
'identify': 'search_wikipedia',
'description': 'Search on Wikipedia by spending some key phrases',
'parameters': {'kind': 'object',
'required': ['query'],
'properties': {
'question': {'kind':'str', 'description':'The enter should be brief key phrases, not a protracted textual content'},
}}}}
## take a look at
search_wikipedia(question="draghi")
First, you must write a immediate to explain the duty of every Agent (the extra detailed, the higher), and that would be the first message within the chat historical past with the LLM.
immediate = '''
You're a photographer that analyzes and describes pictures in particulars.
'''
messages_1 = [{"role":"system", "content":prompt}]
One essential determination to make when constructing a MAS is whether or not the Brokers ought to share the chat historical past or not. The administration of chat historical past is determined by the design and aims of the system:
- Shared chat historical past – Brokers have entry to a typical dialog log, permitting them to see what different Brokers have mentioned or achieved in earlier interactions. This will improve the collaboration and the understanding of the general context.
- Separate chat historical past – Brokers solely have entry to their very own interactions, focusing solely on their very own communication. This design is often used when impartial decision-making is essential.
I like to recommend maintaining the chats separate except it’s essential to do in any other case. LLMs might need a restricted context window, so it’s higher to make the historical past as lite as attainable.
immediate = '''
You're a detective. You learn the picture description supplied by the photographer, and also you search Wikipedia to know when and the place the image was taken.
'''
messages_2 = [{"role":"system", "content":prompt}]
For comfort, I shall use the perform outlined within the earlier articles to course of the mannequin’s response.
def use_tool(agent_res:dict, dic_tools:dict) -> dict:
## use software
if "tool_calls" in agent_res["message"].keys():
for software in agent_res["message"]["tool_calls"]:
t_name, t_inputs = software["function"]["name"], software["function"]["arguments"]
if f := dic_tools.get(t_name):
### calling software
print('🔧 >', f"x1b[1;31m{t_name} -> Inputs: {t_inputs}x1b[0m")
### tool output
t_output = f(**tool["function"]["arguments"])
print(t_output)
### closing res
res = t_output
else:
print('🤬 >', f"x1b[1;31m{t_name} -> NotFoundx1b[0m")
## don't use tool
if agent_res['message']['content'] != '':
res = agent_res["message"]["content"]
t_name, t_inputs = '', ''
return {'res':res, 'tool_used':t_name, 'inputs_used':t_inputs}
As we already did in earlier tutorials, the interplay with the Brokers will be began with a whereas loop. The consumer is requested to supply a picture that the primary Agent will course of.
dic_tools = {'process_image':process_image,
'search_wikipedia':search_wikipedia}
whereas True:
## consumer enter
attempt:
q = enter('📷 > give me the picture to investigate:')
besides EOFError:
break
if q == "stop":
break
if q.strip() == "":
proceed
messages_1.append( {"function":"consumer", "content material":q} )
plt.imshow(pltimg.imread(q))
plt.present()
## Agent 1
agent_res = ollama.chat(mannequin=llm,
instruments=[tool_process_image],
messages=messages_1)
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
print("👽📷 >", f"x1b[1;30m{res}x1b[0m")
messages_1.append( {"role":"assistant", "content":res} )

The first Agent used the Vision LLM Tool and recognized text within the image. Now, the description will be passed to the second Agent, which shall extract some keywords to search Wikipedia.
## Agent 2
messages_2.append( {"role":"system", "content":"-Picture: "+res} )
agent_res = ollama.chat(model=llm,
tools=[tool_search_wikipedia],
messages=messages_2)
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
The second Agent used the Device and extracted info from the online, based mostly on the outline supplied by the primary Agent. Now, it could possibly course of every thing and provides a closing reply.
if tool_used == "search_wikipedia":
messages_2.append( {"function":"system", "content material":"-Wikipedia: "+res} )
agent_res = ollama.chat(mannequin=llm, instruments=[], messages=messages_2)
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
else:
messages_2.append( {"function":"assistant", "content material":res} )
print("👽📖 >", f"x1b[1;30m{res}x1b[0m")
This is literally perfect! Let’s move on to the next example.
Hierarchical Multi-Agent System
Imagine having a squad of Agents that operates with a hierarchical flow, just like a human team, with distinct roles to ensure smooth collaboration and efficient problem-solving. At the top, a manager oversees the overall strategy, talking to the customer (the user), making high-level decisions, and guiding the team toward the goal. Meanwhile, other team members handle operative tasks. Just like humans, Agents can work together and delegate tasks appropriately.
I shall build a tech team of 3 Agents with the objective of querying a SQL database per user’s request. They must work in a hierarchical flow:
- The Lead Agent talks to the user and understands the request. Then, it decides which team member is the most appropriate for the task.
- The Junior Agent has the job of exploring the db and building SQL queries.
- The Senior Agent shall review the SQL code, correct it if necessary, and execute it.
LLMs know how to code by being exposed to a large corpus of both code and natural language text, where they learn patterns, syntax, and semantics of programming languages. The model learns the relationships between different parts of the code by predicting the next token in a sequence. In short, LLMs can generate SQL code but can’t execute it, Agents can.
First of all, I am going to create a database and connect to it, then I shall prepare a series of Tools to execute SQL code.
## Read dataset
import pandas as pd
dtf = pd.read_csv('http://bit.ly/kaggletrain')
dtf.head(3)
## Create dbimport sqlite3
dtf.to_sql(index=False, name="titanic",
con=sqlite3.connect("database.db"),
if_exists="replace")
## Connect db
from langchain_community.utilities.sql_database import SQLDatabase
db = SQLDatabase.from_uri("sqlite:///database.db")
Let’s start with the Junior Agent. LLMs don’t need Tools to generate SQL code, but the Agent doesn’t know the table names and structure. Therefore, we need to provide Tools to investigate the database.
from langchain_community.tools.sql_database.tool import ListSQLDatabaseTool
def get_tables() -> str:
return ListSQLDatabaseTool(db=db).invoke("")
tool_get_tables = {'type':'function', 'function':{
'name': 'get_tables',
'description': 'Returns the name of the tables in the database.',
'parameters': {'type': 'object',
'required': [],
'properties': {}
}}}
## take a look at
get_tables()
That can present the obtainable tables within the db, and this may print the columns in a desk.
from langchain_community.instruments.sql_database.software import InfoSQLDatabaseTool
def get_schema(tables: str) -> str:
software = InfoSQLDatabaseTool(db=db)
return software.invoke(tables)
tool_get_schema = {'kind':'perform', 'perform':{
'identify': 'get_schema',
'description': 'Returns the identify of the columns within the desk.',
'parameters': {'kind': 'object',
'required': ['tables'],
'properties': {'tables': {'kind':'str', 'description':'desk identify. Instance Enter: table1, table2, table3'}}
}}}
## take a look at
get_schema(tables='titanic')
Since this Agent should use a couple of Device which could fail, I’ll write a stable immediate, following the construction of the earlier article.
prompt_junior = '''
[GOAL] You're a information engineer who builds environment friendly SQL queries to get information from the database.
[RETURN] You could return a closing SQL question based mostly on consumer's directions.
[WARNINGS] Use your instruments solely as soon as.
[CONTEXT] As a way to generate the proper SQL question, you must know the identify of the desk and the schema.
First ALWAYS use the software 'get_tables' to search out the identify of the desk.
Then, you MUST use the software 'get_schema' to get the columns within the desk.
Lastly, based mostly on the data you bought, generate an SQL question to reply consumer query.
'''
Transferring to the Senior Agent. Code checking doesn’t require any explicit trick, you may simply use the LLM.
def sql_check(sql: str) -> str:
p = f'''Double verify if the SQL question is right: {sql}. You MUST simply SQL code with out feedback'''
res = ollama.generate(mannequin=llm, immediate=p)["response"]
return res.substitute('sql','').substitute('```','').substitute('n',' ').strip()
tool_sql_check = {'kind':'perform', 'perform':{
'identify': 'sql_check',
'description': 'Earlier than executing a question, at all times overview the SQL question and proper the code if essential',
'parameters': {'kind': 'object',
'required': ['sql'],
'properties': {'sql': {'kind':'str', 'description':'SQL code'}}
}}}
## take a look at
sql_check(sql='SELECT * FROM titanic TOP 3')
Executing code on the database is a special story: LLMs can’t try this alone.
from langchain_community.instruments.sql_database.software import QuerySQLDataBaseTool
def sql_exec(sql: str) -> str:
return QuerySQLDataBaseTool(db=db).invoke(sql)
tool_sql_exec = {'kind':'perform', 'perform':{
'identify': 'sql_exec',
'description': 'Execute a SQL question',
'parameters': {'kind': 'object',
'required': ['sql'],
'properties': {'sql': {'kind':'str', 'description':'SQL code'}}
}}}
## take a look at
sql_exec(sql='SELECT * FROM titanic LIMIT 3')
And naturally, a very good immediate.
prompt_senior = '''[GOAL] You're a senior information engineer who opinions and execute the SQL queries written by others.
[RETURN] You could return information from the database.
[WARNINGS] Use your instruments solely as soon as.
[CONTEXT] ALWAYS verify the SQL code earlier than executing on the database.First ALWAYS use the software 'sql_check' to overview the question. The output of this software is the proper SQL question.You MUST use ONLY the proper SQL question while you use the software 'sql_exec'.'''
Lastly, we will create the Lead Agent. It has crucial job: invoking different Brokers and telling them what to do. There are lots of methods to attain that, however I discover making a easy Device probably the most correct one.
def invoke_agent(agent:str, directions:str) -> str:
return agent+" - "+directions if agent in ['junior','senior'] else f"Agent '{agent}' Not Discovered"
tool_invoke_agent = {'kind':'perform', 'perform':{
'identify': 'invoke_agent',
'description': 'Invoke one other Agent to give you the results you want.',
'parameters': {'kind': 'object',
'required': ['agent', 'instructions'],
'properties': {
'agent': {'kind':'str', 'description':'the Agent identify, one in all "junior" or "senior".'},
'directions': {'kind':'str', 'description':'detailed directions for the Agent.'}
}
}}}
## take a look at
invoke_agent(agent="intern", directions="construct a question")
Describe within the immediate what sort of habits you’re anticipating. Attempt to be as detailed as attainable, for hierarchical Multi-Agent Techniques can get very complicated.
prompt_lead = '''
[GOAL] You're a tech lead.
You've got a staff with one junior information engineer referred to as 'junior', and one senior information engineer referred to as 'senior'.
[RETURN] You could return information from the database based mostly on consumer's requests.
[WARNINGS] You're the just one that talks to the consumer and will get the requests from the consumer.
The 'junior' information engineer solely builds queries.
The 'senior' information engineer checks the queries and execute them.
[CONTEXT] First ALWAYS ask the customers what they need.
Then, you MUST use the software 'invoke_agent' to cross the directions to the 'junior' for constructing the question.
Lastly, you MUST use the software 'invoke_agent' to cross the directions to the 'senior' for retrieving the information from the database.
'''
I shall hold chat historical past separate so every Agent will know solely a particular a part of the entire course of.
dic_tools = {'get_tables':get_tables,
'get_schema':get_schema,
'sql_exec':sql_exec,
'sql_check':sql_check,
'Invoke_agent':invoke_agent}
messages_junior = [{"role":"system", "content":prompt_junior}]
messages_senior = [{"role":"system", "content":prompt_senior}]
messages_lead = [{"role":"system", "content":prompt_lead}]
All the pieces is able to begin the workflow. After the consumer begins the chat, the primary to reply is the Chief, which is the one one which straight interacts with the human.
whereas True:
## consumer enter
q = enter('🙂 >')
if q == "stop":
break
messages_lead.append( {"function":"consumer", "content material":q} )
## Lead Agent
agent_res = ollama.chat(mannequin=llm, messages=messages_lead, instruments=[tool_invoke_agent])
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
agent_invoked = res.break up("-")[0].strip() if len(res.break up("-")) > 1 else ''
directions = res.break up("-")[1].strip() if len(res.break up("-")) > 1 else ''
###-->CODE TO INVOKE OTHER AGENTS HERE<--###
## Lead Agent closing response print("👩💼 >", f"x1b[1;30m{res}x1b[0m") messages_lead.append( {"role":"assistant", "content":res} )
The Lead Agent decided to invoke the Junior Agent giving it some instruction, based on the interaction with the user. Now the Junior Agent shall start working on the query.
## Invoke Junior Agent
if agent_invoked == "junior":
print("😎 >", f"x1b[1;32mReceived instructions: {instructions}x1b[0m")
messages_junior.append( {"role":"user", "content":instructions} )
### use the tools
available_tools = {"get_tables":tool_get_tables, "get_schema":tool_get_schema}
context = ''
while available_tools:
agent_res = ollama.chat(model=llm, messages=messages_junior,
tools=[v for v in available_tools.values()])
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
if tool_used:
available_tools.pop(tool_used)
context = context + f"nTool used: {tool_used}. Output: {res}" #->add software utilization context
messages_junior.append( {"function":"consumer", "content material":context} )
### response
agent_res = ollama.chat(mannequin=llm, messages=messages_junior)
dic_res = use_tool(agent_res, dic_tools)
res = dic_res["res"]
print("😎 >", f"x1b[1;32m{res}x1b[0m")
messages_junior.append( {"role":"assistant", "content":res} )
The Junior Agent activated all its Tools to explore the database and collected the necessary information to generate some SQL code. Now, it must report back to the Lead.
## update Lead Agent
context = "Junior already wrote this query: "+res+ "nNow invoke the Senior to review and execute the code."
print("👩💼 >", f"x1b[1;30m{context}x1b[0m")
messages_lead.append( {"role":"user", "content":context} )
agent_res = ollama.chat(model=llm, messages=messages_lead, tools=[tool_invoke_agent])
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
agent_invoked = res.break up("-")[0].strip() if len(res.break up("-")) > 1 else ''
directions = res.break up("-")[1].strip() if len(res.break up("-")) > 1 else ''
The Lead Agent obtained the output from the Junior and requested the Senior Agent to overview and execute the SQL question.
## Invoke Senior Agent
if agent_invoked == "senior":
print("🧓 >", f"x1b[1;34mReceived instructions: {instructions}x1b[0m")
messages_senior.append( {"role":"user", "content":instructions} )
### use the tools
available_tools = {"sql_check":tool_sql_check, "sql_exec":tool_sql_exec}
context = ''
while available_tools:
agent_res = ollama.chat(model=llm, messages=messages_senior,
tools=[v for v in available_tools.values()])
dic_res = use_tool(agent_res, dic_tools)
res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
if tool_used:
available_tools.pop(tool_used)
context = context + f"nTool used: {tool_used}. Output: {res}" #->add software utilization context
messages_senior.append( {"function":"consumer", "content material":context} )
### response
print("🧓 >", f"x1b[1;34m{res}x1b[0m")
messages_senior.append( {"role":"assistant", "content":res} )
The Senior Agent executed the query on the db and got an answer. Finally, it can report back to the Lead which will give the final answer to the user.
### update Lead Agent
context = "Senior agent returned this output: "+res
print("👩💼 >", f"x1b[1;30m{context}x1b[0m")
messages_lead.append( {"role":"user", "content":context} )
Conclusion
This article has covered the basic steps of creating Multi-Agent Systems from scratch using only Ollama. With these building blocks in place, you are already equipped to start developing your own MAS for different use cases.
Stay tuned for Part 4, where we will dive deeper into more advanced examples.
Full code for this article: GitHub
I hope you enjoyed it! Feel free to contact me for questions and feedback or just to share your interesting projects.
👉 Let’s Connect 👈
All images, unless otherwise noted, are by the author