The OP probably resolved this error long ago but for anyone else coming to this question with a similar problem, this error is exactly as it says, which is that some of the expected variables passed into the chain were not filled in. To resolve the error, pass the correct variables to ChatPromptTemplate.
In the particular example in the OP, the first object in the LCEL chain passes the input query down the chain under "question" but the ChatPromptTemplate expects the user input under the name "input"; these should have a common name. Moreover, since documents are retrieved from a vector store retriever, these should be passed down as a user input in the prompt template object as well. The following prompt template object where the expected input variables are "question" and "context" should solve that issue.
Lastly, if you want the ChatPromptTemplate to expect the "agent_scratchpad" key, it should also be passed down from the immediately preceding dict. However, note that "agent_scratchpad" is a special key under which intermediate steps in an AgentExecutor are stored. If you just have an LLM chain, this key will not be useful, so you can remove it from the prompt template altogether.
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant who retrieves information from documents."),
("user", "{question}\nContext: {context}"), # <--- "question" and "context" will be filled in here
]
)
However, if you do have an AgentExecutor that has its tools, a function-calling LLM etc., then you will need to pass "agent_scratchpad" key, whose value should be a function that formats tool messages; basically, all intermediate steps are combined in a single list. The following is an example agent executor that is equipped with tools to add and subtract integers.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents.format_scratchpad.tools import format_to_tool_messages
from langchain.agents import AgentExecutor
from langchain_core.tools import tool
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from dotenv import load_dotenv
load_dotenv()
@tool
def add(x: int, y: int):
"Add two numbers together."
return x + y
@tool
def sub(x: int, y: int):
"Subtract y from x."
return x - y
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant."),
("user", "{question}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
rag_pipe = (
{
"question": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_tool_messages(x["intermediate_steps"])
}
| prompt
| llm.bind_tools([add, sub])
| OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=rag_pipe, tools=[add, sub], return_intermediate_steps=True)
text = "I had 12 apples. I ate 2 but got 9 from Jane. Then I sold 7 of them. How many apples do I have now?"
result_dict = agent_executor.invoke({"input": text})
ans = result_dict["output"]
print(ans)
In the above code if you print result_dict["intermediate_steps"], you can see why "agent_scratchpad" was necessary, what actions were taken to get to the final output and how the intermediate tool calls are stored.
The error occurs because the second object in the LLM chain (rag_pipe) is a prompt object that admits a dict whose keys must be the input variables, so that this dict's values are filled in the prompt to create a list of messages. In that sense, the workflow is similar to the following:
s = "{input} {agent_scratchpad}"
d = {"input": 1, "agent_scratchpad": "value"}
v = s.format(**d)
In this example, if you pass d = {"question": 1, "context": "value"}, you'll get a key error. The error in the title is akin to this case.