...
A description of its purpose that will allow other AI agents to identify and collaborate with it.
Instructions indicating the context, the objectives of the AI agent, the data sources to query, the tasks to be performed, the examples to be used, the checks to be carried out, etc.
The use of a small or large multimodal generative model (
mainly called a Large Language Model or LLM) to interpret the instructions and generate the result.
The multimodal generative model will process an input that a human will produce in a conversational interaction.
Similarly, the generative model will produce an output that will be evaluated by the human in co-pilot mode. This output can also be shared with another AI agent.
The agent will rely on the short-term memory of the multimodal generative model and its long-term memory (this memory is persistent and can be used to personalize future interactions), as well as on planning, decision-making, and reasoning capabilities.
Access to tools, which may include access to enterprise applications and search engines. The multimodal generative model will decide which tools to use and in what sequence to achieve its objective.
Access to external data sources in addition to the training data of the model.
...