{"id":663038,"date":"2024-11-14T16:22:42","date_gmt":"2024-11-14T10:52:42","guid":{"rendered":"https:\/\/www.digit.in\/?p=663038"},"modified":"2024-11-14T16:22:46","modified_gmt":"2024-11-14T10:52:46","slug":"ai-agents-explained-why-openai-google-and-microsoft-are-building-smarter-ai-agents","status":"publish","type":"post","link":"https:\/\/www.digit.in\/features\/general\/ai-agents-explained-why-openai-google-and-microsoft-are-building-smarter-ai-agents.html","title":{"rendered":"AI agents explained: Why OpenAI, Google and Microsoft are building smarter AI agents"},"content":{"rendered":"\n<p>If 2022 was the birth of AI chatbots as we know it, thanks to OpenAI\u2019s ChatGPT, then by all indications 2025 will see a lot of AI agents coming out into the open from their current secretive research bubble \u2013 and no I\u2019m not talking about the garden variety agents referenced in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Agent_(The_Matrix)\">The Matrix<\/a>! How they will change our world is anyone\u2019s guess at this point, but the dawn of AI agents certainly promises to inject some excitement into the AI landscape that\u2019s becoming \u2013 dare I say \u2013 more drab by the day.<\/p>\n\n\n\n<p>In the last two years, the world has seen a lot of breakneck advancement in the Generative AI space, right from text-to-text, text-to-image and text-to-video based Generative AI capabilities. And all of that\u2019s been nothing short of stepping stones for the next big AI breakthrough \u2013 AI agents. According to <a href=\"https:\/\/www.bloomberg.com\/news\/articles\/2024-11-13\/openai-nears-launch-of-ai-agents-to-automate-tasks-for-users\">Bloomberg<\/a>, OpenAI is preparing to launch its first autonomous AI agent, which is codenamed \u2018Operator,\u2019 as soon as in January 2025.\u00a0<\/p>\n\n\n\n<p>Also read: <a href=\"https:\/\/www.digit.in\/features\/general\/prithvi-nasa-ibm-free-ai-model-for-better-weather-prediction.html\">Meet Prithvi, NASA &amp; IBM\u2019s free AI model for better weather prediction<\/a><\/p>\n\n\n\n<p>Apparently, this OpenAI agent \u2013 or Operator, as it\u2019s codenamed \u2013 is designed to perform complex tasks independently. By understanding user commands through voice or text, this AI agent will seemingly do tasks related to controlling different applications in the computer, send an email, book flights, and no doubt other cool things. Stuff that ChatGPT, Copilot, Google Gemini or any other LLM-based chatbot just can\u2019t do on its own. Knowing fully well that I\u2019m getting way ahead of myself, are you ready for J.A.R.V.I.S., Tony Stark&#8217;s intelligent AI assistant from Iron Man, or Samantha from Her, a significantly advanced AI operating system than what we\u2019ve experienced till now?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-are-ai-agents\">What are AI agents<\/h2>\n\n\n\n<p>Simply put, an AI agent is a slightly more advanced AI program that can perform certain autonomous tasks that aren\u2019t just limited to its own base program. ChatGPT or Gemini can write code for you, if you ask for it, but it can\u2019t go and create a website or an app from that code, where the website is live with a domain name or the app published on the app store. An AI agent will be able to do these things \u2013 I\u2019m not saying these exact tasks that I suggested above, but these AI agents will have the ability to not just show what needs to be done but also go ahead and do some of that work.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/static.digit.in\/IBM-AI-Agent.png\"><img decoding=\"async\" width=\"1024\" height=\"506\" src=\"https:\/\/static.digit.in\/IBM-AI-Agent-1024x506.png\" alt=\"\" class=\"wp-image-663055\" srcset=\"https:\/\/static.digit.in\/IBM-AI-Agent-1024x506.png 1024w, https:\/\/static.digit.in\/IBM-AI-Agent-300x148.png 300w, https:\/\/static.digit.in\/IBM-AI-Agent-768x379.png 768w, https:\/\/static.digit.in\/IBM-AI-Agent-1536x758.png 1536w, https:\/\/static.digit.in\/IBM-AI-Agent-2048x1011.png 2048w, https:\/\/static.digit.in\/IBM-AI-Agent-304x150.png 304w, https:\/\/static.digit.in\/IBM-AI-Agent-100x49.png 100w, https:\/\/static.digit.in\/IBM-AI-Agent-709x350.png 709w, https:\/\/static.digit.in\/IBM-AI-Agent-788x389.png 788w, https:\/\/static.digit.in\/IBM-AI-Agent-600x296.png 600w, https:\/\/static.digit.in\/IBM-AI-Agent-150x74.png 150w, https:\/\/static.digit.in\/IBM-AI-Agent.png 1183w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>According to Amazon\u2019s official <a href=\"https:\/\/aws.amazon.com\/what-is\/ai-agents\/\">AWS blog<\/a>, humans will set broad goals for any given AI-related task, where the AI agent will independently choose the best actions it needs to perform to achieve those goals. Amazon further explains how in a customer service scenario, a future AI agent will automatically try to satisfy a calling customer\u2019s query \u2013 by looking up internal information, by asking different questions to the human customer, by taking stock of the situation and responding with a solution that solves the calling customer\u2019s problem. In this scenario, the AI agent handles the customer\u2019s call on its own \u2013 without passing the call to a human customer support expert. In fact, whether or not to transfer a call to a human customer support expert is determined automatically by the AI agent.<\/p>\n\n\n\n<p>AI agents will be superior to simple AI chatbots thanks to their advanced reasoning capabilities, suggests <a href=\"https:\/\/www.ibm.com\/think\/topics\/ai-agents\">IBM\u2019s blog post on AI agents<\/a>. Unlike traditional AI chatbots like ChatGPT or Gemini, which give highly scripted responses to user queries, AI agents will have the ability to plan, think through and adapt to new information, enabling them to handle much more complex tasks with minimal human intervention or supervision.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"difference-between-ai-agents-and-ai-chatbots\">Difference between AI agents and AI chatbots<\/h2>\n\n\n\n<p>There\u2019s a lot of sophistication baked into AI agents, which AI chatbots simply don\u2019t have. One way to look at AI chatbots is that they\u2019re knowledgeable in all the theories of various subjects, whereas AI agents not only have the knowledge but also the expertise to apply their learnings in different applications. Given below are three key differences between AI agents and chatbots\u2026<\/p>\n\n\n\n<p>Being autonomous is the name of the game here. AI agents are self-directed, capable of making their own decisions based on given human instructions by carrying out tasks like scheduling meetings or managing emails \u2013 without the need for constant human intervention. This is in stark contrast to AI chatbots like ChatGPT or Gemini that rely on constant user prompts to generate responses, where they lack the ability to initiate actions on their own independently.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI.png\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-1024x576.png\" alt=\"OpenAI\" class=\"wp-image-618389\" srcset=\"https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-1024x576.png 1024w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-300x169.png 300w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-768x432.png 768w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-1536x864.png 1536w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-2048x1152.png 2048w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-267x150.png 267w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-100x56.png 100w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-622x350.png 622w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-788x443.png 788w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-599x337.png 599w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI-150x84.png 150w, https:\/\/static.digit.in\/Nvidia-Apple-and-Microsoft-in-talks-to-invest-in-OpenAI.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption class=\"wp-element-caption\">OpenAI<\/figcaption><\/figure>\n\n\n\n<p>Their ability to break down complex tasks and execute is another key differentiator. AI agents are equipped to tackle complex, multi-faceted tasks by drawing on information from diverse sources and making informed decisions. On the other hand, AI chatbots are generally restricted to only providing information or answering queries based on their pre-existing trained knowledge base.<\/p>\n\n\n\n<p>According to experts, another key point of difference is the following: AI agents have the ability to learn from experiences and adapt their behaviour over time to match a set of assigned tasks, enhancing their performance in ever-changing conditions. However, AI chatbots typically lack this level of adaptability, unable to learn anything new apart from what\u2019s there in their existing knowledge base. These are some of the top differences between AI agents and chatbots as we know it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"different-type-of-ai-agents\">Different type of AI agents<\/h2>\n\n\n\n<p>Just like different AI chatbots have varying levels of competency across different tasks, so do AI agents come in all sizes and shapes \u2013 in a matter of saying, of course. An AI agent can be as simple or complex depending on its programming and the quantum of tasks it\u2019s expected to execute. Given this scope, here\u2019s how AI agents are being classified into three main types.<\/p>\n\n\n\n<p>Also read: <a href=\"https:\/\/www.digit.in\/features\/general\/slm-vs-llm-why-smaller-gen-ai-models-maybe-better.html\">SLM vs LLM: Why smaller Gen AI models are better<\/a><\/p>\n\n\n\n<p>Firstly, there are so-called goal-based agents which are designed to achieve specific objectives by evaluating various action sequences and selecting the most effective path to reach their goals. Unlike simple agents, goal-based agents carefully consider future outcomes and plan their actions accordingly. An example of this goal-based AI agent is a navigation system that identifies the fastest route to a destination by analysing multiple pathways and selecting the one that minimises travel time.<\/p>\n\n\n\n<p>After goal-based agents come what\u2019s known as utility-based agents, which extend the functionality of goal-based agents by not only aiming to achieve a goal but also optimising the quality of its final intended outcome. These AI agents use a utility function to assign a value to each potential outcome, thereby choosing actions that maximise overall satisfaction or performance of any given task. This approach is especially useful when multiple paths can lead to the same goal, allowing the AI agent to select the most advantageous one based on predefined criteria. Imagine a travel booking system that recommends flights not only based on reaching the destination but also considering factors like ticket price, travel time, and layovers to provide the most cost-effective and convenient option \u2013 this is what a utility-based AI agent will be able to perform as part of its tasks.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/static.digit.in\/anthropic-ai-agent.png\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/static.digit.in\/anthropic-ai-agent-1024x576.png\" alt=\"\" class=\"wp-image-663056\" srcset=\"https:\/\/static.digit.in\/anthropic-ai-agent-1024x576.png 1024w, https:\/\/static.digit.in\/anthropic-ai-agent-300x169.png 300w, https:\/\/static.digit.in\/anthropic-ai-agent-768x432.png 768w, https:\/\/static.digit.in\/anthropic-ai-agent-1536x864.png 1536w, https:\/\/static.digit.in\/anthropic-ai-agent-2048x1152.png 2048w, https:\/\/static.digit.in\/anthropic-ai-agent-267x150.png 267w, https:\/\/static.digit.in\/anthropic-ai-agent-100x56.png 100w, https:\/\/static.digit.in\/anthropic-ai-agent-622x350.png 622w, https:\/\/static.digit.in\/anthropic-ai-agent-788x443.png 788w, https:\/\/static.digit.in\/anthropic-ai-agent-599x337.png 599w, https:\/\/static.digit.in\/anthropic-ai-agent-150x84.png 150w, https:\/\/static.digit.in\/anthropic-ai-agent.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>Finally, there are learning agents. These AI agents possess the ability to improve their performance over time by learning from experiences. By continuously interacting with their environment and incorporating feedback, learning agents adapt to new situations and refine their decision-making processes, making them suitable for dynamic and complex domains. There are also something known as hierarchical agents, which are nothing but a group of AI agents arranged in multiple tiers. In such a hierarchical structure, higher-level agents break down complex tasks and assign them to individual lower-level AI agents. These lower-evel agents run their tasks independently and hand over their results to the higher-level agents up the value chain.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"who-all-are-developing-ai-agents\">Who all are developing AI agents<\/h2>\n\n\n\n<p>All the big movers and shakers of the AI industry are planning to release their version of AI agents for the public very soon in 2025, if they haven\u2019t done it already by late 2024.<\/p>\n\n\n\n<p>As I mentioned earlier, OpenAI\u2019s reportedly working on getting their AI agent, codenamed <a href=\"https:\/\/www.theverge.com\/2024\/11\/13\/24295879\/openai-agent-operator-autonomous-ai\">Operator<\/a>, out into the open for everyone to check out by January 2025. The AI agent is expected to be capable of autonomously operating certain tasks within your computer, like booking flight tickets and implementing code, among other things. Google seems to be working on several AI agent projects, one of which is known as Project Jarvis, which recently leaked on the <a href=\"https:\/\/timesofindia.indiatimes.com\/technology\/tech-news\/googles-project-jarvis-leaked-on-chrome-web-store-how-this-ai-agent-can-take-over-web-browsing\/articleshow\/115091151.cms\">Chrome Web Store<\/a>. This AI agent will supposedly reside within Google Chrome browser, with the ability to not only automate and execute tasks within the browser but also operate other apps on the host PC or computer. While there\u2019s no set release date for Project Jarvis yet, however, Google\u2019s Gemini 2.0 AI model is expected to have AI agents built-in to offer enhanced capabilities \u2013 it\u2019s expected to release later this year in 2024.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/static.digit.in\/gemini-ai-agent.png\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/static.digit.in\/gemini-ai-agent-1024x576.png\" alt=\"\" class=\"wp-image-663057\" srcset=\"https:\/\/static.digit.in\/gemini-ai-agent-1024x576.png 1024w, https:\/\/static.digit.in\/gemini-ai-agent-300x169.png 300w, https:\/\/static.digit.in\/gemini-ai-agent-766x431.png 766w, https:\/\/static.digit.in\/gemini-ai-agent-1536x864.png 1536w, https:\/\/static.digit.in\/gemini-ai-agent-2048x1152.png 2048w, https:\/\/static.digit.in\/gemini-ai-agent-267x150.png 267w, https:\/\/static.digit.in\/gemini-ai-agent-100x56.png 100w, https:\/\/static.digit.in\/gemini-ai-agent-622x350.png 622w, https:\/\/static.digit.in\/gemini-ai-agent-788x443.png 788w, https:\/\/static.digit.in\/gemini-ai-agent-599x337.png 599w, https:\/\/static.digit.in\/gemini-ai-agent-150x84.png 150w, https:\/\/static.digit.in\/gemini-ai-agent.png 1300w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>Microsoft is also working aggressively on AI agents, something that it had announced earlier in the year in 2024. According to its <a href=\"https:\/\/blogs.microsoft.com\/blog\/2024\/10\/21\/new-autonomous-agents-scale-your-team-like-never-before\/\">official blog<\/a>, new capabilities in Copilot Studio will allow Microsoft customers to create powerful autonomous AI agents. Some of these demonstrations are in public preview at the moment, where AI agents can draw upon work or business data from different Microsoft Office 365 apps to undertake a variety of assistive tasks \u2013 like IT help desk, employee onboarding, coordinating sales and service, and more.<\/p>\n\n\n\n<p>Anthropic, an AI startup competing with OpenAI, has already released its AI agent for people to try. According to <a href=\"https:\/\/techcrunch.com\/2024\/10\/22\/anthropics-new-ai-can-control-your-pc\/\">Techcrunch<\/a>, Anthropic has made significant upgrades to its Claude 3.5 Sonnet AI model which now lets it use the host computer \u2013 yes, it can interact with computers in a way that mimics humans. It can move the cursor around the screen, click on apps and buttons, and it can potentially interact with other softwares and programs installed in your PC to autonomously execute various tasks. How scary and cool is that?!<\/p>\n\n\n\n<p>If the world wasn\u2019t prepared for Generative AI back in 2022, then let me tell you it\u2019s certainly not prepared for AI agents and all the various ways it can impact our lives \u2013 for better or worse. Let\u2019s hope these AI agents don\u2019t turn out to be the dystopian versions as depicted in an iconic movie 25 years ago, for your sake and mine, eh?<\/p>\n\n\n\n<p>Also read: <a href=\"https:\/\/www.digit.in\/features\/general\/meta-ai-manifesto-the-ai-assisted-resurrection-of-mark-zuckerberg.html\">Meta AI manifesto: The AI-assisted resurrection of Mark Zuckerberg<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If 2022 was the birth of AI chatbots as we know it, thanks to OpenAI\u2019s ChatGPT, then by all indications 2025 will see a lot of AI agents coming out into the open from their current secretive research bubble \u2013 and no I\u2019m not talking about the garden variety agents referenced in The Matrix! How [&hellip;]<\/p>\n","protected":false},"author":1934,"featured_media":663045,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_gspb_post_css":"","footnotes":""},"categories":[186989],"tags":[217587,222570,213064,241459],"contenttype":[205],"digitlang":[165350],"dealstore":[],"offerexpiration":[],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/posts\/663038"}],"collection":[{"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/users\/1934"}],"replies":[{"embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/comments?post=663038"}],"version-history":[{"count":2,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/posts\/663038\/revisions"}],"predecessor-version":[{"id":663058,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/posts\/663038\/revisions\/663058"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/media\/663045"}],"wp:attachment":[{"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/media?parent=663038"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/categories?post=663038"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/tags?post=663038"},{"taxonomy":"contenttype","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/contenttype?post=663038"},{"taxonomy":"digitlang","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/digitlang?post=663038"},{"taxonomy":"dealstore","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/dealstore?post=663038"},{"taxonomy":"offerexpiration","embeddable":true,"href":"https:\/\/www.digit.in\/wp-json\/wp\/v2\/offerexpiration?post=663038"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}