From 8b1d2fa78bcf1efb9ec9119a6c4a69221795d060 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Tue, 27 May 2025 13:17:29 -0700 Subject: [PATCH 1/8] Initial draft --- .../AI_Research_Assistant_Cookbook.ipynb | 352 ++++++++++++++++++ 1 file changed, 352 insertions(+) create mode 100644 examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb new file mode 100644 index 0000000000..4b8e11f9c9 --- /dev/null +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -0,0 +1,352 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "85b66af9", + "metadata": {}, + "source": [ + "# Building an **AI Research Assistant** with the OpenAI Agents SDK\n", + "\n", + "This notebook provides a reference patterns for implementing a multi‑agent AI Research Assistant that can plan, search, curate, and draft high‑quality reports with citations.\n", + "\n", + "While the Deep Research feature is avaialble in ChatGPT, however, individual and companies may want to implement their own API based solution for a more finegrained control over the output.\n", + "\n", + "With support for Agents, and built-in tools such as Code Interpreter, Web Search, and File Search, - Responses API makes building your own Research Assistant fast and easy. " + ] + }, + { + "cell_type": "markdown", + "id": "0dcd3942", + "metadata": {}, + "source": [ + "## Table of Contents\n", + "1. [Overview](#overview)\n", + "2. [Solution Workflow](#workflow)\n", + "3. [High‑Level Architecture](#architecture)\n", + "4. [Agent Definitions (Pseudo Code)](#agents)\n", + " * Research Planning Agent\n", + " * Web Search Agent\n", + " * Knowledge Assistant Agent\n", + " * Report Creation Agent\n", + " * Data Analysis Agent (optional)\n", + " * Image‑Gen Agent (optional)\n", + "5. [Guardrails & Best Practices](#best-practices)\n", + "6. [Risks & Mitigation](#risks)" + ] + }, + { + "cell_type": "markdown", + "id": "a32e358e", + "metadata": {}, + "source": [ + "### 1 — Overview \n", + "The AI Research Assistant helps drives better research quality and faster turnaround for knowledge content.\n", + "\n", + "1. **Performs autonomous Internet research** to gather the most recent sources.\n", + "2. **Incorporates internal data sources** such as a Company's proprietery knowledge sources. \n", + "3. **Reduces analyst effort from days to minutes** by automating search, curation and first‑draft writing.\n", + "4. **Produces draft reports with citations** and built‑in hallucination detection." + ] + }, + { + "cell_type": "markdown", + "id": "33cb6ce3", + "metadata": {}, + "source": [ + "### 2 — Solution Workflow \n", + "The typical workflow consists of five orchestrated steps: \n", + "\n", + "| Step | Purpose | Model |\n", + "|------|---------|-------|\n", + "| **Query Expansion** | Draft multi‑facet prompts / hypotheses | `gpt‑4o` |\n", + "| **Search‑Term Generation** | Expand/clean user query into rich keyword list | `gpt‑4o` |\n", + "| **Conduct Research** | Run web & internal searches, rank & summarise results | `gpt‑4o` + tools |\n", + "| **Draft Report** | Produce first narrative with reasoning & inline citations | `o1` / `gpt‑4o` |\n", + "| **Report Expansion** | Polish formatting, add charts / images / appendix | `gpt‑4o` + tools |" + ] + }, + { + "cell_type": "markdown", + "id": "dcb4e6dc", + "metadata": {}, + "source": [ + "### 3 — High‑Level Architecture \n", + "The following diagram groups agents and tools:\n", + "\n", + "* **Research Planning Agent** – interprets the user request and produces a research plan/agenda.\n", + "* **Knowledge Assistant Agent** – orchestrates parallel web & file searches via built‑in tools, curates short‑term memory.\n", + "* **Web Search Agent(s)** – perform Internet queries, deduplicate, rank and summarise pages.\n", + "* **Report Creation Agent** – consumes curated corpus and drafts the structured report.\n", + "* **(Optional) Data Analysis Agent** – executes code for numeric/CSV analyses via the Code Interpreter tool.\n", + "* **(Optional) Image‑Gen Agent** – generates illustrative figures.\n", + "\n", + "Input/output guardrails wrap user prompts and final content for policy, safety and citation checks." + ] + }, + { + "cell_type": "markdown", + "id": "d3464739", + "metadata": {}, + "source": [ + "### 4 — Pre-requisites \n", + "\n", + "Create a virual environment \n", + "\n", + "Install dependencies " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "3a16ac1f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install openai openai-agents --quiet" + ] + }, + { + "cell_type": "markdown", + "id": "69135215", + "metadata": {}, + "source": [ + "### 5 — Agents (Pseudo Code) \n", + "Below are skeletal class definitions illustrating how each agent’s policy and tool‑usage might look." + ] + }, + { + "cell_type": "markdown", + "id": "b9f3062e", + "metadata": {}, + "source": [ + "#### Step 1 - Query Expansion" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b576089c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Draft a comprehensive research report analyzing the evolution and impact of artificial intelligence (AI) over the past five years. This report should investigate key trends that have emerged during this period, including advancements in machine learning models like GPT and BERT, the rise of AI in industries such as healthcare, finance, and autonomous vehicles, and the ethical considerations surrounding AI development and implementation. Delve into how these trends have influenced technological growth, business strategies, and regulatory measures globally. Evaluate the societal and economic implications of these advancements and provide insights into future directions AI might take. Use a variety of sources, including scholarly articles, industry reports, and expert interviews, to support your analysis and conclusions.\n" + ] + } + ], + "source": [ + "from agents import Agent, Runner\n", + "\n", + "query_expansion_agent = Agent(\n", + " name=\"Query Expansion Agent\",\n", + " instructions=\"\"\"You are a helpful agent who is given a research prompt from the user as input. \n", + " Your task is to expand the prompt into a more complete and actionable research prompt. Do not write the research \n", + " paper, just improve the prompt in about one paragraph. Only respond with the expanded prompt no qualifiers.\"\"\",\n", + " tools=[],\n", + " model=\"gpt-4o\", \n", + ")\n", + "\n", + "result = await Runner.run(query_expansion_agent, \"Draft a research report on latest trends in persona auto insurance in the US\")\n", + "\n", + "expanded_prompt = result.final_output \n", + "\n", + "print(expanded_prompt)" + ] + }, + { + "cell_type": "markdown", + "id": "6b1b10e7", + "metadata": {}, + "source": [ + "#### Step 2 - Web Search Terms " + ] + }, + { + "cell_type": "markdown", + "id": "725969cb", + "metadata": {}, + "source": [ + "Generate the web search terms. You can customize the number of search terms generated to a give level of depth. " + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "d3b4d4af", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Search_Queries=['Evolution of AI technology 2020-2025', 'Impact of machine learning models GPT and BERT on industries', 'AI advancements in healthcare and finance 2025', 'Ethics and AI development 2025', 'Future directions for artificial intelligence in various sectors']\n", + "(0, 'Evolution of AI technology 2020-2025')\n", + "(1, 'Impact of machine learning models GPT and BERT on industries')\n", + "(2, 'AI advancements in healthcare and finance 2025')\n", + "(3, 'Ethics and AI development 2025')\n", + "(4, 'Future directions for artificial intelligence in various sectors')\n" + ] + } + ], + "source": [ + "from pydantic import BaseModel\n", + "\n", + "class SearchTerms(BaseModel):\n", + " \"\"\"Structured output model for search-terms suggestions.\"\"\"\n", + " Search_Queries: list[str]\n", + "\n", + "\n", + "search_terms_agent = Agent(\n", + " name=\"Search Terms Agent\",\n", + " instructions=\"\"\"You are a helpful agent assigned a research task. Your job is to provide the top \n", + " 5 Search Queries relevant to the given topic in this year (2025). The output should be in JSON format.\n", + "\n", + " Example format provided below:\n", + " \n", + " {\n", + " \"Search_Queries\": [\n", + " \"Top ranked auto insurance companies US 2025 by market capitalization\",\n", + " \"Geico rates and comparison with other auto insurance companies\",\n", + " \"Insurance premiums of top ranked companies in the US in 2025\", \n", + " \"Total cost of insuring autos in US 2025\", \n", + " \"Top customer service feedback for auto insurance in 2025\"\n", + " ]\n", + " }\n", + " \n", + " \"\"\",\n", + " tools=[],\n", + " model=\"gpt-4o\", \n", + " output_type=SearchTerms,\n", + ")\n", + "\n", + "result = await Runner.run(search_terms_agent, expanded_prompt)\n", + "\n", + "search_terms_raw = result.final_output\n", + "\n", + "print(search_terms_raw)\n", + "\n", + "\n", + "for query in enumerate(search_terms_raw.Search_Queries):\n", + " print(f\"{query}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a477b6a8", + "metadata": {}, + "outputs": [], + "source": [ + "class KnowledgeAssistantAgent:\n", + " \"\"\"Curates short‑term memory of research snippets.\"\"\"\n", + " def run(self, web_snippets, file_snippets):\n", + " corpus = web_snippets + file_snippets\n", + " # Vector‑embed & cluster (pseudo)\n", + " # ...\n", + " # Return pruned, deduplicated corpus\n", + " return corpus[:50] # top‑N" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "de168943", + "metadata": {}, + "outputs": [], + "source": [ + "class ReportCreationAgent:\n", + " \"\"\"Drafts the first complete report with citations.\"\"\"\n", + " def run(self, curated_corpus, outline):\n", + " report = client.chat(\n", + " model='gpt-4o',\n", + " system_prompt='Write a research report following the outline. Cite sources in IEEE style.',\n", + " user_prompt=str({'outline': outline, 'corpus': curated_corpus}),\n", + " )\n", + " return report.text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "40c6bbca", + "metadata": {}, + "outputs": [], + "source": [ + "# --- Orchestration skeleton ---\n", + "def generate_research_paper(topic: str):\n", + " plan = ResearchPlanningAgent().run(topic)\n", + "\n", + " web_results = WebSearchAgent().run(plan['search_terms'])\n", + " # TODO: file_results via file_search if internal corpus available\n", + " file_results = []\n", + "\n", + " curated = KnowledgeAssistantAgent().run(web_results, file_results)\n", + " draft = ReportCreationAgent().run(curated, plan['outline'])\n", + " return draft\n" + ] + }, + { + "cell_type": "markdown", + "id": "fb69c797", + "metadata": {}, + "source": [ + "### 5 — Guardrails & Best Practices \n", + "* **Crawl → Walk → Run**: start with a single agent, then expand into a swarm. \n", + "* **Expose intermediate reasoning** (“show the math”) to build user trust. \n", + "* **Parameterise UX** so analysts can tweak report format and source mix. \n", + "* **Native OpenAI tools first** (web browsing, file ingestion) before reinventing low‑level retrieval. " + ] + }, + { + "cell_type": "markdown", + "id": "1bdcab82", + "metadata": {}, + "source": [ + "### 6 — Risks & Mitigation \n", + "| Pitfall | Mitigation |\n", + "|---------|------------|\n", + "| Scope‑creep & endless roadmap | Narrow MVP & SMART milestones | fileciteturn1file4L23-L24 |\n", + "| Hallucinations & weak guardrails | Golden‑set evals, RAG with citation checks | fileciteturn1file4L25-L26 |\n", + "| Run‑away infra costs | Cost curve modelling; efficient models + autoscaling | fileciteturn1file4L27-L28 |\n", + "| Talent gaps | Upskill & leverage Agents SDK to offload core reasoning | fileciteturn1file4L29-L30 |" + ] + }, + { + "cell_type": "markdown", + "id": "5b40dcf3", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.1" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 0c7b7caff997568d5060c515f3b6a733328728ad Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Fri, 6 Jun 2025 14:24:23 -0700 Subject: [PATCH 2/8] Utils and functionality --- .../AI_Research_Assistant_Cookbook.ipynb | 350 ++++++++++++------ .../query_expansion_agent.py | 100 +++++ .../web_page_summary_agent.py | 52 +++ .../web_search_terms_generation_agent.py | 77 ++++ .../guardrails/topic_content_guardrail.py | 57 +++ .../utils/web_search_and_util.py | 207 +++++++++++ examples/agents_sdk/web_search.py | 11 + 7 files changed, 748 insertions(+), 106 deletions(-) create mode 100644 examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py create mode 100644 examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py create mode 100644 examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py create mode 100644 examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py create mode 100644 examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py create mode 100644 examples/agents_sdk/web_search.py diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index 4b8e11f9c9..1c60946f0c 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -97,18 +97,10 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "id": "3a16ac1f", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], + "outputs": [], "source": [ "%pip install openai openai-agents --quiet" ] @@ -127,40 +119,107 @@ "id": "b9f3062e", "metadata": {}, "source": [ - "#### Step 1 - Query Expansion" + "#### Step 1 - Query Expansion\n", + "\n", + "The query expansion step ensures the subsequent agents conducting research have sufficient context of user's inquiry. \n", + "\n", + "The first step is to understand user's intent, and make sure the user has provided sufficinet details for subsequent agents to search the web, build a knowledge repository, and prepare a deepdive report. The `query_expansion_agent.py` accomplishes this with the prompt that outlines minimum information needed from the user to generate a report. This could include timeframe, industry, target audience, etc. The prompt can be tailored to the need of your deepresearch assistant. The agent will put a `is_task_clear` yes or no, when its no, it would prompt the user with additional questions, if sufficent information is available, it would output the expanded prompt. \n", + "\n", + "This is also an opportunity to enforce input guardrails for any research topics that you'd like to restrict the user from reserarching based on your usage policies. " + ] + }, + { + "cell_type": "markdown", + "id": "f2618f60", + "metadata": {}, + "source": [ + "##### Input Guardrails with Agents SDK \n", + "Let's assume our ficticious guardrail is to prevent the user from generating a non-AI releated topic report. For this we will define a guardrail agent. The guardrail agent `topic_guradrail.py` checks whether the topic is related to AI, if not, it raises an execption. The function `ai_topic_guardrail` is passed to the `QueryExpansionAgent()` as `input_guardrails`" ] }, { "cell_type": "code", - "execution_count": null, - "id": "b576089c", + "execution_count": 3, + "id": "620f9e40", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Draft a comprehensive research report analyzing the evolution and impact of artificial intelligence (AI) over the past five years. This report should investigate key trends that have emerged during this period, including advancements in machine learning models like GPT and BERT, the rise of AI in industries such as healthcare, finance, and autonomous vehicles, and the ethical considerations surrounding AI development and implementation. Delve into how these trends have influenced technological growth, business strategies, and regulatory measures globally. Evaluate the societal and economic implications of these advancements and provide insights into future directions AI might take. Use a variety of sources, including scholarly articles, industry reports, and expert interviews, to support your analysis and conclusions.\n" + "🚫 Guardrail tripped – not an AI topic: The request is about trends in the luxury goods market, which is not focused on artificial intelligence.\n" ] } ], "source": [ - "from agents import Agent, Runner\n", + "from ai_research_assistant_resources.agents_tools_registry.query_expansion_agent import QueryExpansionAgent\n", + "from agents import InputGuardrailTripwireTriggered\n", "\n", - "query_expansion_agent = Agent(\n", - " name=\"Query Expansion Agent\",\n", - " instructions=\"\"\"You are a helpful agent who is given a research prompt from the user as input. \n", - " Your task is to expand the prompt into a more complete and actionable research prompt. Do not write the research \n", - " paper, just improve the prompt in about one paragraph. Only respond with the expanded prompt no qualifiers.\"\"\",\n", - " tools=[],\n", - " model=\"gpt-4o\", \n", - ")\n", + "query_expansion_agent_guardrail_check = QueryExpansionAgent()\n", "\n", - "result = await Runner.run(query_expansion_agent, \"Draft a research report on latest trends in persona auto insurance in the US\")\n", + "try:\n", "\n", - "expanded_prompt = result.final_output \n", + " result = await query_expansion_agent_guardrail_check.task(\"Write a research report on the latest trends in luxury goods market\")\n", "\n", - "print(expanded_prompt)" + "except InputGuardrailTripwireTriggered as e:\n", + " reason = e.guardrail_result.output.output_info.reasoning\n", + " # └─────┬─────┘\n", + " # GuardrailFunctionOutput\n", + " print(\"🚫 Guardrail tripped – not an AI topic:\", reason)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "77364239", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "The task is not clear. The agent asks:\n", + " Could you please specify the timeframe you have in mind for the research report (e.g., current year, last 5 years, or another period)? Additionally, should the report focus on any specific geographic region or subfields within AI developments (e.g., machine learning, natural language processing) or cover the topic broadly?\n", + "\n", + "\n", + "user input: within the last 1 year, in the US and around ehtical AI development \n", + "\n", + "Expanded query:\n", + " Draft a research report that examines the latest trends in ethical AI development within the United States over the last year, providing an analysis of emerging practices, challenges, and regulatory considerations unique to this timeframe and region.\n" + ] + } + ], + "source": [ + "from ai_research_assistant_resources.agents_tools_registry.query_expansion_agent import QueryExpansionAgent\n", + "\n", + "query_expansion_agent = QueryExpansionAgent()\n", + "\n", + "# Initial prompt to the agent\n", + "prompt: str = \"Draft a research report on the latest trends in AI developments\"\n", + "expanded_query = \"\" \n", + "\n", + "try: \n", + "\n", + " while True:\n", + " # Execute the agent with the current prompt\n", + " result = await query_expansion_agent.task(prompt)\n", + "\n", + " # When the task is clear, show the expanded query and exit.\n", + " if result.is_task_clear == \"yes\":\n", + " expanded_query = result.expanded_query\n", + " print(\"\\nExpanded query:\\n\", expanded_query)\n", + " break\n", + "\n", + " # Otherwise, display the clarifying questions and ask the user for input.\n", + " print(\"\\nThe task is not clear. The agent asks:\\n\", result.questions)\n", + " prompt = input(\"Please provide the missing details so I can refine the query: \")\n", + " print(\"\\n\")\n", + " print(\"user input: \", prompt)\n", + " \n", + "\n", + "except Exception as e:\n", + " print(\"Non-AI topic guardrail tripped!\", e)" ] }, { @@ -176,124 +235,129 @@ "id": "725969cb", "metadata": {}, "source": [ - "Generate the web search terms. You can customize the number of search terms generated to a give level of depth. " + "Conducting Web search is typically an integral part of the deep research process. First we generate web search terms relevant to the research report. In the next step we will search the web and build a knowledge repository of the data.\n", + "\n", + "The `WebSearchTermsGenerationAgent` takes as input the the expanded prompt, and generates succient search terms. You can structure the search term generation prompt according to your user's typical requirements such as include adjacent industries in the search terms, include competitors, etc. Additionally, you can also control how much data you want to gather e.g., number of search terms to generate. In our case, we will limit to 3 search terms. " ] }, { "cell_type": "code", - "execution_count": 13, - "id": "d3b4d4af", + "execution_count": 5, + "id": "f15e0c10", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Search_Queries=['Evolution of AI technology 2020-2025', 'Impact of machine learning models GPT and BERT on industries', 'AI advancements in healthcare and finance 2025', 'Ethics and AI development 2025', 'Future directions for artificial intelligence in various sectors']\n", - "(0, 'Evolution of AI technology 2020-2025')\n", - "(1, 'Impact of machine learning models GPT and BERT on industries')\n", - "(2, 'AI advancements in healthcare and finance 2025')\n", - "(3, 'Ethics and AI development 2025')\n", - "(4, 'Future directions for artificial intelligence in various sectors')\n" + "1. Ethical AI development trends USA 2025\n", + "2. Challenges in AI ethics and regulations in 2025\n", + "3. Emerging AI practices and legal considerations in the US 2025\n" ] } ], "source": [ - "from pydantic import BaseModel\n", + "placeholder_query = \"Draft a research report that examines the latest trends in ethical AI development within the United States over the last year, providing an analysis of emerging practices, challenges, and regulatory considerations unique to this timeframe and region.\"\n", "\n", - "class SearchTerms(BaseModel):\n", - " \"\"\"Structured output model for search-terms suggestions.\"\"\"\n", - " Search_Queries: list[str]\n", + "from ai_research_assistant_resources.agents_tools_registry.web_search_terms_generation_agent import WebSearchTermsGenerationAgent\n", "\n", + "search_terms_agent = WebSearchTermsGenerationAgent(3)\n", "\n", - "search_terms_agent = Agent(\n", - " name=\"Search Terms Agent\",\n", - " instructions=\"\"\"You are a helpful agent assigned a research task. Your job is to provide the top \n", - " 5 Search Queries relevant to the given topic in this year (2025). The output should be in JSON format.\n", + "result = await search_terms_agent.task(placeholder_query)\n", "\n", - " Example format provided below:\n", - " \n", - " {\n", - " \"Search_Queries\": [\n", - " \"Top ranked auto insurance companies US 2025 by market capitalization\",\n", - " \"Geico rates and comparison with other auto insurance companies\",\n", - " \"Insurance premiums of top ranked companies in the US in 2025\", \n", - " \"Total cost of insuring autos in US 2025\", \n", - " \"Top customer service feedback for auto insurance in 2025\"\n", - " ]\n", - " }\n", - " \n", - " \"\"\",\n", - " tools=[],\n", - " model=\"gpt-4o\", \n", - " output_type=SearchTerms,\n", - ")\n", + "search_terms_raw = result\n", "\n", - "result = await Runner.run(search_terms_agent, expanded_prompt)\n", + "for i, query in enumerate(search_terms_raw.Search_Queries, start=1):\n", + " print(f\"{i}. {query}\")" + ] + }, + { + "cell_type": "markdown", + "id": "3feeaae8", + "metadata": {}, + "source": [ + "#### Step 3 - Scroll the Web build a inventory of data sources \n", + "\n", + "We will use custom web search to identify and knowledge content to form the baseline for our report. You can learn more about building custom web search and retreival here. [Building a Bring Your Own Browser (BYOB) Tool for Web Browsing and Summarization](https://cookbook.openai.com/examples/third_party/web_search_with_google_api_bring_your_own_browser_tool). You will also need a Google Custom Search API key and Custom Search Engine ID (CSE ID) in a .env file at the root. \n", "\n", - "search_terms_raw = result.final_output\n", + "NOTE: The reason for using custom web search is provide more finegrained control over which information is retreived, and guardrails such as excluding competitor's content from your report. \n", "\n", - "print(search_terms_raw)\n", + "This is a 3 step process: \n", "\n", + "1. Obtain the search results (top 10 pages)\n", + "2. Scroll the pages, and summarize the key points \n", + "3. Output guardrails to weedout irrelevant or undesirable results (e.g., the timeframe of the content doesn't align with user's need, or mentions a competitor)\n", "\n", - "for query in enumerate(search_terms_raw.Search_Queries):\n", - " print(f\"{query}\")" + "prerequisite pip install nest_asyncio" ] }, { "cell_type": "code", - "execution_count": null, - "id": "a477b6a8", + "execution_count": 7, + "id": "7b7260c5", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1. Ethical AI development trends USA 2025\n", + "2. Challenges in AI ethics and regulations in 2025\n", + "3. Emerging AI practices and legal considerations in the US 2025\n", + "Results written to research_results.json\n" + ] + } + ], "source": [ - "class KnowledgeAssistantAgent:\n", - " \"\"\"Curates short‑term memory of research snippets.\"\"\"\n", - " def run(self, web_snippets, file_snippets):\n", - " corpus = web_snippets + file_snippets\n", - " # Vector‑embed & cluster (pseudo)\n", - " # ...\n", - " # Return pruned, deduplicated corpus\n", - " return corpus[:50] # top‑N" + "from ai_research_assistant_resources.utils.web_search_and_util import get_results_for_search_term\n", + "import json\n", + "from dotenv import load_dotenv\n", + "import os\n", + "\n", + "load_dotenv('.env')\n", + "\n", + "api_key = os.getenv('API_KEY')\n", + "cse_id = os.getenv('CSE_ID')\n", + "\n", + "if not api_key or not cse_id:\n", + " raise ValueError(\"API_KEY and CSE_ID must be set as environment variables or in a .env file\")\n", + "\n", + "research_results = []\n", + "\n", + "for i, query in enumerate(search_terms_raw.Search_Queries, start=1):\n", + " print(f\"{i}. {query}\")\n", + " results = get_results_for_search_term(query)\n", + " research_results.append(results)\n", + "\n", + "# Pretty-print the JSON response (or a friendly message if no results).\n", + "if results:\n", + " # Write results to a file\n", + " with open(\"research_results.json\", \"w\", encoding=\"utf-8\") as f:\n", + " json.dump(research_results, f, indent=2, ensure_ascii=False)\n", + " print(\"Results written to research_results.json\")\n", + "else:\n", + " print(\"No results returned. Check your API credentials or search term.\")" ] }, { - "cell_type": "code", - "execution_count": null, - "id": "de168943", + "cell_type": "markdown", + "id": "e9758743", "metadata": {}, - "outputs": [], "source": [ - "class ReportCreationAgent:\n", - " \"\"\"Drafts the first complete report with citations.\"\"\"\n", - " def run(self, curated_corpus, outline):\n", - " report = client.chat(\n", - " model='gpt-4o',\n", - " system_prompt='Write a research report following the outline. Cite sources in IEEE style.',\n", - " user_prompt=str({'outline': outline, 'corpus': curated_corpus}),\n", - " )\n", - " return report.text" + "### Step-4: " ] }, { - "cell_type": "code", - "execution_count": null, - "id": "40c6bbca", + "cell_type": "markdown", + "id": "2a890c80", "metadata": {}, - "outputs": [], - "source": [ - "# --- Orchestration skeleton ---\n", - "def generate_research_paper(topic: str):\n", - " plan = ResearchPlanningAgent().run(topic)\n", - "\n", - " web_results = WebSearchAgent().run(plan['search_terms'])\n", - " # TODO: file_results via file_search if internal corpus available\n", - " file_results = []\n", - "\n", - " curated = KnowledgeAssistantAgent().run(web_results, file_results)\n", - " draft = ReportCreationAgent().run(curated, plan['outline'])\n", - " return draft\n" - ] + "source": [] + }, + { + "cell_type": "markdown", + "id": "255e0a06", + "metadata": {}, + "source": [] }, { "cell_type": "markdown", @@ -307,6 +371,80 @@ "* **Native OpenAI tools first** (web browsing, file ingestion) before reinventing low‑level retrieval. " ] }, + { + "cell_type": "code", + "execution_count": null, + "id": "91eed7c2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "API_KEY and CSE_ID are: AIzaSyCQH3GUXJwnqOmvBp9U12P54eScvMJLH7c 50c7decc940664df9\n" + ] + }, + { + "ename": "TypeError", + "evalue": "An asyncio.Future, a coroutine or an awaitable is required", + "output_type": "error", + "traceback": [ + "\u001b[31m---------------------------------------------------------------------------\u001b[39m\n", + "\u001b[31mTypeError\u001b[39m Traceback (most recent call last)\n", + "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[1]\u001b[39m\u001b[32m, line 19\u001b[39m\n", + "\u001b[32m 16\u001b[39m \u001b[38;5;28mprint\u001b[39m(\u001b[33m\"\u001b[39m\u001b[33mAPI_KEY and CSE_ID are: \u001b[39m\u001b[33m\"\u001b[39m, api_key, cse_id)\n", + "\u001b[32m 18\u001b[39m nest_asyncio.apply()\n", + "\u001b[32m---> \u001b[39m\u001b[32m19\u001b[39m results = \u001b[43masyncio\u001b[49m\u001b[43m.\u001b[49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\u001b[43mget_results_for_search_term\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mAI Trends\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n", + "\u001b[32m 21\u001b[39m \u001b[38;5;66;03m# Pretty-print the JSON response (or a friendly message if no results).\u001b[39;00m\n", + "\u001b[32m 22\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m results:\n", + "\n", + "\u001b[36mFile \u001b[39m\u001b[32m~/workspace-28/openai-cookbook/.venv/lib/python3.13/site-packages/nest_asyncio.py:28\u001b[39m, in \u001b[36m_patch_asyncio..run\u001b[39m\u001b[34m(main, debug)\u001b[39m\n", + "\u001b[32m 26\u001b[39m loop = asyncio.get_event_loop()\n", + "\u001b[32m 27\u001b[39m loop.set_debug(debug)\n", + "\u001b[32m---> \u001b[39m\u001b[32m28\u001b[39m task = \u001b[43masyncio\u001b[49m\u001b[43m.\u001b[49m\u001b[43mensure_future\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmain\u001b[49m\u001b[43m)\u001b[49m\n", + "\u001b[32m 29\u001b[39m \u001b[38;5;28;01mtry\u001b[39;00m:\n", + "\u001b[32m 30\u001b[39m \u001b[38;5;28;01mreturn\u001b[39;00m loop.run_until_complete(task)\n", + "\n", + "\u001b[36mFile \u001b[39m\u001b[32m/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/tasks.py:742\u001b[39m, in \u001b[36mensure_future\u001b[39m\u001b[34m(coro_or_future, loop)\u001b[39m\n", + "\u001b[32m 740\u001b[39m should_close = \u001b[38;5;28;01mFalse\u001b[39;00m\n", + "\u001b[32m 741\u001b[39m \u001b[38;5;28;01melse\u001b[39;00m:\n", + "\u001b[32m--> \u001b[39m\u001b[32m742\u001b[39m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[33m'\u001b[39m\u001b[33mAn asyncio.Future, a coroutine or an awaitable \u001b[39m\u001b[33m'\u001b[39m\n", + "\u001b[32m 743\u001b[39m \u001b[33m'\u001b[39m\u001b[33mis required\u001b[39m\u001b[33m'\u001b[39m)\n", + "\u001b[32m 745\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m loop \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n", + "\u001b[32m 746\u001b[39m loop = events.get_event_loop()\n", + "\n", + "\u001b[31mTypeError\u001b[39m: An asyncio.Future, a coroutine or an awaitable is required" + ] + } + ], + "source": [ + "from ai_research_assistant_resources.utils.web_search_and_util import get_results_for_search_term\n", + "import json\n", + "from dotenv import load_dotenv\n", + "import os\n", + "import asyncio\n", + "import nest_asyncio\n", + "\n", + "load_dotenv('.env')\n", + "\n", + "api_key = os.getenv('API_KEY')\n", + "cse_id = os.getenv('CSE_ID')\n", + "\n", + "if not api_key or not cse_id:\n", + " raise ValueError(\"API_KEY and CSE_ID must be set as environment variables or in a .env file\")\n", + "else:\n", + " print(\"API_KEY and CSE_ID are: \", api_key, cse_id)\n", + "\n", + "nest_asyncio.apply()\n", + "results = asyncio.run(get_results_for_search_term(\"AI Trends\"))\n", + "\n", + "# Pretty-print the JSON response (or a friendly message if no results).\n", + "if results:\n", + " print(json.dumps(results, indent=2))\n", + "else:\n", + " print(\"No results returned. Check your API credentials or search term.\")" + ] + }, { "cell_type": "markdown", "id": "1bdcab82", diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py new file mode 100644 index 0000000000..7fdcfd3472 --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py @@ -0,0 +1,100 @@ +# query_expansion_agent.py +from agents import Agent, Runner +from pydantic import BaseModel +from typing import Literal +from agents.run import RunConfig +from typing import Optional +from ..guardrails.topic_content_guardrail import ai_topic_guardrail + +class QueryExpansion(BaseModel): + """Structured output model for the query-expansion agent.""" + + expanded_query: str + questions: str + is_task_clear: Literal["yes", "no"] + + +class QueryExpansionAgent: + """A wrapper class that internally creates an `agents.Agent` instance on construction. + + Example + ------- + >>> query_agent = QueryExpansionAgent() + >>> prompt = "Draft a research report on the latest trends in AI in the last 5 years." + >>> expanded_prompt = query_agent.task(prompt) + >>> print(expanded_prompt) + """ + + _DEFAULT_NAME = "Query Expansion Agent" + + _DEFAULT_INSTRUCTIONS = """You are a helpful agent who receives a raw research prompt from the user. + + 1. Determine whether the task is sufficiently clear to proceed (“task clarity”). + 2. If the task is missing timeframe of the research or the industry/domain, ask the user for the missing information. + 3. If the task *is* clear, expand the prompt into a single, well-structured paragraph that makes the research request specific, actionable, and self-contained • Do **not** add any qualifiers or meta-commentary; write only the improved prompt itself. + 4. If the task *is not* clear, generate concise clarifying questions that will make the request actionable. Prioritize questions about: + • Domain / industry focus + • Timeframe (e.g., current year 2025, last 5 years, last 10 years, or all time) + • Any other missing constraints or desired perspectives + + Return your response **exclusively** as a JSON object with these exact fields: + + ```json + { + "expanded_query": "", + "questions": "", + "is_task_clear": "yes" | "no" + } + ``` + + When "is_task_clear" is "yes", populate "expanded_query" with the enhanced one-paragraph prompt and leave "questions" empty. + + When "is_task_clear" is "no", populate "questions" with your list of clarifying questions and leave "expanded_query" empty. + + """ + + def __init__(self, *, model: str = "o3-mini", tools: list | None = None, name: str | None = None, + instructions: str | None = None, input_guardrails: list | None = None): + + # Initialise the underlying `agents.Agent` with a structured `output_type` so it + # returns a validated `QueryExpansion` instance instead of a raw string. + self._agent = Agent( + name=name or self._DEFAULT_NAME, + instructions=instructions or self._DEFAULT_INSTRUCTIONS, + tools=tools or [], + model=model, + output_type=QueryExpansion, + input_guardrails=input_guardrails or [ai_topic_guardrail], + ) + self._last_result = None + + # ------------------------------------------------------------------ + # Public helpers + # ------------------------------------------------------------------ + + @property + def agent(self) -> Agent: # type: ignore[name-defined] + """Return the underlying ``agents.Agent`` instance.""" + return self._agent + + async def task(self, prompt: str) -> QueryExpansion: + """Fire-and-forget API that auto-threads each call.""" + + cfg = RunConfig(tracing_disabled=True) + + # ── First turn ────────────────────────────────────────────── + if self._last_result is None: + self._last_result = await Runner.run( + self._agent, prompt, run_config=cfg + ) + # ── Later turns ───────────────────────────────────────────── + else: + new_input = ( + self._last_result.to_input_list() + + [{"role": "user", "content": prompt}] + ) + self._last_result = await Runner.run( + self._agent, new_input, run_config=cfg + ) + + return self._last_result.final_output diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py new file mode 100644 index 0000000000..39c8e2bc20 --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py @@ -0,0 +1,52 @@ +# web_page_summary_agent.py + +from agents import Agent, Runner, RunConfig + + +class WebPageSummaryAgent: + _DEFAULT_NAME = "Web Page Summary Agent" + + # NOTE: Placeholders will be filled at construction time so that the agent carries + # all contextual instructions. The task method will then only need the raw content. + _DEFAULT_INSTRUCTIONS = """You are an AI assistant tasked with summarising provided web page content. + "Return a concise summary in {character_limit} characters or less focused on '{search_term}'. " + "Only output the summary text; do not include any qualifiers or metadata.""" + + def __init__( + self, + search_term: str, + character_limit: int = 1000, + *, + model: str = "gpt-4o", + tools: list | None = None, + name: str | None = None, + instructions: str | None = None, + ) -> None: + # Format instructions with the dynamic values unless explicitly overridden. + formatted_instructions = instructions or self._DEFAULT_INSTRUCTIONS.format( + search_term=search_term, character_limit=character_limit + ) + + self._agent = Agent( + name=name or self._DEFAULT_NAME, + instructions=formatted_instructions, + tools=tools or [], + model=model, + ) + + # ------------------------------------------------------------------ + # Public helpers + # ------------------------------------------------------------------ + + @property + def agent(self) -> Agent: # type: ignore[name-defined] + """Return the underlying :class:`agents.Agent` instance.""" + + return self._agent + + async def task(self, content: str) -> str: + + cfg = RunConfig(tracing_disabled=True) + + result = await Runner.run(self._agent, content, run_config=cfg) + return result.final_output \ No newline at end of file diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py new file mode 100644 index 0000000000..0484ce117f --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py @@ -0,0 +1,77 @@ +from pydantic import BaseModel +from agents import Agent, Runner, RunConfig + +_NUM_SEARCH_TERMS = 5 + +class SearchTerms(BaseModel): + """Structured output model for search-terms suggestions.""" + + Search_Queries: list[str] + + +class WebSearchTermsGenerationAgent: + _DEFAULT_NAME = "Search Terms Agent" + + # Template for building default instructions at runtime. + # NOTE: Double braces are used so that ``str.format`` leaves the JSON braces intact. + _DEFAULT_INSTRUCTIONS_TEMPLATE = """You are a helpful agent assigned a research task. Your job is to provide the top + {num_search_terms} Search Queries relevant to the given topic in this year (2025). The output should be in JSON format. + + Example format provided below: + + {{ + "Search_Queries": [ + "Top ranked auto insurance companies US 2025 by market capitalization", + "Geico rates and comparison with other auto insurance companies", + "Insurance premiums of top ranked companies in the US in 2025", + "Total cost of insuring autos in US 2025", + "Top customer service feedback for auto insurance in 2025" + ] + }} + + """ + + + def __init__( + self, + num_search_terms: int = _NUM_SEARCH_TERMS, + *, + model: str = "gpt-4o", + tools: list | None = None, + name: str | None = None, + instructions: str | None = None, + ): + # Store number of search terms for potential future use + self._num_search_terms = num_search_terms + + # Build default instructions if none were provided. + final_instructions = ( + instructions + or self._DEFAULT_INSTRUCTIONS_TEMPLATE.format( + num_search_terms=num_search_terms + ) + ) + + self._agent = Agent( + name=name or self._DEFAULT_NAME, + instructions=final_instructions, + tools=tools or [], + model=model, + output_type=SearchTerms, + ) + + # ------------------------------------------------------------------ + # Public helpers + # ------------------------------------------------------------------ + + @property + def agent(self) -> Agent: # type: ignore[name-defined] + """Return the underlying ``agents.Agent`` instance.""" + + return self._agent + + async def task(self, prompt: str) -> SearchTerms: # type: ignore[name-defined] + """Execute the agent synchronously and return the structured ``SearchTerms`` output.""" + cfg = RunConfig(tracing_disabled=True) + result = await Runner.run(self._agent, prompt, run_config=cfg) + return result.final_output \ No newline at end of file diff --git a/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py b/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py new file mode 100644 index 0000000000..74b6209d80 --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py @@ -0,0 +1,57 @@ +# topic_guardrail.py +from typing import List + +from pydantic import BaseModel + +from agents import ( # type: ignore (SDK imports) + Agent, + GuardrailFunctionOutput, + RunContextWrapper, + Runner, + TResponseInputItem, + input_guardrail, +) + +# --------------------------------------------------------------------------- +# 1. Tiny classifier agent → “Is this prompt about AI?” +# --------------------------------------------------------------------------- + +class TopicCheckOutput(BaseModel): + """Structured result returned by the classifier.""" + is_about_ai: bool # True → prompt is AI-related + reasoning: str # short rationale (useful for logs) + +topic_guardrail_agent = Agent( + name="Topic guardrail (AI)", + instructions=( + "You are a binary classifier. " + "Reply with is_about_ai = true **only** when the user's request is mainly " + "about artificial-intelligence research, applications, tooling, ethics, " + "policy, or market trends. " + "Return is_about_ai = false for all other domains (finance, biology, history, etc.)." + ), + model="gpt-4o-mini", # lightweight, fast + output_type=TopicCheckOutput, +) + +# --------------------------------------------------------------------------- +# 2. Guardrail function (decorated) that wraps the classifier +# --------------------------------------------------------------------------- + +@input_guardrail +async def ai_topic_guardrail( + ctx: RunContextWrapper[None], + agent: Agent, + input: str | List[TResponseInputItem], +) -> GuardrailFunctionOutput: + result = await Runner.run(topic_guardrail_agent, input, context=ctx.context) + + output = GuardrailFunctionOutput( + output_info=result.final_output, + tripwire_triggered=not result.final_output.is_about_ai, + ) + + return output + +# Optional: tidy public surface +__all__ = ["ai_topic_guardrail", "TopicCheckOutput"] \ No newline at end of file diff --git a/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py new file mode 100644 index 0000000000..bcc3dd6cd5 --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py @@ -0,0 +1,207 @@ +# web_search_and_util.py + +from bs4 import BeautifulSoup +import requests +from dotenv import load_dotenv +import os + +load_dotenv('.env') + +api_key = os.getenv('API_KEY') +cse_id = os.getenv('CSE_ID') + +TRUNCATE_SCRAPED_TEXT = 50000 # Adjust based on your model's context window +SEARCH_DEPTH = 2 # Default depth for Google Custom Search queries + +# ------------------------------------------------------------------ +# Optional: patch asyncio to allow nested event loops (e.g., inside Jupyter) +# ------------------------------------------------------------------ + +try: + import nest_asyncio # type: ignore + + # ``nest_asyncio`` monkey-patches the running event-loop so that further + # calls to ``asyncio.run`` or ``loop.run_until_complete`` do **not** raise + # ``RuntimeError: This event loop is already running``. This makes the + # synchronous helper functions below safe to call in notebook cells while + # still working unchanged in regular Python scripts. + + nest_asyncio.apply() +except ImportError: # pragma: no cover + # ``nest_asyncio`` is an optional dependency. If it is unavailable we + # simply skip patching – the helper functions will still work in regular + # Python scripts but may raise ``RuntimeError`` when called from within + # environments that already run an event-loop (e.g., Jupyter). + pass + +def search(search_item, api_key, cse_id, search_depth=SEARCH_DEPTH, site_filter=None): + service_url = 'https://www.googleapis.com/customsearch/v1' + + params = { + 'q': search_item, + 'key': api_key, + 'cx': cse_id, + 'num': search_depth + } + + try: + response = requests.get(service_url, params=params) + response.raise_for_status() + results = response.json() + + # Check if 'items' exists in the results + if 'items' in results: + if site_filter is not None: + + # Filter results to include only those with site_filter in the link + filtered_results = [result for result in results['items'] if site_filter in result['link']] + + if filtered_results: + return filtered_results + else: + print(f"No results with {site_filter} found.") + return [] + else: + if 'items' in results: + return results['items'] + else: + print("No search results found.") + return [] + + except requests.exceptions.RequestException as e: + print(f"An error occurred during the search: {e}") + return [] + + + + +def retrieve_content(url, max_tokens=TRUNCATE_SCRAPED_TEXT): + try: + headers = {'User-Agent': 'Mozilla/5.0'} + response = requests.get(url, headers=headers, timeout=10) + response.raise_for_status() + + soup = BeautifulSoup(response.content, 'html.parser') + for script_or_style in soup(['script', 'style']): + script_or_style.decompose() + + text = soup.get_text(separator=' ', strip=True) + characters = max_tokens * 4 # Approximate conversion + text = text[:characters] + return text + except requests.exceptions.RequestException as e: + print(f"Failed to retrieve {url}: {e}") + return None + + + +async def get_search_results(search_items, search_term: str, character_limit: int = 500): + # Generate a summary of search results for the given search term + results_list = [] + for idx, item in enumerate(search_items, start=1): + url = item.get('link') + + snippet = item.get('snippet', '') + web_content = retrieve_content(url, TRUNCATE_SCRAPED_TEXT) + + if web_content is None: + print(f"Error: skipped URL: {url}") + else: + summary = summarize_content(web_content, search_term, character_limit) + result_dict = { + 'order': idx, + 'link': url, + 'title': snippet, + 'Summary': summary + } + results_list.append(result_dict) + return results_list + +# ------------------------------------------------------------------ +# Helper using WebPageSummaryAgent for content summarisation +# ------------------------------------------------------------------ +# NOTE: +# ``WebPageSummaryAgent`` is an agent wrapper that internally spins up an +# ``agents.Agent`` instance with the correct system prompt for Web page +# summarisation. Because the ``task`` method on the wrapper is *async*, we +# provide a small synchronous wrapper that takes care of running the coroutine +# irrespective of whether the caller is inside an active event-loop (e.g. +# Jupyter notebooks) or not. + +from ai_research_assistant_resources.agents_tools_registry.web_page_summary_agent import WebPageSummaryAgent +import asyncio + + +def summarize_content(content: str, search_term: str, character_limit: int = 2000) -> str: # noqa: D401 + + # Instantiate the agent with the dynamic instructions. + agent = WebPageSummaryAgent(search_term=search_term, character_limit=character_limit) + + # Run the agent task, making sure we properly handle the presence (or + # absence) of an already-running event-loop. + try: + return asyncio.run(agent.task(content)) + except RuntimeError: + # We are *probably* inside an existing event-loop (common in notebooks + # or async frameworks). In that case fall back to using the current + # loop instead of creating a new one. + loop = asyncio.get_event_loop() + return loop.run_until_complete(agent.task(content)) + + +# ------------------------------------------------------------------ +# High-level convenience API +# ------------------------------------------------------------------ + + +def get_results_for_search_term( + search_term: str, + *, + character_limit: int = 2000, + search_depth: int = SEARCH_DEPTH, + site_filter: str | None = None, +) -> list[dict]: + """Search the Web for *search_term* and return enriched result dictionaries. + + The function handles the entire workflow: + + 1. Perform a Google Custom Search using the provided credentials. + 2. Retrieve and clean the contents of each result page. + 3. Generate a concise summary of each page focused on *search_term* using + :pyfunc:`summarize_content`. + + The returned value is a ``list`` of ``dict`` objects with the following + keys: ``order``, ``link``, ``title`` and ``Summary``. + """ + + # Step 1 – search. + search_items = search( + search_term, + api_key=api_key, + cse_id=cse_id, + search_depth=search_depth, + site_filter=site_filter, + ) + + # Step 2 & 3 – scrape pages and summarise. + # ``get_search_results`` is an *async* coroutine. Execute it and + # return its result, transparently handling the presence (or absence) + # of an already-running event loop (e.g. in notebooks). + + try: + # Prefer ``asyncio.run`` which creates and manages a fresh event + # loop. This is the most robust option for regular Python + # scripts. + import asyncio # local import to avoid polluting module top-level + + return asyncio.run( + get_search_results(search_items, search_term, character_limit) + ) + except RuntimeError: + # We probably find ourselves inside an existing event loop (for + # instance when this helper is invoked from within a Jupyter + # notebook). Fall back to re-using the current loop. + loop = asyncio.get_event_loop() + return loop.run_until_complete( + get_search_results(search_items, search_term, character_limit) + ) \ No newline at end of file diff --git a/examples/agents_sdk/web_search.py b/examples/agents_sdk/web_search.py new file mode 100644 index 0000000000..4e5cfd7954 --- /dev/null +++ b/examples/agents_sdk/web_search.py @@ -0,0 +1,11 @@ +from openai import OpenAI +client = OpenAI() + +response = client.responses.create( + model="gpt-4.1", + tools=[{"type": "web_search_preview", "search_context_size": "high",}], + input="Impact of telematics and AI in auto insurance 2025" +) + +import json +print(json.dumps(response.model_dump(), indent=2)) \ No newline at end of file From e19f5fce057cd2457c7265e75c3e2dd488a1edba Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Fri, 6 Jun 2025 14:25:35 -0700 Subject: [PATCH 3/8] Resolve error --- .../AI_Research_Assistant_Cookbook.ipynb | 74 ------------------- 1 file changed, 74 deletions(-) diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index 1c60946f0c..d7d8c26f95 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -371,80 +371,6 @@ "* **Native OpenAI tools first** (web browsing, file ingestion) before reinventing low‑level retrieval. " ] }, - { - "cell_type": "code", - "execution_count": null, - "id": "91eed7c2", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "API_KEY and CSE_ID are: AIzaSyCQH3GUXJwnqOmvBp9U12P54eScvMJLH7c 50c7decc940664df9\n" - ] - }, - { - "ename": "TypeError", - "evalue": "An asyncio.Future, a coroutine or an awaitable is required", - "output_type": "error", - "traceback": [ - "\u001b[31m---------------------------------------------------------------------------\u001b[39m\n", - "\u001b[31mTypeError\u001b[39m Traceback (most recent call last)\n", - "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[1]\u001b[39m\u001b[32m, line 19\u001b[39m\n", - "\u001b[32m 16\u001b[39m \u001b[38;5;28mprint\u001b[39m(\u001b[33m\"\u001b[39m\u001b[33mAPI_KEY and CSE_ID are: \u001b[39m\u001b[33m\"\u001b[39m, api_key, cse_id)\n", - "\u001b[32m 18\u001b[39m nest_asyncio.apply()\n", - "\u001b[32m---> \u001b[39m\u001b[32m19\u001b[39m results = \u001b[43masyncio\u001b[49m\u001b[43m.\u001b[49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\u001b[43mget_results_for_search_term\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m\"\u001b[39;49m\u001b[33;43mAI Trends\u001b[39;49m\u001b[33;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n", - "\u001b[32m 21\u001b[39m \u001b[38;5;66;03m# Pretty-print the JSON response (or a friendly message if no results).\u001b[39;00m\n", - "\u001b[32m 22\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m results:\n", - "\n", - "\u001b[36mFile \u001b[39m\u001b[32m~/workspace-28/openai-cookbook/.venv/lib/python3.13/site-packages/nest_asyncio.py:28\u001b[39m, in \u001b[36m_patch_asyncio..run\u001b[39m\u001b[34m(main, debug)\u001b[39m\n", - "\u001b[32m 26\u001b[39m loop = asyncio.get_event_loop()\n", - "\u001b[32m 27\u001b[39m loop.set_debug(debug)\n", - "\u001b[32m---> \u001b[39m\u001b[32m28\u001b[39m task = \u001b[43masyncio\u001b[49m\u001b[43m.\u001b[49m\u001b[43mensure_future\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmain\u001b[49m\u001b[43m)\u001b[49m\n", - "\u001b[32m 29\u001b[39m \u001b[38;5;28;01mtry\u001b[39;00m:\n", - "\u001b[32m 30\u001b[39m \u001b[38;5;28;01mreturn\u001b[39;00m loop.run_until_complete(task)\n", - "\n", - "\u001b[36mFile \u001b[39m\u001b[32m/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/tasks.py:742\u001b[39m, in \u001b[36mensure_future\u001b[39m\u001b[34m(coro_or_future, loop)\u001b[39m\n", - "\u001b[32m 740\u001b[39m should_close = \u001b[38;5;28;01mFalse\u001b[39;00m\n", - "\u001b[32m 741\u001b[39m \u001b[38;5;28;01melse\u001b[39;00m:\n", - "\u001b[32m--> \u001b[39m\u001b[32m742\u001b[39m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[33m'\u001b[39m\u001b[33mAn asyncio.Future, a coroutine or an awaitable \u001b[39m\u001b[33m'\u001b[39m\n", - "\u001b[32m 743\u001b[39m \u001b[33m'\u001b[39m\u001b[33mis required\u001b[39m\u001b[33m'\u001b[39m)\n", - "\u001b[32m 745\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m loop \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n", - "\u001b[32m 746\u001b[39m loop = events.get_event_loop()\n", - "\n", - "\u001b[31mTypeError\u001b[39m: An asyncio.Future, a coroutine or an awaitable is required" - ] - } - ], - "source": [ - "from ai_research_assistant_resources.utils.web_search_and_util import get_results_for_search_term\n", - "import json\n", - "from dotenv import load_dotenv\n", - "import os\n", - "import asyncio\n", - "import nest_asyncio\n", - "\n", - "load_dotenv('.env')\n", - "\n", - "api_key = os.getenv('API_KEY')\n", - "cse_id = os.getenv('CSE_ID')\n", - "\n", - "if not api_key or not cse_id:\n", - " raise ValueError(\"API_KEY and CSE_ID must be set as environment variables or in a .env file\")\n", - "else:\n", - " print(\"API_KEY and CSE_ID are: \", api_key, cse_id)\n", - "\n", - "nest_asyncio.apply()\n", - "results = asyncio.run(get_results_for_search_term(\"AI Trends\"))\n", - "\n", - "# Pretty-print the JSON response (or a friendly message if no results).\n", - "if results:\n", - " print(json.dumps(results, indent=2))\n", - "else:\n", - " print(\"No results returned. Check your API credentials or search term.\")" - ] - }, { "cell_type": "markdown", "id": "1bdcab82", From 74aa22e644e1f049b1030831af2ac85fa3aadd93 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Sun, 8 Jun 2025 21:04:47 -0700 Subject: [PATCH 4/8] Agents for research --- REPORT_DRAFT.md | 88 ++++++++ .../AI_Research_Assistant_Cookbook.ipynb | 204 +++++++++++++++--- .../report_writing_agent.py | 109 ++++++++++ .../utils/web_search_and_util.py | 38 ++-- examples/agents_sdk/research_results.json | 38 ++++ 5 files changed, 432 insertions(+), 45 deletions(-) create mode 100644 REPORT_DRAFT.md create mode 100644 examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/report_writing_agent.py create mode 100644 examples/agents_sdk/research_results.json diff --git a/REPORT_DRAFT.md b/REPORT_DRAFT.md new file mode 100644 index 0000000000..426c645e6c --- /dev/null +++ b/REPORT_DRAFT.md @@ -0,0 +1,88 @@ +# Artificial Intelligence in Healthcare 2020-2025 +A Comprehensive Analysis of Technological, Ethical, and Operational Trends + +## 1. Introduction + +### Scope and Methodology +Over the past five years, artificial intelligence (AI) has shifted from pilot projects to enterprise-wide enablers of clinical excellence, administrative efficiency, and patient engagement. This report synthesizes findings from industry analyses, peer-reviewed research, and corporate trend outlooks published between 2022 and 2025 to chart the trajectory of AI adoption in healthcare. The discussion spans machine learning (ML), deep learning (DL), natural language processing (NLP), medical imaging, and ambient intelligence, while also interrogating the regulatory, ethical, and operational consequences of rapid innovation. + +The evidence base is drawn exclusively from the reference resources supplied: a 2025 health-tech trend overview, a 2025 employer-centric healthcare forecast, two scholarly reviews on medical-imaging AI, and a 2025 ethics commentary. Insights from these sources were triangulated to extract common drivers, recurring challenges, and emerging opportunities. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare https://newsroom.cigna.com/top-health-care-trends-of-2025 https://www.nature.com/articles/s41746-022-00592-y https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +## 2. Advances in Machine Learning and Deep Learning + +### Evolving Algorithmic Landscape +Since 2020, DL architectures—particularly convolutional neural networks (CNNs), transformers, and generative adversarial networks (GANs)—have dominated AI innovation in healthcare. These models now underpin applications ranging from tumor detection on MRIs to 3-D surgical planning, delivering faster inference and improved accuracy across diverse imaging modalities. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ Their maturation coincided with a growing emphasis on retrieval-augmented generation (RAG) and synthetic data pipelines that reinforce model generalizability while safeguarding patient privacy. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### From Bench to Bedside +Clinical translation, however, remains uneven. A 2022 review highlighted systemic issues—dataset bias, publication-driven incentives, and evaluation gaps—that limit real-world impact despite impressive benchmark scores. https://www.nature.com/articles/s41746-022-00592-y Health systems counter these pitfalls by collaborating with technology partners and instituting rigorous validation protocols that tether algorithmic performance to measurable patient outcomes and return on investment. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +## 3. AI-Powered Medical Imaging + +### Technical Progress +The marriage of CNNs and emerging vision transformers (ViTs) has redefined image segmentation, anomaly detection, and resolution enhancement. GANs now generate synthetic CT, MRI, and PET scans that supplement limited datasets, boosting training efficiency and reducing over-fitting risks. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ These advances facilitate earlier disease detection and personalized treatment planning, expanding AI’s footprint from academic radiology suites to community hospitals. + +### Addressing Clinical Challenges +Despite technical strides, the imaging domain grapples with biased training corpora and opaque model reporting, undermining clinician trust and equitable care. https://www.nature.com/articles/s41746-022-00592-y To mitigate these issues, developers are adopting larger, demographically diverse datasets and promoting transparent documentation practices that explicate decision boundaries and failure modes. Ethical frameworks that prioritize fairness and reproducibility are now integral to procurement and deployment decisions. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +## 4. Natural Language Processing & Ambient Intelligence + +### Automating Clinical Documentation +Large language models (LLMs) paired with ambient listening devices have begun to transcribe and structure patient–provider conversations in real time, slashing manual documentation burdens. Early adopters report measurable gains in clinician satisfaction and time reclaimed for direct patient care. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare Retrieval-augmented generation further enriches these transcripts with context from electronic health records (EHRs), increasing factual accuracy and auditability. + +### Conversational AI for Engagement +Outside the exam room, generative AI chatbots personalize health-plan navigation and preventive-care nudges, mirroring the customer-centric experiences popularized by consumer technology firms. Employers cite these NLP-driven interfaces as pivotal to improving benefit utilization and containing costs. https://newsroom.cigna.com/top-health-care-trends-of-2025 + +## 5. Operational Transformation and Patient Experience + +### Streamlining Workflows +AI now orchestrates scheduling, prior authorization, and claims adjudication, delivering cost savings through reduced administrative overhead. Hospitals leverage predictive analytics to forecast patient census and allocate staffing dynamically, aligning resources with demand. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### IoMT and Machine Vision on the Ward +Computer-vision sensors integrated into the Internet of Medical Things (IoMT) detect unsafe patient movements, triggering proactive fall-prevention interventions. The resulting reduction in adverse events demonstrates AI’s capacity to extend clinical vigilance beyond human line-of-sight. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +## 6. Regulatory Landscape & Governance + +### Increasing Scrutiny and Standards +Regulators are intensifying oversight of AI tools, mandating interoperability, audit trails, and evidence of clinical efficacy. U.S. and EU agencies alike are signaling that AI-enabled devices will face approval pathways comparable to traditional medical technologies. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### Compliance Strategies +Healthcare organizations respond by embedding governance committees, adopting international standards for data handling, and engaging multidisciplinary review boards. Balanced deployment models that retain human-in-the-loop supervision are favored to satisfy both legal requirements and public expectations. https://newsroom.cigna.com/top-health-care-trends-of-2025 https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +## 7. Ethical Considerations: Privacy, Bias, and Trust + +### Data Protection & Security +LLMs and analytics engines depend on vast troves of sensitive data, heightening exposure to cyber-attacks. Encryption, federated learning, and strict access controls are now table stakes for vendors hoping to gain market acceptance. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +### Algorithmic Fairness +Biases encoded in historical datasets can propagate inequities, disproportionately affecting marginalized populations. Continuous monitoring, inclusive data collection, and bias audits during the model-development lifecycle are emerging best practices to ensure equitable care delivery. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ https://www.nature.com/articles/s41746-022-00592-y + +## 8. Emerging Research Frontiers + +### Retrieval-Augmented Generation & Synthetic Data +RAG frameworks combine real-time database querying with generative text to improve accuracy and transparency, a feature increasingly prized in clinical decision support. Simultaneously, synthetically generated tabular and imaging data sets allow researchers to stress-test models without exposing personally identifiable information, accelerating innovation while upholding privacy norms. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### Generative Models & Synthetic Imaging +GAN-based pipelines now enhance image resolution, fill modality gaps, and create rare-disease exemplars, thereby democratizing access to high-quality training corpora. This synthetic augmentation is particularly valuable for low-resource settings and niche specialties with sparse data. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ + +## 9. Case Studies + +### Ambient Listening for Documentation Relief +A large U.S. health system deployed microphone-enabled LLM solutions in outpatient clinics, achieving a marked reduction in after-hours charting and elevating provider satisfaction scores. The initiative demonstrated how RAG-enhanced NLP could deliver structured notes that integrate seamlessly with EHR templates. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### Vision-Based Fall Prevention +A network of hospitals installed ceiling-mounted cameras coupled with ML algorithms to detect unsupervised bed-exits. Alerts routed to nursing stations cut in-room fall rates and associated costs, illustrating AI’s tangible impact on patient safety KPIs. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +### Transformer-Driven COVID-19 Screening +During the pandemic’s peak, researchers fine-tuned vision transformers for lung-field segmentation on chest X-rays, enabling rapid triage in overcrowded emergency departments. The model’s high sensitivity underscored DL’s agility in responding to emergent public-health crises. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ + +## 10. Recommendations and Conclusion + +### Strategic Roadmap for Stakeholders +1. Invest in modular, interoperable data infrastructures that can support RAG, federated learning, and continuous monitoring pipelines. +2. Formalize multidisciplinary AI governance bodies to align technical innovation with ethical imperatives and regulatory mandates. +3. Prioritize equitable dataset curation and deploy bias-mitigation tooling throughout the model lifecycle to foster trust among diverse patient cohorts. +4. Expand public-private partnerships to generate high-quality synthetic data and open benchmarks that accelerate safe innovation. + +### Concluding Thoughts +AI has progressed from experimental adjunct to indispensable infrastructure in healthcare, offering unprecedented gains in diagnostic accuracy, operational efficiency, and patient engagement. Sustaining this momentum will require vigilant governance, rigorous validation, and a steadfast commitment to equity and transparency. By integrating technical excellence with ethical stewardship, the healthcare industry can fully realize AI’s promise to enhance outcomes and democratize access to high-quality care. + diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index d7d8c26f95..62a954ef2d 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -139,18 +139,10 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "id": "620f9e40", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "🚫 Guardrail tripped – not an AI topic: The request is about trends in the luxury goods market, which is not focused on artificial intelligence.\n" - ] - } - ], + "outputs": [], "source": [ "from ai_research_assistant_resources.agents_tools_registry.query_expansion_agent import QueryExpansionAgent\n", "from agents import InputGuardrailTripwireTriggered\n", @@ -170,7 +162,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 1, "id": "77364239", "metadata": {}, "outputs": [ @@ -180,13 +172,13 @@ "text": [ "\n", "The task is not clear. The agent asks:\n", - " Could you please specify the timeframe you have in mind for the research report (e.g., current year, last 5 years, or another period)? Additionally, should the report focus on any specific geographic region or subfields within AI developments (e.g., machine learning, natural language processing) or cover the topic broadly?\n", + " Could you please specify the timeframe for the research report (e.g., current year, last 5 years, last 10 years, or all time)? Also, are there specific sectors or applications within AI (such as machine learning, robotics, natural language processing, etc.) you want to focus on?\n", "\n", "\n", - "user input: within the last 1 year, in the US and around ehtical AI development \n", + "user input: 5 years healthcare\n", "\n", "Expanded query:\n", - " Draft a research report that examines the latest trends in ethical AI development within the United States over the last year, providing an analysis of emerging practices, challenges, and regulatory considerations unique to this timeframe and region.\n" + " Draft a comprehensive research report analyzing the latest trends in artificial intelligence (AI) developments within the healthcare industry over the past five years. The report should evaluate advancements in machine learning, deep learning, natural language processing, medical imaging, and other relevant AI applications, while also examining regulatory, ethical, and operational impacts on healthcare delivery. Include detailed case studies, emerging research areas, and recommendations for future innovation in the industry.\n" ] } ], @@ -242,7 +234,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 2, "id": "f15e0c10", "metadata": {}, "outputs": [ @@ -250,20 +242,18 @@ "name": "stdout", "output_type": "stream", "text": [ - "1. Ethical AI development trends USA 2025\n", - "2. Challenges in AI ethics and regulations in 2025\n", - "3. Emerging AI practices and legal considerations in the US 2025\n" + "1. AI trends in healthcare 2025\n", + "2. Machine learning advancements in medical imaging\n", + "3. Ethical impacts of AI in healthcare 2025\n" ] } ], "source": [ - "placeholder_query = \"Draft a research report that examines the latest trends in ethical AI development within the United States over the last year, providing an analysis of emerging practices, challenges, and regulatory considerations unique to this timeframe and region.\"\n", - "\n", "from ai_research_assistant_resources.agents_tools_registry.web_search_terms_generation_agent import WebSearchTermsGenerationAgent\n", "\n", "search_terms_agent = WebSearchTermsGenerationAgent(3)\n", "\n", - "result = await search_terms_agent.task(placeholder_query)\n", + "result = await search_terms_agent.task(expanded_query)\n", "\n", "search_terms_raw = result\n", "\n", @@ -293,7 +283,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 3, "id": "7b7260c5", "metadata": {}, "outputs": [ @@ -301,9 +291,11 @@ "name": "stdout", "output_type": "stream", "text": [ - "1. Ethical AI development trends USA 2025\n", - "2. Challenges in AI ethics and regulations in 2025\n", - "3. Emerging AI practices and legal considerations in the US 2025\n", + "Search Query 1. AI trends in healthcare 2025\n", + "Search Query 2. Machine learning advancements in medical imaging\n", + "Search Query 3. Ethical impacts of AI in healthcare 2025\n", + "Failed to retrieve https://www.unesco.org/en/artificial-intelligence/recommendation-ethics: ('Connection aborted.', ConnectionResetError(54, 'Connection reset by peer'))\n", + "Error: skipped URL: https://www.unesco.org/en/artificial-intelligence/recommendation-ethics\n", "Results written to research_results.json\n" ] } @@ -325,7 +317,7 @@ "research_results = []\n", "\n", "for i, query in enumerate(search_terms_raw.Search_Queries, start=1):\n", - " print(f\"{i}. {query}\")\n", + " print(f\"Search Query {i}. {query}\")\n", " results = get_results_for_search_term(query)\n", " research_results.append(results)\n", "\n", @@ -344,7 +336,165 @@ "id": "e9758743", "metadata": {}, "source": [ - "### Step-4: " + "### Step-4: Create a report" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "076a6f22", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Report written to REPORT_DRAFT.md\n" + ] + } + ], + "source": [ + "from ai_research_assistant_resources.agents_tools_registry.report_writing_agent import (\n", + " ReportWritingAgent,\n", + ")\n", + "import json\n", + "from pathlib import Path\n", + "\n", + "# ------------------------------------------------------------------\n", + "# 1. Load research results\n", + "# ------------------------------------------------------------------\n", + "with open(\"research_results.json\", \"r\", encoding=\"utf-8\") as f:\n", + " research_results = f.read()\n", + "\n", + "# ------------------------------------------------------------------\n", + "# 2. Draft the report\n", + "# ------------------------------------------------------------------\n", + "outline = \"\"\" Draft a comprehensive research report analyzing the latest trends in artificial intelligence (AI) developments within the healthcare industry over the past five years. The report should evaluate advancements in machine learning, deep learning, natural language processing, medical imaging, and other relevant AI applications, while also examining regulatory, ethical, and operational impacts on healthcare delivery. Include detailed case studies, emerging research areas, and recommendations for future innovation in the industry.\"\"\" # ← customise as needed\n", + "\n", + "report_agent = ReportWritingAgent(research_resources=research_results)\n", + "\n", + "draft_md = await report_agent.task(outline)\n", + "\n", + "# ------------------------------------------------------------------\n", + "# 4. Persist to file\n", + "# ------------------------------------------------------------------\n", + "Path(\"REPORT_DRAFT.md\").write_text(draft_md, encoding=\"utf-8\")\n", + "print(\"✅ Report written to REPORT_DRAFT.md\")" + ] + }, + { + "cell_type": "markdown", + "id": "d0f4f1a8", + "metadata": {}, + "source": [ + "### Step-5: Report Expansion or Scouting for additional data points (OPTIONAL)\n", + "\n", + "If you have a large corpus of data, you may have a secondary report expansion agent review each section of the report, and add content that may have been overlooked in the first pass by the report writer. This can be selectively done for a section, or for all sections based on your use case. \n", + "\n", + "While it is beyond the purview of this Cookbook, the overall architecture is as follows. " + ] + }, + { + "cell_type": "markdown", + "id": "586ee533", + "metadata": {}, + "source": [ + "### Step-6: Organize the with References and Table of Content \n", + "\n", + "We let the LLM focus on generating the content, the content formatting such as creating a Table of Content upfront, and move references to the end. \n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7ad9315b", + "metadata": {}, + "outputs": [], + "source": [ + "import re\n", + "\n", + "def update_references(file_path, search_results_json):\n", + " \"\"\"\n", + " Update the references in a Markdown file by extracting unique URLs from tags and creating a\n", + " References section at the end of the file.\n", + "\n", + " :param file_path: The path to the Markdown file.\n", + " \"\"\"\n", + " global content\n", + " # Read the markdown_content of the MD file\n", + "\n", + " global url_to_title\n", + " # Load the search results\n", + " # Create a dictionary for quick lookup of titles by URL\n", + " url_to_title = {entry[\"URL\"]: entry[\"title\"] for entry in search_results_json}\n", + "\n", + " with open(file_path, 'r') as file:\n", + " content = file.read()\n", + " # Remove the existing References section if it exists\n", + " content = re.sub(r'\\n## References[\\s\\S]*', '', content)\n", + " content = re.sub(r'\\n### References[\\s\\S]*', '', content)\n", + "\n", + " # Find all tags and extract the URLs\n", + " sources = re.findall(r'(.*?)', content)\n", + " # Eliminate duplicates while maintaining order\n", + " unique_sources = []\n", + " unique_references = {}\n", + " for source in sources:\n", + " if source not in unique_sources:\n", + " unique_sources.append(source)\n", + " unique_references[source] = unique_sources.index(source) + 1\n", + " # Create the References section\n", + " references_section = \"\\n\\n## References\\n\"\n", + "\n", + " for i, source in enumerate(unique_sources, start=1):\n", + " title = url_to_title.get(source, \"Source not found in the search results\")\n", + " references_section += f\"{i}. [{title}]({source})\\n\"\n", + " # references_section += f\"{i}. {source}\\n\"\n", + "\n", + " # Replace tags with [reference #]\n", + " for source, reference_number in unique_references.items():\n", + " # markdown_content = markdown_content.replace(f'{source}', f'[reference {reference_number}]')\n", + " content = content.replace(f'{source}', f'[[{reference_number}]({source})]')\n", + " # Append the References section to the markdown_content\n", + " content += references_section\n", + " # Save the modified markdown_content back to the file\n", + " with open(file_path, 'w') as file:\n", + " file.write(content)\n", + " \n", + " \n", + "def add_toc_to_markdown(file_path):\n", + " \"\"\"\n", + " Add a Table of Contents (TOC) to a Markdown file by generating links to the headings in the file.\n", + "\n", + " :param file_path: The path to the Markdown file.\n", + " \"\"\"\n", + "\n", + " def generate_toc_line(line):\n", + " level = line.count('#') - 2\n", + " heading = line.strip().lstrip('#').strip()\n", + " link = heading.lower().replace(' ', '-').replace('.', '').replace(',', '')\n", + " return f\"{' ' * level}- [{heading}](#{link})\\n\"\n", + "\n", + " with open(file_path, 'r') as file:\n", + " lines = file.readlines()\n", + "\n", + " toc_lines = []\n", + " content_start_index = 0\n", + " for i, line in enumerate(lines):\n", + " if line.startswith('## '):\n", + " content_start_index = i\n", + " break\n", + "\n", + " for line in lines[content_start_index:]:\n", + " if line.startswith('## ') or line.startswith('### '):\n", + " toc_lines.append(generate_toc_line(line))\n", + "\n", + " toc_content = \"# Table of Contents\\n\" + ''.join(toc_lines) + \"\\n---\\n\\n\"\n", + " new_content = toc_content + ''.join(lines)\n", + "\n", + " with open(file_path, 'w') as file:\n", + " file.write(new_content)" ] }, { diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/report_writing_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/report_writing_agent.py new file mode 100644 index 0000000000..b0a7075a9a --- /dev/null +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/report_writing_agent.py @@ -0,0 +1,109 @@ +from agents import Agent, Runner, RunConfig + + +class ReportWritingAgent: + _DEFAULT_NAME = "Report Writing Agent" + + # NOTE: Double braces around RESOURCES so that str.format leaves them intact when formatting other placeholders. + REPORT_WRITER = """You are a report writer that that writes a detailed report in Markdown file format on a report outline \ + provided by the user citing sources in . Use a formal tone. + + Follow the following rules: + + 1. Take your time to analyze the topic provided by the user, and review all the sources provided in REFERENCE RESOURCES. + + 2. Do you own synthesis of sources presented in REFERENCE RESOURCES. Draw conclusions from the sources. + + 3. For each section of the report write 2 to 3 paragraphs using as many REFERENCE RESOURCES as applicable. + + 4. Use Markdown header # for title, ## for section header, and ### for detailed content in the sections. Keep detailed \ + sections to between 6 and 10. + + 5. Cite sources that you use from REFERENCE RESOURCES in tags. You don't need to provide title of the \ + source. + + For example: + Robust integration within Apple's ecosystem, set it apart from competitors + https://618media.com/en/blog/apple-vision-pro-vs-competitors-a-comparison + + 6. Do not include any additional sources not provided in REFERENCE RESOURCES below. + + The resources are provide in a JSON format below: + + + {{{RESOURCES}}} + + """ + + def __init__( + self, + research_resources: str, + *, + model: str = "o3", + tools: list | None = None, + name: str | None = None, + instructions: str | None = None, + ) -> None: + """Create an agent capable of writing a detailed Markdown report. + + Parameters + ---------- + research_resources: str + A string containing reference resources (e.g., URLs, snippets) that will be injected into the default + prompt, replacing the ``{{{RESOURCES}}}`` placeholder. + model: str, optional + Name of the model to use. Defaults to ``"o3"``. + tools: list, optional + Optional list of tools to pass into the underlying ``agents.Agent``. + name: str, optional + Custom name for the agent instance. + instructions: str, optional + If provided, these instructions override the default prompt. + """ + + # ------------------------------------------------------------------ + # Handle different input formats for `research_resources`. + # If a string is provided, we use it directly; otherwise, convert JSON + # objects (list / dict) into the required Markdown-friendly string. + # ------------------------------------------------------------------ + + resources_str: str = research_resources + + # Use custom instructions if supplied, otherwise format the template with the given resources. + formatted_instructions = instructions or self.REPORT_WRITER.replace( + "{{{RESOURCES}}}", resources_str + ) + + self._agent = Agent( + name=name or self._DEFAULT_NAME, + instructions=formatted_instructions, + tools=tools or [], + model=model, + ) + + # ------------------------------------------------------------------ + # Public helpers + # ------------------------------------------------------------------ + + @property + def agent(self) -> Agent: # type: ignore[name-defined] + """Return the underlying :class:`agents.Agent` instance.""" + return self._agent + + async def task(self, outline: str) -> str: + """Generate a Markdown report based on the provided outline. + + Parameters + ---------- + outline: str + The outline or prompt describing the desired report structure. + + Returns + ------- + str + A Markdown formatted report following the provided outline and referencing the supplied resources. + """ + + cfg = RunConfig(tracing_disabled=True) + result = await Runner.run(self._agent, outline, run_config=cfg) + return result.final_output \ No newline at end of file diff --git a/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py index bcc3dd6cd5..e3ca75a168 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py +++ b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py @@ -43,30 +43,32 @@ def search(search_item, api_key, cse_id, search_depth=SEARCH_DEPTH, site_filter= 'cx': cse_id, 'num': search_depth } + + if api_key is None or cse_id is None: + raise ValueError("API key and CSE ID are required") try: response = requests.get(service_url, params=params) response.raise_for_status() results = response.json() - # Check if 'items' exists in the results - if 'items' in results: - if site_filter is not None: - - # Filter results to include only those with site_filter in the link - filtered_results = [result for result in results['items'] if site_filter in result['link']] - - if filtered_results: - return filtered_results - else: - print(f"No results with {site_filter} found.") - return [] - else: - if 'items' in results: - return results['items'] - else: - print("No search results found.") - return [] + # ------------------------------------------------------------------ + # Robust handling – always return a *list* (never ``None``) + # ------------------------------------------------------------------ + items = results.get("items", []) + + # Optional site filtering + if site_filter: + items = [itm for itm in items if site_filter in itm.get("link", "")] + if not items: + print(f"No results with {site_filter} found.") + + # Graceful handling of empty results + if not items: + print("No search results found.") + return [] + + return items except requests.exceptions.RequestException as e: print(f"An error occurred during the search: {e}") diff --git a/examples/agents_sdk/research_results.json b/examples/agents_sdk/research_results.json new file mode 100644 index 0000000000..fccddd7e02 --- /dev/null +++ b/examples/agents_sdk/research_results.json @@ -0,0 +1,38 @@ +[ + [ + { + "order": 1, + "link": "https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare", + "title": "Jan 6, 2025 ... Which AI Solutions Will Healthcare Organizations Adopt in 2025? · Ambient Listening Reduces Clinical Documentation · Pushing for Increased ...", + "Summary": "In 2025, AI is significantly influencing healthcare, driven by the widespread adoption of generative AI, primarily through large language models (LLMs). Healthcare organizations are increasingly exploring AI to enhance clinical and administrative workflows, improve patient care, and achieve cost efficiencies. Notable trends include the utilization of ambient listening technologies, where machine learning-powered audio solutions reduce clinical documentation burdens by analyzing patient-provider conversations for clinical notes.\n\nMore healthcare institutions are experimenting with retrieval-augmented generation (RAG) to increase the accuracy and transparency of AI. RAG combines traditional databases with LLMs, enhancing the accuracy of generative AI tools by accessing up-to-date organizational data. Synthetic data development also garners interest for improving AI testing and validation, underlining a trend towards better model assurance and performance scrutiny.\n\nMachine vision is employed to enhance patient care with sensors and cameras in patient rooms, alerting staff to patient movements and potential fall risks, thereby improving proactive care and clinical workflows. As these AI technologies advance, they are expected to integrate more seamlessly into Internet of Medical Things (IoMT) systems.\n\nRegulatory scrutiny is anticipated to increase, necessitating a balance between innovation and compliance with new and existing regulations, such as interoperability standards set by health information technology governance.\n\nEffective AI adoption requires robust IT infrastructure and data governance to ensure smooth integration and maximize return on investment. Organizations must manage AI implementation carefully, aligning solutions with clear business needs and ensuring cultural readiness.\n\nEngaging with experienced technology partners can aid healthcare organizations in preparing for AI adoption, ensuring initiatives are sustainable and beneficial. AI's role in healthcare is poised to expand further, driving efficiency, improving patient outcomes, and reshaping the industry's technological landscape." + }, + { + "order": 2, + "link": "https://newsroom.cigna.com/top-health-care-trends-of-2025", + "title": "Jan 2, 2025 ... Additionally, generative AI will play a pivotal role in shaping strategy and growth within the health care industry. A study published in the ...", + "Summary": "In 2025, the healthcare industry is poised for transformative changes driven by key trends, especially impacting U.S. employers. One major trend is the emphasis on enhancing customer experience in healthcare, mirroring expectations set by consumer brands like Apple and Amazon. Personalization and seamless digital interactions will be critical, facilitated by advanced AI that predicts patient needs and suggests preventive measures. Such digital transformations aim to streamline healthcare processes and improve patient engagement and outcomes.\n\nGenerative AI is another pivotal trend, poised to significantly shape strategy and growth in healthcare. It's expected to enhance diagnostic accuracy and efficiency, though balancing AI use with human oversight is vital due to potential inaccuracies. Data security and legal governance around AI applications will be crucial, as organizations incorporate these technologies to improve efficiency and decision-making.\n\nBehavioral health care will see an evolution focusing on personalization, navigation, and measurement-based care, addressing the rising mental health needs in the U.S. Integrating mental health into primary care and expanding virtual behavioral care will enhance access and support for diverse populations, including young people and their families.\n\nThe focus on clinical excellence, particularly in women’s health and condition-specific care, will continue to grow. Integrated care models will aim to reduce costs and improve outcomes through personalized, holistic care. For high-cost conditions like musculoskeletal issues and cardiodiabesity, early identification and personalized treatment will be prioritized, supported by predictive models and digital engagement tools.\n\nOverall, these trends promise a more dynamic healthcare environment, pushing employers to adopt innovative strategies to mitigate rising costs and improve employee health outcomes." + } + ], + [ + { + "order": 1, + "link": "https://www.nature.com/articles/s41746-022-00592-y", + "title": "Apr 12, 2022 ... Research in computer analysis of medical images bears many promises to improve patients' health. However, a number of systematic challenges ...", + "Summary": "The article addresses challenges and future directions in applying machine learning (ML) to medical imaging. It highlights several systemic issues, such as data biases, publication biases, and methodological shortcomings that impede progress. These issues include the reliance on datasets that do not fully reflect clinical environments, leading to biases and overfitting. Many studies focus on achieving high performance on benchmarks rather than addressing clinical needs, with evaluation processes often failing to capture practical significance.\n\nThe review outlines the misalignment between the abundance of research and real-world clinical impact. It criticizes the incentive structures that prioritize novel methods over clinically relevant improvements, and points out that many advancements in ML for medical imaging offer marginal gains overshadowed by evaluation noise.\n\nSeveral recommendations are made to improve the field. These include using larger, more diverse datasets, strengthening evaluation methods, and ensuring that model performance translates into clinical benefits. The article advocates for better documentation and transparency in publishing, suggesting that complex methods should not obscure reproducibility. Encouraging registered reports and focusing on patient outcomes rather than mere predictive accuracy are also recommended. Implementing these changes could help align ML research with clinical objectives, ultimately enhancing patient care." + }, + { + "order": 2, + "link": "https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/", + "title": "The innovation segment explores cutting-edge developments in AI, such as deep learning algorithms, convolutional neural networks, and generative adversarial ...", + "Summary": "The integration of machine learning, particularly deep learning and AI, has significantly advanced medical imaging, transforming healthcare delivery. Innovations like deep learning algorithms, convolutional neural networks (CNNs), and generative adversarial networks (GANs) have enhanced the accuracy and efficiency of medical image analysis, enabling rapid and precise detection of abnormalities such as tumors from radiological exams. These technologies have revolutionized early disease detection, personalized treatment planning, and improved patient outcomes.\n\nKey advancements include the use of CNNs and transformers, allowing for improved feature extraction from medical images, applicable to diverse modalities like CT, MRI, and PET scans. Transformers, such as vision transformers (ViTs), offer enhanced modeling capabilities for medical image processing, capturing complex patterns and relationships. These have been critical in applications ranging from early disease detection to surgical planning.\n\nGANs and other generative models have enabled the creation of synthetic medical images, addressing the challenges of limited datasets in medical imaging, enhancing model training, and improving diagnostic accuracy. Their applications range from synthesizing new image data for training AI models to improving image resolution and quality, thus broadening AI adoption in clinical practice.\n\nMoreover, AI has played a pivotal role in enhancing image segmentation, essential for precision in detecting disease and planning treatments. Innovations in image processing, such as the segmentation of lung fields on X-rays, have improved disease screening, notably during the COVID-19 pandemic.\n\nAI's role in surgical planning is equally transformative, allowing for the accurate modeling of anatomical structures and facilitating customized interventions. This integration with technologies like 3D printing has further enhanced surgical precision.\n\nThe application of machine learning to medical imaging continues to expand, with ongoing research making strides in improving diagnostic efficiency, supporting clinical decisions, and personalizing patient care, promising further impact on healthcare outcomes." + } + ], + [ + { + "order": 1, + "link": "https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/", + "title": "Jan 15, 2025 ... Ethics of AI in Healthcare: Navigating Privacy, Bias, and Trust in 2025 · Unauthorized access: Data breaches and cyberattacks on AI systems put ...", + "Summary": "In 2025, the integration of AI in healthcare is rapidly evolving, presenting significant ethical challenges related to privacy, bias, and trust. Safeguarding patient data is critical, as AI systems require vast amounts of sensitive information, heightening the risk of unauthorized access, data breaches, and misuse. Strategies like data anonymization, encryption, and regulatory oversight are essential to mitigate these privacy risks.\n\nAlgorithmic bias is another pressing concern. AI systems often reflect biases present in the data they learn from, leading to disparities in healthcare outcomes, particularly affecting marginalized populations. To combat this, inclusive data collection and continuous monitoring are necessary to ensure AI tools provide equitable care.\n\nBuilding trust in AI technologies is also vital. Patients express concerns about device reliability, lack of transparency, and data privacy. Healthcare organizations can build trust by offering transparent communication, ensuring regulatory safeguards, and educating providers about AI's role in enhancing, not replacing, human judgment.\n\nRegulatory frameworks remain fragmented globally, posing challenges for establishing consistent ethical standards. Effective AI governance in healthcare requires collaborative oversight, patient-centered policies, and stronger industry standards to ensure that AI systems improve patient outcomes and adhere to ethical practices.\n\nThe focus on purpose-built AI tools highlights the necessity for clear regulations and real-world efficacy testing to ensure these technologies deliver tangible improvements in healthcare. Strengthening regulations and industry-led standards can play a pivotal role in aligning AI applications with ethical guidelines.\n\nLooking forward, AI has the potential to revolutionize healthcare by enhancing diagnostics, treatments, and operational efficiencies. However, these advancements must be grounded in robust ethical considerations to foster fairness and equity. Enhanced equity, improved transparency, and stronger governance are crucial to realizing AI's promise while safeguarding patient rights and trust in healthcare systems." + } + ] +] \ No newline at end of file From 27146829820fb5e613d5fc1ab1b5138d2f45d4c0 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Sun, 8 Jun 2025 21:07:32 -0700 Subject: [PATCH 5/8] Update notebook --- examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index 62a954ef2d..73ea2bce43 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -401,7 +401,7 @@ "source": [ "### Step-6: Organize the with References and Table of Content \n", "\n", - "We let the LLM focus on generating the content, the content formatting such as creating a Table of Content upfront, and move references to the end. \n", + "We let the LLM focus on generating the content, the content formatting such as creating a Table of Content upfront, and move references to the end. \n", "\n" ] }, From f1555ce5bfaf7db8c175817313c12f21d63fa46f Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Sun, 8 Jun 2025 21:07:44 -0700 Subject: [PATCH 6/8] Notebook updates --- examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index 73ea2bce43..728704cbb3 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -516,7 +516,7 @@ "source": [ "### 5 — Guardrails & Best Practices \n", "* **Crawl → Walk → Run**: start with a single agent, then expand into a swarm. \n", - "* **Expose intermediate reasoning** (“show the math”) to build user trust. \n", + "* **Expose intermediate reasoning** (“show the math”) to build user trust. \n", "* **Parameterise UX** so analysts can tweak report format and source mix. \n", "* **Native OpenAI tools first** (web browsing, file ingestion) before reinventing low‑level retrieval. " ] From 02159ed371f5c0931a9843e27e224083f1987a01 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Date: Mon, 9 Jun 2025 08:37:05 -0700 Subject: [PATCH 7/8] Report draft --- examples/agents_sdk/REPORT_DRAFT.md | 58 +++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 examples/agents_sdk/REPORT_DRAFT.md diff --git a/examples/agents_sdk/REPORT_DRAFT.md b/examples/agents_sdk/REPORT_DRAFT.md new file mode 100644 index 0000000000..ebdd1fdc8e --- /dev/null +++ b/examples/agents_sdk/REPORT_DRAFT.md @@ -0,0 +1,58 @@ +# Artificial Intelligence in Healthcare: Five-Year Trend Analysis and Future Outlook + +## Introduction +### +Over the past five years, artificial intelligence (AI) has moved from experimental pilots to foundational infrastructure across clinical, administrative, and consumer-facing domains of healthcare. Large language models (LLMs), advanced computer vision, and generative architectures have accelerated performance gains while simultaneously raising new questions about safety, governance, and workforce realignment. Market analysts note that health systems now evaluate AI not as a peripheral innovation but as a core capability for cost containment, experience improvement, and population-health management. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +At the same time, employers and payers are pressuring providers to deliver personalized, digitally streamlined care that mirrors the frictionless experiences of technology giants. Generative AI sits at the center of these expectations, promising both predictive insight and operational efficiency—yet demanding rigorous oversight to avoid inaccuracies and data-security pitfalls. https://newsroom.cigna.com/top-health-care-trends-of-2025 + +## Machine Learning and Deep Learning Advances +### +Deep learning techniques—particularly convolutional neural networks (CNNs) and transformers such as Vision Transformers (ViTs)—have markedly improved feature extraction across radiology, pathology, and multi-modal datasets. These architectures enable earlier disease detection, more granular stratification of tumor heterogeneity, and data-driven surgical planning, reinforcing AI’s role in precision medicine. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ + +Generative adversarial networks (GANs) now supplement limited clinical datasets by producing realistic synthetic images, thereby addressing sample-size constraints that historically hampered algorithm generalization. When combined with retrieval-augmented generation (RAG) pipelines, synthetic data also bolsters transparency by linking model outputs to verifiable source material. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +## Natural Language Processing and Ambient Intelligence +### +Natural language processing (NLP) has matured from simple entity extraction to context-aware LLMs capable of composing full clinical notes through ambient listening. In exam rooms, speech-recognition engines capture conversational nuances between clinicians and patients, automatically converting them into structured documentation and reducing burnout associated with electronic health records (EHR) data entry. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +Beyond documentation, generative AI personalizes engagement by predicting patient queries, surfacing preventive-care nudges, and tailoring behavioral-health content—a capability increasingly adopted by employer-sponsored health programs striving for consumer-grade experiences. Balancing these benefits with human oversight remains essential, given the risk of hallucinations and mis-triaged recommendations. https://newsroom.cigna.com/top-health-care-trends-of-2025 + +## Medical Imaging Innovations +### +AI-driven imaging has delivered some of the clearest clinical wins, yet research reveals systemic obstacles that undermine reproducibility and equity. A comprehensive review points to dataset bias, publication incentives favoring marginal performance gains, and evaluation metrics that do not reflect bedside impact—all factors that can stall translation into routine practice. https://www.nature.com/articles/s41746-022-00592-y + +Nevertheless, practical milestones abound. During the COVID-19 pandemic, automated segmentation of lung fields on chest X-rays facilitated rapid triage, while 3-D reconstructions integrated with printing technologies optimized orthopedic implant positioning. Such case studies demonstrate how properly validated algorithms can shorten diagnostic cycles and personalize interventions when built on diverse, well-curated data. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ + +## Regulatory and Ethical Landscape +### +Escalating deployment has triggered intensified regulatory scrutiny in 2024-2025. Global frameworks remain fragmented, but common priorities include privacy preservation, algorithm explainability, and bias mitigation. Health-data breaches involving AI systems underscore the urgency of encryption, federated-learning designs, and strict access controls to protect sensitive information. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +Algorithmic bias is equally pressing. Skewed training sets can exacerbate disparities for marginalized populations, eroding trust in both institutions and technology. Inclusive data collection, continuous performance audits, and transparent communication are emerging as baseline requirements to secure regulatory approval and public confidence. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +## Operational Integration and Workflow Transformation +### +Healthcare organizations that treat AI as a co-worker rather than a bolt-on tool realize measurable gains in throughput and quality. Machine-vision cameras linked to real-time alert systems, for instance, now monitor patient movement to preempt falls, allowing nurses to prioritize high-risk rooms without constant rounding. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +Strategically, providers align AI implementations with integrated-care models targeting high-cost conditions such as cardiodiabesity and musculoskeletal disease. Predictive algorithms identify early deterioration, while digital-engagement platforms deliver condition-specific education—cutting downstream utilization and enhancing patient satisfaction. https://newsroom.cigna.com/top-health-care-trends-of-2025 + +## Emerging Research Frontiers +### +Synthetic data, once a stopgap for small cohorts, is maturing into a discipline focused on statistical fidelity and privacy assurance. Coupled with RAG methods, it offers a path to explainable generative outputs anchored in verifiable evidence. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +Meanwhile, cross-modal transformers integrate imaging, genomics, and clinical notes, pointing toward holistic patient models capable of multi-task reasoning. Researchers are also experimenting with lightweight edge-AI deployments within the Internet of Medical Things (IoMT), enabling on-device inference for wearables and in-room sensors that reduce latency and preserve data locality. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ + +## Recommendations for Future Innovation +### +1. Strengthen Data Governance: Adopt federated learning, robust anonymization, and role-based access to mitigate privacy risks while maximizing dataset diversity. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ + +2. Prioritize Clinical Relevance: Incentivize studies that measure patient outcomes instead of benchmark scores, aligning academic success with bedside impact. https://www.nature.com/articles/s41746-022-00592-y + +3. Embrace Retrieval-Augmented Generation: Pair generative models with curated knowledge bases to reduce hallucinations and improve auditability of AI-produced content. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare + +4. Foster Multidisciplinary Teams: Combine data scientists, clinicians, ethicists, and operations leaders to ensure solutions address workflow realities and ethical obligations. https://newsroom.cigna.com/top-health-care-trends-of-2025 + +## Conclusion +### +AI’s trajectory in healthcare over the last half-decade reflects a maturing ecosystem: algorithms are more powerful, deployment scenarios more diverse, and governance structures more sophisticated. Yet success hinges on reconciling technical capability with ethical stewardship and operational pragmatism. Addressing bias, fortifying privacy, and aligning incentives toward demonstrable patient benefit will dictate whether AI continues as a transformative force or stalls under the weight of unresolved challenges. The next five years will reward organizations that combine rigorous science, transparent governance, and patient-centered design to unlock AI’s full potential in advancing global health. + From 5e9a1b1afe5a3301c171bef0a535cd8d0079c1cf Mon Sep 17 00:00:00 2001 From: moustafa-openai Date: Wed, 11 Jun 2025 23:56:37 -0700 Subject: [PATCH 8/8] use openai web saerch, update models used (#1894) --- authors.yaml | 5 + .../AI_Research_Assistant_Cookbook.ipynb | 143 +++++------ examples/agents_sdk/REPORT_DRAFT.md | 66 ++--- .../query_expansion_agent.py | 2 +- .../web_page_summary_agent.py | 2 +- .../web_search_terms_generation_agent.py | 2 +- .../guardrails/topic_content_guardrail.py | 15 +- .../utils/web_search_and_util.py | 241 +++--------------- examples/agents_sdk/research_results.json | 231 ++++++++++++++--- 9 files changed, 355 insertions(+), 352 deletions(-) diff --git a/authors.yaml b/authors.yaml index 4d0a696afe..a253b6265e 100644 --- a/authors.yaml +++ b/authors.yaml @@ -3,6 +3,11 @@ # You can optionally customize how your information shows up cookbook.openai.com over here. # If your information is not present here, it will be pulled from your GitHub profile. +moustafa-openai: + name: "Moustafa Elhadary" + website: "https://www.linkedin.com/in/moustafaelhadary/" + avatar: "https://avatars.githubusercontent.com/u/198829901?v=4" + theophile-openai: name: "Theophile Sautory" website: "https://www.linkedin.com/in/theophilesautory" diff --git a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb index 728704cbb3..1c6cec8fb3 100644 --- a/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb +++ b/examples/agents_sdk/AI_Research_Assistant_Cookbook.ipynb @@ -5,11 +5,11 @@ "id": "85b66af9", "metadata": {}, "source": [ - "# Building an **AI Research Assistant** with the OpenAI Agents SDK\n", + "# Build a **Multi‑Agent AI Research Assistant** with the OpenAI Agents SDK & Responses API\n", "\n", "This notebook provides a reference patterns for implementing a multi‑agent AI Research Assistant that can plan, search, curate, and draft high‑quality reports with citations.\n", "\n", - "While the Deep Research feature is avaialble in ChatGPT, however, individual and companies may want to implement their own API based solution for a more finegrained control over the output.\n", + "While the Deep Research feature is available in ChatGPT, however, individual and companies may want to implement their own API based solution for a more fine grained control over the output.\n", "\n", "With support for Agents, and built-in tools such as Code Interpreter, Web Search, and File Search, - Responses API makes building your own Research Assistant fast and easy. " ] @@ -58,11 +58,11 @@ "\n", "| Step | Purpose | Model |\n", "|------|---------|-------|\n", - "| **Query Expansion** | Draft multi‑facet prompts / hypotheses | `gpt‑4o` |\n", - "| **Search‑Term Generation** | Expand/clean user query into rich keyword list | `gpt‑4o` |\n", - "| **Conduct Research** | Run web & internal searches, rank & summarise results | `gpt‑4o` + tools |\n", - "| **Draft Report** | Produce first narrative with reasoning & inline citations | `o1` / `gpt‑4o` |\n", - "| **Report Expansion** | Polish formatting, add charts / images / appendix | `gpt‑4o` + tools |" + "| **Query Expansion** | Draft multi‑facet prompts / hypotheses | `o4-mini` |\n", + "| **Search‑Term Generation** | Expand/clean user query into rich keyword list | `gpt‑4.1` |\n", + "| **Conduct Research** | Run web & internal searches, rank & summarize results | `gpt‑4.1` + tools |\n", + "| **Draft Report** | Produce first narrative with reasoning & inline citations | `o3` |\n", + "| **Report Expansion** | Polish formatting, add charts / images / appendix | `gpt‑4.1` + tools |" ] }, { @@ -75,7 +75,7 @@ "\n", "* **Research Planning Agent** – interprets the user request and produces a research plan/agenda.\n", "* **Knowledge Assistant Agent** – orchestrates parallel web & file searches via built‑in tools, curates short‑term memory.\n", - "* **Web Search Agent(s)** – perform Internet queries, deduplicate, rank and summarise pages.\n", + "* **Web Search Agent(s)** – perform Internet queries, deduplicate, rank and summarize pages.\n", "* **Report Creation Agent** – consumes curated corpus and drafts the structured report.\n", "* **(Optional) Data Analysis Agent** – executes code for numeric/CSV analyses via the Code Interpreter tool.\n", "* **(Optional) Image‑Gen Agent** – generates illustrative figures.\n", @@ -97,10 +97,18 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "id": "3a16ac1f", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], "source": [ "%pip install openai openai-agents --quiet" ] @@ -123,9 +131,9 @@ "\n", "The query expansion step ensures the subsequent agents conducting research have sufficient context of user's inquiry. \n", "\n", - "The first step is to understand user's intent, and make sure the user has provided sufficinet details for subsequent agents to search the web, build a knowledge repository, and prepare a deepdive report. The `query_expansion_agent.py` accomplishes this with the prompt that outlines minimum information needed from the user to generate a report. This could include timeframe, industry, target audience, etc. The prompt can be tailored to the need of your deepresearch assistant. The agent will put a `is_task_clear` yes or no, when its no, it would prompt the user with additional questions, if sufficent information is available, it would output the expanded prompt. \n", + "The first step is to understand user's intent, and make sure the user has provided sufficient details for subsequent agents to search the web, build a knowledge repository, and prepare a deep dive report. The `query_expansion_agent.py` accomplishes this with the prompt that outlines minimum information needed from the user to generate a report. This could include timeframe, industry, target audience, etc. The prompt can be tailored to the need of your deep research assistant. The agent will put a `is_task_clear` yes or no, when its no, it would prompt the user with additional questions, if sufficient information is available, it would output the expanded prompt. \n", "\n", - "This is also an opportunity to enforce input guardrails for any research topics that you'd like to restrict the user from reserarching based on your usage policies. " + "This is also an opportunity to enforce input guardrails for any research topics that you'd like to restrict the user from researching based on your usage policies. " ] }, { @@ -134,15 +142,23 @@ "metadata": {}, "source": [ "##### Input Guardrails with Agents SDK \n", - "Let's assume our ficticious guardrail is to prevent the user from generating a non-AI releated topic report. For this we will define a guardrail agent. The guardrail agent `topic_guradrail.py` checks whether the topic is related to AI, if not, it raises an execption. The function `ai_topic_guardrail` is passed to the `QueryExpansionAgent()` as `input_guardrails`" + "Let's assume our fictitious guardrail is to prevent the user from generating a non-AI related topic report. For this we will define a guardrail agent. The guardrail agent `topic_content_guardrail.py` checks whether the topic is related to AI, if not, it raises an exception. The function `ai_topic_guardrail` is passed to the `QueryExpansionAgent()` as `input_guardrails`" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "id": "620f9e40", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "🚫 Guardrail tripped – not an AI topic: The user's request focuses on the luxury goods market, which pertains to market trends in the luxury sector rather than artificial intelligence. Therefore, it is not about AI.\n" + ] + } + ], "source": [ "from ai_research_assistant_resources.agents_tools_registry.query_expansion_agent import QueryExpansionAgent\n", "from agents import InputGuardrailTripwireTriggered\n", @@ -162,7 +178,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 3, "id": "77364239", "metadata": {}, "outputs": [ @@ -172,13 +188,17 @@ "text": [ "\n", "The task is not clear. The agent asks:\n", - " Could you please specify the timeframe for the research report (e.g., current year, last 5 years, last 10 years, or all time)? Also, are there specific sectors or applications within AI (such as machine learning, robotics, natural language processing, etc.) you want to focus on?\n", + " 1. What timeframe should the report cover (e.g., the past year, the past five years, up to current date)?\n", + "2. Should the report focus on specific AI subfields (e.g., natural language processing, computer vision, reinforcement learning) or provide a general overview?\n", + "3. Are there particular industries or application domains (e.g., healthcare, finance, manufacturing) you want the report to emphasize?\n", + "4. What length or depth do you expect for the report (e.g., a brief summary, a detailed 20-page analysis)?\n", + "5. Who is the target audience for the report (e.g., technical researchers, business executives, policymakers)?\n", "\n", "\n", - "user input: 5 years healthcare\n", + "user input: 5 years, AI in healthcare, exec summary \n", "\n", "Expanded query:\n", - " Draft a comprehensive research report analyzing the latest trends in artificial intelligence (AI) developments within the healthcare industry over the past five years. The report should evaluate advancements in machine learning, deep learning, natural language processing, medical imaging, and other relevant AI applications, while also examining regulatory, ethical, and operational impacts on healthcare delivery. Include detailed case studies, emerging research areas, and recommendations for future innovation in the industry.\n" + " Draft an executive summary research report on the latest trends in AI developments in healthcare over the past five years. Summarize key advancements across major subfields such as diagnostic imaging, predictive analytics, natural language processing for clinical documentation, and personalized medicine. Highlight impactful case studies, emerging technologies, regulatory considerations, and potential challenges. Provide high-level insights on market adoption, ROI metrics, and strategic recommendations for healthcare executives.\n" ] } ], @@ -229,12 +249,12 @@ "source": [ "Conducting Web search is typically an integral part of the deep research process. First we generate web search terms relevant to the research report. In the next step we will search the web and build a knowledge repository of the data.\n", "\n", - "The `WebSearchTermsGenerationAgent` takes as input the the expanded prompt, and generates succient search terms. You can structure the search term generation prompt according to your user's typical requirements such as include adjacent industries in the search terms, include competitors, etc. Additionally, you can also control how much data you want to gather e.g., number of search terms to generate. In our case, we will limit to 3 search terms. " + "The `WebSearchTermsGenerationAgent` takes as input the the expanded prompt, and generates succinct search terms. You can structure the search term generation prompt according to your user's typical requirements such as include adjacent industries in the search terms, include competitors, etc. Additionally, you can also control how much data you want to gather e.g., number of search terms to generate. In our case, we will limit to 3 search terms. " ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, "id": "f15e0c10", "metadata": {}, "outputs": [ @@ -242,9 +262,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "1. AI trends in healthcare 2025\n", - "2. Machine learning advancements in medical imaging\n", - "3. Ethical impacts of AI in healthcare 2025\n" + "1. Latest AI trends in healthcare 2025 report\n", + "2. Advancements in AI for diagnostic imaging and predictive analytics 2020-2025\n", + "3. Impactful AI case studies in healthcare and market adoption analysis 2025\n" ] } ], @@ -266,24 +286,26 @@ "id": "3feeaae8", "metadata": {}, "source": [ - "#### Step 3 - Scroll the Web build a inventory of data sources \n", + "### Step 3 - Web Search: Build an Inventory of Data Sources\n", "\n", - "We will use custom web search to identify and knowledge content to form the baseline for our report. You can learn more about building custom web search and retreival here. [Building a Bring Your Own Browser (BYOB) Tool for Web Browsing and Summarization](https://cookbook.openai.com/examples/third_party/web_search_with_google_api_bring_your_own_browser_tool). You will also need a Google Custom Search API key and Custom Search Engine ID (CSE ID) in a .env file at the root. \n", + "In this step, we will use the OpenAI web search tool that is integrated into the `responses` API to identify and collect knowledge content that will form the baseline for our report. This tool allows you to search the web and retrieve relevant information and citations directly within your workflow, without needing to set up any external search APIs or browser automation.\n", "\n", - "NOTE: The reason for using custom web search is provide more finegrained control over which information is retreived, and guardrails such as excluding competitor's content from your report. \n", + "You can learn more about the OpenAI web search tool here: [OpenAI Web Search Tool Documentation](https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses).\n", "\n", - "This is a 3 step process: \n", + "The OpenAI web search tool is a convenient, out-of-the-box solution for most research use cases. However, if you require more fine-grained control over the information retrieved (for example, to exclude certain sources, apply custom filters, or use a specific search engine), you can also build and use your own browser-based or Google Custom Search integration. For an example of building a custom web search and retrieval pipeline, see [Building a Bring Your Own Browser (BYOB) Tool for Web Browsing and Summarization](https://cookbook.openai.com/examples/third_party/web_search_with_google_api_bring_your_own_browser_tool).\n", "\n", - "1. Obtain the search results (top 10 pages)\n", - "2. Scroll the pages, and summarize the key points \n", - "3. Output guardrails to weedout irrelevant or undesirable results (e.g., the timeframe of the content doesn't align with user's need, or mentions a competitor)\n", + "The process for building your research data inventory using the OpenAI web search tool is as follows:\n", "\n", - "prerequisite pip install nest_asyncio" + "1. Obtain the search results (e.g., top 10 relevant pages) for each search term.\n", + "2. Extract and summarize the key points from each result.\n", + "3. Optionally, apply output guardrails to filter out irrelevant or undesirable results (for example, based on publication date, source, or content).\n", + "#\n", + "If you choose to implement your own custom search or browser-based retrieval, you may need additional setup such as API keys or environment configuration." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 5, "id": "7b7260c5", "metadata": {}, "outputs": [ @@ -291,11 +313,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "Search Query 1. AI trends in healthcare 2025\n", - "Search Query 2. Machine learning advancements in medical imaging\n", - "Search Query 3. Ethical impacts of AI in healthcare 2025\n", - "Failed to retrieve https://www.unesco.org/en/artificial-intelligence/recommendation-ethics: ('Connection aborted.', ConnectionResetError(54, 'Connection reset by peer'))\n", - "Error: skipped URL: https://www.unesco.org/en/artificial-intelligence/recommendation-ethics\n", + "Search Query 1: Latest AI trends in healthcare 2025 report\n", + "Search Query 2: Advancements in AI for diagnostic imaging and predictive analytics 2020-2025\n", + "Search Query 3: Impactful AI case studies in healthcare and market adoption analysis 2025\n", "Results written to research_results.json\n" ] } @@ -303,32 +323,19 @@ "source": [ "from ai_research_assistant_resources.utils.web_search_and_util import get_results_for_search_term\n", "import json\n", - "from dotenv import load_dotenv\n", - "import os\n", - "\n", - "load_dotenv('.env')\n", - "\n", - "api_key = os.getenv('API_KEY')\n", - "cse_id = os.getenv('CSE_ID')\n", - "\n", - "if not api_key or not cse_id:\n", - " raise ValueError(\"API_KEY and CSE_ID must be set as environment variables or in a .env file\")\n", "\n", "research_results = []\n", "\n", - "for i, query in enumerate(search_terms_raw.Search_Queries, start=1):\n", - " print(f\"Search Query {i}. {query}\")\n", - " results = get_results_for_search_term(query)\n", - " research_results.append(results)\n", + "for idx, query in enumerate(search_terms_raw.Search_Queries, 1):\n", + " print(f\"Search Query {idx}: {query}\")\n", + " research_results.append(get_results_for_search_term(query))\n", "\n", - "# Pretty-print the JSON response (or a friendly message if no results).\n", - "if results:\n", - " # Write results to a file\n", + "if research_results: \n", " with open(\"research_results.json\", \"w\", encoding=\"utf-8\") as f:\n", " json.dump(research_results, f, indent=2, ensure_ascii=False)\n", " print(\"Results written to research_results.json\")\n", "else:\n", - " print(\"No results returned. Check your API credentials or search term.\")" + " print(\"No results returned.\")\n" ] }, { @@ -341,7 +348,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "id": "076a6f22", "metadata": {}, "outputs": [ @@ -369,7 +376,7 @@ "# ------------------------------------------------------------------\n", "# 2. Draft the report\n", "# ------------------------------------------------------------------\n", - "outline = \"\"\" Draft a comprehensive research report analyzing the latest trends in artificial intelligence (AI) developments within the healthcare industry over the past five years. The report should evaluate advancements in machine learning, deep learning, natural language processing, medical imaging, and other relevant AI applications, while also examining regulatory, ethical, and operational impacts on healthcare delivery. Include detailed case studies, emerging research areas, and recommendations for future innovation in the industry.\"\"\" # ← customise as needed\n", + "outline = \"\"\" Draft a comprehensive research report analyzing the latest trends in artificial intelligence (AI) developments within the healthcare industry over the past five years. The report should evaluate advancements in machine learning, deep learning, natural language processing, medical imaging, and other relevant AI applications, while also examining regulatory, ethical, and operational impacts on healthcare delivery. Include detailed case studies, emerging research areas, and recommendations for future innovation in the industry.\"\"\" # ← customize as needed\n", "\n", "report_agent = ReportWritingAgent(research_resources=research_results)\n", "\n", @@ -407,7 +414,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "id": "7ad9315b", "metadata": {}, "outputs": [], @@ -497,18 +504,6 @@ " file.write(new_content)" ] }, - { - "cell_type": "markdown", - "id": "2a890c80", - "metadata": {}, - "source": [] - }, - { - "cell_type": "markdown", - "id": "255e0a06", - "metadata": {}, - "source": [] - }, { "cell_type": "markdown", "id": "fb69c797", @@ -517,7 +512,7 @@ "### 5 — Guardrails & Best Practices \n", "* **Crawl → Walk → Run**: start with a single agent, then expand into a swarm. \n", "* **Expose intermediate reasoning** (“show the math”) to build user trust. \n", - "* **Parameterise UX** so analysts can tweak report format and source mix. \n", + "* **Parameterize UX** so analysts can tweak report format and source mix. \n", "* **Native OpenAI tools first** (web browsing, file ingestion) before reinventing low‑level retrieval. " ] }, @@ -544,7 +539,7 @@ ], "metadata": { "kernelspec": { - "display_name": ".venv", + "display_name": "openai", "language": "python", "name": "python3" }, @@ -558,7 +553,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.13.1" + "version": "3.12.9" } }, "nbformat": 4, diff --git a/examples/agents_sdk/REPORT_DRAFT.md b/examples/agents_sdk/REPORT_DRAFT.md index ebdd1fdc8e..9798aca68c 100644 --- a/examples/agents_sdk/REPORT_DRAFT.md +++ b/examples/agents_sdk/REPORT_DRAFT.md @@ -1,58 +1,62 @@ -# Artificial Intelligence in Healthcare: Five-Year Trend Analysis and Future Outlook +# Artificial Intelligence in Healthcare: A Five-Year Trend Analysis (2020 – 2025) -## Introduction +## 1. Introduction and Scope ### -Over the past five years, artificial intelligence (AI) has moved from experimental pilots to foundational infrastructure across clinical, administrative, and consumer-facing domains of healthcare. Large language models (LLMs), advanced computer vision, and generative architectures have accelerated performance gains while simultaneously raising new questions about safety, governance, and workforce realignment. Market analysts note that health systems now evaluate AI not as a peripheral innovation but as a core capability for cost containment, experience improvement, and population-health management. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare +Over the past half-decade, artificial intelligence (AI) has evolved from a set of promising proofs-of-concept to a foundational technology across nearly every layer of healthcare delivery. Spanning machine learning (ML), deep learning (DL), natural language processing (NLP), computer vision, and hybrid approaches, AI systems now routinely assist in diagnostics, therapeutics, administration, and population health management. Market analysts estimate that the global AI-in-healthcare sector will expand from roughly USD 39 billion in 2025 to almost USD 491 billion by 2032, reflecting a 43 % compound annual growth rate and underscoring the strategic importance of continued innovation and governance in this domain https://www.fortunebusinessinsights.com/industry-reports/artificial-intelligence-in-healthcare-market-100534?utm_source=openai. -At the same time, employers and payers are pressuring providers to deliver personalized, digitally streamlined care that mirrors the frictionless experiences of technology giants. Generative AI sits at the center of these expectations, promising both predictive insight and operational efficiency—yet demanding rigorous oversight to avoid inaccuracies and data-security pitfalls. https://newsroom.cigna.com/top-health-care-trends-of-2025 +This report synthesizes peer-reviewed studies, market analyses, and recent case studies from 2020-2025 to map the trajectory of AI innovation. It evaluates advancements in ML/DL algorithms, NLP, medical imaging, surgical and drug-discovery applications, and operational use cases. Further, it critically examines regulatory, ethical, and organizational considerations, before proposing recommendations to accelerate responsible adoption. -## Machine Learning and Deep Learning Advances +## 2. Machine Learning & Deep Learning for Diagnostics ### -Deep learning techniques—particularly convolutional neural networks (CNNs) and transformers such as Vision Transformers (ViTs)—have markedly improved feature extraction across radiology, pathology, and multi-modal datasets. These architectures enable earlier disease detection, more granular stratification of tumor heterogeneity, and data-driven surgical planning, reinforcing AI’s role in precision medicine. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ +ML and DL have markedly improved diagnostic accuracy by learning complex patterns in multimodal data. Google’s DeepMind, for instance, achieved a 94.5 % accuracy rate in detecting eye diseases, outperforming seasoned specialists and illustrating the ability of convolutional neural networks to generalize across large ophthalmology datasets https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. Parallel advances in oncology leverage DL to analyze histopathology slides, predicting tumor aggressiveness and supporting precision medicine workflows. -Generative adversarial networks (GANs) now supplement limited clinical datasets by producing realistic synthetic images, thereby addressing sample-size constraints that historically hampered algorithm generalization. When combined with retrieval-augmented generation (RAG) pipelines, synthetic data also bolsters transparency by linking model outputs to verifiable source material. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare +Equally significant are predictive analytics models that comb electronic health records (EHRs) to identify high-risk patients. AI-driven tools now detect sepsis hours before clinical manifestation and forecast multiple-sclerosis progression months in advance, enabling prophylactic interventions and lowering readmission rates https://blog.medicai.io/en/future-of-medical-imaging/?utm_source=openai. These gains, however, demand rigorous validation to mitigate automation bias and ensure equitable performance across demographic subgroups. -## Natural Language Processing and Ambient Intelligence +## 3. Natural Language Processing & Conversational AI ### -Natural language processing (NLP) has matured from simple entity extraction to context-aware LLMs capable of composing full clinical notes through ambient listening. In exam rooms, speech-recognition engines capture conversational nuances between clinicians and patients, automatically converting them into structured documentation and reducing burnout associated with electronic health records (EHR) data entry. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare +NLP has transitioned from rule-based systems to transformer models capable of contextual understanding. Healthcare chatbots and virtual health assistants deliver 24/7 triage, medication adherence reminders, and mental-health coaching; the global chatbot market alone is on track for a 21.5 % CAGR, approaching USD 544 million by 2030 https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. On the provider side, ambient “digital scribe” platforms such as Heidi Health automatically generate SOAP notes and discharge summaries, now supporting over one million patient encounters weekly across several countries https://en.wikipedia.org/wiki/Heidi_Health?utm_source=openai. -Beyond documentation, generative AI personalizes engagement by predicting patient queries, surfacing preventive-care nudges, and tailoring behavioral-health content—a capability increasingly adopted by employer-sponsored health programs striving for consumer-grade experiences. Balancing these benefits with human oversight remains essential, given the risk of hallucinations and mis-triaged recommendations. https://newsroom.cigna.com/top-health-care-trends-of-2025 +Clinical documentation has further been enhanced by generative-AI models that outperform surgeons in drafting post-operative reports—cutting time, reducing omissions, and freeing clinicians for direct patient care https://www.reuters.com/business/healthcare-pharmaceuticals/health-rounds-ai-tops-surgeons-writing-post-operative-reports-2025-02-14/?utm_source=openai. Yet, integrating these models into electronic workflows raises questions about data provenance, liability, and clinician oversight. -## Medical Imaging Innovations +## 4. AI in Medical Imaging & Predictive Analytics ### -AI-driven imaging has delivered some of the clearest clinical wins, yet research reveals systemic obstacles that undermine reproducibility and equity. A comprehensive review points to dataset bias, publication incentives favoring marginal performance gains, and evaluation metrics that do not reflect bedside impact—all factors that can stall translation into routine practice. https://www.nature.com/articles/s41746-022-00592-y +Imaging has been one of AI’s earliest and most profound success stories. DL algorithms now match or surpass expert radiologists in detecting lung nodules, breast lesions, and cerebral hemorrhages, while real-time image fusion (e.g., PET-MRI) allows dynamic visualization of tumor metabolism alongside anatomical context https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai. Techniques such as Deep Tomographic Reconstruction reduce noise and radiation exposure, enabling low-dose CT and rapid MRI without sacrificing resolution https://en.wikipedia.org/wiki/Deep_Tomographic_Reconstruction?utm_source=openai. -Nevertheless, practical milestones abound. During the COVID-19 pandemic, automated segmentation of lung fields on chest X-rays facilitated rapid triage, while 3-D reconstructions integrated with printing technologies optimized orthopedic implant positioning. Such case studies demonstrate how properly validated algorithms can shorten diagnostic cycles and personalize interventions when built on diverse, well-curated data. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ +Predictive models built on longitudinal imaging plus clinical data anticipate strokes, cardiac events, and treatment responses. Integrations with EHRs provide clinicians with risk-stratified dashboards, accelerating time to intervention and enhancing population-health initiatives https://pmc.ncbi.nlm.nih.gov/articles/PMC11799571/?utm_source=openai. Nonetheless, scalability hinges on standardized data pipelines and interoperability standards, areas where many health systems still lag. -## Regulatory and Ethical Landscape +## 5. Surgical Robotics, Drug Discovery & Remote Monitoring ### -Escalating deployment has triggered intensified regulatory scrutiny in 2024-2025. Global frameworks remain fragmented, but common priorities include privacy preservation, algorithm explainability, and bias mitigation. Health-data breaches involving AI systems underscore the urgency of encryption, federated-learning designs, and strict access controls to protect sensitive information. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ +AI-assisted robotic platforms have catalyzed a new era of minimally invasive surgery. The AI-enabled surgical robotics market is projected to top USD 14.7 billion by 2026, with systems delivering sub-millimeter precision, reduced complication rates, and shortened recovery times https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. Outside the operating room, generative models expedite drug discovery; Insilico Medicine identified a fibrosis candidate molecule in just 46 days, exemplifying the compression of traditional R&D timelines https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. -Algorithmic bias is equally pressing. Skewed training sets can exacerbate disparities for marginalized populations, eroding trust in both institutions and technology. Inclusive data collection, continuous performance audits, and transparent communication are emerging as baseline requirements to secure regulatory approval and public confidence. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ +Remote patient monitoring (RPM) leverages AI to analyze data from wearables, flagging arrhythmias or glycemic excursions before they escalate. The RPM market is expanding 20 % annually toward USD 175 billion by 2027, signaling a shift from episodic to continuous care https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. Such distributed models offer scalability but demand robust cybersecurity and reimbursement frameworks. -## Operational Integration and Workflow Transformation +## 6. Operational & Administrative Transformation ### -Healthcare organizations that treat AI as a co-worker rather than a bolt-on tool realize measurable gains in throughput and quality. Machine-vision cameras linked to real-time alert systems, for instance, now monitor patient movement to preempt falls, allowing nurses to prioritize high-risk rooms without constant rounding. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare +Beyond clinical tasks, AI optimizes revenue-cycle management, scheduling, and supply-chain logistics. Wolters Kluwer highlights AI-enabled clinical-decision support and administrative automation as top healthcare technology trends for 2025, citing reductions in clinician burnout and operating costs https://www.wolterskluwer.com/en/news/wolters-kluwer-25-for-25-report-predicts-key-healthcare-technology-trends?utm_source=openai. -Strategically, providers align AI implementations with integrated-care models targeting high-cost conditions such as cardiodiabesity and musculoskeletal disease. Predictive algorithms identify early deterioration, while digital-engagement platforms deliver condition-specific education—cutting downstream utilization and enhancing patient satisfaction. https://newsroom.cigna.com/top-health-care-trends-of-2025 +India’s Apollo Hospitals illustrates tangible impact: AI-driven documentation and clinical-support tools free two to three hours of daily staff time, improving throughput amid nursing shortages and a planned 33 % bed-capacity expansion https://www.reuters.com/business/healthcare-pharmaceuticals/indias-apollo-hospitals-bets-ai-tackle-staff-workload-2025-03-13/?utm_source=openai. Venture investment echoes these gains; startups secured over USD 35 billion in the first half of 2024 for ambient documentation and revenue-cycle platforms https://www.futuremarketinsights.com/reports/artificial-intelligence-in-healthcare-market?utm_source=openai. -## Emerging Research Frontiers +## 7. Regulatory, Ethical & Data-Security Considerations ### -Synthetic data, once a stopgap for small cohorts, is maturing into a discipline focused on statistical fidelity and privacy assurance. Coupled with RAG methods, it offers a path to explainable generative outputs anchored in verifiable evidence. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare +Rapid deployment has intensified scrutiny on privacy, bias, and accountability. Scholars warn of “automation bias,” where clinicians over-rely on algorithmic output without adequate verification—a risk magnified in diagnostic AI systems that lack transparent reasoning pathways https://arxiv.org/abs/2502.16732?utm_source=openai. Concurrently, integrating AI with blockchain offers immutable audit trails, promoting HIPAA compliance and mitigating data-breach risks https://arxiv.org/abs/2501.02169?utm_source=openai. -Meanwhile, cross-modal transformers integrate imaging, genomics, and clinical notes, pointing toward holistic patient models capable of multi-task reasoning. Researchers are also experimenting with lightweight edge-AI deployments within the Internet of Medical Things (IoMT), enabling on-device inference for wearables and in-room sensors that reduce latency and preserve data locality. https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/ +Global regulators are responding with draft frameworks that emphasize human-in-the-loop oversight, performance monitoring, and equitable dataset curation. Ethical guidelines also stress the need to evaluate social determinants of health to prevent perpetuating disparities, particularly in mental-health AI applications https://en.wikipedia.org/wiki/Artificial_intelligence_in_mental_health?utm_source=openai. Still, absence of a clear liability model for AI-assisted diagnosis remains a formidable barrier to widespread clinical trust. -## Recommendations for Future Innovation +## 8. Case Studies Demonstrating Real-World Impact ### -1. Strengthen Data Governance: Adopt federated learning, robust anonymization, and role-based access to mitigate privacy risks while maximizing dataset diversity. https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/ +Several implementations confirm AI’s capacity to enhance outcomes and efficiency. The University of Pittsburgh and Leidos channeled USD 10 million into AI tools that cut leukemia reporting times and support underserved communities, showcasing how generative models can reduce diagnostic disparities https://www.axios.com/local/pittsburgh/2025/04/18/pitt-leidos-use-ai-to-fight-cancer-and-health-disparities?utm_source=openai. Meanwhile, UK-based C2-Ai applies risk stratification and behavioral coaching to surgical wait-lists, lowering complication rates sixfold and halving readmissions across 2,000 patients https://www.ft.com/content/37b79af4-116f-46e5-9bbd-b814aa4c95af?utm_source=openai. -2. Prioritize Clinical Relevance: Incentivize studies that measure patient outcomes instead of benchmark scores, aligning academic success with bedside impact. https://www.nature.com/articles/s41746-022-00592-y +Imaging networks also benefit: Mercy health system integrated Aidoc’s aiOS platform to triage radiology studies, accelerating time-critical findings and standardizing workflows across sites https://en.wikipedia.org/wiki/Aidoc?utm_source=openai. Collectively, these cases evidence financial and clinical ROI, yet emphasize the necessity for robust change-management strategies and continuous algorithm auditing. -3. Embrace Retrieval-Augmented Generation: Pair generative models with curated knowledge bases to reduce hallucinations and improve auditability of AI-produced content. https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare - -4. Foster Multidisciplinary Teams: Combine data scientists, clinicians, ethicists, and operations leaders to ensure solutions address workflow realities and ethical obligations. https://newsroom.cigna.com/top-health-care-trends-of-2025 - -## Conclusion +## 9. Emerging Research Directions & Recommendations ### -AI’s trajectory in healthcare over the last half-decade reflects a maturing ecosystem: algorithms are more powerful, deployment scenarios more diverse, and governance structures more sophisticated. Yet success hinges on reconciling technical capability with ethical stewardship and operational pragmatism. Addressing bias, fortifying privacy, and aligning incentives toward demonstrable patient benefit will dictate whether AI continues as a transformative force or stalls under the weight of unresolved challenges. The next five years will reward organizations that combine rigorous science, transparent governance, and patient-centered design to unlock AI’s full potential in advancing global health. +Next-generation AI research is converging on multimodal foundation models capable of ingesting text, images, genomics, and physiologic signals to produce holistic patient representations. Combining such models with federated-learning architectures may enable cross-institutional insights while preserving data sovereignty https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai. Additionally, explainable-AI (XAI) techniques, from saliency maps to counterfactual reasoning, are poised to enhance clinician trust and regulatory compliance. + +To sustain momentum, stakeholders should: +1. Establish standardized benchmarks and real-world evidence repositories for continual algorithm validation. +2. Incentivize development of bias-mitigation toolkits and ethical-impact assessments. +3. Expand reimbursement pathways for AI-enabled preventive care and RPM. +4. Foster public-private partnerships to pilot blockchain-secured data-exchange frameworks. +5. Cultivate AI literacy among clinicians through integrated education modules, ensuring human oversight remains central to patient care. +By embedding these principles, the healthcare ecosystem can harness AI’s transformative potential while safeguarding equity, safety, and trust. \ No newline at end of file diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py index 7fdcfd3472..9042dc4b49 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/query_expansion_agent.py @@ -53,7 +53,7 @@ class QueryExpansionAgent: """ - def __init__(self, *, model: str = "o3-mini", tools: list | None = None, name: str | None = None, + def __init__(self, *, model: str = "o4-mini", tools: list | None = None, name: str | None = None, instructions: str | None = None, input_guardrails: list | None = None): # Initialise the underlying `agents.Agent` with a structured `output_type` so it diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py index 39c8e2bc20..46c991a6d6 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_page_summary_agent.py @@ -17,7 +17,7 @@ def __init__( search_term: str, character_limit: int = 1000, *, - model: str = "gpt-4o", + model: str = "gpt-4.1", tools: list | None = None, name: str | None = None, instructions: str | None = None, diff --git a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py index 0484ce117f..5e23a6f853 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py +++ b/examples/agents_sdk/ai_research_assistant_resources/agents_tools_registry/web_search_terms_generation_agent.py @@ -36,7 +36,7 @@ def __init__( self, num_search_terms: int = _NUM_SEARCH_TERMS, *, - model: str = "gpt-4o", + model: str = "gpt-4.1", tools: list | None = None, name: str | None = None, instructions: str | None = None, diff --git a/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py b/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py index 74b6209d80..ec67fbb9e9 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py +++ b/examples/agents_sdk/ai_research_assistant_resources/guardrails/topic_content_guardrail.py @@ -16,10 +16,13 @@ # 1. Tiny classifier agent → “Is this prompt about AI?” # --------------------------------------------------------------------------- + class TopicCheckOutput(BaseModel): """Structured result returned by the classifier.""" - is_about_ai: bool # True → prompt is AI-related - reasoning: str # short rationale (useful for logs) + + is_about_ai: bool # True → prompt is AI-related + reasoning: str # short rationale (useful for logs) + topic_guardrail_agent = Agent( name="Topic guardrail (AI)", @@ -30,7 +33,7 @@ class TopicCheckOutput(BaseModel): "policy, or market trends. " "Return is_about_ai = false for all other domains (finance, biology, history, etc.)." ), - model="gpt-4o-mini", # lightweight, fast + model="gpt-4.1-mini", # lightweight, fast output_type=TopicCheckOutput, ) @@ -38,8 +41,9 @@ class TopicCheckOutput(BaseModel): # 2. Guardrail function (decorated) that wraps the classifier # --------------------------------------------------------------------------- + @input_guardrail -async def ai_topic_guardrail( +async def ai_topic_guardrail( ctx: RunContextWrapper[None], agent: Agent, input: str | List[TResponseInputItem], @@ -53,5 +57,6 @@ async def ai_topic_guardrail( return output + # Optional: tidy public surface -__all__ = ["ai_topic_guardrail", "TopicCheckOutput"] \ No newline at end of file +__all__ = ["ai_topic_guardrail", "TopicCheckOutput"] diff --git a/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py index e3ca75a168..0e58b9fa22 100644 --- a/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py +++ b/examples/agents_sdk/ai_research_assistant_resources/utils/web_search_and_util.py @@ -1,209 +1,44 @@ -# web_search_and_util.py - -from bs4 import BeautifulSoup -import requests -from dotenv import load_dotenv -import os - -load_dotenv('.env') - -api_key = os.getenv('API_KEY') -cse_id = os.getenv('CSE_ID') - -TRUNCATE_SCRAPED_TEXT = 50000 # Adjust based on your model's context window -SEARCH_DEPTH = 2 # Default depth for Google Custom Search queries - -# ------------------------------------------------------------------ -# Optional: patch asyncio to allow nested event loops (e.g., inside Jupyter) -# ------------------------------------------------------------------ - -try: - import nest_asyncio # type: ignore - - # ``nest_asyncio`` monkey-patches the running event-loop so that further - # calls to ``asyncio.run`` or ``loop.run_until_complete`` do **not** raise - # ``RuntimeError: This event loop is already running``. This makes the - # synchronous helper functions below safe to call in notebook cells while - # still working unchanged in regular Python scripts. - - nest_asyncio.apply() -except ImportError: # pragma: no cover - # ``nest_asyncio`` is an optional dependency. If it is unavailable we - # simply skip patching – the helper functions will still work in regular - # Python scripts but may raise ``RuntimeError`` when called from within - # environments that already run an event-loop (e.g., Jupyter). - pass - -def search(search_item, api_key, cse_id, search_depth=SEARCH_DEPTH, site_filter=None): - service_url = 'https://www.googleapis.com/customsearch/v1' - - params = { - 'q': search_item, - 'key': api_key, - 'cx': cse_id, - 'num': search_depth - } - - if api_key is None or cse_id is None: - raise ValueError("API key and CSE ID are required") - - try: - response = requests.get(service_url, params=params) - response.raise_for_status() - results = response.json() - - # ------------------------------------------------------------------ - # Robust handling – always return a *list* (never ``None``) - # ------------------------------------------------------------------ - items = results.get("items", []) - - # Optional site filtering - if site_filter: - items = [itm for itm in items if site_filter in itm.get("link", "")] - if not items: - print(f"No results with {site_filter} found.") - - # Graceful handling of empty results - if not items: - print("No search results found.") - return [] - - return items - - except requests.exceptions.RequestException as e: - print(f"An error occurred during the search: {e}") - return [] - - - - -def retrieve_content(url, max_tokens=TRUNCATE_SCRAPED_TEXT): - try: - headers = {'User-Agent': 'Mozilla/5.0'} - response = requests.get(url, headers=headers, timeout=10) - response.raise_for_status() - - soup = BeautifulSoup(response.content, 'html.parser') - for script_or_style in soup(['script', 'style']): - script_or_style.decompose() - - text = soup.get_text(separator=' ', strip=True) - characters = max_tokens * 4 # Approximate conversion - text = text[:characters] - return text - except requests.exceptions.RequestException as e: - print(f"Failed to retrieve {url}: {e}") - return None - - - -async def get_search_results(search_items, search_term: str, character_limit: int = 500): - # Generate a summary of search results for the given search term - results_list = [] - for idx, item in enumerate(search_items, start=1): - url = item.get('link') - - snippet = item.get('snippet', '') - web_content = retrieve_content(url, TRUNCATE_SCRAPED_TEXT) - - if web_content is None: - print(f"Error: skipped URL: {url}") - else: - summary = summarize_content(web_content, search_term, character_limit) - result_dict = { - 'order': idx, - 'link': url, - 'title': snippet, - 'Summary': summary - } - results_list.append(result_dict) - return results_list - -# ------------------------------------------------------------------ -# Helper using WebPageSummaryAgent for content summarisation -# ------------------------------------------------------------------ -# NOTE: -# ``WebPageSummaryAgent`` is an agent wrapper that internally spins up an -# ``agents.Agent`` instance with the correct system prompt for Web page -# summarisation. Because the ``task`` method on the wrapper is *async*, we -# provide a small synchronous wrapper that takes care of running the coroutine -# irrespective of whether the caller is inside an active event-loop (e.g. -# Jupyter notebooks) or not. - -from ai_research_assistant_resources.agents_tools_registry.web_page_summary_agent import WebPageSummaryAgent -import asyncio - - -def summarize_content(content: str, search_term: str, character_limit: int = 2000) -> str: # noqa: D401 - - # Instantiate the agent with the dynamic instructions. - agent = WebPageSummaryAgent(search_term=search_term, character_limit=character_limit) +from openai import OpenAI + +client = OpenAI() + + +def openai_web_search( + query: str, + model: str = "gpt-4.1", + search_context_size: str = "high", +) -> dict: + resp = client.responses.create( + model=model, + tools=[ + {"type": "web_search_preview", "search_context_size": search_context_size} + ], + input=f"Search the web for the following information and provide citations: {query}", + ) - # Run the agent task, making sure we properly handle the presence (or - # absence) of an already-running event-loop. - try: - return asyncio.run(agent.task(content)) - except RuntimeError: - # We are *probably* inside an existing event-loop (common in notebooks - # or async frameworks). In that case fall back to using the current - # loop instead of creating a new one. - loop = asyncio.get_event_loop() - return loop.run_until_complete(agent.task(content)) + answer = "" + citations = [] + for item in resp.output: + if item.type == "message": + for part in item.content: + if part.type == "output_text": + answer = part.text + for ann in part.annotations or []: + if ann.type == "url_citation": + citations.append( + { + "url": ann.url, + "title": getattr(ann, "title", None), + "start_index": getattr(ann, "start_index", None), + "end_index": getattr(ann, "end_index", None), + } + ) -# ------------------------------------------------------------------ -# High-level convenience API -# ------------------------------------------------------------------ + return {"answer": answer, "citations": citations} def get_results_for_search_term( - search_term: str, - *, - character_limit: int = 2000, - search_depth: int = SEARCH_DEPTH, - site_filter: str | None = None, -) -> list[dict]: - """Search the Web for *search_term* and return enriched result dictionaries. - - The function handles the entire workflow: - - 1. Perform a Google Custom Search using the provided credentials. - 2. Retrieve and clean the contents of each result page. - 3. Generate a concise summary of each page focused on *search_term* using - :pyfunc:`summarize_content`. - - The returned value is a ``list`` of ``dict`` objects with the following - keys: ``order``, ``link``, ``title`` and ``Summary``. - """ - - # Step 1 – search. - search_items = search( - search_term, - api_key=api_key, - cse_id=cse_id, - search_depth=search_depth, - site_filter=site_filter, - ) - - # Step 2 & 3 – scrape pages and summarise. - # ``get_search_results`` is an *async* coroutine. Execute it and - # return its result, transparently handling the presence (or absence) - # of an already-running event loop (e.g. in notebooks). - - try: - # Prefer ``asyncio.run`` which creates and manages a fresh event - # loop. This is the most robust option for regular Python - # scripts. - import asyncio # local import to avoid polluting module top-level - - return asyncio.run( - get_search_results(search_items, search_term, character_limit) - ) - except RuntimeError: - # We probably find ourselves inside an existing event loop (for - # instance when this helper is invoked from within a Jupyter - # notebook). Fall back to re-using the current loop. - loop = asyncio.get_event_loop() - return loop.run_until_complete( - get_search_results(search_items, search_term, character_limit) - ) \ No newline at end of file + search_term: str, *, search_context_size: str = "high" +) -> dict: + return openai_web_search(search_term, search_context_size=search_context_size) diff --git a/examples/agents_sdk/research_results.json b/examples/agents_sdk/research_results.json index fccddd7e02..752cc36643 100644 --- a/examples/agents_sdk/research_results.json +++ b/examples/agents_sdk/research_results.json @@ -1,38 +1,197 @@ [ - [ - { - "order": 1, - "link": "https://healthtechmagazine.net/article/2025/01/overview-2025-ai-trends-healthcare", - "title": "Jan 6, 2025 ... Which AI Solutions Will Healthcare Organizations Adopt in 2025? · Ambient Listening Reduces Clinical Documentation · Pushing for Increased ...", - "Summary": "In 2025, AI is significantly influencing healthcare, driven by the widespread adoption of generative AI, primarily through large language models (LLMs). Healthcare organizations are increasingly exploring AI to enhance clinical and administrative workflows, improve patient care, and achieve cost efficiencies. Notable trends include the utilization of ambient listening technologies, where machine learning-powered audio solutions reduce clinical documentation burdens by analyzing patient-provider conversations for clinical notes.\n\nMore healthcare institutions are experimenting with retrieval-augmented generation (RAG) to increase the accuracy and transparency of AI. RAG combines traditional databases with LLMs, enhancing the accuracy of generative AI tools by accessing up-to-date organizational data. Synthetic data development also garners interest for improving AI testing and validation, underlining a trend towards better model assurance and performance scrutiny.\n\nMachine vision is employed to enhance patient care with sensors and cameras in patient rooms, alerting staff to patient movements and potential fall risks, thereby improving proactive care and clinical workflows. As these AI technologies advance, they are expected to integrate more seamlessly into Internet of Medical Things (IoMT) systems.\n\nRegulatory scrutiny is anticipated to increase, necessitating a balance between innovation and compliance with new and existing regulations, such as interoperability standards set by health information technology governance.\n\nEffective AI adoption requires robust IT infrastructure and data governance to ensure smooth integration and maximize return on investment. Organizations must manage AI implementation carefully, aligning solutions with clear business needs and ensuring cultural readiness.\n\nEngaging with experienced technology partners can aid healthcare organizations in preparing for AI adoption, ensuring initiatives are sustainable and beneficial. AI's role in healthcare is poised to expand further, driving efficiency, improving patient outcomes, and reshaping the industry's technological landscape." - }, - { - "order": 2, - "link": "https://newsroom.cigna.com/top-health-care-trends-of-2025", - "title": "Jan 2, 2025 ... Additionally, generative AI will play a pivotal role in shaping strategy and growth within the health care industry. A study published in the ...", - "Summary": "In 2025, the healthcare industry is poised for transformative changes driven by key trends, especially impacting U.S. employers. One major trend is the emphasis on enhancing customer experience in healthcare, mirroring expectations set by consumer brands like Apple and Amazon. Personalization and seamless digital interactions will be critical, facilitated by advanced AI that predicts patient needs and suggests preventive measures. Such digital transformations aim to streamline healthcare processes and improve patient engagement and outcomes.\n\nGenerative AI is another pivotal trend, poised to significantly shape strategy and growth in healthcare. It's expected to enhance diagnostic accuracy and efficiency, though balancing AI use with human oversight is vital due to potential inaccuracies. Data security and legal governance around AI applications will be crucial, as organizations incorporate these technologies to improve efficiency and decision-making.\n\nBehavioral health care will see an evolution focusing on personalization, navigation, and measurement-based care, addressing the rising mental health needs in the U.S. Integrating mental health into primary care and expanding virtual behavioral care will enhance access and support for diverse populations, including young people and their families.\n\nThe focus on clinical excellence, particularly in women’s health and condition-specific care, will continue to grow. Integrated care models will aim to reduce costs and improve outcomes through personalized, holistic care. For high-cost conditions like musculoskeletal issues and cardiodiabesity, early identification and personalized treatment will be prioritized, supported by predictive models and digital engagement tools.\n\nOverall, these trends promise a more dynamic healthcare environment, pushing employers to adopt innovative strategies to mitigate rising costs and improve employee health outcomes." - } - ], - [ - { - "order": 1, - "link": "https://www.nature.com/articles/s41746-022-00592-y", - "title": "Apr 12, 2022 ... Research in computer analysis of medical images bears many promises to improve patients' health. However, a number of systematic challenges ...", - "Summary": "The article addresses challenges and future directions in applying machine learning (ML) to medical imaging. It highlights several systemic issues, such as data biases, publication biases, and methodological shortcomings that impede progress. These issues include the reliance on datasets that do not fully reflect clinical environments, leading to biases and overfitting. Many studies focus on achieving high performance on benchmarks rather than addressing clinical needs, with evaluation processes often failing to capture practical significance.\n\nThe review outlines the misalignment between the abundance of research and real-world clinical impact. It criticizes the incentive structures that prioritize novel methods over clinically relevant improvements, and points out that many advancements in ML for medical imaging offer marginal gains overshadowed by evaluation noise.\n\nSeveral recommendations are made to improve the field. These include using larger, more diverse datasets, strengthening evaluation methods, and ensuring that model performance translates into clinical benefits. The article advocates for better documentation and transparency in publishing, suggesting that complex methods should not obscure reproducibility. Encouraging registered reports and focusing on patient outcomes rather than mere predictive accuracy are also recommended. Implementing these changes could help align ML research with clinical objectives, ultimately enhancing patient care." - }, - { - "order": 2, - "link": "https://pmc.ncbi.nlm.nih.gov/articles/PMC10740686/", - "title": "The innovation segment explores cutting-edge developments in AI, such as deep learning algorithms, convolutional neural networks, and generative adversarial ...", - "Summary": "The integration of machine learning, particularly deep learning and AI, has significantly advanced medical imaging, transforming healthcare delivery. Innovations like deep learning algorithms, convolutional neural networks (CNNs), and generative adversarial networks (GANs) have enhanced the accuracy and efficiency of medical image analysis, enabling rapid and precise detection of abnormalities such as tumors from radiological exams. These technologies have revolutionized early disease detection, personalized treatment planning, and improved patient outcomes.\n\nKey advancements include the use of CNNs and transformers, allowing for improved feature extraction from medical images, applicable to diverse modalities like CT, MRI, and PET scans. Transformers, such as vision transformers (ViTs), offer enhanced modeling capabilities for medical image processing, capturing complex patterns and relationships. These have been critical in applications ranging from early disease detection to surgical planning.\n\nGANs and other generative models have enabled the creation of synthetic medical images, addressing the challenges of limited datasets in medical imaging, enhancing model training, and improving diagnostic accuracy. Their applications range from synthesizing new image data for training AI models to improving image resolution and quality, thus broadening AI adoption in clinical practice.\n\nMoreover, AI has played a pivotal role in enhancing image segmentation, essential for precision in detecting disease and planning treatments. Innovations in image processing, such as the segmentation of lung fields on X-rays, have improved disease screening, notably during the COVID-19 pandemic.\n\nAI's role in surgical planning is equally transformative, allowing for the accurate modeling of anatomical structures and facilitating customized interventions. This integration with technologies like 3D printing has further enhanced surgical precision.\n\nThe application of machine learning to medical imaging continues to expand, with ongoing research making strides in improving diagnostic efficiency, supporting clinical decisions, and personalizing patient care, promising further impact on healthcare outcomes." - } - ], - [ - { - "order": 1, - "link": "https://www.alation.com/blog/ethics-of-ai-in-healthcare-privacy-bias-trust-2025/", - "title": "Jan 15, 2025 ... Ethics of AI in Healthcare: Navigating Privacy, Bias, and Trust in 2025 · Unauthorized access: Data breaches and cyberattacks on AI systems put ...", - "Summary": "In 2025, the integration of AI in healthcare is rapidly evolving, presenting significant ethical challenges related to privacy, bias, and trust. Safeguarding patient data is critical, as AI systems require vast amounts of sensitive information, heightening the risk of unauthorized access, data breaches, and misuse. Strategies like data anonymization, encryption, and regulatory oversight are essential to mitigate these privacy risks.\n\nAlgorithmic bias is another pressing concern. AI systems often reflect biases present in the data they learn from, leading to disparities in healthcare outcomes, particularly affecting marginalized populations. To combat this, inclusive data collection and continuous monitoring are necessary to ensure AI tools provide equitable care.\n\nBuilding trust in AI technologies is also vital. Patients express concerns about device reliability, lack of transparency, and data privacy. Healthcare organizations can build trust by offering transparent communication, ensuring regulatory safeguards, and educating providers about AI's role in enhancing, not replacing, human judgment.\n\nRegulatory frameworks remain fragmented globally, posing challenges for establishing consistent ethical standards. Effective AI governance in healthcare requires collaborative oversight, patient-centered policies, and stronger industry standards to ensure that AI systems improve patient outcomes and adhere to ethical practices.\n\nThe focus on purpose-built AI tools highlights the necessity for clear regulations and real-world efficacy testing to ensure these technologies deliver tangible improvements in healthcare. Strengthening regulations and industry-led standards can play a pivotal role in aligning AI applications with ethical guidelines.\n\nLooking forward, AI has the potential to revolutionize healthcare by enhancing diagnostics, treatments, and operational efficiencies. However, these advancements must be grounded in robust ethical considerations to foster fairness and equity. Enhanced equity, improved transparency, and stronger governance are crucial to realizing AI's promise while safeguarding patient rights and trust in healthcare systems." - } - ] + { + "answer": "Artificial Intelligence (AI) is profoundly transforming the healthcare sector, with several key trends emerging in 2025:\n\n**1. AI-Powered Diagnostics**\n\nAI algorithms are enhancing diagnostic accuracy by analyzing medical images and patient data. For instance, Google's DeepMind achieved a 94.5% accuracy rate in diagnosing eye diseases, surpassing human specialists. ([voiceoc.com](https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai))\n\n**2. Conversational AI and Virtual Health Assistants**\n\nAI-driven chatbots and virtual assistants are improving patient engagement by providing 24/7 support, managing appointments, and offering personalized health information. The global healthcare chatbot market is projected to grow at a 21.5% CAGR, reaching approximately $543.65 million by 2030. ([voiceoc.com](https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai))\n\n**3. Generative AI in Drug Discovery**\n\nGenerative AI is accelerating drug discovery by predicting molecular behaviors and simulating patient populations, reducing development time and costs. For example, Insilico Medicine developed a potential drug for fibrosis in just 46 days using AI. ([voiceoc.com](https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai))\n\n**4. Remote Patient Monitoring and Telemedicine**\n\nAI enhances remote patient monitoring by analyzing data from wearables to predict health issues, enabling early interventions. The remote patient monitoring market is expected to grow 20% annually, reaching $175.2 billion by 2027. ([voiceoc.com](https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai))\n\n**5. AI in Surgery**\n\nAI-powered robotic systems are assisting in surgeries, improving precision and reducing complications. The AI-assisted surgical robotics market is projected to reach $14.7 billion by 2026. ([voiceoc.com](https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai))\n\n**6. Predictive Analytics for Disease Prevention**\n\nAI-driven predictive analytics are identifying individuals at risk of diseases, enabling proactive interventions. For instance, AI models can analyze behavioral and clinical data to predict and prevent suicide risks. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_intelligence_in_mental_health?utm_source=openai))\n\n**7. AI in Healthcare Administration**\n\nAI is streamlining administrative tasks such as clinical documentation and patient scheduling, reducing clinician burnout. In 2025, AI is expected to further integrate into healthcare workflows, enhancing efficiency and productivity. ([wolterskluwer.com](https://www.wolterskluwer.com/en/news/wolters-kluwer-25-for-25-report-predicts-key-healthcare-technology-trends?utm_source=openai))\n\n**8. AI in Mental Health**\n\nAI applications in mental health include chatbots providing cognitive behavioral therapy and predictive analytics identifying individuals at risk of mental health disorders, facilitating early interventions. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Artificial_intelligence_in_mental_health?utm_source=openai))\n\n**9. AI and Blockchain for Data Security**\n\nIntegrating AI with blockchain technology enhances the security and privacy of healthcare data, ensuring compliance with regulations like HIPAA. This combination addresses challenges related to data breaches and unauthorized access. ([arxiv.org](https://arxiv.org/abs/2501.02169?utm_source=openai))\n\n**10. Ethical Considerations and AI Regulation**\n\nAs AI adoption grows, addressing ethical concerns such as data privacy, algorithmic bias, and regulatory compliance becomes crucial. Establishing robust ethical frameworks is essential for responsible AI implementation in healthcare. ([arxiv.org](https://arxiv.org/abs/2501.01639?utm_source=openai))\n\nThese trends highlight AI's transformative impact on healthcare, offering opportunities to improve patient outcomes, enhance operational efficiency, and address critical challenges in the industry.\n\n\n## Recent Developments in AI Transforming Healthcare:\n- [How we can use AI to create a better society](https://www.ft.com/content/33ed8ad0-f8ad-42ed-983a-54d5b9eb2d27?utm_source=openai)\n- [Our Healthcare System Is Broken. Can Technology Help Heal It?](https://time.com/7203635/our-healthcare-system-is-broken-can-technology-help-heal-it/?utm_source=openai)\n- [Health Rounds: AI tops surgeons in writing post-operative reports](https://www.reuters.com/business/healthcare-pharmaceuticals/health-rounds-ai-tops-surgeons-writing-post-operative-reports-2025-02-14/?utm_source=openai) ", + "citations": [ + { + "url": "https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai", + "title": "Top AI Trends in Healthcare 2025: Transforming the Future of Medical Innovation - Voiceoc", + "start_index": 368, + "end_index": 464 + }, + { + "url": "https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai", + "title": "Top AI Trends in Healthcare 2025: Transforming the Future of Medical Innovation - Voiceoc", + "start_index": 816, + "end_index": 912 + }, + { + "url": "https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai", + "title": "Top AI Trends in Healthcare 2025: Transforming the Future of Medical Innovation - Voiceoc", + "start_index": 1203, + "end_index": 1299 + }, + { + "url": "https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai", + "title": "Top AI Trends in Healthcare 2025: Transforming the Future of Medical Innovation - Voiceoc", + "start_index": 1583, + "end_index": 1679 + }, + { + "url": "https://www.voiceoc.com/blogs/ai-healthcare-trends-for-future?utm_source=openai", + "title": "Top AI Trends in Healthcare 2025: Transforming the Future of Medical Innovation - Voiceoc", + "start_index": 1892, + "end_index": 1988 + }, + { + "url": "https://en.wikipedia.org/wiki/Artificial_intelligence_in_mental_health?utm_source=openai", + "title": "Artificial intelligence in mental health", + "start_index": 2259, + "end_index": 2369 + }, + { + "url": "https://www.wolterskluwer.com/en/news/wolters-kluwer-25-for-25-report-predicts-key-healthcare-technology-trends?utm_source=openai", + "title": "Wolters Kluwer “25 for ‘25” report predicts key healthcare technology trends | Wolters Kluwer", + "start_index": 2645, + "end_index": 2797 + }, + { + "url": "https://en.wikipedia.org/wiki/Artificial_intelligence_in_mental_health?utm_source=openai", + "title": "Artificial intelligence in mental health", + "start_index": 3035, + "end_index": 3145 + }, + { + "url": "https://arxiv.org/abs/2501.02169?utm_source=openai", + "title": "The Integration of Blockchain and Artificial Intelligence for Secure Healthcare Systems", + "start_index": 3424, + "end_index": 3489 + }, + { + "url": "https://arxiv.org/abs/2501.01639?utm_source=openai", + "title": "Implications of Artificial Intelligence on Health Data Privacy and Confidentiality", + "start_index": 3775, + "end_index": 3840 + }, + { + "url": "https://www.ft.com/content/33ed8ad0-f8ad-42ed-983a-54d5b9eb2d27?utm_source=openai", + "title": "How we can use AI to create a better society", + "start_index": 4098, + "end_index": 4227 + }, + { + "url": "https://time.com/7203635/our-healthcare-system-is-broken-can-technology-help-heal-it/?utm_source=openai", + "title": "Our Healthcare System Is Broken. Can Technology Help Heal It?", + "start_index": 4230, + "end_index": 4398 + }, + { + "url": "https://www.reuters.com/business/healthcare-pharmaceuticals/health-rounds-ai-tops-surgeons-writing-post-operative-reports-2025-02-14/?utm_source=openai", + "title": "Health Rounds: AI tops surgeons in writing post-operative reports", + "start_index": 4401, + "end_index": 4621 + } + ] + }, + { + "answer": "Between 2020 and 2025, artificial intelligence (AI) has significantly advanced diagnostic imaging and predictive analytics, enhancing disease detection, treatment planning, and patient outcomes.\n\n**Advancements in AI for Diagnostic Imaging:**\n\n1. **Enhanced Image Interpretation:** AI algorithms, particularly deep learning models, have improved the accuracy and speed of interpreting medical images. For instance, convolutional neural networks (CNNs) now achieve diagnostic accuracy comparable to senior radiologists for specific pathologies, such as detecting lung nodules and breast cancer lesions. ([forbes.com](https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai))\n\n2. **Real-Time Imaging and Image Fusion:** AI has facilitated real-time imaging techniques and image fusion, allowing for dynamic visualization of physiological processes. This includes advancements in 4D imaging, capturing moving structures like blood flow patterns through cardiac cycles, and hybrid imaging modalities like PET-MRI fusion, enhancing tumor metabolism assessment and anatomical mapping. ([blog.medicai.io](https://blog.medicai.io/en/future-of-medical-imaging/?utm_source=openai))\n\n3. **Automated Image Analysis:** AI-driven tools have automated the analysis of medical images, reducing radiologists' workload and improving diagnostic consistency. For example, AI systems can prioritize urgent cases by detecting critical findings in imaging studies, leading to faster intervention. ([forbes.com](https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai))\n\n4. **Improved Imaging Techniques:** AI has enhanced imaging techniques by reducing noise and improving resolution, resulting in clearer images with lower radiation doses. This is particularly beneficial in modalities like low-dose CT and fast MRI, where traditional methods faced challenges with image quality. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Deep_Tomographic_Reconstruction?utm_source=openai))\n\n**Advancements in AI for Predictive Analytics:**\n\n1. **Early Disease Detection:** AI models have been developed to predict disease progression by analyzing patterns in imaging data. For instance, machine learning algorithms can forecast multiple sclerosis progression with high accuracy months in advance, enabling early intervention. ([blog.medicai.io](https://blog.medicai.io/en/future-of-medical-imaging/?utm_source=openai))\n\n2. **Personalized Treatment Planning:** AI has been instrumental in developing personalized treatment plans by analyzing patient-specific data. In oncology, AI systems assess tumor characteristics and predict responses to therapies, facilitating tailored treatment strategies. ([forbes.com](https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai))\n\n3. **Risk Prediction Models:** AI-driven predictive models analyze various data sources, including imaging and clinical records, to identify patients at risk of adverse events like strokes or heart attacks. This proactive approach allows for timely preventive measures. ([time.com](https://time.com/6227623/ai-medical-imaging-radiology/?utm_source=openai))\n\n4. **Integration with Electronic Health Records (EHRs):** AI systems have been integrated with EHRs to analyze large datasets, identifying patterns and predicting patient outcomes. This integration supports clinical decision-making and enhances patient care by providing comprehensive insights. ([pmc.ncbi.nlm.nih.gov](https://pmc.ncbi.nlm.nih.gov/articles/PMC11799571/?utm_source=openai))\n\nThese advancements underscore AI's transformative role in diagnostic imaging and predictive analytics, leading to more accurate diagnoses, personalized treatments, and improved patient outcomes. ", + "citations": [ + { + "url": "https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai", + "title": "AI-Driven Medical Imaging: Revolutionizing Diagnostics And Treatment", + "start_index": 602, + "end_index": 765 + }, + { + "url": "https://blog.medicai.io/en/future-of-medical-imaging/?utm_source=openai", + "title": "Future of Medical Imaging in 2025", + "start_index": 1171, + "end_index": 1263 + }, + { + "url": "https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai", + "title": "AI-Driven Medical Imaging: Revolutionizing Diagnostics And Treatment", + "start_index": 1566, + "end_index": 1729 + }, + { + "url": "https://en.wikipedia.org/wiki/Deep_Tomographic_Reconstruction?utm_source=openai", + "title": "Deep Tomographic Reconstruction", + "start_index": 2042, + "end_index": 2143 + }, + { + "url": "https://blog.medicai.io/en/future-of-medical-imaging/?utm_source=openai", + "title": "Future of Medical Imaging in 2025", + "start_index": 2480, + "end_index": 2572 + }, + { + "url": "https://www.forbes.com/councils/forbestechcouncil/2025/03/11/ai-driven-medical-imaging-revolutionizing-diagnostics-and-treatment/?utm_source=openai", + "title": "AI-Driven Medical Imaging: Revolutionizing Diagnostics And Treatment", + "start_index": 2851, + "end_index": 3014 + }, + { + "url": "https://time.com/6227623/ai-medical-imaging-radiology/?utm_source=openai", + "title": "How AI Is Changing Medical Imaging to Improve Patient Care", + "start_index": 3286, + "end_index": 3372 + }, + { + "url": "https://pmc.ncbi.nlm.nih.gov/articles/PMC11799571/?utm_source=openai", + "title": "Use of AI in Diagnostic Imaging and Future Prospects - PMC", + "start_index": 3669, + "end_index": 3763 + } + ] + }, + { + "answer": "Artificial intelligence (AI) is increasingly transforming healthcare by enhancing diagnostic accuracy, streamlining operations, and improving patient outcomes. Below are several impactful AI case studies in healthcare, followed by an analysis of market adoption trends as of 2025.\n\n**Impactful AI Case Studies in Healthcare**\n\n1. **Apollo Hospitals' AI Integration in India**\n\n Apollo Hospitals, one of India's largest healthcare providers, has significantly invested in AI to alleviate the workload of medical staff. By automating routine tasks such as medical documentation, the hospital aims to free up two to three hours daily for doctors and nurses. The AI tools assist with patient diagnoses, test recommendations, treatment suggestions, and transcription of doctors' notes. Additionally, Apollo is developing an AI system to prescribe the most effective antibiotics, addressing challenges like high nurse attrition rates and plans to expand bed capacity by one-third over four years. ([reuters.com](https://www.reuters.com/business/healthcare-pharmaceuticals/indias-apollo-hospitals-bets-ai-tackle-staff-workload-2025-03-13/?utm_source=openai))\n\n2. **University of Pittsburgh and Leidos Collaboration**\n\n The University of Pittsburgh (Pitt) partnered with Leidos in a $10 million, five-year initiative to combat cancer and heart disease using AI, focusing on underserved communities. Leveraging Pitt's Computational Pathology and AI Center of Excellence (CPACE), the collaboration has developed AI tools that enhance diagnostic speed and accuracy. For instance, generative AI applications have expedited leukemia reporting, reducing errors and alleviating pressures from staffing shortages. The initiative emphasizes ethical AI implementation with continuous human oversight. ([axios.com](https://www.axios.com/local/pittsburgh/2025/04/18/pitt-leidos-use-ai-to-fight-cancer-and-health-disparities?utm_source=openai))\n\n3. **C2-Ai's AI-Powered Surgical Support**\n\n UK-based company C2-Ai developed an AI system to assist patients during long waits for surgery, improving outcomes and reducing healthcare costs. The technology assesses individual patient risks using a dataset of approximately 500 million global patient cases. Patients receive guidance on diet, exercise, and mental health support through applications like Surgery Hero. This proactive approach has led to a sixfold decrease in complications and halved readmission rates among about 2,000 patients in Cheshire and Merseyside. The technology is also being adopted in countries like Canada, Italy, and Sweden. ([ft.com](https://www.ft.com/content/37b79af4-116f-46e5-9bbd-b814aa4c95af?utm_source=openai))\n\n4. **Mercy's Integration of Aidoc's AI Platform**\n\n In February 2025, Mercy, a healthcare organization, began integrating Aidoc’s AI-driven aiOS platform into its imaging services to enhance speed and accuracy in patient care. This initiative is part of Mercy's broader strategy to incorporate AI into clinical workflows, aiming to improve diagnostic efficiency and patient outcomes. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Aidoc?utm_source=openai))\n\n5. **Heidi Health's AI Medical Scribe**\n\n Australian company Heidi Health provides AI-based medical scribe software that automates clinical documentation for healthcare professionals. The software transcribes patient consultations into clinical notes, case histories, and other medical documents, reducing administrative burdens and allowing clinicians to focus more on patient care. As of 2025, Heidi Health supports over one million patient interactions per week across multiple countries. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Heidi_Health?utm_source=openai))\n\n**Market Adoption Analysis as of 2025**\n\nThe AI in healthcare market has experienced substantial growth, driven by technological advancements and increasing demand for efficient healthcare solutions. Key trends include:\n\n- **Market Growth**: The global AI in healthcare market is projected to grow from USD 39.34 billion in 2025 to USD 490.96 billion by 2032, exhibiting a compound annual growth rate (CAGR) of 43.4%. ([fortunebusinessinsights.com](https://www.fortunebusinessinsights.com/industry-reports/artificial-intelligence-in-healthcare-market-100534?utm_source=openai))\n\n- **Investment Surge**: In the first two quarters of 2024, total investments in AI-driven healthcare startups surpassed USD 35 billion. Notable examples include Abridge securing USD 150 million for ambient clinical documentation tools and AKASA raising USD 120 million to enhance revenue cycle automation. ([futuremarketinsights.com](https://www.futuremarketinsights.com/reports/artificial-intelligence-in-healthcare-market?utm_source=openai))\n\n- **Technological Innovations**: AI is being integrated into various healthcare applications, such as AI-driven drug discovery, personalized medicine, augmented reality in surgery, and AI-powered genome sequencing. These innovations are driving efficiency, improving patient outcomes, and transforming healthcare delivery models. ([pharmiweb.com](https://www.pharmiweb.com/press-release/2025-02-14/artificial-intelligence-in-healthcare-market-2024-2035-trends-competitive-landscape-and-future-outlook?utm_source=openai))\n\n- **Regulatory and Ethical Considerations**: The widespread adoption of AI in healthcare raises critical regulatory and ethical challenges, particularly regarding accountability in AI-assisted diagnosis and the risk of automation bias. The absence of a well-defined liability framework underscores the need for policies that ensure AI functions as an assistive tool rather than an autonomous decision-maker. ([arxiv.org](https://arxiv.org/abs/2502.16732?utm_source=openai))\n\nIn summary, AI is making significant strides in healthcare, with numerous case studies demonstrating its potential to enhance patient care and operational efficiency. The market is poised for continued growth, driven by technological advancements and substantial investments. However, addressing regulatory and ethical challenges remains crucial to ensure responsible and effective AI integration in healthcare. ", + "citations": [ + { + "url": "https://www.reuters.com/business/healthcare-pharmaceuticals/indias-apollo-hospitals-bets-ai-tackle-staff-workload-2025-03-13/?utm_source=openai", + "title": "India's Apollo Hospitals bets on AI to tackle staff workload", + "start_index": 993, + "end_index": 1153 + }, + { + "url": "https://www.axios.com/local/pittsburgh/2025/04/18/pitt-leidos-use-ai-to-fight-cancer-and-health-disparities?utm_source=openai", + "title": "Pitt, Leidos use AI to fight cancer and health disparities", + "start_index": 1787, + "end_index": 1927 + }, + { + "url": "https://www.ft.com/content/37b79af4-116f-46e5-9bbd-b814aa4c95af?utm_source=openai", + "title": "AI generated advice eases long waits for surgery", + "start_index": 2586, + "end_index": 2679 + }, + { + "url": "https://en.wikipedia.org/wiki/Aidoc?utm_source=openai", + "title": "Aidoc", + "start_index": 3067, + "end_index": 3142 + }, + { + "url": "https://en.wikipedia.org/wiki/Heidi_Health?utm_source=openai", + "title": "Heidi Health", + "start_index": 3638, + "end_index": 3720 + }, + { + "url": "https://www.fortunebusinessinsights.com/industry-reports/artificial-intelligence-in-healthcare-market-100534?utm_source=openai", + "title": "Artificial Intelligence [AI] in Healthcare Market Size | Share, 2032", + "start_index": 4140, + "end_index": 4299 + }, + { + "url": "https://www.futuremarketinsights.com/reports/artificial-intelligence-in-healthcare-market?utm_source=openai", + "title": "AI in Healthcare Market Size, Trends & Forecast 2025-2035", + "start_index": 4607, + "end_index": 4744 + }, + { + "url": "https://www.pharmiweb.com/press-release/2025-02-14/artificial-intelligence-in-healthcare-market-2024-2035-trends-competitive-landscape-and-future-outlook?utm_source=openai", + "title": "Artificial Intelligence in Healthcare Market [2024-2035]: Trends, Competitive Landscape, and Future Outlook - PharmiWeb.com", + "start_index": 5076, + "end_index": 5266 + }, + { + "url": "https://arxiv.org/abs/2502.16732?utm_source=openai", + "title": "DeepSeek reshaping healthcare in China's tertiary hospitals", + "start_index": 5676, + "end_index": 5741 + } + ] + } ] \ No newline at end of file