Data Search

Data Search covers single-page scraping, search engine queries, and asynchronous deep crawls with polling helpers.

Web Scraping

result = client.data_search.scrape(
    "https://example.com/article",
    formats=["markdown", "html"],
    screenshot=True,
    wait_for=2,
)
 
print(result.url)
print(result.markdown)
print(len(result.links))
print(result.screenshot is not None)
results = client.data_search.search(
    "latest AI papers 2026",
    limit=10,
    lang="en",
)
 
print(results.query)
print(results.total_results)
for item in results.results:
    print(item.title, item.url)

Deep Crawl

task = client.data_search.deep_crawl(
    "https://docs.example.com",
    max_depth=2,
    max_pages=50,
    strategy="bfs",
    same_domain_only=True,
    exclude_patterns=["*/login*", "*/admin*"],
)
 
status = client.data_search.deep_crawl_status(task.task_id)
 
crawl = client.data_search.deep_crawl_and_wait(
    "https://docs.example.com",
    max_depth=2,
    max_pages=50,
    poll_interval=2.0,
    timeout=120.0,
)
MethodReturn TypeDescription
scrape(url, ...)ScrapeResultScrape a single web page
search(query, ...)WebSearchResultRun web search
deep_crawl(url, ...)CrawlTaskStart an async deep crawl
deep_crawl_status(task_id)CrawlStatusPoll crawl status
deep_crawl_and_wait(url, ...)CrawlStatusStart and auto-poll until complete

deep_crawl() returns immediately. If you want one blocking helper for backend jobs, use deep_crawl_and_wait().

Deep ResearchMemory
Data Search | Documentation