Data Search

Data Search covers single-page scraping, search engine queries, and asynchronous deep crawls with polling helpers.

Web Scraping

result = client.data_search.scrape(
    "https://example.com/article",
    formats=["markdown", "html"],
    screenshot=True,
    wait_for=2,
)
 
print(result.url)
print(result.markdown)
print(len(result.links))
print(result.screenshot is not None)

Web Search

results = client.data_search.search(
    "latest AI papers 2026",
    limit=10,
    lang="en",
)
 
print(results.query)
print(results.total_results)
for item in results.results:
    print(item.title, item.url)

Deep Crawl

task = client.data_search.deep_crawl(
    "https://docs.example.com",
    max_depth=2,
    max_pages=50,
    strategy="bfs",
    same_domain_only=True,
    exclude_patterns=["*/login*", "*/admin*"],
)
 
status = client.data_search.deep_crawl_status(task.task_id)
 
crawl = client.data_search.deep_crawl_and_wait(
    "https://docs.example.com",
    max_depth=2,
    max_pages=50,
    poll_interval=2.0,
    timeout=120.0,
)

Method	Return Type	Description
`scrape(url, ...)`	`ScrapeResult`	Scrape a single web page
`search(query, ...)`	`WebSearchResult`	Run web search
`deep_crawl(url, ...)`	`CrawlTask`	Start an async deep crawl
`deep_crawl_status(task_id)`	`CrawlStatus`	Poll crawl status
`deep_crawl_and_wait(url, ...)`	`CrawlStatus`	Start and auto-poll until complete

deep_crawl() returns immediately. If you want one blocking helper for backend jobs, use deep_crawl_and_wait().

Deep Research Memory