Data Search
Data Search covers single-page scraping, search engine queries, and asynchronous deep crawls with polling helpers.
Web Scraping
result = client.data_search.scrape(
"https://example.com/article",
formats=["markdown", "html"],
screenshot=True,
wait_for=2,
)
print(result.url)
print(result.markdown)
print(len(result.links))
print(result.screenshot is not None)Web Search
results = client.data_search.search(
"latest AI papers 2026",
limit=10,
lang="en",
)
print(results.query)
print(results.total_results)
for item in results.results:
print(item.title, item.url)Deep Crawl
task = client.data_search.deep_crawl(
"https://docs.example.com",
max_depth=2,
max_pages=50,
strategy="bfs",
same_domain_only=True,
exclude_patterns=["*/login*", "*/admin*"],
)
status = client.data_search.deep_crawl_status(task.task_id)
crawl = client.data_search.deep_crawl_and_wait(
"https://docs.example.com",
max_depth=2,
max_pages=50,
poll_interval=2.0,
timeout=120.0,
)| Method | Return Type | Description |
|---|---|---|
scrape(url, ...) | ScrapeResult | Scrape a single web page |
search(query, ...) | WebSearchResult | Run web search |
deep_crawl(url, ...) | CrawlTask | Start an async deep crawl |
deep_crawl_status(task_id) | CrawlStatus | Poll crawl status |
deep_crawl_and_wait(url, ...) | CrawlStatus | Start and auto-poll until complete |
deep_crawl() returns immediately. If you want one blocking helper for backend jobs, use deep_crawl_and_wait().