IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Getting started with website search Managing crawls in Kibana »

›

Elastic web crawler

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Elastic web crawler

edit

Looking for the App Search web crawler? See the App Search documentation.

To compare the web crawler with the App Search web crawler, see the reference table on this page.

This feature is not available at all Elastic subscription levels. Refer to the Elastic subscriptions pages for Elastic Cloud and self-managed deployments.

Overview

edit

Use the web crawler to programmatically discover, extract, and index searchable content from websites and knowledge bases. When you ingest data with the web crawler a search-optimized Elasticsearch index is created to hold and sync webpage content.

The web crawler is a native Elasticsearch solution. It reads and writes directly to Elasticsearch indices in a format that enables developers to build intuitive, relevant search experiences using App Search engines and the Search UI library.

Web crawler documentation:

Getting started with website search: Concrete guide to building a website search experience, using the crawler UI.
Managing crawls: Detailed reference for managing crawls using the Kibana UI. Learn how to:
- Manage duplicated documents
- Extract binary content such as PDFs from webpages
- Schedule automated crawls
Optimizing web content: Optimize your web content source files for the web crawler, to manage webpage discovery and content extraction. Learn about:
Custom fields using proxy: How to extract custom fields from webpages using a proxy server.
Troubleshooting crawls: Detailed troubleshooting reference
Web crawler events logs reference: Detailed web crawler events logs reference
View web crawler events logs: How to view web crawler events logs in Kibana

Appendix: Compare the web crawler and App Search web crawler
edit

	App Search web crawler	Web crawler
Interface	GUI / API	GUI-only
Binary content extraction	Yes	Yes
Search	App Search	Elasticsearch / App Search using Elasticsearch search API for App Search
Ingest pipelines	Yes	Yes
Monitoring	Yes	Yes
APM	Yes	Yes
Audit logging	Yes	No
Event logging	Yes	Yes
Public REST API	Yes	No

« Getting started with website search Managing crawls in Kibana »

Elastic web crawler

Elastic web crawler

Overview

Appendix: Compare the web crawler and App Search web crawler