App Search APIs
editApp Search APIs
editOn this page
Initializing the Client
editThe AppSearch
client can either be configured directly:
# Use the AppSearch client directly: from elastic_enterprise_search import AppSearch app_search = AppSearch( "http://localhost:3002", http_auth="private-..." ) # Now call API methods app_search.search(...)
…or can be used via a configured EnterpriseSearch.app_search
instance:
from elastic_enterprise_search import EnterpriseSearch ent_search = EnterpriseSearch("http://localhost:3002") # Configure authentication of the AppSearch instance ent_search.app_search.http_auth = "private-..." # Now call API methods ent_search.app_search.search(...)
API Key Privileges
editUsing the APIs require a key with read
, write
or search
access
depending on the action. If you’re receiving an UnauthorizedError
make sure the key you’re using in http_auth
has the proper privileges.
Engine APIs
editEngines index documents and perform search functions. To use App Search you must first create an Engine.
Create Engine
editLet’s create an Engine named national-parks
and uses English
as a language:
# Request: app_search.create_engine( engine_name="national-parks", language="en", ) # Response: { "name": "national-parks", "type": "default", "language": "en" }
Get Engine
editOnce we’ve created an Engine we can look at it:
# Request: app_search.get_engine(engine_name="national-parks") # Response: { "document_count": 0, "language": "en", "name": "national-parks", "type": "default" }
List Engines
editWe can see all our Engines in the App Search instance:
# Request: app_search.list_engines() # Response: { "meta": { "page": { "current": 1, "size": 25, "total_pages": 1, "total_results": 1 } }, "results": [ { "document_count": 0, "language": "en", "name": "national-parks", "type": "default" } ] }
Delete Engine
editIf we want to delete the Engine and all documents
inside you can use the delete_engine()
method:
# Request: app_search.delete_engine(engine_name="national-parks") # Response: { "deleted": True }
Document APIs
editCreate and index Documents
editOnce you’ve created an Engine you can start adding documents
with the index_documents()
method:
# Request: app_search.index_documents( engine_name="national-parks", documents=[{ "id": "park_rocky-mountain", "title": "Rocky Mountain", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "visitors": 4517585, "world_heritage_site": False, "location": "40.4,-105.58", "acres": 265795.2, "date_established": "1915-01-26T06:00:00Z" }, { "id": "park_saguaro", "title": "Saguaro", "nps_link": "https://www.nps.gov/sagu/index.htm", "states": [ "Arizona" ], "visitors": 820426, "world_heritage_site": False, "location": "32.25,-110.5", "acres": 91715.72, "date_established": "1994-10-14T05:00:00Z" }] ) # Response: [ { "errors": [], "id": "park_rocky-mountain" }, { "errors": [], "id": "park_saguaro" } ]
List Documents
editBoth of our new documents indexed without errors.
Now we can look at our indexed documents in the engine:
# Request: app_search.list_documents(engine_name="national-parks") # Response: { "meta": { "page": { "current": 1, "size": 100, "total_pages": 1, "total_results": 2 } }, "results": [ { "acres": "91715.72", "date_established": "1994-10-14T05:00:00Z", "id": "park_saguaro", "location": "32.25,-110.5", "nps_link": "https://www.nps.gov/sagu/index.htm", "states": [ "Arizona" ], "title": "Saguaro", "visitors": "820426", "world_heritage_site": "false" }, { "acres": "265795.2", "date_established": "1915-01-26T06:00:00Z", "id": "park_rocky-mountain", "location": "40.4,-105.58", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "title": "Rocky Mountain", "visitors": "4517585", "world_heritage_site": "false" } ] }
Get Documents by ID
editYou can also retrieve a set of documents by their id
with
the get_documents()
method:
# Request: app_search.get_documents( engine_name="national-parks", document_ids=["park_rocky-mountain"] ) # Response: [ { "acres": "265795.2", "date_established": "1915-01-26T06:00:00Z", "id": "park_rocky-mountain", "location": "40.4,-105.58", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "title": "Rocky Mountain", "visitors": "4517585", "world_heritage_site": "false" } ]
Update existing Documents
editYou can update documents with the put_documents()
method:
# Request: resp = app_search.put_documents( engine_name="national-parks", documents=[{ "id": "park_rocky-mountain", "visitors": 10000000 }] ) # Response: [ { "errors": [], "id": "park_rocky-mountain" } ]
Delete Documents
editYou can delete documents from an Engine with the delete_documents()
method:
# Request: resp = app_search.delete_documents( engine_name="national-parks", document_ids=["park_rocky-mountain"] ) # Response: [ { "deleted": True, "id": "park_rocky-mountain" } ]
Schema APIs
editNow that we’ve indexed some data we should take a look at the way the data is being indexed by our Engine.
Get Schema
editFirst take a look at the existing Schema inferred from our data:
# Request: resp = app_search.get_schema( engine_name="national-parks" ) # Response: { "acres": "text", "date_established": "text", "location": "text", "nps_link": "text", "states": "text", "title": "text", "visitors": "text", "world_heritage_site": "text" }
Update Schema
editLooks like the date_established
field wasn’t indexed
as a date
as desired. Update the type of the date_established
field:
# Request: resp = app_search.put_schema( engine_name="national-parks", schema={ "date_established": "date" } ) # Response: { "acres": "number", "date_established": "date", # Type has been updated! "location": "geolocation", "nps_link": "text", "square_km": "number", "states": "text", "title": "text", "visitors": "number", "world_heritage_site": "text" }
Search APIs
editOnce documents are ingested and the Schema is set properly
you can use the search()
method to search through an Engine
for matching documents.
The Search API has many options, read the Search API documentation for a list of all options.
Single Search
edit# Request: resp = app_search.search( engine_name="national-parks", body={ "query": "rock" } ) # Response: { "meta": { "alerts": [], "engine": { "name": "national-parks-demo", "type": "default" }, "page": { "current": 1, "size": 10, "total_pages": 2, "total_results": 15 }, "request_id": "6266df8b-8b19-4ff0-b1ca-3877d867eb7d", "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks-demo", "id": "park_rocky-mountain", "score": 6776379.0 }, "acres": { "raw": 265795.2 }, "date_established": { "raw": "1915-01-26T06:00:00+00:00" }, "id": { "raw": "park_rocky-mountain" }, "location": { "raw": "40.4,-105.58" }, "nps_link": { "raw": "https://www.nps.gov/romo/index.htm" }, "square_km": { "raw": 1075.6 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Rocky Mountain" }, "visitors": { "raw": 4517585.0 }, "world_heritage_site": { "raw": "false" } } ] }
Multi Search
editMultiple searches can be executed at the same time with the multi_search()
method:
# Request: resp = app_search.multi_search( engine_name="national-parks", body={ "queries": [ {"query": "rock"}, {"query": "lake"} ] } ) # Response: [ { "meta": { "alerts": [], "engine": { "name": "national-parks-demo", "type": "default" }, "page": { "current": 1, "size": 1, "total_pages": 15, "total_results": 15 }, "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks", "id": "park_rocky-mountain", "score": 6776379.0 }, "acres": { "raw": 265795.2 }, "date_established": { "raw": "1915-01-26T06:00:00+00:00" }, "id": { "raw": "park_rocky-mountain" }, "location": { "raw": "40.4,-105.58" }, "nps_link": { "raw": "https://www.nps.gov/romo/index.htm" }, "square_km": { "raw": 1075.6 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Rocky Mountain" }, "visitors": { "raw": 4517585.0 }, "world_heritage_site": { "raw": "false" } } ] }, ... ]
Curation APIs
editCurations hide or promote result content for pre-defined search queries.
Create Curation
edit# Request: resp = app_search.create_curation( engine_name="national-parks", queries=["rocks", "rock", "hills"], promoted_doc_ids=["park_rocky-mountains"], hidden_doc_ids=["park_saguaro"] ) # Response: { "id": "cur-6011f5b57cef06e6c883814a" }
Get Curation
edit# Request: resp = app_search.get_curation( engine_name="national-parks", curation_id="cur-6011f5b57cef06e6c883814a" ) { "hidden": [ "park_saguaro" ], "id": "cur-6011f5b57cef06e6c883814a", "promoted": [ "park_rocky-mountains" ], "queries": [ "rocks", "rock", "hills" ] }
List Curations
edit# Request: app_search.list_curations( engine_name="national-parks" )
Delete Curation
edit# Request: app_search.delete_curation( engine_name="national-parks", curation_id="cur-6011f5b57cef06e6c883814a" )
Meta Engine APIs
editMeta Engines is an Engine that has no documents of its own, instead it combines multiple other Engines so that they can be searched together as if they were a single Engine.
The Engines that comprise a Meta Engine are referred to as "Source Engines".
Create Meta Engine
editCreating a Meta Engine uses the create_engine()
method
and set the type
parameter to "meta"
.
# Request: app_search.create_engine( engine_name="meta-engine", type="meta", source_engines=["national-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks" ], "type": "meta" }
Searching Documents from a Meta Engine
edit# Request: app_search.search( engine_name="meta-engine", body={ "query": "rock" } ) # Response: { "meta": { "alerts": [], "engine": { "name": "meta-engine", "type": "meta" }, "page": { "current": 1, "size": 10, "total_pages": 1, "total_results": 1 }, "request_id": "aef3d3d3-331c-4dab-8e77-f42e4f46789c", "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks", "id": "park_black-canyon-of-the-gunnison", "score": 2.43862 }, "id": { "raw": "national-parks|park_black-canyon-of-the-gunnison" }, "nps_link": { "raw": "https://www.nps.gov/blca/index.htm" }, "square_km": { "raw": 124.4 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Black Canyon of the Gunnison" }, "world_heritage_site": { "raw": "false" } } ] }
Notice how the id
of the result we receive (national-parks|park_black-canyon-of-the-gunnison
)
includes a prefix of the Source Engine that the result is from to distinguish them from
results with the same id
but different Source Engine within a search result.
Adding Source Engines to an existing Meta Engine
editIf we have an existing Meta Engine named meta-engine
we can add additional Source Engines to it with the
add_meta_engine_source()
method. Here we add the
state-parks
Engine:
# Request: app_search.add_meta_engine_source( engine_name="meta-engine", source_engines=["state-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks", "state-parks" ], "type": "meta" }
Removing Source Engines from a Meta Engine
editIf we change our mind about state-parks
being a Source Engine for
meta-engine
we can use the delete_meta_source_engines()
method:
# Request: app_search.delete_meta_engine_source( engine_name="meta-engine", source_engines=["state-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks" ], "type": "meta" }
Web Crawler APIs
editDomains
edit# Create a domain resp = app_search.create_crawler_domain( engine_name="crawler-engine", body={ "name": "https://example.com" } ) domain_id = resp["id"] # Get a domain app_search.get_crawler_domain( engine_name="crawler-engine", domain_id=domain_id ) # Update a domain app_search.put_crawler_domain( engine_name="crawler-engine", domain_id=domain_id, body={ ... } ) # Delete a domain app_search.delete_crawler_domain( engine_name="crawler-engine", domain_id=domain_id ) # Validate a domain app_search.get_crawler_domain_validation_result( body={ "url": "https://example.com", "checks": [ "dns", "robots_txt", "tcp", "url", "url_content", "url_request" ] } ) # Extract content from a URL app_search.get_crawler_url_extraction_result( engine_name="crawler-engine", body={ "url": "https://example.com" } ) # Trace a URL app_search.get_crawler_url_tracing_result( engine_name="crawler-engine", body={ "url": "https://example.com" } )
Crawls
edit# Get the active crawl app_search.get_crawler_active_crawl_request( engine_name="crawler-engine", ) # Start a crawl app_search.create_crawler_crawl_request( engine_name="crawler-engine" ) # Cancel the active crawl app_search.delete_crawler_active_crawl_request( engine_name="crawler-engine" )
Entry Points
edit# Create an entry point resp = app_search.create_crawler_entry_point( engine_name="crawler-engine", body={ "value": "/blog" } ) entry_point_id = resp["id"] # Delete an entry point app_search.delete_crawler_entry_point( engine_name="crawler-engine", entry_point_id=entry_point_id )
Crawl Rules
edit# Create a crawl rule resp = app_search.create_crawler_crawl_rule( engine_name="crawler-engine", domain_id=domain_id, body={ "policy": "deny", "rule": "ends", "pattern": "/dont-crawl" } ) crawl_rule_id = resp["id"] # Delete a crawl rule app_search.delete_crawler_crawl_rule( engine_name="crawler-engine", domain_id=domain_id, crawl_rule_id=crawl_rule_id )
Sitemaps
edit# Create a sitemap resp = app_search.create_crawler_sitemap( engine_name="crawler-engine", domain_id=domain_id, body={ "url": "https://example.com/sitemap.xml" } ) sitemap_id = resp["id"] # Delete a sitemap app_search.delete_crawler_sitemap( engine_name="crawler-engine", domain_id=domain_id, sitemap_id=sitemap_id )