Search API Reference

edit

Search API Reference

edit

The Custom search experiences guide provides a conceptual walkthrough of the steps involved in issuing search requests on behalf of users via OAuth.

In this API reference

edit

Search API Overview

edit
POST http://localhost:3002/api/ws/v1/search

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

query

optional

The search query

page

optional

Provides optional keys of size and current. Specifies the number of results per page and which page the API should return.

sort

optional

Sort results ASC or DESC for a field

search_fields

optional

Fields used for full-text matching

result_fields

optional

Fields returned in the JSON response

filters

optional

Query modifiers used to refine a query

facets

optional

Faceting configuration


Search Query

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

query

optional

A string or number used to find related documents

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali"
}'

Pagination

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

page

optional

Provides optional keys of size and current. Specifies the number of results per page and which page the API should return.

size

optional

Specifies the number of results per page. Must be greater than or equal to 1 and less than or equal to 100.

current

optional

Specifies which page of results to retrieve for the query. Must be greater than or equal to 1 and less than or equal to 100.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali",
  "page": {
    "size": 1,
    "current": 1
  }
}'

Limit on results per query

edit

As described in Pagination, Workplace Search limits page.size to 100 and page.current to 100. Therefore, Workplace Search effectively limits each search query to ten thousand (10000) results.

Work around this limitation by requesting fewer documents. Divide a large set of documents into smaller sets by filtering, perhaps by content source, type, or timestamps (e.g. updated, created). Choose filters that create sets smaller than 10000.


Sorting

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

sort

optional

Sort results ASC or DESC for one or multiple field

field

field

Name of the field used for sorting

value

direction

ASC or DESC direction for sorting

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali",
  "sort": [
    { "square_km": "desc" },
    { "date_established": "asc" }
  ]
}'

Search Fields

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

search_fields

optional

Fields used for full-text matching

field

field

Name of the field used for full-text matching

weight

integer

The relative importance of fields in a query. A higher value represents greater importance for algorithmic scoring.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "denali",
  "search_fields": {
    "title": {
      "weight": 10
    },
    "description": {}
  }
}'

Result Fields

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

result_fields

optional

Fields returned in the JSON response

field

field

Children of the result_fields object. Name of the field to be returned in the response

raw

result type

Value of the field as originally indexed

snippet

result type

Value of the field with highlighting markup added to visually distinguish where the match occurred

size

integer

Character length of the returned value

fallback

boolean

For snippet only. Returns the raw value as snippet even when no match was highlighted

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "america",
  "result_fields": {
    "title": {
      "raw": {},
      "snippet": {}
    },
    "description": {
      "raw": {
        "size": 50
      },
      "snippet": {
        "fallback": true,
        "size": 50
      }
    },
    "states": {
      "snippet": {
        "size": 50
      }
    }
  }
}'
{
  "meta": {
    ...
  },
  "results": [
    {
      "title": {
        "raw": "American Samoa",
        "snippet": "<em>American</em> Samoa"
      },
      "_meta": {
        "source": "custom",
        "last_updated": "2020-03-27T20:10:33+00:00",
        "content_source_id": "5e7e5d911897c6dbb7e3e72a",
        "id": "park_american-samoa",
        "score": 6.359234
      },
      "source": {
        "raw": "custom"
      },
      "states": {
        "snippet": "<em>American</em> Samoa"
      },
      "description": {
        "raw": "The southernmost National Park is on three Samoan",
        "snippet": "The southernmost National Park is on three Samoan"
      },
      "last_updated": {
        "raw": "2020-03-27T20:10:33+00:00"
      },
      "content_source_id": {
        "raw": "5e7e5d911897c6dbb7e3e72a"
      },
      "id": {
        "raw": "park_american-samoa"
      }
    },
    {
      "title": {
        "raw": "Denali",
        "snippet": null
      },
      "_meta": {
        "source": "custom",
        "last_updated": "2020-03-27T20:10:33+00:00",
        "content_source_id": "5e7e5d911897c6dbb7e3e72a",
        "id": "park_denali",
        "score": 6.357545
      },
      "source": {
        "raw": "custom"
      },
      "states": {
        "snippet": null
      },
      "description": {
        "raw": "Centered on Denali, the tallest mountain in North",
        "snippet": " <em>America</em>, Denali is serviced by a single road"
      },
      "last_updated": {
        "raw": "2020-03-27T20:10:33+00:00"
      },
      "content_source_id": {
        "raw": "5e7e5d911897c6dbb7e3e72a"
      },
      "id": {
        "raw": "park_denali"
      }
    },
    ...
  ]
}

Value Filters

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

filters

optional

Query modifiers used to refine a query

field

field

Name of field upon which to apply your filter

field value

field value

The value upon which to filter. The value must be an exact match, even casing: True will not match on true.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "filters": {
    "states": ["California", "Washington"]
  }
}'

Range Filters

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

filters

optional

Query modifiers used to refine a query

field

field

Name of field upon which to apply your filter

from

optional

Inclusive lower bound of the range. Is required if to is not provided.

to

optional

Exclusive upper bound of the range. Is required if from is not provided.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "filters": {
    "visitors": {
      "from": 100
    }
  }
}'

Geo Filters

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

filters

optional

Query modifiers used to refine a query

field

field

Name of field upon which to apply your filter

center

required

The mode of the distribution as a string in "[latitude], [longitude]" format.

unit

required

The base unit of measurement: mm, cm, m (meters), km, in, ft, yd, or mi (miles).

distance

optional

A number representing the distance unit. Is required if from or to is not given.

from

optional

Inclusive lower bound of the range. Is required if to is not provided.

to

optional

Exclusive upper bound of the range. Is required if from is not provided.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "filters": {
    "location": {
      "unit": "km",
      "center": "47.6062,122.3321",
      "distance": 1000
    }
  }
}'

Combining Filters

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

filters

optional

Query modifiers used to refine a query

all

array

All of the filters must match. This functions as an AND condition.

any

array

At least one of the filters must match. This functions as an OR condition.

none

array

All of the filters must not match. This functions as a NOT condition.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "filters": {
    "all": [
      {
        "states": ["California", "Washington"]
      },
      {
        "visitors": {
          "from": 100
        }
      },
      {
        "date_established": {
          "to": "1900-01-01"
        }
      }
    ],
    "any": [
      {
        "location": {
          "unit": "km",
          "center": "47.6062,122.3321",
          "distance": 1000
        }
      },
      {
        "location": {
          "unit": "km",
          "center": "37.7749,122.4194",
          "from": 100,
          "to": 10000
        }
      }
    ],
    "none": {
      "world_heritage_site": "true"
    }
  }
}'

Value Facets

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

facets

optional

Faceting configuration

field

field

Name of field upon which to apply your facet

type

required

Type of facet, in this case value

name

optional

Name given to facet

size

optional

Between 1 and 250, defaults to 10

sort

optional

JSON object where the key is either count or value and the value is asc or desc. The default is sorting by descending count.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "facets": {
    "states": {
      "type": "value",
      "name": "top-five-states",
      "sort": { "count": "desc" },
      "size": 5
    }
  }
}'
{
  "meta": {
    ...
  },
  "results": [
    ...
  ],
  "facets": {
    "states": [
      {
        "type": "value",
        "data": [
          {
            "value": "Alaska",
            "count": 5
          },
          {
            "value": "Utah",
            "count": 2
          },
          {
            "value": "Colorado",
            "count": 2
          },
          {
            "value": "California",
            "count": 2
          },
          {
            "value": "Washington",
            "count": 1
          }
        ],
        "name": "top-five-states"
      }
    ],
  }
}

Range Facets

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

facets

optional

Faceting configuration

field

field

Name of field upon which to apply your facet

type

required

Type of facet, in this case range

name

optional

Name given to facet

ranges

optional

An array of range objects

from

optional

Inclusive lower bound of the range. Is required if to is not provided.

to

optional

Exclusive upper bound of the range. Is required if from is not provided.

name

optional

Name given to range

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "facets": {
    "acres": [
      {
        "type": "range",
        "name": "min-and-max-range",
        "ranges": [
          { "from": 1, "to": 10000 },
          { "from": 10000 }
        ]
      }
    ],
    "date_established": {
      "type": "range",
      "name": "half-century",
      "ranges": [
        { "from": "1900-01-01T12:00:00+00:00", "to": "1950-01-01T00:00:00+00:00" }
      ]
    }
  }
}'
{
  "meta": {
    ...
  },
  "results": [
    ...
  ],
  "facets": {
    "acres": [
      {
        "type": "range",
        "data": [
          {
            "key": "5e7e6cbd1897c6aa79a79c95",
            "from": 1,
            "to": 10000,
            "count": 1
          },
          {
            "key": "5e7e6cbd1897c6aa79a79c96",
            "from": 10000,
            "count": 19
          }
        ],
        "name": "min-and-max-range"
      }
    ],
    "date_established": [
      {
        "type": "range",
        "data": [
          {
            "key": "5e7e6cbd1897c6aa79a79c97",
            "from": "1900-01-01T12:00:00.000Z",
            "to": "1950-01-01T00:00:00.000Z",
            "count": 11
          }
        ],
        "name": "half-century"
      }
    ]
  }
}

Geolocation Facets

edit

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

facets

optional

Faceting configuration

field

field

Name of field upon which to apply your facet

type

required

Type of facet, in this case range

name

optional

Name given to facet

ranges

optional

An array of range objects

from

optional

Inclusive lower bound of the range. Is required if to is not provided.

to

optional

Exclusive upper bound of the range. Is required if from is not provided.

center

required

The mode of the distribution as a string in "[latitude], [longitude]" format.

unit

required

The base unit of measurement: mm, cm, m (meters), km, in, ft, yd, or mi (miles).

name

optional

Name given to range

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "facets": {
    "location": {
      "type": "range",
      "name": "geo-range-from-san-francisco",
      "center": "37.386483, -122.083842",
      "unit": "m",
      "ranges": [
        { "from": 0, "to": 100000, "name": "Nearby" },
        { "from": 100000, "to": 300000, "name": "A longer drive." },
        { "from": 300000, "name": "Perhaps fly?" }
      ]
    }
  }
}'
{
  "meta": {
    ...
  },
  "results": [
    ...
  ],
  "facets": {
    "location": [
      {
        "type": "range",
        "data": [
          {
            "key": "Nearby",
            "from": 0,
            "to": 100000,
            "count": 0
          },
          {
            "key": "A longer drive.",
            "from": 100000,
            "to": 300000,
            "count": 0
          },
          {
            "key": "Perhaps fly?",
            "from": 300000,
            "count": 20
          }
        ],
        "name": "geo-range-from-san-francisco"
      }
    ]
  }
}

Boosts

edit

Boosting allows you to control the relevance of a document based on criteria for the value of a field (or fields) within a document.

Different boosts are applied to different field types.

  • Value boosts: text, number, date
  • Functional boosts: number
  • Proximity Boosts: number, location
  • Recency boosts: date

The general format for a single boost looks like so:

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
    "boosts": {
        "(field)": { (boost parameters) },
        "(field_2)": ...,
        ...
    }
}'

The general format for multiple boosts looks like so:

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
    "boosts": {
        "(field)": [
            { (boost parameter 1) },
            { (boost parameter 2) },
            ...
        ]
    }
}'

Value boosts

edit

A value boost will boost the score of a document based on a direct value match. Available on text, number, and date fields. A document’s overall score will only be boosted once.

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

boosts

optional

Faceting configuration

type

required

Type of boost. For a value boost, this should be value.

value

required

The value to exact match on, or use an array to match on multiple values.

operation

optional

The arithmetic operation used to combine the original document score with your boost value. Can be add or multiply. Defaults to add.

factor

optional

Factor to alter the impact of a boost on the score of a document. Must be between 0.0 and 10.0. Defaults to 1.0.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "",
  "sort": { "_score": "desc" },
  "boosts": {
  	"states": {
  		"type": "value",
  		"value": "California",
  		"operation": "multiply",
  		"factor": 2
  	}
  },
  "result_fields": {
  	"title": { "raw": {} },
  	"visitors": { "raw": {} },
  	"states": { "raw": {} }
  },
  "page": {
  	"size": 100
  }
}'
{
    "meta": {
      ...
    },
    "results": [
        {
            "title": {
                "raw": "Channel Islands"
            },
            "_meta": {
                "source": "custom",
                "last_updated": "2020-10-22T17:20:58+00:00",
                "content_source_id": "5f91bf7888f9297b02a3d80c",
                "id": "park_channel-islands",
                "score": 2.0
            },
            "source": {
                "raw": "custom"
            },
            "states": {
                "raw": [
                    "California"
                ]
            },
            "last_updated": {
                "raw": "2020-10-22T17:20:58+00:00"
            },
            "visitors": {
                "raw": 364807.0
            },
            "content_source_id": {
                "raw": "5f91bf7888f9297b02a3d80c"
            },
            "id": {
                "raw": "park_channel-islands"
            }
        },
        {
            "title": {
                "raw": "Death Valley"
            },
            "_meta": {
                "source": "custom",
                "last_updated": "2020-10-22T17:20:58+00:00",
                "content_source_id": "5f91bf7888f9297b02a3d80c",
                "id": "park_death-valley",
                "score": 2.0
            },
            "source": {
                "raw": "custom"
            },
            "states": {
                "raw": [
                    "California",
                    "Nevada"
                ]
            },
            "last_updated": {
                "raw": "2020-10-22T17:20:58+00:00"
            },
            "visitors": {
                "raw": 1296283.0
            },
            "content_source_id": {
                "raw": "5f91bf7888f9297b02a3d80c"
            },
            "id": {
                "raw": "park_death-valley"
            }
        },
        ...
    ]
}

Functional boosts

edit

A functional boost will apply a function to the overall document score based on the value of the numeric field. Only available on number fields.

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

boosts

optional

Faceting configuration

type

required

Type of boost. For a functional boost, this should be functional.

function

required

Type of function to calculate the boost value. Can be linear, exponential, or logarithmic.

operation

optional

The arithmetic operation used to combine the original document score with your boost value. Can be add or multiply. Defaults to add.

factor

optional

Factor to alter the impact of a boost on the score of a document. Must be between 0.0 and 10.0. Defaults to 1.0.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "",
  "filters": { "states": ["California"] },
  "sort": { "_score": "desc" },
  "boosts": {
  	"visitors": {
  		"type": "functional",
  		"function": "linear",
  		"operation": "multiply",
  		"factor": 2
  	}
  },
  "result_fields": {
  	"title": { "raw": {} },
  	"visitors": { "raw": {} }
  }
}'
{
    "meta": {
      ...
    },
    "results": [
        {
            "title": {
                "raw": "Yosemite"
            },
            "_meta": {
                "source": "custom",
                "last_updated": "2020-10-22T17:20:58+00:00",
                "content_source_id": "5f91bf7888f9297b02a3d80c",
                "id": "park_yosemite",
                "score": 1.0057736E7
            },
            "source": {
                "raw": "custom"
            },
            "last_updated": {
                "raw": "2020-10-22T17:20:58+00:00"
            },
            "visitors": {
                "raw": 5028868.0
            },
            "content_source_id": {
                "raw": "5f91bf7888f9297b02a3d80c"
            },
            "id": {
                "raw": "park_yosemite"
            }
        },
        {
            "title": {
                "raw": "Joshua Tree"
            },
            "_meta": {
                "source": "custom",
                "last_updated": "2020-10-22T17:20:58+00:00",
                "content_source_id": "5f91bf7888f9297b02a3d80c",
                "id": "park_joshua-tree",
                "score": 5010572.0
            },
            "source": {
                "raw": "custom"
            },
            "last_updated": {
                "raw": "2020-10-22T17:20:58+00:00"
            },
            "visitors": {
                "raw": 2505286.0
            },
            "content_source_id": {
                "raw": "5f91bf7888f9297b02a3d80c"
            },
            "id": {
                "raw": "park_joshua-tree"
            }
        },
        ...
    ]
}

Proximity boosts

edit

Boost on the difference of a document value and a given value from the center parameter. Available on number and geolocation fields.

For date fields see recency boosts.

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

boosts

optional

Faceting configuration.

type

required

Type of boost. For a proximity boost, this should be proximity.

function

required

Type of function to calculate the boost value. Can be linear, exponential, or gaussian.

center

required

The mode of the distribution. Should be a number or a set of geolocation coordinates, like 25.32, -80.93.

factor

optional

Factor to alter the impact of a boost on the score of a document. Must be between 0.0 and 10.0. Defaults to 1.0.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "boosts": {
    "location": {
      "type": "proximity",
      "function": "linear",
      "center": "25.32, -80.93",
      "factor": 8
    }
  },
  "query": "old growth"
}'
Recency boosts
edit

A proximity boost, but with a timeframe as the center instead of a coordinate. Recency boosts are syntactically the same as proximity boosts, however they exclusively operate on a date/time field. In addition, the value for these boosts can use the keyword "now" to specify the center, or origin, of the boost function as the current date/time.

Only applies to date fields.

access_token

required

Must be included in HTTP authorization headers, is acquired via an OAuth authorization flow.

boosts

optional

Faceting configuration

type

required

Type of boost. For a recency boost, this should be proximity.

function

required

Type of function to calculate the boost value. Can be linear, exponential, or gaussian.

center

required

Provide a time-frame. Consider using now to establish recency from the present time, or, an ISO 8601 date/time formatted value such as: 1974-01-13T05:15:12.65Z.

factor

optional

Factor to alter the impact of a boost on the score of a document. Must be between 0.0 and 10.0. Defaults to 1.0.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "boosts": {
    "location": {
      "type": "proximity",
      "function": "linear",
      "center": "25.32, -80.93",
      "factor": 8
    }
  },
  "query": "old growth"
}'

Automatic query refinements

edit

Automatic query refinements are on by default.

To disable automatic query refinements, change the boolean value on the automatic_query_refinement top-level field:

{
  …
  "automatic_query_refinement": {true|false, default: true}
  …
}

automatic_query_refinement

optional

Can be true or false. Defaults to true.

curl -X POST http://localhost:3002/api/ws/v1/search \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
  "query": "documents updated last week",
  "automatic_query_refinement": true
}'
{
  “meta”: {
    …
    “query_refinements”:
      “submitted_query”: “”,
      “decorated_query_html”: “<span class=\"tag tag-filter active highlighted\">documents</span> updated <span class=\"tag tag-filter active highlighted\">last week</span>”
      “refinements”: [
        {
          “term”: “documents”,
          ‘position”: [0,8],
          “trigger_type”: “filter”,
          “trigger_filter_type”: “static”,
          “filter”: {
          “mime_type”: ["application/iwork-keynote-sffkey", "application/iwork-numbers-sffnumbers",… (full list abbreviated here for clarity) … ]”
          }
        },
        {
          “term”: “updated last week”,
          “position”: [10,27],
          “trigger_type”: “filter”,
          “trigger_filter_type”: “static”,
          “filter”: {
            “last_updated”: {
              “from”: "2020-04-10"
            }
          }
        }
      ]
    }
    …
  },
  …
}

A breakdown of the response fields might help you parse and action what you receive:

submitted_query

The actual query that is submitted to the underlying Elasticsearch instance. May be transformed based on filter settings.

decorated_query_html

The query with the triggered terms and phrases highlighted. You might use this to display the query via HTML with stylistic decoration.

refinements

Metadata about each filter or query refinement that was created. Includes the term or phrase, the start and end character position in the original query, the filter type, and the actual filter that was built. You can re-use the returned filter fields in new search queries as they appear.

trigger_filter_type

Can be: (1) static: a static filter based on a value or range, (2) dynamic: a filter based on dynamic fields such as a status, tag, assigned person, or similar, in the document itself, (3) me: a specific filter pertaining to the logged in user, or (4) content: a filter based on the content source, like GitHub, Slack, and so on.