Custom Sources Indexing API Reference

edit

Custom Sources Indexing API Reference

edit

This is a technical API reference. Refer to the Custom API sources for a conceptual walkthrough.

Custom Source API Endpoints and Operations

edit

The Custom Source API supports traditional RESTful operations:

Creating or updating a document:

POST /api/ws/v1/sources/[KEY]/documents/bulk_create

Deleting a document:

POST /api/ws/v1/sources/[KEY]/documents/bulk_destroy

Authenticating requests with the Custom Source API

edit

Each API call requires authentication an auth_token and identification with a Custom Source key.

The auth_token is your private API key. The key is used to identify which Custom Source for which documents will be idnexed, updated or deleted.

curl -X POST http://localhost:3002/api/ws/v1/sources/[KEY]/documents/bulk_create \
-H "Authorization: Bearer [AUTH_TOKEN]" \
-H "Content-Type: application/json" \
-d '
  ...
'

Create a new Creating a Custom Source or navigate to the Details area of an existing Custom Source from the Workplace Search administrative dashboard to locate these keys.

The auth_token is shared amongst all Custom Sources. The key value is a unique identifier for each Custom Source.

Schema Management and Configuration

edit

Every Custom Source has its own unique schema, allowing you to create document repositories that truly represent the nature of the information you want your team to access via Workplace Search. Read the Custom API sources guide for a walkthrough of the process.

The following guidelines may help you create a maintainable and scalable schema:

  1. A Custom Source schema can be configured with up to 64 fields.
  2. Always index new fields as the same type as existing documents.

    • eg. An existing date field should not receive geolocation data.
  3. Arrays are supported, but nested field objects are not supported.
  4. Fields cannot be deleted once they have been created.
  5. Reserved fields can not be created:

    • external_id
    • source
    • content_source_id
    • updated_at
    • last_updated
    • highlight
    • any, all, none, or, and, not
    • engine_id
    • _allow_permissions and _deny_permissions
  6. A field name can only contain lowercase letters, numbers, and underscores.

Schema Data Types

edit

Custom Source fields can be one of four different types:

text Fields
edit

Text fields are at the heart of search. They are analyzed fields and are used for full-text matching in information retrieval. Any group of characters or text that you want to search over should be text.

Example: A description of an object, the name of a product, the content of a review.

text is the default type for all new fields.

number Fields
edit

number fields represent a finite double-precision floating point value: 3.14 or 42. Number fields enable fine grained sorting, filtering, faceting, and boosting.

Example: A price, a review score, the number of visitors, or a size.
date Fields
edit

Dates must be in ISO 8601 format, i.e. "2013-02-27T18:09:19Z" or "2013-02-27T17:09:19+01:00".

Example: A product release or publish date, birth date, an air date.
geolocation Fields
edit

Geographic coordinates can leverage the location field. A location is specified using a JSON object containing the longitude and latitude, such as:"37.7894758, -122.3940638". The separating space after the comma may be omitted: "37.7894758,-122.3940638".

Example: A store where a product is located, location of a venue.

Indexing and Updating Documents

edit

Index new objects into a Custom Source or update existing documents.

Request limits: Maximum 100 documents per request

POST /api/ws/v1/sources/[KEY]/documents/bulk_create

key

required

Unique key for a Custom Source, provided upon creation of a Custom Source.

auth_token

required

Must be included in HTTP authorization headers.

id

optional

ID unique to a document used to identify, modify or delete the record at a later time. If you do not provide an id, a BSON id will be created for you. Learn more about document IDs with the Workplace Search API reference.

_allow_permissions

optional

Optional for document level security. When a value is set within a document, only users with a matching permission will be able to view it.

_deny_permissions

optional

Optional for document level security. When a value is set within a document, users with the matching permission will be unable to view it. Read the Document permissions for Custom Sources to learn more.

curl -X POST http://localhost:3002/api/ws/v1/sources/[KEY]/documents/bulk_create \
-H "Authorization: Bearer [AUTH_TOKEN]" \
-H "Content-Type: application/json" \
-d '[
  {
    "_allow_permissions": ["permission1"],
    "_deny_permissions": [],
    "id" : 1234,
    "title" : "The Meaning of Time",
    "body" : "Not much. It is a made up thing.",
    "url" : "https://example.com/meaning/of/time",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
  {
    "_allow_permissions": [],
    "_deny_permissions": ["permission2"],
    "id" : 1235,
    "title" : "The Meaning of Sleep",
    "body" : "Rest, recharge, and connect to the Ether.",
    "url" : "https://example.com/meaning/of/sleep",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  },
  {
    "_allow_permissions": ["permission1"],
    "_deny_permissions": ["permission2"],
    "id" : 1236,
    "title" : "The Meaning of Life",
    "body" : "Be excellent to each other.",
    "url" : "https://example.com/meaning/of/life",
    "created_at": "2019-06-01T12:00:00+00:00",
    "type": "list"
  }
]'
{
  results: [
    {
       "id":"1235",
       "errors":[]
    }
  ]
}

Deleting Documents

edit

Remove documents from a Custom Source.

POST /api/ws/v1/sources/[KEY]/documents/bulk_destroy

key

required

Unique key for a Custom source, provided upon creation of a Custom Source.

auth_token

required

Must be included in HTTP authorization headers.

id

required

An array of IDs associated to documents to delete.

curl -X POST http://localhost:3002/api/ws/v1/sources/[KEY]/documents/bulk_destroy \
-H "Authorization: Bearer [AUTH_TOKEN]" \
-H "Content-Type: application/json" \
-d '[
  [DOCUMENT_ID_1], [DOCUMENT_ID_2]
]'
{
  results: [
    {
      "id":1234,
      "success":true
    },
    {
      "id":1235,
      "success":true
    }
  ]
}

Understanding Document IDs

edit

Each document within a content source must have a unique id. If you do not provide an id, a BSON id will be created for you. Two documents in two separate content sources may have the same id.

You can update existing documents by issuing a POST request to an existing id.

If the id does not exist, a new document is created. It is up to you to maintain the integrity of your id for each document within each Custom API Source.

We recommend that you avoid SHAs or any identifier derived from the content of a document. Any modification of the original data will alter the value, making it difficult to identify the document in the search index. This can lead to record duplication.

Synchronizing Document-Level Permissions for Custom Sources

edit

Custom Sources allow you to define at the document level which user may or may not access the result as part of the search experience. Two reserved fields (_allow_permissions and _deny_permissions) accept array-type values. Using proper user mapping, you can generate sophisticated document access controls.

Deny permissions take precedence.

Read more in the Document permissions for Custom Sources guide.