Connecting GitHub

edit

Connecting GitHub

edit

Connector configuration instructions provided in this guide apply both to GitHub and GitHub Enterprise Server.

GitHub is a development platform, version control and collaboration platform for teams of all sizes. From open source to business, you can host and review code, manage projects, and build software across departments and continents. The GitHub connectors provided with Workplace Search automatically capture, sync and index the following items:

Issues

Including ID, Content, Status, Repository, Created By, Comments, Comment authors and timestamps

Pull Requests

Including ID, Content, Status, Repository, Created By, Comments, Comment authors and timestamps

Repositories

Including ID, Name, and README content as description. Repository content (e.g. code, files, wiki) is excluded.

Workplace Search supports GitHub Enterprise Server version: 2.19.4

Configuring the GitHub Connector

edit

Configuring the GitHub connector is the first step prior to connecting the GitHub service to Workplace Search, and requires that you create an OAuth App from the GitHub platform. To get started, first log in to GitHub and access your administrative dashboard:


Step 1. Locate the Account drop-down menu in the top-right area and navigate to Settings:

Figure 22. Connecting GitHub

Step 2. From here, you’ll see a set of menu items on the left under Personal Settings. Click Developer Settings:

Figure 23. Connecting GitHub

Workplace Search needs an OAuth App with which to interface. There are two important things to understand before you create one:

  1. The app can stay in developer mode. You do not need to publish it.
  2. Make sure that you create this app with a trusted and stable GitHub account.

We recommend creating a team-owned account for only this app. If access is lost, a new one must be created and the configuration updated in Workplace Search.


Step 3. Click New OAuth App:

Figure 24. Connecting GitHub

You’ll need to fill in specific values:

  • Application Name: A name to help you identify the application. It’s best to make it explicit: Workplace Search, or something like that.
  • Description (Optional): More information will help you remember the application’s purpose.
  • Homepage URL: The base URL of the user interface used to manage Workplace Search (scheme + host). This is affected by which user interface you are using to manage Enterprise Search. Enterprise Search in Kibana and standalone Enterprise Search use different base URLs. See user interfaces for details on each UI.

    For Enterprise Search in Kibana, this should correspond to the value of kibana.external_url in your enterprise-search.yml.

    For standalone Enterprise Search this will be the base URL of Enterprise Search.

    Examples:

    # Deployment using a custom domain name
    https://www.example.com
    
    # Deployment using a default Elastic Cloud domain name and Standalone Enterprise Search
    https://c3397e558e404195a982cb68e84fbb42.ent-search.us-east-1.aws.found.io:443
    
    # Unsecured local development environment for Standalone Enterprise Search
    http://localhost:3002
    
    # Deployment using a default Elastic Cloud domain name and Enterprise Search in Kibana
    https://c3397e558e404195a982cb68e84fbb42.kb.us-east-1.aws.found.io:443
  • Authorization callback URL: Among other factors, the Authorization callback URL is affected by which user interface you are using to manage Enterprise Search. Enterprise Search in Kibana and standalone Enterprise Search use different callback URLs. See user interfaces for details on each UI.

    When using Standalone Enterprise Search, use the following URL, substituting <WS_BASE_URL> with the base URL at which Workplace Search is hosted (scheme + host, no path).

    <WS_BASE_URL>/ws/

    Examples:

    # Deployment using a custom domain name
    https://www.example.com/ws/
    
    # Deployment using a default Elastic Cloud domain name
    https://c3397e558e404195a982cb68e84fbb42.ent-search.us-east-1.aws.found.io:443/ws/
    
    # Unsecured local development environment
    http://localhost:3002/ws/

    When using Enterprise Search in Kibana, use the following URL, substituting <KIBANA_BASE_URL> with the base URL of your Kibana instance. This should correspond with the value of kibana.external_url in your enterprise-search.yml:

    <KIBANA_BASE_URL>/app/enterprise_search/workplace_search/sources/added

    Examples:

    # Deployment using a custom domain name for Kibana
    https://www.example.com/app/enterprise_search/workplace_search/sources/added
    
    # Deployment using a default Elastic Cloud domain name for Kibana
    https://c3397e558e404195a982cb68e84fbb42.kb.us-east-1.aws.found.io:443/app/enterprise_search/workplace_search/sources/added
    
    # Unsecured local Kibana environment
    http://localhost:5601/app/enterprise_search/workplace_search/sources/added

Step 4. Once the form is complete, click Register Application:

Figure 25. Connecting GitHub

Step 5. The app is created, and we may now retrieve the Client ID and Client Secret.

Figure 26. Connecting GitHub

Step 6. From the Workplace Search administrative dashboard’s Sources area, locate GitHub and provide both the Client ID and Client Secret.

Voilà! The GitHub connector is now configured, and ready to be used to synchronize content. In order to capture data, you must now connect a GitHub instance with the adequate authentication credentials.

Connecting GitHub to Workplace Search

edit

Once the GitHub connector has been configured, you may connect a GitHub instance to your organization.


Step 1. Head to your organization’s Workplace Search administrative dashboard, and locate the Sources tab.


Step 2. Click Add a new source.


Step 3. Select GitHub (or GitHub Enterprise) in the Configured Sources list, and follow the GitHub authentication flow as presented.


Step 4. Upon the successful authentication flow, you will be redirected to Workplace Search, and will be prompted to select the Organization you would like to synchronize.

GitHub content will now be captured and will be ready for search gradually as it is synced. Once successfully configured and connected, the GitHub synchronization automatically occurs every 2 hours.

Document-level permissions

edit

You can synchronize document access permissions from GitHub to Workplace Search. This will ensure the right people see the right documents.

See Document-level permissions for GitHub.

Adding GitHub requires that you belong to and have OAuth permissions within a GitHub organization, usually as a GitHub organization admin-level user.

Limiting the content to be indexed

edit

If you don’t need to index all the available content, you can specify the indexing rules via the API. This will help shorten indexing times and limit the size of the index. See Customizing indexing. For Github and Github Enterprise, applicable rule types would be path_template and object_type. When writing path_template rules, note that Github document paths generally follow their URL value:

https://github.com/elastic
  /elastic

https://github.com/elastic/elasticsearch
  /elastic/elasticsearch

https://github.com/elastic/elasticsearch/projects/2
  /elastic/elasticsearch/projects/2

https://github.com/elastic/elasticsearch/issues/77316
  /elastic/elasticsearch/issues/77316

https://github.com/elastic/elasticsearch/pull/77320
  /elastic/elasticsearch/pull/77320

Synchronized fields

edit

The following table lists the fields synchronized from the connected source to Workplace Search. The attributes in the table apply to the default search application, as follows:

  • Display name - The label used when displayed in the UI
  • Field name - The name of the underlying field attribute
  • Faceted filter - whether the field is a faceted filter by default, or can be enabled (see also: Customizing filters)
  • Automatic query refinement preceding phrases - The default list of phrases that must precede a value of this field in a search query in order to automatically trigger query refinement. If "None," a value from this field may trigger refinement regardless of where it is found in the query string. If '', a value from this field must be the first token(s) in the query string. If N.A., automatic query refinement is not available for this field by default. All fields that have a faceted filter (default or configurable) can also be configured for automatic query refinement; see also Update a content source, Get a content source’s automatic query refinement details and Customizing filters.
Display name Field name Faceted filter Automatic query refinement preceding phrases

Id

id

No

N.A.

URL

url

No

N.A.

Title

title

No

N.A.

Type

type

Default

None

Body

body

No

N.A.

Created at

created_at

No

N.A.

Updated at

updated_at

No

N.A.

Last updated

last_updated

No

N.A.

Slug

slug

No

N.A.

Status

status

Default

[with status, status is, in state, '']

Repository

repository

Default

N.A.

Created by

created_by

Default

[creator is, created by, edited by, modified by]

Assigned to

assigned_to

Default

[assigned to]

Private

private

Configurable

N.A.

Path

path

No

N.A.

Commented by

comment_authors

No

[commented by]