Getting Started with Kibana

edit

Getting Started with Kibana

edit

Now that you have Kibana installed, you can step through this tutorial to get fast hands-on experience with key Kibana functionality. By the end of this tutorial, you will have:

  • Loaded a sample data set into your Elasticsearch installation
  • Defined at least one index pattern
  • Used the Discover functionality to explore your data
  • Set up some visualizations to graphically represent your data
  • Assembled visualizations into a Dashboard

The material in this section assumes you have a working Kibana install connected to a working Elasticsearch install.

Video tutorials are also available:

Before You Start: Loading Sample Data

edit

The tutorials in this section rely on the following data sets:

  • The complete works of William Shakespeare, suitably parsed into fields. Download this data set by clicking here: shakespeare.json.
  • A set of fictitious accounts with randomly generated data. Download this data set by clicking here: accounts.zip
  • A set of randomly generated log files. Download this data set by clicking here: logs.jsonl.gz

Two of the data sets are compressed. Use the following commands to extract the files:

unzip accounts.zip
gunzip logs.jsonl.gz

The Shakespeare data set is organized in the following schema:

{
    "line_id": INT,
    "play_name": "String",
    "speech_number": INT,
    "line_number": "String",
    "speaker": "String",
    "text_entry": "String",
}

The accounts data set is organized in the following schema:

{
    "account_number": INT,
    "balance": INT,
    "firstname": "String",
    "lastname": "String",
    "age": INT,
    "gender": "M or F",
    "address": "String",
    "employer": "String",
    "email": "String",
    "city": "String",
    "state": "String"
}

The schema for the logs data set has dozens of different fields, but the notable ones used in this tutorial are:

{
    "memory": INT,
    "geo.coordinates": "geo_point"
    "@timestamp": "date"
}

Before we load the Shakespeare data set, we need to set up a mapping for the fields. Mapping divides the documents in the index into logical groups and specifies a field’s characteristics, such as the field’s searchability or whether or not it’s tokenized, or broken up into separate words.

Use the following command to set up a mapping for the Shakespeare data set:

curl -XPUT http://localhost:9200/shakespeare -d '
{
 "mappings" : {
  "_default_" : {
   "properties" : {
    "speaker" : {"type": "string", "index" : "not_analyzed" },
    "play_name" : {"type": "string", "index" : "not_analyzed" },
    "line_id" : { "type" : "integer" },
    "speech_number" : { "type" : "integer" }
   }
  }
 }
}
';

This mapping specifies the following qualities for the data set:

  • The speaker field is a string that isn’t analyzed. The string in this field is treated as a single unit, even if there are multiple words in the field.
  • The same applies to the play_name field.
  • The line_id and speech_number fields are integers.

The logs data set requires a mapping to label the latitude/longitude pairs in the logs as geographic locations by applying the geo_point type to those fields.

Use the following command to establish geo_point mapping for the logs:

curl -XPUT http://localhost:9200/logstash-2015.05.18 -d '
{
 "mappings" : {
  "log" : {
   "properties" : {
    "geo" : {
     "properties" : {
      "coordinates" : {
       "type" : "geo_point"
      }
     }
    }
   }
  }
 }
}
';

Because the logs data set is in three indices, one for each day in a three-day period, run the mapping again two more times, changing the name of the index to logstash-2015.05.19 and logstash-2015.05.20.

The accounts data set doesn’t require any mappings, so at this point we’re ready to use the Elasticsearch bulk API to load the data sets with the following commands:

curl -XPOST 'localhost:9200/accounts/account/_bulk?pretty' --data-binary @accounts.json
curl -XPOST 'localhost:9200/shakespeare/_bulk?pretty' --data-binary @shakespeare.json
curl -XPOST 'localhost:9200/_bulk?pretty' --data-binary @logs.jsonl

These commands may take some time to execute, depending on the computing resources available.

These commands assume your Elasticsearch cluster is not using the Shield security plugin. If your Elasticsearch cluster has Shield configured, provide suitable credentials with the -k and -u options.

Verify successful loading with the following command:

curl 'localhost:9200/_cat/indices?v'

You should see output similar to the following:

health status index               pri rep docs.count docs.deleted store.size pri.store.size
yellow open   accounts              5   1       1000            0    418.2kb        418.2kb
yellow open   shakespeare           5   1     111396            0     17.6mb         17.6mb
yellow open   logstash-2015.05.18   5   1       4631            0     15.6mb         15.6mb
yellow open   logstash-2015.05.19   5   1       4624            0     15.7mb         15.7mb
yellow open   logstash-2015.05.20   5   1       4750            0     16.4mb         16.4mb