Outcold Solutions LLC

Monitoring Kubernetes - Version 5

Installation

With our solution for Monitoring Kubernetes, you can start monitoring your clusters in under 10 minutes, including forwarding metadata-enriched container logs, host logs, and metrics. You can request an evaluation license that is valid for 30 days.

Splunk configuration

Install Monitoring Kubernetes application

Install the latest version of the Monitoring Kubernetes application from Splunkbase. You only need to install it on Search Heads.

If you created a dedicated index that is not searchable by default, modify the macro macro_kubernetes_base to include this index.

macro_kubernetes_base = (index=kubernetes)

Enable HTTP Event Collector in Splunk

Outcold Solutions' Collector sends data to Splunk using the HTTP Event Collector. By default, Splunk does not enable HTTP Event Collector. Please read the HTTP Event Collector walkthrough to learn more about HTTP Event Collector.

The minimum requirement is Splunk Enterprise or Splunk Cloud 6.5. If you are managing Splunk Clusters with a version below 6.5, please read our FAQ on how to set up a Heavy Weight Forwarder in between.

After enabling HTTP Event Collector, you need to find the correct URL for HTTP Event Collector and generate an HTTP Event Collector Token. If you are running your Splunk instance on hostname hec.example.com, listening on port 8088, using SSL and the token is B5A79AAD-D822-46CC-80D1-819F80D7BFB0, you can test it with the curl command as shown in the example below.

$ curl -k https://hec.example.com:8088/services/collector/event/1.0 -H "Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0" -d '{"event": "hello world"}'
{"text": "Success", "code": 0}

-k is necessary for self-signed certificates.

If you are using Splunk Cloud, the URL is not the same as the URL for Splunk Web. See Send data to HTTP Event Collector on Splunk Cloud instances for details.

If you use an index that is not searchable by default, please read our documentation on how to configure indices at Splunk and inside the collector in Splunk Indexes.

Install Collector for Kubernetes

For Docker UCP installation, see the blog post Monitoring Docker Universal Control Plane (UCP) with Splunk Enterprise and Splunk Cloud

Pre-requirements

Collector works out of the box with CRI-O, Containerd and Docker as runtime engines.

The most important part is to configure log rotation for your container logs. Some Kubernetes providers set very low numbers for files and their sizes for log files. For example, AKS sets 5 files of 10MiB by default. These files are your safety buffer. If one of your containers writes 10MiB in an hour, you have 5 hours (5 files of 10MiB) to fix any issues between Collectord and Splunk HEC (for example, connectivity issues). If one of the containers writes 10MiB in a minute, you only have 5 minutes to fix this issue.

Please consult your Kubernetes provider to check the default configurations for log rotation.

Usually, the configuration can be applied with the KubeletConfiguration (or using the command line arguments for Kubelet). Please make sure to change them to at least 128Mi and 5 rotated files (the actual values depend on the amount of logs your Pods produce).

kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
...
containerLogMaxSize: 128Mi
containerLogMaxFiles: 5

Installation

Use the latest Kubernetes configuration file collectorforkubernetes.yaml. This configuration deploys multiple workloads under the collectorforkubernetes namespace.

Open it in your favorite editor and set the Splunk HTTP Event Collector URL, token, configuration for a certificate if required, and review and accept a license agreement. Include the license key (request an evaluation license key with this automated form).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[general]

acceptLicense = false

license =

fields.kubernetes_cluster = -

...

# Splunk output
[output.splunk]

# Splunk HTTP Event Collector url
url =

# Splunk HTTP Event Collector Token
token =

# Allow invalid SSL server certificate
insecure = false

# Path to CA certificate
caPath =

# CA Name to verify
caName =

Based on the example above, you will need to modify the lines as shown below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
[general]

acceptLicense = true

license = ...

fields.kubernetes_cluster = development

...

# Splunk output
[output.splunk]

# Splunk HTTP Event Collector url
url = https://hec.example.com:8088/services/collector/event/1.0

# Splunk HTTP Event Collector Token
token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0

# Allow invalid SSL server certificate
insecure = true

If you are planning to deploy Collectord on a cluster that has been running for a while and has a lot of logs stored on the disk, Collectord will forward all the logs, which can disturb your cluster. You can configure values thruputPerSecond or tooOldEvents under [general] to configure the amount of logs you want to forward per second and which events Collectord should skip.

To apply these changes to your Kubernetes cluster, use kubectl

$ kubectl apply -f ./collectorforkubernetes.yaml

After running the above command, verify the workloads using:

$ kubectl get all --namespace collectorforkubernetes

After all the pods are deployed, give it a few moments to download the image and start the containers. Then, go to the Monitoring Kubernetes application in Splunk, and you should see data on the dashboards.

The collector forwards container logs, host logs (including syslog), metrics for hosts, pods, containers, and processes by default.

Next steps

  • Review predefined alerts.
  • Verify configuration by using our troubleshooting instructions.
  • Enable Audit Logs. By default, Kubernetes does not enable Audit Logs. If you want to be able to audit activities on the Kubernetes API Server, you need to manually enable Audit Logs.
  • Verify Prometheus Metrics. Our configuration works out of the box in most cases. If you find that some of the data is not available for Control Plane, verify that you are receiving all the Prometheus metrics and that all of our configurations work in your cluster.
  • To learn how to forward application logs, please read our documentation on annotations.
  • We send the data to the default HTTP Event Collector index. For better performance, we recommend splitting logs with metrics into separate indices. You can find information on how to configure indexes in our guide Splunk Indices.
  • We provide a flexible scheme that allows you to define search time extraction for logs in your containers. Follow the guide Splunk fields extraction for container logs to learn more.
  • You can define specific patterns for multi-line log lines, override indexes, sources, and source types for the logs and metrics, extract fields, redirect some log lines to /dev/null, and hide sensitive information from logs with annotations for pods.

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.