Installation
With our solution for Monitoring OpenShift, you can start monitoring your clusters in under 10 minutes, including forwarding metadata-enriched container logs, host logs, and metrics. You can request an evaluation license that valid for the 30 days.
Splunk configuration
Install Monitoring OpenShift application
Install latest version of application Monitoring OpenShift from splunkbase. You need to install it on Search Heads only.
If you created a dedicated index, that is not searchable by default, modify the macro macro_openshift_base
to include
this index.
macro_openshift_base = (index=openshift)
Enable HTTP Event Collector in Splunk
Outcold Solutions' Collector sends data to Splunk using HTTP Event Collector. By default, Splunk does not enable HTTP Event Collector. Please read HTTP Event Collector walkthrough to learn more about HTTP Event Collector.
The minimum requirement is Splunk Enterprise or Splunk Cloud 6.5. If you are managing Splunk Clusters with version below 6.5, please read our FAQ how to setup Heavy Weight Forwarder in between.
After enabling HTTP Event Collector, you need to find correct Url for HTTP Event Collector and generate an HTTP Event Collector Token.
If you are running your Splunk instance on hostname hec.example.com
,
it listens on port 8088
, using SSL
and token is B5A79AAD-D822-46CC-80D1-819F80D7BFB0
you can test it with the curl
command as in the example below.
$ curl -k https://hec.example.com:8088/services/collector/event/1.0 -H "Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0" -d '{"event": "hello world"}' {"text": "Success", "code": 0}
-k
is necessary for self-signed certificates.If you are using Splunk Cloud, the URL is not the same as url for Splunk Web, see Send data to HTTP Event Collector on Splunk Cloud instances for details.
If you use an index, that is not searchable by default, please read our documentation on how to configure indices at Splunk and inside the collector at Splunk Indexes.
OpenShift preparation
To be able to use our solution and get all the benefits, you will need to perform preparation on every OpenShift node in your cluster.
OpenShift 4.x
By default, OpenShift writes 5 of 50MiB files maximum for each container. When you configure log rotation for your nodes, keep in mind, those files are your safety buffer. If one of your container writes 50MiB in an hour, you have 5 hours (5 files of 50MiB) to fix any issues between Collectord and Splunk HEC (for example, connectivity issues). If one of the containers writes 50MiB in a minute, you only have 5 minutes to fix this issue.
When increasing number of files or size of those files, keep in mind how much storage you need to support those log files. If you have changed the configuration to 5 files of 500MiB and run on average about 20 containers per node, you need 50GiB of disk space to support those log files.
Please follow Modification of log rotation of CRI-O in Openshift 4 to learn how to modify log rotation.
OpenShift 3.x - Docker logging driver
OpenShift 4+ uses CRI-O, so these steps aren't required anymore
When you set up your OpenShift cluster, verify that docker uses json-file
logging driver.
RHEL by default configures docker with journald
. Base on your
Linux distribution you can find this configuration in various places. In case of
latest RHEL Server 7.5 you can find it under /etc/sysconfig/docker
.
Replace --log-driver=journald
with
--log-driver=json-file --log-opt max-size=10M --log-opt max-file=3
.
It is important to limit the size of the log files and number of them, see
Managing Container Logs
for details.
$ sed -i 's/--log-driver=journald/--log-driver=json-file --log-opt max-size=10M --log-opt max-file=3/' /etc/sysconfig/docker
$ systemctl restart docker
If you are using
Red Hat Container Development Kit, it will pre-setup minishift with journald
logging driver. You can change it when you start minishift for the first time
with minishift start --docker-opt log-driver=json-file
.
Installation
Verify that you are in the context of the user who can perform admin operations (cluster-admin
role).
$ oc login -u system:admin
Use latest OpenShift configuration file
collectorforopenshift.yaml.
This configuration deploys multiple workloads under collectorforopenshift
namespace.
Open it in your favorite editor and set the Splunk HTTP Event Collector Url, token, configuration for a certificate if required, review and accept license agreement and include license key (request an evaluation license key with this automated form).
Optionally you can name your cluster if you are planning to monitor multiple clusters. That can help you to identify the nodes from a specific cluster within the application
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | [general] acceptLicense = false license = fields.openshift_cluster = - ... # Splunk output [output.splunk] # Splunk HTTP Event Collector url url = # Splunk HTTP Event Collector Token token = # Allow invalid SSL server certificate insecure = false # Path to CA certificate caPath = # CA Name to verify caName = |
Based on the example above you will need to modify the lines as in the following.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | [general] acceptLicense = true license = ... fields.openshift_cluster = development ... # Splunk output [output.splunk] # Splunk HTTP Event Collector url url = https://hec.example.com:8088/services/collector/event/1.0 # Splunk HTTP Event Collector Token token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0 # Allow invalid SSL server certificate insecure = true |
If you are planning to deploy Collectord on a cluster, which was running for a while, and has a lot of logs stored on the disk, Collectord will forward all the logs, which can disturb your cluster. You can configure under
[general]
valuesthruputPerSecond
ortooOldEvents
to configure the amount of logs you want to forward per second, and which events Collectord should skip.
Apply this change to your OpenShift cluster with oc
$ oc apply -f ./collectorforopenshift.yaml
In case if you are running OpenShift version 3.11 or below, you need to add privileged security context to the Service Account we use for the collector. For the OpenShift versions 4.x and above we have configured SecurityContextConstraints in the YAML definition. Our workloads need to have access to host mounts and being able to run in the privileged mode to make some of the syscalls to collect the metrics.
$ oc adm policy add-scc-to-user privileged system:serviceaccount:collectorforopenshift:collectorforopenshift
If you see an error message
the server could not find the requested resource
, possible that you are using a mismatched version of theoc
tool and the server version. You can accomplish the same by using commandoc edit securitycontextconstraints privileged
and addingsystem:serviceaccount:collectorforopenshift:collectorforopenshift
to the list ofusers
.
If you are using Red Hat certified images from registry.connect.redhat.com
, make sure to specify the secret for pulling
the image. See instructions on the Configuration Reference page.
Verify the workloads.
$ oc get all --namespace collectorforopenshift
If collectorforopenshift
Pods aren't deployed, follow the
Troubleshooting steps.
Give it a few moments to download the image and start the container. After all the pods are deployed, go to the Monitoring OpenShift application in Splunk and you should see data on dashboards.
The collector forwards by default container logs, host logs (including syslog), metrics for host, pods, containers and processes.
Next steps
- Integrate Web Console with Monitoring OpenShift application
- Review predefined alerts.
- Verify configuration by using our troubleshooting instructions.
- Enable Audit Logs. By default OpenShift does not enable Audit Logs, if you want to be able to audit activities on OpenShift API Server - you need to manually enable Audit Logs.
- Verify Prometheus Metrics. Our configuration works in most of the times out of the box. If you will find that some of the data is not available for Control Plan, verify that you get all the Prometheus metrics and that all our configurations work in your cluster.
- To learn how to forward application logs, please read our documentation on annotations.
- We send the data to the default HTTP Event Collector index. For better performance we recommend at least to split logs with metrics in separate indices. You can find how to configure indexes in our guide Splunk Indices.
- We provide flexible scheme, that allows you define search time extraction for logs in your containers. Follow the guide Splunk fields extraction for container logs to learn more.
- You can define specific patterns for multi-line log lines; override indexes, sources, source types for the logs and metrics;
extract fields, redirect some log lines to
/dev/null
, hide sensitive information from logs with annotations for pods.
Links
-
Installation
- Start monitoring your OpenShift environments in under 10 minutes.
- Automatically forward host, container and application logs.
- Test our solution with the embedded 30 days evaluation license.
-
Collector Configuration
- Collector configuration reference.
-
Annotations
- Changing index, source, sourcetype for namespaces, workloads and pods.
- Forwarding application logs.
- Multi-line container logs.
- Fields extraction for application and container logs (including timestamp extractions).
- Hiding sensitive data, stripping terminal escape codes and colors.
- Forwarding Prometheus metrics from Pods.
-
Audit Logs
- Configure audit logs.
- Forwarding audit logs.
-
Prometheus metrics
- Collect metrics from control plane (etcd cluster, API server, kubelet, scheduler, controller).
- Configure collector to forward metrics from the services in Prometheus format.
-
Configuring Splunk Indexes
- Using not default HTTP Event Collector index.
- Configure the Splunk application to use not searchable by default indexes.
-
Splunk fields extraction for container logs
- Configure search-time fields extractions for container logs.
- Container logs source pattern.
-
Configurations for Splunk HTTP Event Collector
- Configure multiple HTTP Event Collector endpoints for Load Balancing and Fail-overs.
- Secure HTTP Event Collector endpoint.
- Configure the Proxy for HTTP Event Collector endpoint.
-
Monitoring multiple clusters
- Learn how you can monitor multiple clusters.
- Learn how to set up ACL in Splunk.
-
Streaming OpenShift Objects from the API Server
- Learn how you can stream all changes from the OpenShift API Server.
- Stream changes and objects from OpenShift API Server, including Pods, Deployments or ConfigMaps.
-
License Server
- Learn how you can configure remote License URL for Collectord.
- Monitoring GPU
- Alerts
- Troubleshooting
- Release History
- Upgrade instructions
- Security
- FAQ and the common questions
- License agreement
- Pricing
- Contact