Monitoring Kubernetes - Release History

Monitoring Kubernetes - Release History

5.23.432 - 2025-02-12

Collectord updates:

Upgrade SQLite to 3.47.2.
Upgrade golang to 1.23.6.
Bug fix: Collectord verify command can result in panic when Collectord uses License Server.

5.23.431 - 2024-11-18

Supports collectorforkubernetes version 5.23.x and below

Update application for Splunk Cloud compatibility

Collectord updates:

Upgrade SQLite to 3.47.0.
Upgrade golang to 1.23.3.

5.23.430 - 2024-10-28

Supports collectorforkubernetes version 5.23.x and below

To better support installations with large number of nodes and containers, default behavior for most of the dashboards is to require pressing a Submit button after selecting filters.
Overview Dashboard - new table with Not Ready Containers.
Pod Dashboard - include container statuses table.
Audit Dashboard - include user agent, and update compatibility with latest audit formats.
Audit Dashboards - small performance improvement for the new installations.
Host dashboard - show node conditions table.
Host dashboard - show only external eht* interfaces in network stats.

Collectord updates:

Implement new and improved watch mechanism for Kubernetes resources to handle large clusters.
Change the default pipe join configuration to have max size of 1MB instead of 100KB.
Allow to define outputs for prometheus metrics defined with annotations.
When HTTP Server is enabled for the Collectord, it writes every call to the stdout, make it configurable.
Bug fix: Collectord did not respect proxyBasicAuth for the splunk output.
Bug fix: Collectord verify command can report incorrectly the status of containerd runtime.
Upgrade SQLite to 3.46.1.
Upgrade golang to 1.23.2.

5.22.422 - 2024-06-17

Bug fix: Fix issue with calculating values on Resource Quota dashboard.

Collectord updates:

Upgrade SQLite to 3.46.0.
Upgrade golang to 1.22.4.

5.22.421 - 2024-05-13

Collectord updates:

Allow spawning journald log reader in a separate process, to prevent corrupted logs from crashing the main process.
Upgrade golang to 1.22.3.

5.22.420 - 2024-04-22

Supports collectorforkubernetes version 5.22.x and below

Workload dashboard - add Pod OwnerKind and OwnerName, PriorityClass, and Pod Requests/Limits
Address too many data points in host and workload dashboard in network graphs
Additional CPU Metrics: CPU IOWait, Steal and Idle in Top Hosts dashboards.
Showing CPU IOWait in Host dashboard.
Alert Container CPU Throttled - exclude container with low CPU usage.
New dashboard Review->Disk Stats for the host.
Exclude virtual ethernet interfaces from host dashboard.
Support memory limits and requests expressed in milli-bytes.

Collectord updates:

Allow disabling IP address Lookup in net_socket_table input.
Better handling of zombie processes in proc_stats input.
Allow configuring user Splunk outputs using CDR SplunkOutput.
Allow blacklisting labels from forwarded metadata.
When onVolumeDatabase is used Collectord verifies that volume supports locking.
Add additional metrics CPU IOWait, Steal and Idle.
Monitoring disk stats for the host.
Add input disk_stats.
New diagnostic - CPU Vulnerabilities.
Improve check for the Kubernetes API endpoint in verify command.
Deprecate diagnostic for entropy.
Upgrade default API Version to 1.24 for Docker endpoints.
License Client - allow configuring the proxy.
Bug fix: ignore containers with completed status.
Bug fix: don't include containers with completed status (init) containers for the Pod requests and limits.
Bug fix: if container does not generate a lot of logs, some messages can stack in queue while waiting for more messages.
Bug fix: Collectord describe command can crash if user fields are defined with annotations on the pod.
Upgrade golang to 1.22.2.
Upgrade sqlite3 to 3.45.3.

5.21.412 - 2024-01-08

Collectord updates:

Add libdl.so.2 library to the scratch image for compatibility with Aqua Security
Upgrade SQLite to 3.44.2
Upgrade Go language runtime to 1.21.5

5.21.411 - 2023-11-28

Collectord updates:

Bug fix: Collectord might send events without timestamps
Upgrade Go language runtime to 1.21.4

5.21.410 - 2023-10-16

Supports collectorforkubernetes version 5.21.x and below

Compatibility updates for the version 5.21 of Collectord
New Dashboard: Review -> CPU (Throttled, Limits, Requests)
Alert update: High amount of GRPC errors
Alert update: Container CPU Throttled
Network tables update: show UDP connections for Host, Workloads, Containers, and Pods
Network Connection Dashboard: allows filtering by namespaces
Show maximum and average number of Pods per cluster in Clusters (Allocations and usage) dashboard
Update Resource Quota dashboard to support comparing milli-cores and cores

Collectord updates:

Support for global replace configurations for Collectord, allowing to sanitize data before forwarding to Splunk
Support journald as logging driver for container logs
When both volatile and persistent journald destination exist, Collectord will identify which has the most recent data
Support for configuring modify values for specific namespaces when streaming objects
Support for arrays in modify values for the streaming objects from Kubernetes/OpenShift API server
Allow sending to Splunk more precise timestamps for the events
Collectord can automatically refresh tokens when they are expired for API Server
Compatibility updates for latest versions of Kubernetes
Upgrade Go language runtime to 1.21.3
Upgrade sqlite3 library to 3.43.1
Upgrade libc and common base libraries to debian:bookworm

5.20.403 - 2023-07-31

Collectord updates:

Improvements for working with NFS shares and closed file handlers.
Improvements for streaming Pods from Kubernetes API server.
Collectord reports when the Splunk HEC Collector does not reply with the correct response with 200 status code.
Upgrade go runtime to version 1.20.6.
Bug fix: Collectord might report invalid memory usage for the stopped containers.
Bug fix: If collectord fails to initialize on volume database, that might crash whole Collectord instance.

5.20.402 - 2023-06-06

Collectord updates:

Bug fix: onvolumedatabase annotation does not work when ignoreCSIMountFolderForDiscovery is enabled
Bug fix: Splunk output might send event_id field when includeEventID is not enabled
Allow configuring timeout-seconds for collecting diag

5.20.401 - 2023-05-22

Collectord updates:

Upgrade go runtime to version 1.20.4
Allow users to configure how many events Collectord can have in the output pipeline to lower memory footprint
Include iNode and DevID in the info.txt in diag
Bug fix: Collectord cannot collect performance metrics in diag
Bug fix: Collectord can start forwarding logs from the older file position than in the acknowledgement database

5.20.400 - 2023-04-17

Supports collectorforkubernetes version 5.20.x and below

Show Pod conditions on the Pod dashboard
Bug fix: Pods dashboard filters out pods not on the host network.
Compatibility updates for the version 5.20 of Collectord

Collectord updates:

Multi-architecture images for amd64 and arm64
Allow sending logs to multiple Splunk HEC endpoints simultaneously
New annotation collectord.io/volume.{N}-logs-onvolumedatabase to keep acknowledgement information about forwarded logs on the volume
Allow including placeholder templates in the annotation collectord.io/volume.{N}-logs-glob
Support for new outputs (ElasticSearch and OpenSearch)
Collectord produces diag file without performance data, if flag --include-performance-profiles is not set
Use IMDSv2 for AWS metadata
Performance improvements for an acknowledgement database
Improvements for the acknowledgement database on how long Collectord keeps the data by refreshing the state, if file still exists on the disk
Upgrade Go language runtime to 1.20.3
Collectord verifies that only one Collectord instance can access the data folder, where Collectord stores its state
Remove automatic watching for Docker runtime on Kubernetes/OpenShift hosts
Add a verify step for Containerd runtime for the verify command
Add ability to send events with event_id, unique identifier for the messages generated from logs
Bug fix: Collectord might assign processes running outside of the containers on the host to the Collectord container
Bug fix: CPU-based license tries to connect to the license server, when running verify command
Bug fix: Collectord might not set a source to the log files for non-default splunk output

5.19.391 - 2023-03-07

Collectord updates:

Upgrade go runtime to 1.19.7
For CSI volumes, Collectord allows to ignore the "mount" subdirectory with configuration ignoreCSIMountFolderForDiscovery under input.app_logs

5.19.390 - 2022-10-17

Supports collectorforkubernetes version 5.19.x and below

Update dashboards for latest changes in the metric names for API Server, Controller and Scheduler
Update Kubelet dashboard to support various container runtimes
Audit (users and namespaces) dashboard: show access to non-namespaces resources
Logs dashboard: show container and pod as separate filters
New alert for Collectord alarms for node diagnostics (reboot required, and entropy)
Bug fix: misprint in "Cluster Warning: container cpu is throttled" alert

Collectord updates:

Splunk output supports maximumMessageLength to truncate messages exceeding this size
Splunk output supports requireExplicitIndex to ignore all events that don't have explicit index defined
Collectord monitors if node requires reboot
Input Kubernetes watch allows now to hash or remove values from JSON before sending them to Splunk
Collectord now reads its own clusterrole and implements a gate, that does not allow it to invoke requests to API server, that it does not have access to
Instead of using automatic gate based on clusterrole, admin can define list of objects Collectord should use to load metadata for the Pods
Update configurations for latest versions of Kubernetes to support various CRI runtimes
Update configuration to use control-plane role instead of master (as the last one is deprecated)
Improved support for CSI volumes, automatically discover additional sub directory "mount"
Allow to force override annotations from cluster level configurations
Upgrade go runtime to 1.19.2
Beta: weighted splunk output algorithm when multiple threads used
Bug fix: if docker runtime is not installed, Collectord can clog the output with warnings
Bug fix: verify command can report an error with journald, when it properly works
Bug fix: Collectord can clog the output if cgroupv2 is used, and blkio is not enabled
Bug fix: Collectord can crash if default output.splunk is not configured, now it shows the error
Bug fix: If output is not defined for Kubernetes Watch input, it should use default output
Bug fix: if Kubernetes watch connection fails, Collectord can generate a lot of requests to API Server

5.18.381 - 2022-05-17

Collectord updates:

Update go runtime to 1.17.11
When Splunk HEC is slow, and cannot process the events, Collectord might hold on the files in PVC volume, preventing kubelet to stop the application pod. Collectord now has a configuration for how long it can keep the file descriptors for when pod is terminated.
Bug fix: When Splunk HEC is unavailable, Collectord can start closing dedicated Splunk outputs for Indexes
Bug fix: When Splunk HEC returns code 4xx, unrecognized by the format of Splunk HEC, Collectord might incorrectly skip the event
Bug fix: Collectord builds incorrect path for the Kubernetes API service, when watchin for some objects, like gateway
Bug fix: Verify command does not respect cgroup v2

5.18.380 - 2022-04-19

Supports collectorforkubernetes version 5.18.x and below

Cluster filter on Events dashboard
Rewrite CPU throttled alert to make it less verbose
Memory usage now reports memory without caches and memory that can be freed.
Support cgroupv2

Collectord updates:

Support cgroupv2
New ability to specify the message field name for the logs extraction with annotations extractionMessageField
Collectord improves grace period for expired licenses allowing to bootstrap new nodes for 14 days
Support of journald database written with systemd library 247+
Upgrade go runtime to 1.17.9
Bug fix: cleanup the diag, exclude the real license key
Bug fix: collectord reports high CPU usage for just started containers or hosts
Bug fix: update pods/container labels when user updates them (prior restart was required)
Bug fix: set now as a date for container logs with corrupted log files instead of 0 timestamp
Bug fix: include the values of whitelists and blacklists in diag
Bug fix: verify command might incorrectly show that it cannot find container logs with CRIO runtime

5.17.370 - 2021-10-20

Supports collectorforkubernetes version 5.17.x and below

Show milicores/cores CPU usage instead of percents
New dashboard: Review - Resource Quotas
Review - Projects: filter by project name
Review - Clusters: filter by node label
Review - Clusters: include max and avg usage
Bug fix: storage dashboard might not render in some Splunk versions
Bug fix: Namespaces dashboard shows only one namespace label

Collectord updates:

Upgrade to Go 1.17.2
Support query in Prometheus URLs for metrics
Collectord now reports source and source type for the events with incorrect index
Support for licensing server
Support for CPU-based licenses
Allow to specify multiple values for blacklist and whitelist for host logs
Bug fix: Collectord clogs the output with WARN messages for stopped containers running with Containerd
Bug fix: Containers with not set requests might show 1core request by default
Bug fix: Collectord clogs the output with WARN messages about closed Splunk outputs
Bug fix: parse commas in the timestamps for logs

5.16.363 - 2021-05-26

Bug fix: Put in parentheses source selection in macro_openshift_prometheus_metrics

Collectord updates:

Upgrade go runtime to 1.16.3
Bug fix: fix verbose logging for docker watcher with messages "failed to get next event"
Bug fix: NetworkPolicy cannot be watched, as Collectord does not convert it in plural form properly
Bug fix: Verify command fails on Containerd runtime
Bug fix: DefaultIdleConnTimeout is ignored for HTTP clients
Bug fix: Put in parentheses source selection in macro_kubernetes_prometheus_metrics

5.16.361 - 2021-03-16

Supports collectorforkubernetes version 5.16.x and below

Overview dashboard filters respect filters (show only namespaces from selected cluster)
Bug fix: use correct units for Memory and Storage (MiB, MB, Mb)
Bug fix: compatibility with new format of Events from API server (FirstSeen, LastSeen, Source could be shown as null)
Bug fix: Collectord metrics request time shows the summary on the period, not the individual request times

Collectord updates:

ARM64 image
Allow removing managed fields from events (enabled with new configurations by default)
Upgrade to Go 1.16.2
Bug fix: precise time to Splunk HEC, sending with milliseconds instead of nanoseconds (which are incorrectly ronded by HEC)
Bug fix: first sample of the container can record above 100% of the CPU usage, as the values are pretty low
Bug fix: verify command does not respect glob patterns for Prometheus inputs (certs, tokens)
Bug fix: trim spaces in token value for Prometheus inputs

5.16.353 - 2021-02-11

Collectord updates:

Bug fix: collectord can report parse int errors on the stderr
Upgrade go runtime to 1.15.8

5.16.351 - 2021-01-04

Collectord updates:

Bug fix: host file inputs can raise a fatal error: concurrent map writes

5.16.350 - 2020-12-14

Supports collectorforkubernetes version 5.16.x and below

New dashboard: Collectord metrics
Compatibility for Kubernetes 1.20
Bug fix: broken link in Allocatable Resources dashboard

Collectord updates:

Annotations for collecting prometheus metrics: authorization keys and CAName for SSL certificates
Improvement for DNS resolutions of Splunk output FQDN
Export internal collectord metrics in Prometheus format
Forwarding internal collectord metrics to Splunk
For the watch objects inputs being able to hide management fields
In the diag include all open file descriptors
Upgrade go runtime to 1.14.13
Remove \0 symbol from the labels values in the prometheus metrics
Allow to filter host logs with blacklist and whitelist
Bug fix: less verbose warnings about not being able to load resources from API server
Bug fix: performance improvements for Ack DB
Bug fix: custom prometheus metrics forwarded by Collectord do not include cluster field or custom user fields
Bug fix: addon pod terminates faster
Bug fix: verify command trying to post to all outputs with all indexes specified in the configuration
Bug fix: crash in AckDB
Bug fix: input system stats does not recognize ouputs specified for the host and cgroup
Bug fix: verify command runs recursively all the time for host logs even when recursively is set to false

5.15.305 - 2021-01-04

Collectord updates:

Upgrade go runtime to 1.14.13
Bug fix: host file inputs can raise a fatal error: concurrent map writes

5.15.303 - 2020-08-12

Collectord updates:

Upgrade golang to 1.14.7 to fix the hang in runtime

5.15.301 - 2020-06-24

Collectord updates:

Bug fix: verify command broken for addon pod

5.15.300 - 2020-06-01

Supports collectorforkubernetes version 5.15.x and below

Events dashboard: filters depend on selection of cluster and node labels
Support for Kubernetes 1.18+
Improvement for alert "Cluster Warning: high number of errors to Kubernetes API" (only alert on 5xx errors)
Bug fix: node events aren't visible in Events tab

Collectord updates:

Support for annotations to add custom user fields to data
Support for blacklisting and whitelisting Prometheus metrics (significally reducing the indexing cost of data)
Verify command improvements - verify proper configurations for cgroup (memory/memory.use_hierarchy is 1)
Bug fix: fix bug in prometheus metrics parser, empty fields can be filled with previous fields
Bug fix: occasionally addon can report warnings about trying to delete expired keys from ack db
Bug fix: better handle of connections to metrics endpoints exported in Prometheus format
Bug fix: http connections improvements for when Splunk is unresponsive
Bug fix: broken diag

5.14.285 - 2020-08-12

Collectord updates:

Upgrade golang to 1.14.7 to fix the hang in runtime

5.14.284 - 2020-03-23

Collectord updates:

New annotation to configure whitelist pattern for log messages
Allow to override Kubernetes service URL
Bug fix: panic in output for addon
Bug fix: performance and memory usage improvement for ack db

5.14.280 - 2020-01-27

Supports collectorforkubernetes version 5.14.x and below

Logs dashboard: filters depend on selection
Overview dashboard: namespace counter for list of projects

Collectord updates:

Support templates in the index, source and sourcetype
Allow to exclude indexed fields when forwarding to Splunk
Support annotation for stats interval for containers
Support containerd runtime
Bug fix: verify command can show incorrect error about verifying journald input
Bug fix: index on namespace should set index for application logs
Bug fix: warning about not being able to retrieve node information

5.12.273 - 2019-11-18

Collectord updates:

Bug fix: panic in application logs discovering for PVC volumes

5.12.272 - 2019-11-08

Collectord updates:

Bug fix: in case when the rotated files are reusing FileID/DevID Collectord stops forwarding rotated files

5.12.271 - 2019-11-07

Supports collectorforkubernetes version 5.12.x and below

Improvements for the macros for backward compatibility

Collectord updates:

Bug fix: when event pattern is used for joining multi-line events, the error can not be showed if raised by the input in pipeline.
Bug fix: reduce warnings failed to get the new event in pipeline - submitted
Stability improvements

5.12.270 - 2019-10-22

Supports collectorforkubernetes version 5.12.x and below

Compact metrics (pre-calculated on Collectord side)
Switched stats for host and cgroup in different macros
Use base macro for alerts
Improved command extraction for exec in Audit Logs
Add cluster name in the alert results

Collectord updates:

Watch namespaces and workloads for changes
Global configurations with Custom Resources and selectors
Describe command to see applied annotations for pods
Bug fix: panic when pipe join configuration is removed
Bug fix: panic when proc stats is enabled and cgroup stats is disabled
Bug fix: support ProxyBasicAuthorization for license server checks
Bug fix: Fix for collecting first sample (can show high CPU usage for first sample)
Bug fix: if list of URLs is used for Splunk output, the empty URL is still required
Beta: dynamic index, source and sourcetype names based on the metafields
Beta: cluster diagnostics with one rule: node entropy

5.11.266 - 2020-10-15

Collectord updates:

Upgrade golang to 1.14.10 to fix the hang in runtime

5.11.265 - 2020-06-24

Collectord updates:

Bug fix: memory improvement for large ackdb files

5.11.264 - 2019-11-08

Collectord updates:

Bug fix: in case when the rotated files are reusing FileID/DevID Collectord stops forwarding rotated files

5.11.261 - 2019-09-13

Collectord update:

Bug fix: improves discovery for the PVC volumes
Bug fix: delay loading for the PVC volumes
Bug fix: improves logging for the directory walker

5.11.260 - 2019-09-09

Supports collectorforkubernetes version 5.11.x and below

GPU Monitoring (NVIDIA)

Collectord updates:

Support for PVC volumes for application logs
Bug fix: small memory leak in addon
Bug fix: duplicate events then pipeline is getting throttled
Bug fix: don't use throttling for devnull output
Bug fix: better recovery for ack db corruption
Bug fix: crash on journald input initialization when ack db is corrupted
Bug fix: annotations joinmultiline requires joinpartial
Bug fix: configurations for stdout only with annotations can crash collectord
Set events = 50 by default for Splunk output batches

5.10.255 - 2019-11-20

Collectord updates:

Bug fix: better recovery for ack db corruption
Bug fix: crash on journald input initialization when ack db is corrupted

5.10.253 - 2019-07-31

Collectord update:

Bug fix: collectord can pick up compressed json logs (*.gz)
Bug fix: too verbose warnings from the docker watcher about retries

5.10.252 - 2019-07-24

Collectord update:

Support for configuring the thruput (general and with annotations for container logs)
Support for configuring too old or too new events (general and with annotations for container logs)

5.10.251 - 2019-06-20

Collectord update:

Ability to configure Acknowledgement database for collectord.

5.10.250 - 2019-06-18

Supports collectorforkubernetes version 5.10.x and below

Security dashboard: Access: access to host via ssh, sudo, exec commands, failed access
Security dashboard: Audit (users and namespaces)
Security dashboard: Network (traffic)
Security dashboard: Network (connections)
Security dashboard: Objects (pods) - review pods with host network, age of pods, image pull policy, attached host paths, security context and restart policies
Review dashboard: Clusters (allocations and usage)
Cluster field filters
Base macro for overriding macros for other macros

Collectord updates:

Support for volatile and persistent journald storage with default configuration
Updated YAML configuration to include most common resources
Better support for overriding sourcetype, that does not require to update the Splunk macros
Bug fix: rarely when collectord fails to post to HEC it can panic
Bug fix: better support for Kubernetes 1.14 and CRI-O storage
Bug fix: space characters in index annotations can break the pipeline

5.9.244 - 2019-05-20

Collectord update:

Bug fix: support for CRI-O in Kubernetes 1.14

5.9.240 - 2019-05-14

Supports collectorforkubernetes version 5.9.x and below

Visual improvements on the graphs for the number of logs and events
New alerts for the CPU and Memory reservation

Collectord updates:

Support for multiple Splunk destinations (outputs)
Support subdomains for annotations (to deploy multiple collectord instances)
Support for streaming objects from Kubernetes API to Splunk
Bug fix: journald input keeps fd open to the rotated files
Bug fix: fix in the annotation parser for the interval annotations
Bug fix: fix splunk url selection configuration for multiple splunk URLs

5.8.231 - 2019-04-25

Bug fix: Collectord usage report shows trial licenses for all instances

5.8.230 - 2019-04-22

Supports collectorforkubernetes version 5.8.x and below

Use multiselect filters for most dashboards and filters with possibility to input custom filters.
Reduce dedup usage to improve performance on dashboards.
Add critical pod annotations for Kubernetes ...1.13, and priority class for Kubernetes 1.14...
Fix: statefulset dashboard does not show data with filters.
Add graph of number of pods per namespace on Overview dashboard.

Collectord updates:

Bug fix: clogging collectord output with errors when incorrect index is used.
Bug fix: short lived containers can results in duplicating logs.
Bug fix: clogging collectord output with warnings when kernel reports incorrect VmRss size.
Bug fix: annotations cannot override timestamp location for fields extraction.
Bug fix: verify command reports Journald input in incorrect place.
Better support for cgroup symlinks, automatically discover correct location.

5.7.220 - 2019-03-18

Supports collectorforkubernetes version 5.7.x and below

Review savedsearches/alerts to support indexing delay (start searches from 2 minutes behind) and run them in more random time.
Workload dashboard - change CPU (of host) in table to real CPU
Fixed single value memory panel on host dashboard (missed span)
Use SEGMENTATION=none for stats events to use less disk space (needs to me moved to indexers)

Collectord updates:

Support hostname formatting with environment variables in configuration
New rotated file logic uses less file descriptors and frees rotated files quicker
Allow to specify a default sampling value for container logs
Reimplemented shutdown sequence to stop collectord faster
Allow to override sampling percent with annotations
New Input: journald

5.6.213 - 2019-03-03

Collectord: Fix panic, when collectord does not have access to docker socket, and information about this container does not exist on the disk.

5.6.212 - 2019-02-19

Supports collectorforkubernetes version 5.6.x and below

New: Alert: high CPU usage on the host.
Fixed: Splunk usage dashboard - charts do not show the data, when the used indexed aren't searchable by default.
New: Support Dark theme.
New: Free text search in Logs dashboard.
New: Add auto-refresh options to the dashboard.
Fixed: Revisited CPU limits and requests for Pods and Containers.
New: add CPU Max, Memory Max and Project/Namespace labels to the Review-Namespaces dashboard.
Fixed: Show deleted events

Collectord updates:

Fixed: auto-recovery from the corrupted write-ahead-log in acknowledgment database.
New: support sampling (random and hash-based) for container/application and host logs.
New: when running multiple collectord on one host (with different output) - count that as one licensed host, change InstanceID format.
Fixed: when container is scheduled with remove flag lock the file till collectord processes it completely.
Fixed: collectord reports rare warning about unparsable uint64 max value from proc filesystem.
Fixed: collectord reports rare warning about unparsable line from proc/io files.
New: allow to include annotations in the forwarding data.
Fixed: if collectord cannot access to the API - report the warning less often
Fixed: do not report docker warnings for verify command, if there is no container scheduled outside of the Kubernetes.
New: splunk output - allow to limit the output batch by the number of events in payload.
Fixed: attach namespace labels to the forwarded logs.
Fixed: attach openshift_namespace field to the events.

5.5.205 - 2019-01-25

Collectord fix: collectord could stop sending container file logs when the original file has been truncated (using the same Node ID as previously used log file).

5.5.203 - 2019-01-25

Collectord fix: collectord could send an empty X-Splunk-Request-Channel header to Splunk.

5.5.202 - 2019-01-24

Supports collectorforkubernetes version 5.5.202

New: Dashboard Review -> namespaces. Review allocations and requests for namespaces and pods.
Fixed: kubernetes_stats_cpu_request_percent - is divided by the number of CPU.

Collectord updates:

Fixed: Interval 0 in prometheus input can crash the collectord.
Fixed: When both glob and match are set for the application logs, the glob pattern can block the match pattern from finding the files in the volume.

5.4.201 - 2018-12-19

Supports collectorforkubernetes version 5.4.x and below

Fixed: Alerts for licenses issued with AWS Subscriptions

Collectord updates:

Fixed: Better handling rotated files (less open fd)
Fixed: Events input can hang in the err loop.

5.4 - 2018-12-17

Supports collectorforkubernetes version 5.x and below

New: CoreDNS dashboard.
New: CoreDNS alerts.
Improved: etcd metrics representation for bucket values.
Compatibility update for collectord 5.4.

Collectord updates:

New: Attach EC2 metadata fields
New: Basic Auth for Proxy (License Server and Splunk)
Fixed: Collectord verify reports CRI-O as unsupported runtime.
Fixed: Rare crash on Prometheus metrics definition.
Fixed: Better handling of acknowledgment database corruption.
Fixed: When handling incorrect indexes, collectord can send index with empty string, that Splunk recognize as incorrect index

5.3 - 2018-11-19

Supports collectorforkubernetes version 5.x and below

Fixed: Improved Workload dashboard. Allows to filter by namespace, see all Pods in a specific namespace, filter by workload label.
New: Alert for showing when Collectord reports errors in Processing pipelines (as an example if it failed to extract fields).
New: Alert for showing when Collectord reports warnings.
Fixed: Add node labels filter to Storage Dashboard and Control Plane Dashboards.
New: Alert if lag in the indexing of the data.
New: Splunk Usage (License usage, number of events) report under Setup.
Fixed: adjusted high amount of errors to Kubernetes API dashboard to make it less verbose.
Fixed: misprint in the search for showing alerts
Fixed: lookup with alerts causing very often replication activities on SHC
Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations

Collectord updates:

Fixed: high memory usage with Gzip compression enabled (reduced memory usage).
New: Allow to disable pipe.join with annotations.
Fixed: In high amount of logs (10,000 events per second) Collectord can read lines not in full, that can break JSON logs.
Fixed: When collectord writes a Warning that it failed to post to Splunk, it will write a Success message after retry.
New: Allow to hash sensitive data with annotations.
Fixed: Group network socket tables to reduce the amount of forwarded data (4 times reducing the amount of data)
Fixed: Identify when glob and match pattern require recursive directory traversal.
Fixed: Make it possible to add annotations for the specific containers inside of the the same Pods.
New: Annotation for complete disabling of the handling and forwarding logs for containers.
Fixed: Performance improvements for CRI-O logs.
Fixed: Collectord showed few Debug messages on start.
Fixed: Performance improvements for log forwarding (up to 35% in high amount of logs).
Fixed: reduce duplication of the Kubernetes events, forwarded to Splunk.
Fixed: Do not generate a WARN when API Server results in 404. Usually this caused by the owner object being deleted.
Fixed: Failed to parse proc name from the stat file with the not paired parentheses.

5.2 - 2018-10-15

Supports collectorforkubernetes version 5.x and below

New: Review/Storage dashboard based on storage metrics and PVC metrics.
New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
Fixed: Performance improvements

Collector updates:

New: runtime storage metrics (usage, available, inodes)
New: image is built on top of SCRATCH image.
New: verify and diag commands for troubleshooting.
New: support /dev/null output for logs
New: override source/sourcetype and index base on regexp pattern for container logs.
Fixed: do not send empty docker_labels
New: support docker JSON tags and labels
Fixed: allowing a new license to unblock collector with the expired license.
Fixed: Prometheus parser fails to parse metrics with labels that end with a comma.
Fixed: Performance improvements
New: Prometheus parser supports basic authentication
Fixed: Workaround for a bug in HTTP Event Collector, that can return an incorrect index of failed event
New: Prometheus autodiscover support host network
Fixed: remove node info and limit metadata from logs
Fixed: documentation / default configuration update - mount `/etc/localtime to allow collector to use host tz (when not UTC)
Fixed: documentation / default configuration update - use dnsPolicy: ClusterFirstWithHostNet for pods mounted on host network

5.1 - 2018-09-17

Supports collectorforkubernetes version 5.x and below

New: Network metrics (MB, Packets, Drops and Errors) for host and containers.
New: Network socket tables (list of port that containers and hosts are listen on, connections to external resources).
New: Network review dashboard to see the list of connection to public services and in private network.
Improvement: Replace python-based lookup with macro written with eval.
Improvement: Visual improvement for showing when the object was Last Seen (highlighting and showing minutes ago).
New: discovering Prometheus metrics in Pods with annotations.
New: attaching pod metadata to metrics collected from prometheus metrics exposed from pods.
Improvement: Changed source of proc stats to proc root filesystem, to keep minimum list of unique sources.
New: Support for Splunk multi-threads outputs (for forwarding more than 3000 events per second).
Improvement: Performance improvements for Prometheus parsing.
Improvement: Reduce amount of metrics forwarded with proc_stats by excluding system threads.
Improvement: Configuration for gzip compression.
Improvement: Calculate checksums for first bytes of files, to better identify new files with reused iNode.
Bug: Process metrics could be collected 2 times.

5.0 - 2018-09-03

Supports collectorforkubernetes version 5.x and below

New dashboard: Events
Added events panel to the Workload and Pod dashboards.
Labels on Workload and Hosts dashboards.
Auto-discover and forward Application logs from host mounts or local volumes.
Annotations for containers to change per container configurations (index, source, join rules, replaces and more).
Escaping terminal sequences from container logs.
Redirecting logs to /dev/null for specific patterns.
Replace patterns in container and application logs (hiding sensitive or not important information).
Support for extracting fields from the container logs, including timestamps.
Include Memory and CPU limits for container lists.
Visual updates for the panels, highlighting high CPU and Memory usages
Filter cgroup stats, forward only container and host metrics.
Support for multiple Splunk HTTP Event Collector endpoints (support fail-over and load-balancing).
Handle HTTP Event Collector errors with the incorrect index. Multiple options to redirect to default index, drop or wait.
Add retry logic to license client to reduce amount of false positive warnings.
Add HTTP read timeouts (handle gateway timeouts, 504).
Fixed: fail to parse the latest line in the JSON log.
Better error handling incorrect configurations.
Deprecating Join rules in favour of annotations.
Support for HTTP Event Collector client certificates.
Support CRI-O runtime.
Fixed: limit directory walkers for depth (fixing issues when directory has a mount to itself)
Fixed: add a limit of the maximum line size that collector can read at once (defaults to 1Mb).
Fixed: acknowledgement database stores now NodeID, DevID and a parent folder identifier. That way if NodeID is going to be reused right away - we will identify this file as a new one, if it is in different location.
Change: docker_stream field has been renamed to stream for compatibility with other container runtime.
Change: prometheus metrics has default sourcetype=kubernetes_prometheus (macro supports backward compatibility)

Upgrade from version 4 to 5

4.0.24 - 2018-05-05

Supports collectorforkubernetes version 4.x and below

New dashboard: Cluster/Audit
New dashboard: Cluster/Kubernetes API Server
New dashboard: Cluster/Kubelet
New dashboard: Cluster/etcd
New dashboard: Cluster/Scheduler
New dashboard: Cluster/Controller Manager.
Include image name, when list containers.
Added syslog component to the list of host logs.
Fixed: Include Daemon Set on Overview dashboard, list of namespaces.

Collector updates (4.0.171):

Collecting metrics from Prometheus format.
Add HTTP read timeouts (handle gateway timeouts, 504).
Correctly parse HTTP Event Responses when one of few events fail to be indexed (as an example, wrong index).
Performance optimizations.
Optimize payloads for higher write throughput.
Fixed: reduce the number of calls to Kubernetes API Server.
Fixed: fail to parse the latest line in the JSON log.
Better error handling incorrect configurations.
Failed to parse memory limits (Failed to parse memory=000k for the container).
Collecting Kubernetes events from the cluster once by using collector addon.

collectorforkubernetes 4.0.172

Fixed: Messages "WARN ... proc.go:441: Unparsable line from /rootfs/proc/X/status" caused by new Linux kernel that reports empty line in proc file system.
Fixed: Incorrectly parsed Limits for the Kubernetes pods. 5m and 500m both results as 0.500.

collectorforkubernetes 4.0.173

Fixed: significant memory usage with the events larger than 512Kb, caused by Splunk issue SPL-156315 (incapable to parse events larger 512Kb, regression in 7.x).

collectorforkubernetes 4.0.174.180730

Show the index name in the output, when Splunk reports incorrect index.

3.0.23 - 2018-02-17

Supports collectorforkubernetes version 3.x and below

Bug: Memory view on workflow dashboard had a max limit set to 100.
Bug: Events view on overview dashboard had a max limit set to 100.

3.0.22 - 2018-02-07

Supports collectorforkubernetes version 3.x and below

Added support for containers deployed without Kubernetes.
Added CPU Quota, CPU Shares, Throttled and Memory Limit and Request Overlays on Container and Pod Dashboards.
Indexing Kubernetes events in sourcetype kubernetes_events
Performance improvement on Dashboards by combining multiple charts using one common search.
New "Review/Allocatable Resources" dashboard to track limits and requests for CPU and Memory.
New "Review/Privileged containers and enabled capabilities" dashboard to list all privileged containers and enabled security capabilities for containers.
New Overview dashboard to easy navigate within the application.
New Aggregated metrics dashboard for specific Workload.
Fixed bug on Process Dashboard, some charts did not filter by host.
"Setup: Collectors" now supports collectorforkubernetes images distributed via private registries.
"Overview: Process" dashboard did not use Span token for timechart dashboards.
"Top: Containers" fixed incorrect memory usage (showed double size)
Added alerts in application for notification about outdated collector versions and expired licenses for collector.
Hide Wait Read/Write IO panels, when this data is not available.
In process Dashboard show VmRSS with RssAnon, RssFile, and RssShmem.

Collector updates:

Support for Splunk indexing acknowledgment.
Watching for Kubernetes/OpenShift events.
HTTP Proxy support for License server and Splunk output.
Allow to configure destination indices for different types of data in collector configuration (stats, logs, host logs, proc stats and events).
Handling responses from HTTP Event Collector to skip invalid events (will be logged).
If container is running, but Kubernetes does not provide metadata, allow to wait for metadata.
Collect security capabilities and uid/gid.
For Kubernetes/OpenShift environments recognize containers scheduled outside of Pods and load metadata directly from docker.
Support for custom labels, specified with collector configuration.
Support OpenShift/Kubernetes annotations "collectord.io/..." to configure destination indices, sourcetypes and sources for pods, workloads and namespaces.
Support for partial logs without join rules.
Bug. Use local timezone by default for local syslog files.
Bug. Fix small memory leak on deleted containers.
Bug. When collector is failing to send data to Splunk, impossible to stop collector with terminate.

2.1 - 2017-10-22

Supports collectorforkubernetes version 2.1.59.x and below

Implemented collectors dashboard to track number of collectors, their versions and used licenses.
Fallback to the process IO statistics when blkio is not available.
Fix IO statistic graphs, showed average, when sum should be used.
Fields extraction support for nginx ingress 0.9 and above.
collector* - Improved resistance for storage failures.
collector* - License checks reporting.
collector* - Better support for openshift environment (default configuration).

2.0 - 2017-10-22

Supports collectorforkubernetes version 2.0.37.x and below

Better labels support in Dashboards. Collector has a breaking feature, replacing format for labels from kubernetes_node_labels_LABEL1=VALUE1 to kubernetes_node_labels=[LABEL1=VALUE1,LABEL2=VALUE2].
Process level metrics.
Uptime for hosts and processes.
Fields extraction for kubernetes controller manager and scheduler.
Fields extraction and support in dashboards for main kubernetes components (setup host logs collection with collector).
New top-like dashboards allow to monitor Hosts/Pods/Containers/Processes in real-time.
Rewritten Kubernetes Objects Dashboards with support of Events and Labels.
Improved dashboards navigation.
Support for host logs.
Other bugs and improvements based on user feedback.

Links

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.

Monitoring Kubernetes - Release History

5.23.432 - 2025-02-12

5.23.431 - 2024-11-18

5.23.430 - 2024-10-28

5.22.422 - 2024-06-17

5.22.421 - 2024-05-13

5.22.420 - 2024-04-22

5.21.412 - 2024-01-08

5.21.411 - 2023-11-28

5.21.410 - 2023-10-16

5.20.403 - 2023-07-31

5.20.402 - 2023-06-06

5.20.401 - 2023-05-22

5.20.400 - 2023-04-17

5.19.391 - 2023-03-07

5.19.390 - 2022-10-17

5.18.381 - 2022-05-17

5.18.380 - 2022-04-19

5.17.370 - 2021-10-20

5.16.363 - 2021-05-26

5.16.361 - 2021-03-16

5.16.353 - 2021-02-11

5.16.351 - 2021-01-04

5.16.350 - 2020-12-14

5.15.305 - 2021-01-04

5.15.303 - 2020-08-12

5.15.301 - 2020-06-24

5.15.300 - 2020-06-01

5.14.285 - 2020-08-12

5.14.284 - 2020-03-23

5.14.280 - 2020-01-27

5.12.273 - 2019-11-18

5.12.272 - 2019-11-08

5.12.271 - 2019-11-07

5.12.270 - 2019-10-22

5.11.266 - 2020-10-15

5.11.265 - 2020-06-24

5.11.264 - 2019-11-08

5.11.261 - 2019-09-13

5.11.260 - 2019-09-09

5.10.255 - 2019-11-20

5.10.253 - 2019-07-31

5.10.252 - 2019-07-24

5.10.251 - 2019-06-20

5.10.250 - 2019-06-18

5.9.244 - 2019-05-20

5.9.240 - 2019-05-14

5.8.231 - 2019-04-25

5.8.230 - 2019-04-22

5.7.220 - 2019-03-18

5.6.213 - 2019-03-03

5.6.212 - 2019-02-19

5.5.205 - 2019-01-25

5.5.203 - 2019-01-25

5.5.202 - 2019-01-24

5.4.201 - 2018-12-19

5.4 - 2018-12-17

5.3 - 2018-11-19

5.2 - 2018-10-15

5.1 - 2018-09-17

5.0 - 2018-09-03

4.0.24 - 2018-05-05

collectorforkubernetes 4.0.172

collectorforkubernetes 4.0.173

collectorforkubernetes 4.0.174.180730

3.0.23 - 2018-02-17

3.0.22 - 2018-02-07

2.1 - 2017-10-22

2.0 - 2017-10-22

Links

About Outcold Solutions