Setup Nagios Alerts to Monitor SSL Certificate Expiry in Java Key Store

May 14, 2022

The background

Nagios is a monitoring tool1 for systems, networks, and infrastructure. The expected outcome was to add a Nagios alert for SSL Certificates stored in a Java Key Store file. Since the Nagios server was already set up and functioning, we’ll cover only for setting up the alert.

The procedure

Considering security and efficiency, it was decided to set up a cron triggered script to read the certificate expiry dates, write them to a location accessible by the Nagios Remote Plugin Executer (NRPE)2 agent. Then, the NRPE agent would trigger a script that would read the file and exit with appropriate exit status3.

On the server where the JKS in question is avaialable, the following cron job was added.

# Write expiry date of jks ssl certs for nrpe to read
10 0 * * * JAVA_HOME/bin/keytool -list -v -keystore ABSOLUTE_PATH_TO_JKS/bizao_prod_tmp.jks -alias JKS_ALIAS_HERE -storepass PASS 2>/dev/null | grep "until:" | sed 's/.*until: //' > /opt/keystore_until_dates.txt && chmod 755 /opt/keystore_until_dates.txt

In the Nagios server configuration for services to monitor, following was added.

# JKS SSL Certificate Check 
define service{
  use                     generic-service
  host_name               SERVER-NAME-HERE
  service_description     JKS SSL Certificate Expiration - JKS_ALIAS_HERE
  check_command           check_nrpe_args!check_jks_cert_args!45!10
}

The script4 for reading the expiry dates from a file and exiting with appropriate exit status was as follows.

#!/bin/bash

# Script to check expiry date of certs within an alias in a Java keytool
#
# maintainer: MAINTAINER_EMAIL_HERE


DATES_FILE="keystore_until_dates.txt"
WARNING_DAYS="$1"
CRITICAL_DAYS="$2"

# Exit with error if DATES_FILE is missing
if [ ! -f "$DATES_FILE" ]; then
  echo "WARNING - Could not find $DATES_FILE"
  # Warning
  exit 1
fi

readarray -t KEYSTORE_UNTIL_DATES < "${DATES_FILE}"

DATE="${KEYSTORE_UNTIL_DATES[1]}"
KEYSTORE_UNTIL_EPOCH=$(date +%s --date="${DATE}")
NOW_EPOCH=$(date +%s)
seconds_to_expiry=$(($KEYSTORE_UNTIL_EPOCH-$NOW_EPOCH))
days_to_expiry_min=$(( $seconds_to_expiry / 86400 ))
date_min=""

for DATE in "${KEYSTORE_UNTIL_DATES[@]}"; do
  KEYSTORE_UNTIL_EPOCH=$(date +%s --date="${DATE}")
  NOW_EPOCH=$(date +%s)
  seconds_to_expiry=$(($KEYSTORE_UNTIL_EPOCH-$NOW_EPOCH))
  days_to_expiry=$(( $seconds_to_expiry / 86400 ))
  
  if (( $days_to_expiry_min > $days_to_expiry )); then 
    days_to_expiry_min=$days_to_expiry
    date_min="${DATE}"
  fi
done

msg="JKS SSL Certificate Expires in $days_to_expiry_min days at $date_min - JKS_ALIAS_HERE"

if (( $days_to_expiry_min < $CRITICAL_DAYS )); then
  echo "CRITICAL - $msg"
  # Critical
  exit 2
fi
if (( $days_to_expiry_min < $WARNING_DAYS )); then
  echo "WARNING - $msg" 
  # Warning
  exit 1
fi

echo "OK - $msg"
# OK
exit 0

Please note that the above bash script is not optimized, and much room for improvement exists considering the whole setup.


  1. https://en.wikipedia.org/wiki/Nagios ↩︎

  2. https://en.wikipedia.org/wiki/Nagios#NRPE ↩︎

  3. https://en.wikipedia.org/wiki/Exit_status ↩︎

  4. Shoutout to Songrong for https://songrgg.github.io/operation/how-to-check-and-monitor-tls-jks-certificates-with-telegraf/ ↩︎

NagiosMonitoringLinux

Updating VirtualBox on Ubuntu