How jq can save your job in simple ways

While checking the ticketing system for my company I noticed there was a new reply for a ticket that had been in hiatus for quite a while (hiatus as in it was opened and no one checked it for whatever reason, maybe someone was too busy and no it wasn’t on the backlog queue). The new message said something like: ‘Hey guys? is this happening? ever?’

So, I had some free time and decided to act on it. An API needed to be monitored. The API had several endpoints defined, all of which gave crucial information about the application instance that was deployed on the server. The information is presented on a JSON format and has the value entry defined with the values that needed monitoring.

Now, I’m not an expert on JSON, however the format isn’t too hard to understand and after checking some documentation and a couple of google searches it’s easily manageable. With a general understanding about the works on Bash, and a general understanding that’s basically summarized in two words -> google-fu (as 99% of sysadmins I know are too afraid to admit this publicly, I will).

Analizing

After checking the API’s URL with curl I could tell there were some endpoints presented. All of the endpoints the developers needed to be monitored in fact.

Note: If you don’t want an explanation for this and just wanna see the script, it’s posted at the end of this article

To do this all you need to use is a one-liner on the terminal such as this:

curl -s http://amazingapi.local/api

Note: you could use curl with any URL, be it with a domain name or an IP without DNS, if you need to get some more instructions about it feel free to consult the man pages for the command using man curl

Working with JSON

After checking the output from the curl command we get something like this:

curl -s http://amazingapi.local/api

{"names":["jvm.memory.max","jdbc.connections.active","process.files.max","jvm.gc.memory.promoted","tomcat.cache.hit","system.load.average.1m","tomcat.cache.access","jvm.memory.used","jvm.gc.max.data.size","jdbc.connections.max","jdbc.connections.min","jvm.gc.pause","jvm.memory.committed","http.server.requests","system.cpu.count","tomcat.global.sent","jvm.buffer.memory.used","tomcat.sessions.created","jvm.threads.daemon","system.cpu.usage","jvm.gc.memory.allocated","tomcat.global.request.max","hikaricp.connections.idle","hikaricp.connections.pending","tomcat.global.request","tomcat.sessions.expired","hikaricp.connections","jvm.threads.live","jvm.threads.peak","tomcat.global.received","hikaricp.connections.active","hikaricp.connections.creation","process.uptime","tomcat.sessions.rejected","process.cpu.usage","tomcat.threads.config.max","jvm.classes.loaded","hikaricp.connections.max","hikaricp.connections.min","jvm.classes.unloaded","tomcat.global.error","tomcat.sessions.active.current","tomcat.sessions.alive.max","jvm.gc.live.data.size","tomcat.servlet.request.max","hikaricp.connections.usage","tomcat.threads.current","tomcat.servlet.request","hikaricp.connections.timeout","process.files.open","jvm.buffer.count","jvm.buffer.total.capacity","tomcat.sessions.active.max","hikaricp.connections.acquire","tomcat.threads.busy","process.start.time","tomcat.servlet.error"]}

Not many people will read the entire output, first of all because even though it’s on a JSON format, it’s minimized so it’s not as easy to follow, however this is where our new best friend when it comes to JSON comes in: jq. If you pipe the output from the curl command into jq we can see something like this:

curl -s http://amazingapi.local/api | jq

{
  "names": [
    "jvm.memory.max",
    "http.server.requests",
    "jdbc.connections.active",
    "process.files.max",
    "jvm.gc.memory.promoted",
    "tomcat.cache.hit",
    "system.load.average.1m",
    "tomcat.cache.access",
    "jvm.memory.used",
    "jvm.gc.max.data.size",
    "jdbc.connections.max",
    "jdbc.connections.min",
    "jvm.gc.pause",
    "jvm.memory.committed",
    "system.cpu.count",
    "tomcat.global.sent",
    "jvm.buffer.memory.used",
    "tomcat.sessions.created",
    "jvm.threads.daemon",
    "system.cpu.usage",
    "jvm.gc.memory.allocated",
    "tomcat.global.request.max",
    "hikaricp.connections.idle",
    "hikaricp.connections.pending",
    "tomcat.global.request",
    "tomcat.sessions.expired",
    "hikaricp.connections",
    "jvm.threads.live",
    "jvm.threads.peak",
    "tomcat.global.received",
    "hikaricp.connections.active",
    "hikaricp.connections.creation",
    "process.uptime",
    "tomcat.sessions.rejected",
    "process.cpu.usage",
    "tomcat.threads.config.max",
    "jvm.classes.loaded",
    "hikaricp.connections.max",
    "hikaricp.connections.min",
    "jvm.classes.unloaded",
    "tomcat.global.error",
    "tomcat.sessions.active.current",
    "tomcat.sessions.alive.max",
    "jvm.gc.live.data.size",
    "tomcat.servlet.request.max",
    "hikaricp.connections.usage",
    "tomcat.threads.current",
    "tomcat.servlet.request",
    "hikaricp.connections.timeout",
    "process.files.open",
    "jvm.buffer.count",
    "jvm.buffer.total.capacity",
    "tomcat.sessions.active.max",
    "hikaricp.connections.acquire",
    "tomcat.threads.busy",
    "process.start.time",
    "tomcat.servlet.error"
  ]
}

If you need to install jq, it’s as simple as: apt install jq -y on Debian based distros or yum install jq -y on RHEL based distros.

This is A LOT easier to understand, and we start finding a small issue, there’s a lot of endpoints. There’s a few ways to create a script to monitor an API, most of them involve creating a single script for a single API, however there’s many APIs to be monitored in both this company’s infrastructure or whichever we want to explore.

Reasoning

In this specific case a few developers told me that not all of the APIs had the same endpoints but most of them didn’t need credentials since they were filtered by IPs, so it made the task a lot easier. I won’t go on details on how to do this against an authenticated API, maybe on another article; but implementing a token based authentication or a credentials based authentication on the script shouldn’t take more than a couple of lines.

Back to the API. As you and I can see, there’s a lot of endpoints and we don’t know if those are going to change any day, some deleted, some added. We need to make a script that can adapt and be set up within our monitorization tool to take into account the new changes. Also, this API is served from a Tomcat application, this doesn’t need to be the case always.

Since we can get the Endpoint list from the curl command we executed earlier, we can also check for a specific endpoint. In the case of the first entry: jvm.memory.max we can do this:

Selecting values

curl -s http://amazingapi.local/api/jvm.memory.max | jq

{
  "name": "jvm.memory.max",
  "description": "The maximum amount of memory in bytes that can be used for memory management",
  "baseUnit": "bytes",
  "measurements": [
    {
      "statistic": "VALUE",
      "value": 33096204287
    }
  ],
  "availableTags": [
    {
      "tag": "area",
      "values": [
        "heap",
        "nonheap"
      ]
    },
    {
      "tag": "id",
      "values": [
        "Compressed Class Space",
        "PS Survivor Space",
        "PS Old Gen",
        "Metaspace",
        "PS Eden Space",
        "Code Cache"
      ]
    }
  ]
}

Awesome, we see the value that the API provides as well as a few extra data that doesn’t serve much purpose for our needs right now and we won’t dwelve on it too much. The main attribute we want from the JSON output is the VALUE entry, which belongs on the measurements array:

  "measurements": [
    {
      "statistic": "VALUE",
      "value": 33096204287
    }
  ],

Options for jq

With jq we can get an array from within an object with single quotes and the brackets to denote an array, like this:

curl -s http://amazingapi.local/api/jvm.memory.max | jq '.measurements[]'

{
  "statistic": "VALUE",
  "value": 33096204287
}

In case we want to get a specific entry from the array we can go further in and get the specific value we need:

curl -s http://amazingapi.local/api/jvm.memory.max | jq '.measurements[]' | jq '.value'

33096204287

That number that outputs from the command is “The maximum amount of memory in bytes that can be used for memory management” as stated by the previous query we did against the API. This number can be converted a bunch of different ways from the script, however we won’t because we need the script to be as frugal as possible so it can be used with many APIs and many endpoints.

Now, we know how to get a specific value from an endpoint using curl, we just need to script it into something that can be used by monitoring software. Since most monitoring software allows bash scripts or sh or ksh or whichever shell is installed on the monitoring software’s host machine we’ll just use Bash.

Scripting

Please note that this script relies on some text manipulation that can be understood using the man pages on your linux console. This script might look kinda hard to read at first (sorry, not sorry) but it’s pretty simple once you actually go line through line.

Lets start with the first segment:

#!/bin/bash
if [[ -z $1 ]] ; then
    echo "This script has to be used with two parameters"
    echo "Usage:"
    echo "check_api.sh <api_url> <endpoint>"
    echo "e.g.:"
    echo "  check_api.sh http://exampleurl.com/api cpu.sys.util"
    exit 1
fi

We just initialize the script and put in a description, if the script doesn’t receive the parameters it expects on startup its going to show some help message and an example.

Afterwards we define a couple of variables to be used:

url=$1
probe=$2

endpoints=$`curl -s $1 | \
jq '.names[]'| awk -F\" '{print $2}'|cut -c 1-`

These variables do simple jobs. The first two are set up to receive the parameters from the script execution, the third one is the command that gets the enpoint list from the API server (like we discovered earlier).

Then we input the endpoint list onto an array defined as: endpointsarray:

endpointsarray=( $endpoints )

And finally the actual magic:

for i in "${endpointsarray[@]}"
do
        specs=`curl -s $1$i | \
        jq '.measurements[]' | jq '.value'`
        echo $i $specs |grep $probe | cut -d " " -f2-
done

This for loop just goes through the array and gets the value from the endpoint. It has some text manipulation due to the fact that the script is going to throw out some information that’s not needed, in case you wanna see some of it you’re free to experiment with the script, try seeing what the cut command does on it.

Hopefully this has thrown some light onto some things that may need some light. Full script as follows.

Final Script

TL;DR: Script that gets all endpoints from an API and gets the values from a single endpoint:

#!/bin/bash
if [[ -z $1 ]] ; then
    echo "This script has to be used with two parameters"
    echo "Usage:"
    echo "check_api.sh <api_url> <endpoint>"
    echo "e.g.:"
    echo "  check_api.sh http://exampleurl.com/api cpu.sys.util"
    exit 1
fi

url=$1
probe=$2

endpoints=$`curl -s $1 | \
jq '.names[]'| awk -F\" '{print $2}'|cut -c 1-`

endpointsarray=( $endpoints )

for i in "${endpointsarray[@]}"
do
        specs=`curl -s $1$i | \
        jq '.measurements[]' | jq '.value'`
        echo $i $specs |grep $probe | cut -d " " -f2-
done