Saving All Query Result to File with Bash Script with Scroll API
The logs kept on Apinizer may need to be transferred to other environments or examined using other products.
In such cases, it is necessary to query the data kept in Apinizer Log database ElasticSearch and save it to the file. Due to the structure of Elastic Search, more than 1000 records are not returned to the queries made.
In cases where the total number of records exceeds 1000, it is necessary to query with the Scroll API.
This process should be done in a loop, as the result coming with the Scroll API may need to be processed and querying again.
You can find the implementation of this loop with Linux Script below.
Prerequisite: JQ(Json Processor) Installation
The JQ package must be installed on the server for Bash Script to work properly.
You can follow the steps below for this setup:
1.Install the EPEL repository
yum install epel-release -y
2.Update your server
yum update -y
3.Install the jq(JSON Processor) tool
yum install jq -y
Scrolling Script
The script below should be saved in a directory with the name script.sh and made executable with the chmod 777 command.
#!/bin/bash
es_url='http://172.16.0.49:9200'
index=apinizer-log-apiproxy-lelc
response=$(curl -X GET -s $es_url/$index/_search?scroll=1m -H 'Content-Type: application/json' -d @query.json)
scroll_id=$(echo $response | jq -r ._scroll_id)
hits_count=$(echo $response | jq -r '.hits.hits | length')
hits_so_far=${hits_count}
echo Got initial response with $hits_count hits and scroll ID $scroll_id
# process first page of results here (ex. put the response into result.json)
echo $response | jq . >> result.json
while [ "$hits_count" != "0" ]; do
response=$(curl -X GET -s $es_url/_search/scroll -H 'Content-Type: application/json' -d "{ \"scroll\": \"1m\", \"scroll_id\": \"$scroll_id\" }")
scroll_id=$(echo $response | jq -r ._scroll_id)
hits_count=$(echo $response | jq -r '.hits.hits | length')
hits_so_far=$((hits_so_far + hits_count))
echo "Got response with $hits_count hits (hits so far: $hits_so_far), new scroll ID $scroll_id"
# process page of results (ex. put the response into result.json)
echo $response | jq . >> result.json
done
echo Done!
#script reference: https://gist.github.com/toripiyo/8b14e8a387069bae372d49296b0077d7
Example Query
The following query needs to be saved in the same directory as the script.sh file with the name query.json.
Since this query needs to be sent to Apinizer ElasticSearch, the requested address and index name must be corrected according to your environment.
curl --location --request POST 'http://10.10.10.10:9200/apinizer-log-apiproxy-lelc/_search' --header 'Content-Type: application/json' --data-raw '{
"from": 0,
"size": 3000000,
"query": {
"bool": {
"filter": [
{
"bool": {
"filter": [
{
"bool": {
"filter": [
{
"match": {
"uok": {
"query": "username",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"term": {
"pi": {
"value": "6130d19b59f2007bff548d29",
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
{
"range": {
"@timestamp": {
"from": "now-4320m/m",
"to": "now/m",
"include_lower": true,
"include_upper": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"_source": {
"includes": [
"@timestamp",
"uok",
"fcrb",
"sc",
"pet",
"rt",
"tch",
"tcb",
"hr1ra",
"et",
"fcrh"
],
"excludes": []
}
}'
You can visit this page to see what the fields in this query mean.
Running the Script
What you need to do for this is to write ./script.sh from the script.
After that, information notes will start to appear as below, and the results will accumulate in the result.json file.