DigitalOcean platform was used as the infrastructure to run the load tests of the Apinizer platform because of its easy use and fast support.

The load test topology was as follows:

To create this topology and run our load tests, these steps were followed:

1. Setup of "Load Test Server" and Configuration of JMeter

Load Test Server specifications were:


Following steps were done with root user


  • Java 1.8 was installed
# yum install java-1.8.0-openjdk -y
BASH
  • Java version was ckecked
# java -version
openjdk version "1.8.0_275"
OpenJDK Runtime Environment (build 1.8.0_275-b01)
OpenJDK 64-Bit Server VM (build 25.275-b01, mixed mode)
BASH
  • "wget" command was installed for various file download operations
# yum install wget -y
BASH
  • Jmeter setup file was downloaded to server
# wget http://apache.stu.edu.tw//jmeter/binaries/apache-jmeter-5.2.1.tgz
BASH
  • Jmeter tar file was extracted 
# tar -xf apache-jmeter-5.2.1.tgz
BASH
  • Environment in linux was configured for Jmeter
# vim ~/.bashrc
BASH
  • Following lines were added to .bashrc 
export JMETER_HOME=/root/apache-jmeter-5.2.1
export PATH=$JMETER_HOME/bin:$PATH
BASH
  • The .bashrc file was reloaded with the source command
# source ~/.bashrc
BASH


After configuration is completed, script for load testing was prepared using Jmeter's interface. The number of threads and load time were created as parametric values.

Sample JMeter configuration file:

<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan version="1.2" properties="5.0" jmeter="5.2">
  <hashTree>
    <TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="Test Plan" enabled="true">
      <stringProp name="TestPlan.comments"></stringProp>
      <boolProp name="TestPlan.functional_mode">false</boolProp>
      <boolProp name="TestPlan.tearDown_on_shutdown">true</boolProp>
      <boolProp name="TestPlan.serialize_threadgroups">false</boolProp>
      <elementProp name="TestPlan.user_defined_variables" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
        <collectionProp name="Arguments.arguments"/>
      </elementProp>
      <stringProp name="TestPlan.user_define_classpath"></stringProp>
    </TestPlan>
    <hashTree>
      <ThreadGroup guiclass="ThreadGroupGui" testclass="ThreadGroup" testname="Thread Group" enabled="true">
        <stringProp name="ThreadGroup.on_sample_error">continue</stringProp>
        <elementProp name="ThreadGroup.main_controller" elementType="LoopController" guiclass="LoopControlPanel" testclass="LoopController" testname="Loop Controller" enabled="true">
          <boolProp name="LoopController.continue_forever">false</boolProp>
          <intProp name="LoopController.loops">-1</intProp>
        </elementProp>
        <stringProp name="ThreadGroup.num_threads">${__P(threads,10)}</stringProp>
        <stringProp name="ThreadGroup.ramp_time">5</stringProp>
        <boolProp name="ThreadGroup.scheduler">true</boolProp>
        <stringProp name="ThreadGroup.duration">${__P(seconds,30)}</stringProp>
        <stringProp name="ThreadGroup.delay"></stringProp>
        <boolProp name="ThreadGroup.same_user_on_next_iteration">true</boolProp>
      </ThreadGroup>
      <hashTree>
        <HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="HTTP Request" enabled="true">
          <elementProp name="HTTPsampler.Arguments" elementType="Arguments" guiclass="HTTPArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
            <collectionProp name="Arguments.arguments"/>
          </elementProp>
          <stringProp name="HTTPSampler.domain"><Change with Your IP></stringProp>
          <stringProp name="HTTPSampler.port">30080</stringProp>
          <stringProp name="HTTPSampler.protocol"></stringProp>
          <stringProp name="HTTPSampler.contentEncoding"></stringProp>
          <stringProp name="HTTPSampler.path">/apigateway/<Change with Your Path></stringProp>
          <stringProp name="HTTPSampler.method">GET</stringProp>
          <boolProp name="HTTPSampler.follow_redirects">true</boolProp>
          <boolProp name="HTTPSampler.auto_redirects">false</boolProp>
          <boolProp name="HTTPSampler.use_keepalive">true</boolProp>
          <boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
          <stringProp name="HTTPSampler.embedded_url_re"></stringProp>
          <stringProp name="HTTPSampler.connect_timeout"></stringProp>
          <stringProp name="HTTPSampler.response_timeout"></stringProp>
        </HTTPSamplerProxy>
        <hashTree>
          <ResultCollector guiclass="SummaryReport" testclass="ResultCollector" testname="Summary Report" enabled="true">
            <boolProp name="ResultCollector.error_logging">false</boolProp>
            <objProp>
              <name>saveConfig</name>
              <value class="SampleSaveConfiguration">
                <time>true</time>
                <latency>true</latency>
                <timestamp>true</timestamp>
                <success>true</success>
                <label>true</label>
                <code>true</code>
                <message>true</message>
                <threadName>true</threadName>
                <dataType>true</dataType>
                <encoding>false</encoding>
                <assertions>true</assertions>
                <subresults>true</subresults>
                <responseData>false</responseData>
                <samplerData>false</samplerData>
                <xml>false</xml>
                <fieldNames>true</fieldNames>
                <responseHeaders>false</responseHeaders>
                <requestHeaders>false</requestHeaders>
                <responseDataOnError>false</responseDataOnError>
                <saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
                <assertionsResultsToSave>0</assertionsResultsToSave>
                <bytes>true</bytes>
                <sentBytes>true</sentBytes>
                <url>true</url>
                <threadCounts>true</threadCounts>
                <idleTime>true</idleTime>
                <connectTime>true</connectTime>
              </value>
            </objProp>
            <stringProp name="filename"></stringProp>
          </ResultCollector>
          <hashTree/>
        </hashTree>
      </hashTree>
    </hashTree>
  </hashTree>
</jmeterTestPlan>


XML

<Change with Your IP> value must be the IP address to which the load will be sent.

<Change with Your Path>  value must be the request address following the IP address to which the load will be sent.

The "threads" and "seconds" values are parametric and will be given at runtime.

Screenshot of this configuration:

  • Thread Group:

  • HTTP Request:


  • The following command was run by changing the thread and message duration, the results were recorded.
# jmeter  -Jthreads=1000 -Jseconds=600 -n -t ./conf/configurable.jmx 
BASH


2. Setup of "NGINX Server" and Configuration of NGINX

NGINX was used for simulating backend operations.  

NGINX Server specifications were:


Following steps were done with root user


  • EPEL repository was installed
# yum install epel-release
BASH
  • NGINX was installed
# yum install nginx
BASH
  • NGINX was started after installation
# systemctl start nginx
BASH
  • To ensure that NGINX was working, accessed to the configured address from browser 
http://server_domain_name_or_IP/
BASH
  • Enabled NGINX to run as a service in linux after successful result is seen
# systemctl enable nginx
BASH
  • Added/edited the following settings to the configuration file so that NGINX can run under high load
worker_processes 4;
worker_connections 8192;
worker_rlimit_nofile 40000;
BASH
  • NGINX's response body was set to 'OK' and statusCode is 200.
location / {
    return 200 'OK';
    add_header Content-Type text/plain;
}
BASH
  • NGINX configuration file after all configurations completed was as follows:
# vi /etc/nginx/nginx.conf

user nginx; 
worker_processes 4;
worker_rlimit_nofile 40000;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;


include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 8192;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    include /etc/nginx/conf.d/*.conf;

    server {
        listen       80 default_server;
        listen       [::]:80 default_server;
        server_name  _;
        root         /usr/share/nginx/html;


        include /etc/nginx/default.d/*.conf;

        location / {
            return 200 'OK';
            add_header Content-Type text/plain;
        }

        error_page 404 /404.html;
            location = /40x.html {
        }

        error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }

}
BASH
  • NGINX updated configuration was reloaded by the following command:
# nginx -s reload
BASH
  • To ensure NGINX loads the latest settings, accessed server address from browser and 'OK' text was seen.

3. Setup of "Apinizer and Log Server" and Kubernetes, MongoDb, ElasticSearch  installation & configuration

Kubernetes master and mongodb server specification:

Kubernetes worker server specification:
Log database (Elasticsearch) server specification:



In this section, installation of kubernetes master, worker, mongodb, elasticsearch and Apinizer were done as steps detailed in https://docs.apinizer.com/


4. Key Points of Load Test

Points to consider while testing:

  • Apinizer asynchronously stores all request and response messages and metrics in the Elasticsearch log database. During the tests, this logging operations continued as it was to be.
  • In all our tests, we used internal IPs to reduce network latency and see the real impact of Apinizer.
  • We specifically observed that Kubernetes does not restart pods during runtime. The number of restarts is an important parameter as it reflects the overload/stuffing or faulty conditions.
  • These 4 configurations were setup from Gateway Environments and Elasticsearch Clusters screens, and used in tests:




Worker SettingsRouting Connection Pool Elastic Search Client 
CoreRam (gb)IO ThreadsMin. Thread CountMax. Thread CountHttp Max ConnectionsMax Conn. Per RouteMax Conn. TotalIO Thread CountMax Conn. Per RouteMax Conn. Total
11151210241024512102443264
222102420484096204840961664128
444102440968192409681923264128
88161024819281924096819232128256


  • Following values were added to the jvm parameters: -server -XX:MaxRAMPercentage=90
  • These 4 cases above were tested with Get and Post requests. 5K and 50K request bodies were used for the Post requests.
  • All the test cases were done for 10 minutes.

5. Monitoring System Resources


Since Apinizer works in the kubernetes environment, two methods were preferred to monitor the resources spent. These were:

  1. Kubernetes dashboard, the setup of which is described on this page,
  2. JConsole

Monitoring resources via Kubernetes Dashboard was relatively easy, giving instantaneous CPU and RAM status of the server.

However, the drawback of this method was that it was not able to monitor the sources in the long term and did not show the details:


For this reason, the use of JConsole became more useful.


These below settings have been made so that the JConsole application can access the Java application running inside the Pod in kubernetes:

  • The Java startup parameters were set as follows: 
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=30180 -Dcom.sun.management.jmxremote.rmi.port=30180 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=<Worker Sunucusunun Dış Erişim IP'si>
BASH
  • In order to open the JMX service, which is opened from the 30180 port of the Worker pod, to the internet, the following service definition has been made in kubernetes:
apiVersion: v1
kind: Service
metadata:
  name: worker-jmx-service
  namespace: prod
  labels:
    app: worker
spec:
  selector:
    app: worker
  type: NodePort
  ports:
    - name: http
      port: 30180
      targetPort: 30180
      nodePort: 30180
YML


When the settings are finished, the JConsole application was run and accessed to the JVM running on the worker was provided by giving the external access address of the worker server and port 30180.


6. Results





GETPOST 5KbPOST 50Kb
NoThread CountThroughputAvgThroughputAvgThroughputAvg
A5011334310024967573
100110090983101653152
2501025242852292554448
500963516----
B50223222186826143734
100216945176856140970
250208911914561701223203
500191525913983551149432
1000176256412298098771134
1500163191511991245--
200013791441----
C508090673536467910
100781612725713467521
250701135713834402061
5006759737141693221154
1000674214770111412962335
150066832236935215--
20006692297----
40006448617----
D50154203133963468310
100158126134827467121
25015614151358718438256
500156643113611363496142
1000154546413562733046326
15001502699132081122853522
200014839133131791502794710
40001435627612792309--
80001160365511115701--

Throughput & Concurrent Users

Average Response Time & Concurrent Users

Key observations:


A very common mistake when examining the results is to confuse the number of sessions with the number of instant requests. A request is an HTTP request for a specific destination with a specific HTTP method. There can be zero or more requests per session. For example, having 50K sessions in a web application does not mean that the instant request will be 50K, but the probability of 50K requests at the same time is very low. Keeping and using session on gateways is very rare, usually access to backend services is stateless. Therefore, it becomes more meaningful to measure the number of simultaneous requests and latency.

When the number of concurrent users increases, the efficiency increases up to a certain limit. Then it starts to decline. In fact, this natural course indicates that there is a limit to vertical growth. To support more concurrent users with acceptable response times, horizontal and vertical scaling must be considered together. While scaling horizontally, it is necessary to put a load balancer in front of two or more gateways in other gateways, while this process can be configured very easily and quickly since Kubernetes infrastructure is used in Apinizer.

As the message sizes increase, the power needed for processing will increase and the efficiency will decrease. Therefore, the response time is become longer. Although the request sizes are usually around 1Kb in real-life scenarios, we found it worth examining 5Kb and 50Kb Post requests as there was little difference between our 1Kb Post and Get requests in our tests. Although the results are naturally lower than the GET requests, we were pleased that the figures dropped to only one fourth of the performance, which increased by 10 times.

Ram expended rates were very consistent throughout the load test. Although the size of the requests increased tenfold, there was no significant increase in RAM usage. This proved that Openj9 was the right choice for Apinizer.

State "D", 8000 Thread, a snapshot of the VM for the Get request:

Effect of adding a Policy 

Each policy we add to the gateway affects performance on the gateway according to its complexity and dependencies.

Now let's add a "basic authentication" policy to Apinizer. Let's test this configuration only for case "D", not all cases, it should give us an idea after all:



GETGET with Policy
NoThread CountThroughputAvgThroughputAvg
D50154203147603
100158126148436
25015614151489116
50015664311474833
100015454641428568
1500150269914373102
20001483913314280136
40001435627613795279
80001160365511437672

Throughput & Concurrent Users

Average Response Time & Concurrent Users

As we can see, the effect on the performance has been insignificant. But if, for example, a policy with high processing power such as "content filtering" was added, or a policy such as "Ldap Authentication" that requires external connection and adds network latency, performance would decrease even more rapidly. The important thing here is to know how much burden each policy will bring and to choose the design accordingly.