Thursday, February 23, 2023

Test log Analytics with Elasticsearch and Kibana

Software Test Analytics is the process of collecting and analyzing data from software testing activities to improve the quality and efficiency of the testing process. This can include metrics such as test coverage, defect density, and test execution time, as well as data on test automation and test case management. The goal of software test analytics is to identify trends and patterns in the data that can be used to make informed decisions about how to improve the testing process, such as where to focus testing efforts, which tests to automate, and how to optimize test case design.

Test log analysis is the process of collecting, analyzing, and interpreting data from test logs in order to identify patterns, trends, and issues that can help improve the quality and performance of software systems. This can include data such as test results, error messages, performance metrics, and other relevant information. The goal of test log analysis is to help identify and resolve issues that may be impacting the performance or functionality of the system, and to improve the overall quality of the software. Common techniques used in test log analysis include statistical analysis, machine learning, and data visualization.

Insights that can be gained from analyzing test case output logs are listed below: 

1) Identifying which test cases are passing and which are failing, and the reasons for the failures. This can help you to focus your testing efforts on the areas of the application that need the most attention.

2) Understanding the performance of the application under test. This can include metrics such as response time, memory usage, and CPU usage, which can help you to identify and fix performance bottlenecks.

3) Identifying patterns in the test case data that indicate potential issues with the application under test. For example, if a large number of test cases are failing in a particular module, it could indicate a problem with that module that needs to be investigated.

4) Identifying areas of the application that are not being adequately tested. This can help you to create new test cases to cover these areas, and to ensure that the application is thoroughly tested before release.

5) Identifying where automation can improve the testing process. With the help of logs, you can automate test cases which are repetitive, time-consuming or prone to human errors.

6) To get the most value out of test case output logs, it's important to have a systematic and automated way of collecting and analyzing the data. This can include using tools such as log analyzers, data visualization tools, and automated reporting tools.

Open-source frameworks that can be used for test result analytics: 

ELK Stack: ELK stands for Elasticsearch, Logstash, and Kibana. Elasticsearch is a search engine, Logstash is a log aggregator, and Kibana is a data visualization tool. By using the ELK stack, you can collect, store, and analyze large volumes of test result data in real-time.


The ELK stack (Elasticsearch, Logstash, and Kibana) can be integrated with a CI (Continuous Integration) framework in several ways to show Test analytics.

1. Logstash: Logstash can be used to collect and parse log files generated by the CI framework. You can configure Logstash to read log files from the CI server and to parse the data into a format that can be indexed by Elasticsearch.

2. Kibana: Kibana can be used to visualize the data collected by Logstash. You can create a dashboard in Kibana that displays metrics such as build time, test execution time, and pass/fail rate.

3. Elasticsearch: Elasticsearch can be used to store and index the data collected by Logstash. You can use Elasticsearch to search and analyze the data, and to create complex queries and visualizations.

4. Integrate with CI/CD tool: You can integrate ELK stack with your CI/CD tool, for example, Jenkins, Travis or CircleCI. You can configure the CI tool to send log files to Logstash, or directly to Elasticsearch.

-------------------

After Installation and configuring the ELK stack and Jenkins, the next step for analytics on test reports logs would be to start analyzing and visualizing the data. Here are some steps you can take:

1. Verify data collection: Ensure that the data is being collected correctly and that the logs are being indexed in Elasticsearch.

2. Create visualizations: Use Kibana to create visualizations such as line charts, bar charts, and pie charts to represent the data in a meaningful way. These visualizations can be used to represent metrics such as build time, test execution time, and pass/fail rate.

3. Create dashboards: Use Kibana to create dashboards that display multiple visualizations in a single view. These dashboards can be used to monitor the build and test results in real-time and to analyze the data over time.

4. Define alerts: Set up alerts in Kibana to notify you when certain conditions are met, such as a high number of test failures.

5. Analyze the data: Use Elasticsearch to create complex queries and to analyze the data in more detail. This can be used to identify patterns and trends in the data that can help improve the testing process.

6. Improve test coverage: Use the data to identify areas of the application that are not being adequately tested and to focus your testing efforts on those areas.

7. Identify and fix defects: Use the data to identify the root cause of test failures and to fix defects in the application

------------------------

We can ingest data with Python on Elastic search instead of using logstash which is good for realtime data ingestion. Prerequisites: Get elasticsearch packages

python -m pip install elasticsearch
python -m pip install elasticsearch-async

Some commonly used elasticsearch APIs listed below :

  1. es.index: Used to index a document into an index.
  2. es.search: Used to search for documents in an index based on a query.
  3. es.get: Used to retrieve a document from an index by its ID.
  4. es.delete: Used to delete a document from an index by its ID.
  5. es.update: Used to update a document in an index.
  6. es.count: Used to count the number of documents that match a query without returning the actual documents.
  7. es.exists: Used to check if a document exists in an index.
  8. es.bulk: Used to execute multiple index, update, or delete requests in a single HTTP request.
  9. es.create: Used to create a new index.
  10. es.delete_index: Used to delete an existing index.
  11. es.get_mapping: Used to retrieve the mapping of a document type in an index.
  12. es.put_mapping: Used to define the mapping of a document type in an index.
  13. es.cluster.health: Used to retrieve information about the health of the Elasticsearch cluster.
  14. es.cluster.state: Used to retrieve the current state of the Elasticsearch cluster.
  15. es.nodes.info: Used to retrieve information about the nodes in the Elasticsearch cluster.
  16. es.nodes.stats: Used to retrieve statistics about the nodes in the Elasticsearch cluster.
  17. es.termvectors: Used to retrieve information about the terms in a document.
  18. es.mtermvectors: Used to retrieve information about the terms in multiple documents.
  19. es.explain: Used to explain how a particular document matches a query.
  20. es.mget: Used to retrieve multiple documents from an index by their IDs

Example : Python code to use the Elasticsearch Python client to index a test log report into an Elasticsearch index:

Pre-requisite: Elasticsearch Python client installed (pip install elasticsearch)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

# cat my_test_dataToES.py

#Let me create a sample test report data and push it to ElasticsearchDB

from datetime import datetime
from elasticsearch import Elasticsearch
# create an Elasticsearch client instance
es = Elasticsearch("http://sachin.bengaluru.com:9200")

# create an Elasticsearch index to store the test log report
es.indices.create(index='test_logs_sachin', ignore=400)

# define the test log report as a Python dictionary
test_log = {
    'test_name': 'login_test',
    'status': 'failed',
    'error_message': 'Invalid credentials',
    'timestamp': datetime.now()
}
# index the test log report into the Elasticsearch index
es.index(index='test_logs_sachin', body=test_log)
#

Execute the python script 

# python3 my_test_dataToES.py

#

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Check the kibana dashboard and create ==> Create index pattern


Visualizing testing results can help you to quickly and easily identify patterns and trends in the data, and can provide valuable insights into the performance and quality 

1) Test Execution Progress: Creating a graph or chart that shows the progress of test execution over time can help you to identify trends in test case pass/fail rates, and to identify areas of the application that are not being adequately tested.

2) Test Case Pass/Fail Rates: Creating a graph or chart that shows the pass/fail rate for each test case can help you to quickly identify which test cases are passing and which are failing. This can help you to focus your testing efforts on the areas of the application that need the most attention.

3) Defect Density: Creating a graph or chart that shows the number of defects per unit of code can help you to identify areas of the application that are prone to defects and to identify patterns in the types of defects that are being found.

4) Test Execution Time: Creating a graph or chart that shows the execution time for each test case can help you to identify performance bottlenecks and to optimize test case design.

5) Test Automation: Creating a graph or chart that shows the percentage of test cases that are automated can help you to identify areas of the application that can benefit from test automation.

6) Test coverage : Creating a graph or chart that shows how much of the application is being tested by your test suite can help you identify the areas that are not being covered and focus on increasing test coverage.

-----------------------------------------------

NOTE: 

Arrays: https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html

In Elasticsearch, there is no dedicated array data type. Any field can contain zero or more values by default, however, all values in the array must be of the same data type. For instance:

an array of strings: [ "one", "two" ]

an array of integers: [ 1, 2 ]

an array of arrays: [ 1, [ 2, 3 ]] which is the equivalent of [ 1, 2, 3 ]

an array of objects: [ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]

Arrays of objects: Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested data type instead of the object data type. This is explained in more detail in Nested.

Detailed insights from test result logs :

1) Root Cause Analysis: By analyzing the logs of failed test cases, you can identify the root cause of the failure, such as an issue with the application under test, a problem with the test case design, or an environment issue. This can help you to quickly fix the problem and prevent similar issues in the future.

2) Correlation Analysis: By analyzing the logs of multiple test cases, you can identify patterns and correlations between test results, such as the relationship between test execution time and the number of defects found. This can help you to identify areas of the application that are prone to defects and to optimize test case design.

3) Regression Analysis: By analyzing the logs of test cases that have been executed over time, you can identify trends in test case pass/fail rates, and to identify areas of the application that are not being adequately tested. This can help you to focus your testing efforts on the areas of the application that need the most attention.

4) Log Parsing: By parsing the logs, you can extract relevant information such as test case name, status, execution time, error messages, and stack trace. This information can be further analyzed to identify trends and patterns that can help improve the testing process.

5) Anomaly Detection: By analyzing the logs, you can identify anomalies or unexpected behavior in the test results. This can help you to identify potential issues with the application under test and to quickly fix them before they become major problems.

Machine Learning: You can use machine learning techniques such as clustering, classification, or prediction to analyze test results logs. This can help you to identify patterns and insights that would be difficult to discover manually.

Natural Language Processing: By using NLP techniques, you can extract useful information from unstructured test result logs. This information can be used to identify patterns and insights that would be difficult to discover manually.

These techniques can be implemented using machine learning libraries and frameworks such as scikit-learn, TensorFlow, or PyTorch.  It's also important to have a good understanding of the data, cleaning and preprocessing it before training the model.

No comments:

Post a Comment