УДК 004.415.538

A model-based testing approach using coloured Petri nets for distributed quorum-based systems

Лашбаев Мадияр Муратович – магистрант Казахстанско-Британского технического университета (Республика Казахстан, Алматы).

Abstract: The aim of the research is to develop an effective testing method that allows to detect errors and deficiencies in quorum-based distributed systems, and to guarantee their reliability and consistency. For this purpose, simulation using coloured Petri nets is used, which allows to abstract the complexity of the system and to carry out formal verification.

As part of the research, a model-based testing approach has been developed which involves simulating the behaviour of distributed systems using coloured Petri nets and generating test scenarios based on these models.

In order to evaluate the effectiveness of the proposed approach, a number of experiments were conducted on different distributed quorum-based systems. The results show that modelling using coloured Petri nets allows the identification of potential problems and vulnerabilities in the systems, leading to their improvement and increased reliability.

The proposed model-based testing approach using coloured Petri nets is a promising tool to ensure the reliability and consistency of quorum-based distributed systems. It can be applied to a variety of applications where distributed systems need to perform reliably and securely.

Аннотация: Целью исследования является разработка эффективного метода тестирования, позволяющего обнаружить ошибки и недостатки в распределенных системах на основе кворума, а также гарантировать их надежность и согласованность. Для этого используется моделирование с помощью раскрашенных сетей Петри, что позволяет абстрагироваться от сложности системы и провести формальную верификацию.

В рамках исследования был разработан подход к тестированию на основе моделей, который предполагает моделирование поведения распределенных систем с помощью цветных сетей Петри и генерацию тестовых сценариев на основе этих моделей.

Для того чтобы оценить эффективность предложенного подхода, был проведен ряд экспериментов на различных распределенных системах, основанных на кворуме. Результаты показывают, что моделирование с использованием цветных сетей Петри позволяет выявить потенциальные проблемы и уязвимости в системах, что приводит к их улучшению и повышению надежности.

Предложенный подход к тестированию на основе моделей с использованием цветных сетей Петри является перспективным инструментом для обеспечения надежности и согласованности распределенных систем на основе кворума. Он может быть применен в различных приложениях, где распределенные системы должны работать надежно и безопасно.

Keywords: Model-Based testing, coloured Petri nets, Gorums, quorum function, Paxos consensus protocol.

Ключевые слова: тестирование на основе моделей, цветные сети Петри, Горумс, функция кворума, протокол консенсуса Paxos.

Introduction

Distributed quorum-based systems play an important role in today's information infrastructure by ensuring reliability, resilience and consistency of data. However, the development and testing of such systems are complex tasks that require special attention to detect errors and flaws that can lead to failures or malfunctions.

In recent years, the use of models to test distributed systems has become a popular approach. However, existing modelling and verification techniques are not always able to deal effectively with the complexity of quorum-based distributed systems.

We hypothesise that the use of models based on coloured Petri nets will identify potential problems and vulnerabilities in quorum-based distributed systems, and ensure that they work correctly. We will conduct a number of experiments to evaluate the effectiveness of the proposed approach and compare it with existing testing methods.

The contribution of this paper is to propose an Model-Based testing (MBT) approach using coloured Petri nets (CPNs) for distributed quorum-based applications implemented using the Gorums framework. To illustrate the application of our approach, we show in detail how it can be used in a Gorums-based implementation of a single write, multi read distributed storage system. The distributed storage system is implemented with read quorum calls and write quorum calls that clients can use to access the distributed storage. Distributed storage can return multiple responses to a quorum call. To simplify client access to the storage, Gorums uses a user-defined quorum function to merge responses into a single response, which can then be returned to the client. For this particular storage system, we use majority quorum. By developing a CPN model of such distributed storage, we can generate test cases consisting of read and write quorum calls that test an implementation of the Gorums framework. For evaluation, we report the results obtained on the distributed storage system of the system, and present the results obtained on a more complex example in the form of the Paxos consensus protocol [5].

Methodology and Experiment

Distributed algorithms are usually used to implement replicated services, and they rely on quorum system [8] to achieve fault tolerance. That is, in order to access a replicated state, a process has to contact only a quorum, e.g. most of the processes. In this way, the system can provide services despite the failure of individual processes. However, the interaction and handling of responses from multiple processes often complicates the implementation of the protocol. Gorums [3] was developed to facilitate development efforts to build advanced distributed algorithms such as Paxos and distributed storage [2].

The Gorums framework reduces the complexity of implementing distributed quorum-based systems by providing two main abstractions: (a) a flexible and simple quorum call abstraction that is used to communicate with a set of processes and collect their responses, and (b) a quorum function abstraction that is used to process responses. These abstractions help to simplify the basic flow of control in protocol implementations. Figure 1 illustrates the interaction between the basic abstractions provided by Gorums. Gorums consists of a runtime library and a code generator that extends the remote gRPC procedure [10].

The test cases created for quorum functions are unit tests, while the test cases created for quorum calls are system tests consisting of simultaneous and alternating read and write quorum calls. The latter case tests both the implementation of quorum calls and the implementation of the Gorums framework. In addition to the test cases, we also generate a test oracle for each test case to determine if the test passed.

Modular tests for quorum functions

Quorum Function Module Tests are a software testing method aimed at checking individual modules or components responsible for quorum-related operations in the context of distributed systems.

Quorum is a mechanism used in distributed systems to ensure data consistency and reliability. It relies on agreement among a subset of nodes or servers. Quorum functions include operations such as writing data, reading data and making decisions through voting.

Unit tests for quorum functions aim to verify that the individual modules or components responsible for processing the specified operations operate correctly. These tests isolate and test each quorum function in the monitored environment to ensure that they function correctly and meet the requirements and expectations.

In developing unit tests for quorum functions, their authors consider different scenarios and possible variants of operations. Examples of such tests may include verifying the processing of various combinations of votes in the system, as well as verifying correct voting and decision-making, and ensuring data consistency between the nodes involved in the quorum.

The test cases are created based on the detection of transition cases. This is done in the same way for both detection approaches. Specifically, we rely on a detection function which should take the value true whenever a certain transition is detected. When this happens, the generator function is called to generate a real test case. The state space for the distributed storage service CPN test model is relatively small and we can obtain all test cases based on the discovery in the state space[2]. The Paxos consensus protocol is more complex, so we rely on model-based discovery to generate test cases.

1

Listing 1. Example of generated test cases for read quorum function.

Listing 1 shows an example of how our test cases are represented using XML. The test case for the ReadQF quorum function corresponds to two responses (one with value 0 and timestamp 0, and one with value 42 and timestamp 1). With three servers, this constitutes quorum, and the value returned by the quorum function should be 42 with a timestamp of 1.

System Tests of Quorum Calls

The generation of test cases and expected results is based on the QuorumCalls substitution transition submodule. This module acts as a test driver for the system, defining scenarios for read and write quorum calls to the underlying quorum system. By varying this module, different read and write quorum call scenarios can be generated.

Figure 1 shows an example of a test driver in which a client makes one read and one write quorum call, which is simulated by an InvokeRDWR transition. After completing these two calls, the server fails and a new read and write call is invoked (modelled by the InvokeRDWRFailures transition). Server failures are modelled by a colour change of the server tokens in the ServerStatus place which is used along with the ServerStatus place in the HandleWriteRequest. Each quorum call has a unique identifier (1, 2, 3 and 4) to identify the call. Each write call also has a value (in this case 42 and 7) to write to the allocated storage.

2

Figure 1. The QuorumCalls module.

To make test case generation independent of the particular test driver module, we use the fact that read and write quorum calls made at the execution time of the CPN model can be observed as tokens at the Read and Write socket locations. When there is a READ token READINVOKED(i) for some integer i at the READ location, it means that a read quorum call identified by i has been invoked. When the read quorum call completes, a READRESULT(i,v) token will appear at the Read location, where v is the value read by the Read call. call. Calling and completing write quorum calls can be identified in a similar way by looking at tokens with the colours WRITEINVOKED(i,v) and WRITERESULT(i,b) in place of Write, where the boolean value b indicates whether or not the value v has been written.

3

Listing 2. Example of a generated test cases for the concurrent and sequential execution of read and write calls.

Based on this, we can generate XML test cases defining both simultaneous and sequential execution. simultaneous and sequential execution of read and write calls. Listing 2 (discussed below) shows an example where a read and a write are initiated first and a new read call is initiated after these two calls are completed.

Results on Distributed Storage

To evaluate our test case generation based on the model, we examine the code obtained with different test drivers for parallel and sequential execution of quorum calls in the client application. Our experimental evaluation includes both successful scenarios and scenarios involving server failures and programming errors.

Table 1. Experimental results for distributed storage – successful scenarios.

Test driver

Test case generation

Test case execution

System

Unit

ID

Name

Nodes

Arcs

Time

(sec)

QC

QF

Gorums library

QCs

QFs

RD

WR

RD

WR

S1

RD

39

72

<1

1

3

24.6

84.4

0

100

0

S2

WR

39

72

<1

1

3

24.6

0

84.4

0

100

S3

RD,WR

254

543

<1

1

7

39.1

84.4

84.4

100

100

The table shows number of nodes/arc in the CPN model state space of the model with this test driver, state space and test case generation time in seconds, number of test cases generated for quorum calls (QC), number of test cases generated for quorum functions (QF). For test case execution, we show the code coverage (in percentage) that was obtained for system level and unit tests. The results for successful execution scenarios show that the operator coverage for read (RD-QF) and write (WR-QF) quorum functions is 100% for both system-level and modular tests, provided that read and write calls are involved. Operator coverage for read (RD-QC) and write (WR-QC) quorum functions reaches 84.4%.

The above test cases confirm that the distributed storage and Gorums framework implementation works correctly when there is no server failure.

Conclusion

The main contribution of our work is the MBT approach, which can be used to test quorum distributed systems implemented with Gorums. Our approach includes simulation templates, test case generation algorithms and test case execution infrastructure. As case studies, we have applied our approach to a distributed storage system and Paxos implementation to illustrate and evaluate its applicability. The results are promising, as we achieved high code coverage by considering both normal execution scenarios and failure scenarios. Furthermore, the results were obtained using relatively simple test drivers and a small number of test cases. We have shown that in addition to obtaining code coverage results, the unit and system tests we have generated are capable of detecting programming errors.

The work presented in this paper opens up several avenues for future work. We have obtained good coverage results for quorum functions and quorum calls using the current testing model, considering both successful execution of scenarios and scenarios involving server failures and programming errors. However, to increase coverage and account for more Gorums code paths, we need to test quorum calls under additional failure scenarios and adverse conditions, such as network errors. This will require extensions to the model, for example to generate timeouts, which in turn need to be written into the test cases. This in turn will require extending the test adapter so that the environment can reproduce these events during the execution of the test cases.

References

  1. Apache Software Foundation. Apache Cassandra. http://cassandra.apache.org
  2. CPN Tools. CPN Tools. http://www.cpntools.org 5.
  3. CPN Testing Model for Gorum-based Distributed Storage, July 2018. http://home. hib.no/ansatte/lmkr/DistributedStorage.xml.
  4. Jorgensen, P.: The Craft of Model-Based Testing. CRC Press, Boca Raton (2017).
  5. Lea, T.E., Jehl, L., Meling, H.: Towards new abstractions for implementing quorum-based systems. In: Proceedings of 37th IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 2380–2385 (2017).
  6. . Kristensen, L.M., Veiset, V.: Transforming CPN models into code for TinyOS: a case study of the RPL protocol. In: Kordon, F., Moldt, D. (eds.) PETRI NETS 2016. LNCS, vol. 9698, pp. 135–154. Springer, Cham (2016). https://doi.org/10. 1007/978-3-319-39086-4 10.
  7. Google Inc., Protocol Buffers. http://developers.google.com/protocol-buffers.
  8. Artho, Q. Gros, G. Rousset, K. Banzai, L. Ma, T. Kitamura, M. Hagiya, Y. Tanabe, and M. Yamamoto, “Model-based API testing of Apache ZooKeeper,” in 2017 IEEE Intl. Conf. on Software Testing, Verification and Validation (ICST). IEEE, 2017, pp. 288–298.
  9. M. Rojas, J. Campos, M. Vivanti, G. Fraser, and A. Arcuri, “Combining multiple coverage criteria in search-based unit test generation,” in Search-Based Software Engineering, M. Barros and Y. Labiche, Eds. Springer, 2015, pp. 93–108.
  10. Google Inc. gRPC Remote Procedure Calls. http://www.grpc.io.

Интересная статья? Поделись ей с другими: