Achour Mostefaoui | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Achour Mostefaoui is active.

Explore More

Publication

Featured researches published by Achour Mostefaoui.

international symposium on distributed computing | 1999

Solving Consensus Using Chandra-Toueg's Unreliable Failure Detectors: A General Quorum-Based Approach

Achour Mostefaoui; Michel Raynal

This paper addresses the Consensus problem in asynchronous distributed systems (made of n processes, at most f of them may crash) equipped with unreliable failure detectors. A generic Consensus protocol is presented: it is quorum-based and works with any failure detector belonging to the class S (provided that f ≤ n - 1) or to the class ⋄S (provided that f < n/2). This quorum-based generic approach for solving the Consensus problem is new (to our knowledge). Moreover, the proposed protocol is conceptually simple, allows early decision and uses messages shorter than previous solutions. The generic dimension and the surprising design simplicity of the proposed protocol provide a better understanding of the basic algorithmic structures and principles that allow to solve the Consensus problem with the help of unreliable failure detectors.

dependable systems and networks | 2003

Asynchronous implementation of failure detectors

Achour Mostefaoui; Eric Mourgaya; Michel Raynal

Unreliable failure detectors introduced by Chandra and Toueg are abstract mechanisms that provide information on process failures. On the one hand, failure detectors allow to state the minimal requirements on process failures that allow to solve problems that cannot be solved in purely asynchronous systems. But, on the other hand, they cannot be implemented in such systems: their implementation requires that the underlying distributed system be enriched with additional assumptions. The usual failure detector implementations rely on additional synchrony assumptions (e.g., partial synchrony). This paper proposes a new look at the implementation of failure detectors and more specifically at Chandra-Toueg’s failure detectors. The proposed approach does not rely on synchrony assumptions (e.g., it allows the communication delays to always increase). It is based on a query-response mechanism and assumes that the query/response messages exchanged obey a pattern where the responses from some processes to a query arrive among the (n − f ) first ones (n being the total number of processes, f the maximum number of them that can crash, with 1 ≤ f< n). When we consider the particular case f =1 , and the implementation of a failure detector of the class denoted S (the weakest class that allows to solve the consensus problem), the additional assumption the underlying system has to satisfy boils down to a simple channel property, namely, there is eventually a pair of processes (pi ,p j) such that the channel connecting them is never the slowest among the channels connecting pi or pj to the other processes. A probabilistic analysis shows that this requirement is practically met in asynchronous distributed systems.

Parallel Processing Letters | 2001

LEADER-BASED CONSENSUS

Achour Mostefaoui; Michel Raynal

It is now well recognized that consensus is a fundamental problem one has to solve to implement reliable applications on top of unreliable asynchronous distributed systems prone to failures. It has been shown that this problem cannot be solved if the underlying asynchronous system does not satisfy additional assumptions. This paper presents a new consensus protocol based on a leader oracle (denoted Ω in the litterature). Although this protocol uses asynchronous rounds, it is not based on the rotating coordinator paradigm. As a consequence, it does not suffer from drawbacks inherent to ♢S-based consensus protocols that explicity use this paradigm. As Ω and ♢S are equivalent, the proposed protocol does not require assumptions stronger or weaker than the ones abstracted in ♢S. Hence, it also requires f < n/2 (where n is the number of processes and f an upper bound on the number of processes that may crash). From a design point of view, the proposed protocol is surprisingly simple. From an efficiency point of view, it allows the processes to agree in a single round when the oracle provides the processes with the same leader (a common case in practice). It is also shown that the time and message costs of the protocol can be reduced when f < n/3. Moreover, when, in addition to the leader oracle, the system is equipped with a random oracle, the proposed protocol can be extended to provide a hybrid consensus protocol at no additional message cost.

ieee international symposium on fault tolerant computing | 1997

A communication-induced checkpointing protocol that ensures rollback-dependency trackability

Roberto Baldoni; Jean-Michel Hélary; Achour Mostefaoui; Michel Raynal

Considering an application in which processes take local checkpoints independently (called basic checkpoints), this paper develops a protocol that forces them to take some additional local checkpoints (called forced checkpoints) in order that the resulting checkpoint and communication pattern satisfies the Rollback Dependency Trackability (RDT) property. This property states that all dependencies between local checkpoints are on-line trackable by using a transitive dependency vector. Compared to other protocols ensuring the RDT property, the proposed protocol is less conservative in the sense that it takes less additional local checkpoints. It attains this goal by a subtle tracking of causal dependencies on already taken checkpoints; this tracking is then used to prevent the occurrence of hidden dependencies. As indicated by simulation study, the proposed protocol compares favorably with other protocols; moreover it additionally associates on-the-fly with each local checkpoint C the minimum global checkpoint to which C belongs.

Distributed Computing | 2000

Communication-based prevention of useless checkpoints in distributed computations

Jean-michel Hélary; Achour Mostefaoui; Robert H. B. Netzer; Michel Raynal

Summary. A useless checkpoint is a local checkpoint that cannot be part of a consistent global checkpoint. This paper addresses the following problem. Given a set of processes that take (basic) local checkpoints in an independent and unknown way, the problem is to design communication-induced checkpointing protocols that direct processes to take additional local (forced) checkpoints to ensure no local checkpoint is useless.The paper first proves two properties related to integer timestamps which are associated with each local checkpoint. The first property is a necessary and sufficient condition that these timestamps must satisfy for no checkpoint to be useless. The second property provides an easy timestamp-based determination of consistent global checkpoints. Then, a general communication-induced checkpointing protocol is proposed. This protocol, derived from the two previous properties, actually defines a family of timestamp-based communication-induced checkpointing protocols. It is shown that several existing checkpointing protocols for the same problem are particular instances of the general protocol. The design of this general protocol is motivated by the use of communication-induced checkpointing protocols in “consistent global checkpoint”-based distributed applications such as the detection of stable or unstable properties and the determination of distributed breakpoints.

symposium on reliable distributed systems | 1998

Consensus in asynchronous systems where processes can crash and recover

Michel Hurfin; Achour Mostefaoui; Michel Raynal

The consensus problem is now well identified as being one of the most important problems encountered in the design and the construction of fault-tolerant distributed systems. This problem is defined as follows: processes have to reach a common decision, which depends on their inputs, despite failures. We consider the consensus problem in asynchronous distributed systems augmented with unreliable failure detectors. Several protocols have been proposed for these systems, when process crashes are assumed to be definitive. This paper addresses the consensus problem in a more practical asynchronous system model, namely in a context where processes can crash and recover. As a process crash entails the loss of its volatile memory, each process is equipped with a stable storage. So, to be efficient a consensus protocol has to log as few critical data as possible. The proposed protocol uses a new class of failure detectors suited to the crash/recovery model. It is particularly efficient when, whether there are crashes or not, the underlying failure detector makes few mistakes. Additionally, the proposed protocol tolerates message duplication and copes with some message losses.

principles of distributed computing | 2000

k-set agreement with limited accuracy failure detectors

Achour Mostefaoui; Michel Raynal

Let the <italic>scope</italic> of the accuracy property of an unreliable failure detector be the number <italic>x</italic> of processes that may not suspect a correct process. The scope notion gives rise to new classes of failure detectors among which we consider <italic>S<subscrpt>x</subscrpt></italic> and ⋄<italic>S<subscrpt>x</subscrpt></italic> in this paper (Usual failure detectors consider an implicit scope equal to <italic>n</italic>, the total number of processes). The <italic>k</italic>-set agreement problem generalizes the consensus problem: each correct process has to decide a value in such a way that a decided value is a proposed value, and the number of decided values is bounded by <italic>k</italic>. There exist protocols that solve this problem in asynchronous distributed systems when ƒ < <italic>k</italic> (where ƒ is the maximum number of processes that may crash). Moreover, it has been shown that there is no solution in those systems when ƒ ≥ <italic>k</italic>. The paper considers asynchronous distributed systems equipped with limited scope accuracy failure detectors. It studies conditions on <italic>n</italic>, ƒ, <italic>k</italic> and <italic>x</italic> that allow to solve the <italic>k</italic>-set agreement problem in those systems and presents two protocols. The first protocol solves the <italic>k</italic>-set agreement in asynchronous distributed systems augmented with a failure detector of the class <italic>S<subscrpt>x</subscrpt></italic>. It requires ƒ < <italic>k</italic> + <italic>x</italic> - 1. The second protocol works with any failure detector of the class ⋄<italic>S<subscrpt>x</subscrpt></italic>. It actually defines a family of protocols. This family allows to solve the <italic>k</italic>-set agreement problem when ƒ < <italic>max</italic>(<italic>k, max</italic><subscrpt>1≤<italic>α</italic>≤<italic>k</italic></subscrpt>(<italic>min</italic>(<italic>n</italic> - <italic>α</italic>⌊<italic>n</italic>/(<italic>α</italic> + 1)⌋, <italic>α</italic> +<italic>x</italic> - 1))). We conjecture that, when ƒ ≥ <italic>k</italic>, these conditions are necessary to solve the <italic>k</italic>-set agreement problem in asynchronous distributed systems equipped with failure detectors ε <italic>S<subscrpt>x</subscrpt></italic> or ⋄<italic>S<subscrpt>x</subscrpt></italic>, respectively.

symposium on reliable distributed systems | 2005

From static distributed systems to dynamic systems

Achour Mostefaoui; Michel Raynal; Corentin Travers; Stacy Patterson; Divyakant Agrawal; Amr El Abbadi

A noteworthy advance in distributed computing is due to the recent development of peer-to-peer systems. These systems are essentially dynamic in the sense that no process can get a global knowledge on the system structure. They mainly allow processes to look up for data that can be dynamically added/suppressed in a permanently evolving set of nodes. Although protocols have been developed for such dynamic systems, to our knowledge, up to date no computation model for dynamic systems has been proposed. Nevertheless, there is a strong demand for the definition of such models as soon as one wants to develop provably correct protocols suited to dynamic systems. This paper proposes a model for (a class of) dynamic systems. That dynamic model is defined by (1) a parameter (an integer denoted a) and (2) two basic communication abstractions (query-response and persistent reliable broadcast). The new parameter is a threshold value introduced to capture the liveness part of the system (it is the counterpart of the minimal number of processes that do not crash in a static system). To show the relevance of the model, the paper adapts an eventual leader protocol designed for the static model, and proves that the resulting protocol is correct within the proposed dynamic model. In that sense, the paper has also a methodological flavor, as it shows that simple modifications to existing protocols can allow them to work in dynamic systems.

IEEE Transactions on Computers | 2002

A versatile family of consensus protocols based on Chandra-Toueg's unreliable failure detectors

Michel Hurfin; Achour Mostefaoui; Michel Raynal

This paper is on consensus protocols for asynchronous distributed systems prone to process crashes, but equipped with Chandra-Touegs (1996) unreliable failure detectors. It presents a unifying approach based on two orthogonal versatility dimensions. The first concerns the class of the underlying failure detector. An instantiation can consider any failure detector of the class S (provided that at least one process does not crash), or oS (provided that a majority of processes do not crash). The second versatility dimension concerns the message exchange pattern used during each round of the protocol. This pattern (and, consequently, the round message cost) can be defined for each round separately, varying from O(n) (centralized pattern) to O(n/sup 2/) (fully distributed pattern), n being the number of processes. The resulting versatile protocol has nice features and actually gives rise to a large and well-identified family of failure detector-based consensus protocols. Interestingly, this family includes at once new protocols and some well-known protocols (e.g., Chandra-Touegs oS-based protocol). The approach is also interesting from a methodological point of view. It provides a precise characterization of the two sets of processes that, during a round, have to receive messages for a decision to be taken (liveness) and for a single value to be decided (safety), respectively. Interestingly, the versatility of the protocol is not restricted to failure detectors: a simple timer-based instance provides a consensus protocol suited to partially synchronous systems.

IEEE Transactions on Parallel and Distributed Systems | 2000

Computing global functions in asynchronous distributed systems with perfect failure detectors

Jean-Michel Hélary; Michel Hurfin; Achour Mostefaoui; Michel Raynal; Frédéric Tronel

A Global Data is a vector with one entry per process. Each entry must be filled with an appropriate value provided by the corresponding process. Several distributed computing problems amount to compute a function on a global data. This paper proposes a protocol to solve such problems in the context of asynchronous distributed systems where processes may fail by crashing. The main problem that has to be solved lies in computing the global data and in providing each noncrashed process with a copy of it, despite the possible crash of some processes. To be consistent, the global data must contain, at least, all the values provided by the processes that do not crash. This defines the Global Data Computation (GDC) problem. To solve this problem, processes execute a sequence of asynchronous rounds during which they construct, in a decentralized way, the value of the global data and eventually each process gets a copy of it. To cope with process crashes, the protocol uses a perfect failure detector. The proposed protocol has been designed to be time efficient: it allows early decision. Let t be the maximum number of processes that may crash, t<n where n is the total number of processes, and f be the actual number of process crashes (f/spl les/t). In the worst case, the protocol terminates in min(2f+2,t+1) rounds. Moreover, the protocol does not require processes to exchange information on their perception of crashes. The message size depends only on the size of the global data.

Explore More