-PDF Download- Consistency And Fault Tolerance Of Distributed Storage Systems EBOOK

Consistency and Fault Tolerance of Distributed Storage Systems

Author: Kathrin Sobe

Publisher:

Published: 2013

Total Pages: 127

ISBN-13: 9783844258899

DOWNLOAD EBOOK →

Benchmarking, Consistency, Distributed Database Management Systems, Distributed Systems, Eventual Consistency

Author: Bermbach, David

Publisher: KIT Scientific Publishing

Published: 2014-07-22

Total Pages: 202

ISBN-13: 3731501864

DOWNLOAD EBOOK →

Cloud storage services and NoSQL systems typically offer only "Eventual Consistency", a rather weak guarantee covering a broad range of potential data consistency behavior. The degree of actual (in-)consistency, however, is unknown. This work presents novel solutions for determining the degree of (in-)consistency via simulation and benchmarking, as well as the necessary means to resolve inconsistencies leveraging this information.

Consistent Distributed Storage

Author: Vincent Gramoli

Publisher: Morgan & Claypool Publishers

Published: 2021-06-30

Total Pages: 194

ISBN-13: 1636390633

DOWNLOAD EBOOK →

This is a presentation of several approaches for employing shared memory abstraction in distributed systems, a powerful tool for simplifying the design and implementation of software systems for networked platforms. These approaches enable system designers to work with abstract readable and writable objects without the need to deal with the complexity and dynamism of the underlying platform. The key property of shared memory implementations is the consistency guarantee that it provides under concurrent access to the shared objects. The most intuitive memory consistency model is atomicity because of its equivalence with a memory system where accesses occur serially, one at a time. Emulations of shared atomic memory in distributed systems is an active area of research and development. The problem proves to be challenging, and especially so in distributed message passing settings with unreliable components, as is often the case in networked systems. Several examples are provided for implementing shared memory services with the help of replication on top of message-passing distributed platforms subject to a variety of perturbations in the computing medium.

Consistent Distributed Storage

Author: Vincent Gramoli

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 176

ISBN-13: 3031020154

DOWNLOAD EBOOK →

Providing a shared memory abstraction in distributed systems is a powerful tool that can simplify the design and implementation of software systems for networked platforms. This enables the system designers to work with abstract readable and writable objects without the need to deal with the complexity and dynamism of the underlying platform. The key property of shared memory implementations is the consistency guarantee that it provides under concurrent access to the shared objects. The most intuitive memory consistency model is atomicity because of its equivalence with a memory system where accesses occur serially, one at a time. Emulations of shared atomic memory in distributed systems is an active area of research and development. The problem proves to be challenging, and especially so in distributed message passing settings with unreliable components, as is often the case in networked systems. We present several approaches to implementing shared memory services with the help of replication on top of message-passing distributed platforms subject to a variety of perturbations in the computing medium.

Protocol- and Situation-aware Distributed Storage Systems

Author: Ramnatthan Alagappan

Publisher:

Published: 2019

Total Pages: 220

ISBN-13:

DOWNLOAD EBOOK →

We are dependent upon data in many aspects of our lives. Much of this data is stored and managed by distributed storage systems that run in data centers, powering many modern applications such as e-commerce, photo sharing, video streaming, search, social networking, messaging, collaborative editing, and even health-care and financial services. A distributed storage system stores copies of a piece of data on many nodes for fault-tolerance: even when a few nodes fail, the system can still provide access to data. Each of these nodes depends upon a local storage stack to safely store and manage user data. The local storage stack is complex, consisting of many hardware and software components. Due to this complexity, the storage layer is a place for many potential problems to arise. This dissertation examines the reliability and performance challenges that arise the interaction points between a distributed system and the local storage stack. In the first part of this thesis, we study how distributed storage systems react to storage faults: cases where the storage device may return corrupted data or errors. We focus on replicated state machine systems, an important class of distributed systems. We find that none of the existing approaches used in current systems can safely handle storage faults, leading to data loss and unavailability. Using the insights gained in our study, we design corruption-tolerant replication (CTRL), a protocol-aware recovery approach for RSM systems. CTRL exploits protocol-specific knowledge of how RSM systems operate, to ensure safety and high availability in the presence of storage faults without impacting performance. In the second part, we study the performance and reliability properties of replication protocols used by distributed systems. We find there exists a dichotomy with respect to how and where current approaches store system state. One approach writes data to the storage stack synchronously, whereas the other buffers the data in volatile memory. The choice of whether data is written synchronously to the storage device or not greatly influences the system's robustness to crash failures and its performance. We show that existing approaches either provide robustness to crashes or performance, but not both. Thus, we introduce situation-aware updates and crash recovery, a dynamic protocol that, depending upon the situation, writes either synchronously or asynchronously to the storage devices, achieving both strong reliability and high performance. In the final part of this thesis, we study the effects of file-system crash behaviors in distributed storage systems. We build protocol-aware crash explorer or PACE, a tool that can model and reason about file-system crash behaviors in distributed systems under a special correlated crash failure scenario. Our study reveals that the correctness of update and recovery protocols of many distributed systems hinges upon how the local file-system state is updated by each replica. We perform a detailed analysis of the vulnerabilities, showing their serious consequences and prevalence on commonly used file systems. We finally point to possible solutions to the problems discovered.

Consistency-aware Durability

Author: Aishwarya Ganesan

Publisher:

Published: 2020

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK →

Modern distributed storage systems are emerging as the primary choice for storing massive amounts of critical data that we generate today. A central goal of these systems is to ensure data durability, i.e., these systems must keep user data safe under all scenarios. To achieve high levels of durability, most modern systems store redundant copies of data on many machines. When a client wishes to update the data, the distributed system takes a set of actions to update these redundant copies, which we refer to as the system's durability model. At one end of the durability model spectrum, data is immediately replicated and persisted on many or all servers. While this immediate durability model offers strong guarantees, it suffers from poor performance. At the other end, data is only lazily replicated and persisted, eventually making it durable; this approach provides excellent performance but poor durability guarantees. The choice of durability model also influences what consistency models can be realized by the system. While immediate durability enables strong consistency, only weaker models can be realized upon eventual durability. Thus, in this dissertation, we seek to answer the following question: is it possible for a durability model to enable strong consistency guarantees, yet also deliver high performance? In the first part of this dissertation, we study the behavior of eight popular modern distributed systems and analyze whether they ensure data durability when the storage devices on the replicas fail partially, i.e., sometimes return corrupted data or errors. Our study reveals that redundancy does not provide fault tolerance; a single storage fault can result in catastrophic outcomes such as user-visible data loss, unavailability, and spread of corruption. In the second part, to address the fundamental tradeoff between consistency and performance, we propose consistency-aware durability or CAD, a new way to achieving durability in distributed systems. The key idea behind CAD is to shift the point of durability from writes to reads. By delaying durability upon writes, CAD provides high performance; however, by ensuring the durability of data before serving reads, CAD enables the construction of strong consistency models. Finally, we introduce cross-client monotonic reads, a novel and strong consistency property that provides monotonic reads across failures and sessions. We show that this property can be efficiently realized upon CAD, while other durability models cannot enable this property with high performance. We also demonstrate the benefits of this new consistency model.

Fault Tolerance in Distributed Systems

Author: Pankaj Jalote

Publisher: Prentice Hall

Published: 1994

Total Pages: 456

ISBN-13:

DOWNLOAD EBOOK →

Fault tolerance is an approach by which reliability of a computer system can be increased beyond what can be achieved by traditional methods. Comprehensive and self-contained, this book explores the information available on software supported fault tolerance techniques, with a focus on fault tolerance in distributed systems.

Design and Evaluation of Distributed Wide-area On-line Archival Storage Systems

Author: Hakim Weatherspoon

Publisher:

Published: 2006

Total Pages: 474

ISBN-13:

DOWNLOAD EBOOK →

Replication Techniques in Distributed Systems

Author: Abdelsalam A. Helal

Publisher: Springer Science & Business Media

Published: 2005-12-29

Total Pages: 166

ISBN-13: 0306477963

DOWNLOAD EBOOK →

Replication Techniques in Distributed Systems organizes and surveys the spectrum of replication protocols and systems that achieve high availability by replicating entities in failure-prone distributed computing environments. The entities discussed in this book vary from passive untyped data objects, to typed and complex objects, to processes and messages. Replication Techniques in Distributed Systems contains definitions and introductory material suitable for a beginner, theoretical foundations and algorithms, an annotated bibliography of commercial and experimental prototype systems, as well as short guides to recommended further readings in specialized subtopics. This book can be used as recommended or required reading in graduate courses in academia, as well as a handbook for designers and implementors of systems that must deal with replication issues in distributed systems.

Benchmarking Eventually Consistent Distributed Storage Systems

Author: David Bermbach

Publisher:

Published: 2020-10-09

Total Pages: 198

ISBN-13: 9781013280405

DOWNLOAD EBOOK →

Cloud storage services and NoSQL systems typically offer only ""Eventual Consistency"", a rather weak guarantee covering a broad range of potential data consistency behavior. The degree of actual (in-)consistency, however, is unknown. This work presents novel solutions for determining the degree of (in-)consistency via simulation and benchmarking, as well as the necessary means to resolve inconsistencies leveraging this information. This work was published by Saint Philip Street Press pursuant to a Creative Commons license permitting commercial use. All rights not granted by the work's license are retained by the author or authors.