With mission-critical web applications and resources being hosted on cloud
environments, and cloud services growing fast, the need for having a greater level of
service assurance regarding fault tolerance for availability and reliability has increased.
The high priority now is ensuring a fault-tolerant environment that can keep the
systems up and running. To minimize the impact of downtime or accessibility failure
due to systems, network devices, or hardware, the expectations are that such failures
must be anticipated and handled proactively, quickly and intelligently. This chapter
discusses the fault tolerance system for cloud computing environments and analyzes
whether this is effective for Cloud environments.
Keywords: Fault Tolerance, Replication, Redundancy, High Availability.