Jump to content

Failover: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
→‎See also: missing capitalization
 
(47 intermediate revisions by 38 users not shown)
Line 1: Line 1:
{{Short description|Automatic switching from failed computer system to standby computers}}
In [[computing]], '''failover''' is automatic switching to a [[redundancy (engineering)|redundant]] or standby [[computer]] [[Server (computing)|server]], [[system]], hardware component or [[computer network|network]] upon the failure or [[abnormal end|abnormal termination]] of the previously active [[Application software|application]],<ref>
'''Failover''' is switching to a [[redundancy (engineering)|redundant]] or standby [[computer]] [[Server (computing)|server]], [[system]], hardware component or network upon the failure or [[abnormal end|abnormal termination]] of the previously active [[Application software|application]],<ref>
For application-level failover, see for example {{cite book
For application-level failover, see for example {{cite book
|last= Jayaswal
|last= Jayaswal
|first= Kailash
|first= Kailash
|authorlink=
|title= Administering Data Centers: Servers, Storage, And Voice Over IP
|title= Administering Data Centers: Servers, Storage, And Voice Over IP
|url= http://books.google.com/books?id=W48oOMKU0RIC
|url= https://books.google.com/books?id=W48oOMKU0RIC
|accessdate= 2009-08-07
|access-date= 2009-08-07
|year= 2005
|year= 2005
|publisher= Wiley-India
|publisher= Wiley-India
Line 12: Line 12:
|page= 364
|page= 364
|chapter= 27
|chapter= 27
|chapter-url= https://books.google.com/books?id=W48oOMKU0RIC&pg=PA364
|trans_chapter=
|chapterurl= http://books.google.com/books?id=W48oOMKU0RIC&pg=PA364#v=onepage&q=&f=false
|quote= Although it is impossible to prevent some data loss during an application failover, certain steps can [...] minimize it.}}.
|quote= Although it is impossible to prevent some data loss during an application failover, certain steps can [...] minimize it.}}.
</ref> server, system, hardware component, or network. Failover and [[switchover]] are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.
</ref> server, system, hardware component, or network in a [[computer network]]. Failover and [[switchover]] are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.
[[File:Duckfone.png|thumb|4G cellular failover for network resiliency]]


[[Systems design]]ers usually provide failover capability in servers, systems or networks requiring continuous availability -the used term is [[High Availability]]- and a high degree of [[Reliability engineering|reliability]].
[[Systems design]]ers usually provide failover capability in servers, systems or networks requiring [[High availability|near-continuous availability]] and a high degree of [[Reliability engineering|reliability]].


At the server level, failover automation usually uses a "[[Heartbeat (computing)|heartbeat]]" system that connects two servers, either through using a separate cable (for example, [[RS-232]] serial ports/cable) or a network connection. In the most common design, as long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not bring its systems online; however a few systems actively use all servers and can failover their work to remaining servers after a failure. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent downtime. The second server takes over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to send a notification of failover.
According to Nittin Choudhary,renowned OCA describes it as : In terms of database field,Failover means when primary database is not available due to planned or unplanned downtime then standby database will act as primary database.
On other hand, switchover means when primary database is loaded with queries, then offloading done by primary on standby database.


Certain systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover.
At server level, failover automation usually uses a "[[Heartbeat (program)|heartbeat]]" cable that connects two servers. As long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not initiate its systems. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent downtime. The second server takes over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to send a notification of failover.


'''Failback''' is the process of restoring a system, component, or service previously in a state of failure back to its original, working state, and having the standby system go from functioning back to standby.
Some systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover.


The use of [[Platform virtualization|virtualization]] software has allowed failover practices to become less reliant on physical hardware through the process referred to as [[migration (virtualization)|migration]] in which a running virtual machine is moved from one physical host to another, with little or no disruption in service.
'''Failback''' is the process of restoring a system, component, or service in a state of failover back to its original state (before failure).


'''Failover''' and '''Failback''' technology are also regularly used in the Microsoft SQL Server database, in which SQL Server Failover Cluster Instance (FCI) is installed/configured on top of the '''Windows Server failover Cluster''' (WSFC). The SQL Server groups and resources running on WSFC can manually be [https://www.dbsection.com/how-to-failover-cluster-from-one-node-to-another/ failover to the second node] for any planned maintenance on the first node OR automatically failover to the second node in case of any issues on the first node. In the same way, a failback operation can be performed to the first node once the issue is resolved or maintenance is done on it.
The use of [[Platform virtualization|virtualization]] software has allowed failover practices to become less reliant on physical hardware; see also [[teleportation (virtualization)]].According to Nittin Choudhary,an OCA describes it as : In terms of database field,Failover means when primary database is not available due to planned or unplanned downtime then standby database will act as primary database.

On other hand, switchover means when primary database is loaded with queries, then offloading done by primary on standby database.
== History ==

The term "failover", although probably in use by engineers much earlier, can be found in a 1962 declassified [[NASA]] report.<ref>[https://archive.org/details/NasaAudioHighlightReels NASA Postlaunch Memorandum Report for Mercury-Atlas], June 15, 1962.</ref> The term "switchover" can be found in the 1950s<ref>Petroleum Engineer for Management - Volume 31 - Page D-40</ref> when describing '"Hot" and "Cold" Standby Systems', with the current meaning of immediate switchover to a running system (hot) and delayed switchover to a system that needs starting (cold). A conference proceedings from 1957 describes computer systems with both Emergency Switchover (i.e. failover) and Scheduled Failover (for maintenance).<ref>[https://books.google.com/books?id=qzxNAAAAYAAJ&q=switchover Proceedings of the Western Joint Computer Conference], Macmillan 1957</ref>


==See also==
==See also==
{{colbegin}}
* [[Data reliability]]
* [[Computer cluster]]
* [[High-availability cluster]]
* [[Data integrity]]
* [[Disaster recovery]]
* [[Disaster recovery]]
* [[Fault-tolerance]]
* [[Fault-tolerance]]
* [[Fencing (computing)]]
* [[Fencing (computing)]]
* [[High-availability cluster]]
* [[Load balancing (computing)|Load balancing]]
* [[Load balancing (computing)|Load balancing]]
* [[Log shipping]]
* [[Log shipping]]
* [[Safety engineering]]
* [[Safety engineering]]
* [[teleportation (virtualization)]]
* [[Teleportation (virtualization)]]
{{colend}}


== References ==
==References==
{{Reflist|2}}
{{Reflist|2}}


{{Authority control}}
[[Category:Computer networking]]
[[Category:Fault-tolerant computer systems]]

{{compu-network-stub}}
{{compu-network-stub}}


[[Category:Computer networking]]
[[de:Failover]]
[[Category:Fault-tolerant computer systems]]
[[es:Tolerancia a fallos]]
[[fr:Basculement (informatique)]]
[[ko:장애 극복 기능]]
[[hu:Feladatátvétel]]
[[ja:フェイルオーバー]]

Latest revision as of 06:26, 5 February 2024

Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application,[1] server, system, hardware component, or network in a computer network. Failover and switchover are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.

4G cellular failover for network resiliency

Systems designers usually provide failover capability in servers, systems or networks requiring near-continuous availability and a high degree of reliability.

At the server level, failover automation usually uses a "heartbeat" system that connects two servers, either through using a separate cable (for example, RS-232 serial ports/cable) or a network connection. In the most common design, as long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not bring its systems online; however a few systems actively use all servers and can failover their work to remaining servers after a failure. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent downtime. The second server takes over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to send a notification of failover.

Certain systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover.

Failback is the process of restoring a system, component, or service previously in a state of failure back to its original, working state, and having the standby system go from functioning back to standby.

The use of virtualization software has allowed failover practices to become less reliant on physical hardware through the process referred to as migration in which a running virtual machine is moved from one physical host to another, with little or no disruption in service.

Failover and Failback technology are also regularly used in the Microsoft SQL Server database, in which SQL Server Failover Cluster Instance (FCI) is installed/configured on top of the Windows Server failover Cluster (WSFC). The SQL Server groups and resources running on WSFC can manually be failover to the second node for any planned maintenance on the first node OR automatically failover to the second node in case of any issues on the first node. In the same way, a failback operation can be performed to the first node once the issue is resolved or maintenance is done on it.

History

[edit]

The term "failover", although probably in use by engineers much earlier, can be found in a 1962 declassified NASA report.[2] The term "switchover" can be found in the 1950s[3] when describing '"Hot" and "Cold" Standby Systems', with the current meaning of immediate switchover to a running system (hot) and delayed switchover to a system that needs starting (cold). A conference proceedings from 1957 describes computer systems with both Emergency Switchover (i.e. failover) and Scheduled Failover (for maintenance).[4]

See also

[edit]

References

[edit]
  1. ^ For application-level failover, see for example Jayaswal, Kailash (2005). "27". Administering Data Centers: Servers, Storage, And Voice Over IP. Wiley-India. p. 364. ISBN 978-81-265-0688-0. Retrieved 2009-08-07. Although it is impossible to prevent some data loss during an application failover, certain steps can [...] minimize it..
  2. ^ NASA Postlaunch Memorandum Report for Mercury-Atlas, June 15, 1962.
  3. ^ Petroleum Engineer for Management - Volume 31 - Page D-40
  4. ^ Proceedings of the Western Joint Computer Conference, Macmillan 1957