What Is Redundancy In The Network And What Equipment And Technologies Can Be Implemented?
One Of The Important Concepts Around Computer Networks And Network Supervisors And IT Teams Should Have A Certain Attention To It Is Error Tolerance Threshold.
This concept means continuing the work of a communication infrastructure after a problem or failure. The solution to overcome this problem is redundancy.
When communication systems and networks are implemented based on the redundancy principle, the failure of a component will not cause network disruption, as the equipment will be able to continue working.
That’s why large organizations and companies design networks based on redundancy and error tolerance principles.
Redundancy of power supplies
Hard drives are prone to malfunction because they have rotating magnetic plates. However, disks are not the only component that may fail in a system. The power supply may also fail. For this reason, it is recommended that plans be built with a dual power supply so that in case of failure of a power supply, the second power supply enters the circuit, and the system will continue to operate. It is also possible to detect errors related to the central processor on newer servers, and the motherboard can transfer computing and processing processes to the second central processor.
Also, memory errors are predictable and manageable so that information is not lost in the event of memory failure. Combining these factors makes it possible for a communication network to operate without problems and send a warning to the network administrator by observing the slightest sign of failure but no disruption to business performance.
If the power supply fails in one of the network’s most important equipment, such as the server, the server will stop moving first, and then the network will stop moving. In this case, if you have the best support contract, you will have to wait at least 4 hours for the new power supply to arrive, and the server will continue to operate again. Therefore, when designing networks, you should pay special attention to the principle of high accessibility.
Fortunately, you can purchase the most network equipment with an optional second power supply. Dual power supplies work in a variety of ways, with the following three more popular:
Active/Passive Dual Power Supplies:
In this method, power supplies are divided into primary and secondary states. The direct power supply is activated, and the second power supply is in inactive or, more precisely, standby mode. In this case, only one power supply can provide the Power needed by the server.
- One of the potential problems of active-inactive dual power supplies is that only one power supply works simultaneously. When the system’s power supply process encounters a problem, the second power supply enters the circuit. All the device’s workload is transferred to the dual power supply, inactive mode. If the passive power supply is dissipated over time and the workload is assigned to it, likely, it will not function properly.
- Dual Load Balancing Power Supply: Both power supplies work in an active-active configuration. Both powers supply the electricity needed for the equipment equally, which is called the “load balancing” approach. The dual load-balancing power supply has the same problem as the active-passive dual power supply model, as both power supplies are forced to supply the electrical current required by the device.
- Dual power supply with load change: It is a widely used forum server and data center equipment. This model has a function similar to the active and passive dual power model, except that sometimes the load is temporarily transferred to the second power supply and driven to the main power supply twice. The advantage of this method is that it is possible to test both power supply and, therefore, problems can identify before the system’s permanent power outage.
An array of standalone disks (RAID)
When installing the operating system on a hard drive or secure digital card (SD) called Secure Digital, a synagogue of the operating system must be placed on a secondary device. Figure 1 shows the pattern above. The most common way used in this area is the additional array of independent disks (RAID-1). This approach is also called flipping. The above pattern improves the tolerance threshold for system error in drive failure or support card.
Figure 1
An important point to consider about the redundancy of storage equipment is the architecture on which hard drives are configured. For example, flip-based redundancy costs a lot of money because you need a movement of the same volume to keep a synagogue of information for every purpose supposed to host communication. That’s why some companies use striping and RAID-5. In the above method, you need three or more drives. The above pattern is shown in Figure 2.
Clustering servers
Today’s servers can purchase with full redundancy to keep the server functioning if parts fail. However, the vital thing to note about component redundancy is that the system’s periodic evaluation process increases, and some models lack software mechanisms that intelligently distribute heavy processing loads between different servers.
Accordingly, companies use a concept called clustering. Clustering refers to the grouping of servers that can balance workloads, simplify maintenance and prevent failure and loss of a network or set of services. For example, Windows Servers 2019 and 2022 offer a functional feature called Failover Clustering, which provides high app availability.
The critical thing about physical clustering is that nearly 5 to 10 years ago, the above approach was quite popular, but today most companies use the virtual clustering approach. In this case, if one server fails, running programs will automatically redirect to another server. However, to achieve this, the application must support the feature.
Microsoft Hyper-V, VMware vSphere, and Citrix Hypervisor (XenServer) are a few examples of popular virtual clustering software used in data centers today.
These platforms can group platforms to access servers and workstations. Workstation access is usually done in a concept called virtual desktop infrastructure (VDI).
In this case, programs can be placed in containers on these platforms using Docker or Cobernitz. The above approach outside Iran significantly reduces the cost of using applications (because companies do not need to buy expensive licenses). A program similar to a virtual machine is made available to the user. Figure 3 shows an example of virtual clustering.
Figure 3
Typically, each host is similar to other hosts in clusters, has its hypervisor, and can access shared resources such as bandwidth, storage space, and percussion power. Most virtual solution providers allow users to download their hypervisor and use it for free, as each host and its hypervisor operates independently of other hosts.
However, when hosts work together simultaneously, they form a cluster, and the service that connects these hosts ensures high availability. High accessibility feature is one of the criteria that provides virtual solutions with many advertising maneuvers.
Switches
Until this part of the article, we saw how to achieve the principle of high availability in computing and storage space. However, another essential thing to note is that there may be a definite problem and disability for links between switches responsible for linking hosts and savers. So we need to think about solving this problem to preserve the principle of high accessibility.
Therefore, we need to look for a solution to achieve redundancy of network connections so that a link between the equipment is not interrupted in case of failure. An important point to consider about switches is the Spanning Tree Protocol (STP), which blocks additional link frames so that there is no problem with broadcast storms and duplicate boundaries.
However, if a link breaks down, STP will repeat the calculation process and allow the frames to be moved via an additional link, but it will cause a tangible drop in network performance.
As shown in Figure 4, it is possible to bypass all switches within a network by making redundancy on the web. Sometimes the distribution layer or main layer may crash and send information incompletely. Usually, buying additional buttons for these layers is costly, and if the daily business set is heavy, the problems should resolve quickly. In such circumstances, it is redundancy that ensures that business activities will continue in the event of failure.
Figure 4
Routers
If the router fails due to repair or failure, it will no longer respond to requests and routing packages via the virtual IP address. To achieve the redundancy principle associated with routers, we need to look at FHRP first-hop redundancy protocols, HSRP Hot Standby Router Protocol, and VRRP Surname Virtual Router Redundancy Protocol. These protocols allow you to configure an accessible default gateway by providing a synchronized virtual router; figure 5 shows how to achieve redundancy associated with routers.
Figure 5
Firewalls
Firewalls are another component of the network that should consider the principle of high accessibility concerning them. It is possible to use the standard FHRP protocols that routers use in this context. Figure 6 shows how to implement redundancy associated with firewalls.
Regarding using FHRP to achieve an additional firewall that is supposed to connect to the Internet, you should pay attention to the critical point that the service provider supports this approach. We can use FHRP for outbound traffic redundancy, but the provider should also help FHRP with incoming traffic redundancy.
If an organization uses two different providers, it isn’t easy to achieve such an original. Another point is that both firewalls must have the same configuration to prevent potential security problems on the network.
Figure 6
Support, facilities, and infrastructure
When you want to buy network equipment, you need to do comprehensive research on supporting equipment and infrastructure. This concept is known as Ping and Power and revolves around network management and remote control of devices. Ping and Power has a general approach and includes various equipment such as air conditioning systems (HVAC), humidity measurement, equipment performance, power outages, and even fire extinguishing.
Undoubtedly, power outages are the most critical threat to networks, but proper maintenance of HVAC equipment and fire prevention are essential issues that should not be passed by indifferently.
Uninterrupted Power Supply (UPS)
An uninterrupted power supply (UPS) is an emergency power support system that detects any fluctuations in electricity flow while providing electricity and automatically supplies the required electricity whenever urban electricity is cut off. The subtle thing to pay attention to about ups is that you need to activate a UPS only for a limited time and provide the energy required for the equipment as soon as the generator is in orbit as a constant source of electricity.
In connection with workstations and servers that are not connected to backup generators for any reason, UPS gives you enough time to shut down systems. UPSs are often mistakenly used as a source of energy during power outages. Regular UPS can supply equipment for a minimal period and are not suitable for long-term outages.
The advantage of simultaneous use of UPS and generators is that the equipment will not fluctuate when commissioning the generator after the power outage. So in connection with the redundancy principle, be careful that upscale systems should provide electricity only as long as a power generator is set up. Today, there are many types of UPS systems, three of which perform better:
Standby UPS:
The most common ups placed under a desk are available to protect your personal computer. These UPSs work by transferring load from AC line to battery inverter, and capacitors in the load transfer unit help the system avoid severe power fluctuations. Due to limited capacity, these models perform well but are not used in server rooms.
Linear Interactive UPS:
They are commonly used for small server rooms and racks. Their performance is based on electricity supply from the AC line to the inverter. When a power outage occurs, the inverter receives a signal to electricity from batteries. This process may sound similar to the performance of a standby UPS, but in this model, no-load handling is done. The load must be changed from AC to a completely different circuit (inverter) in a standby up. In contrast, the inverter is always connected to the pack in a linear interactive UPS but only works with the battery when power outages.
Online UPS:
The standard option is data centers. This model works by supplying AC power to the rectifier/charging circuit that keeps batteries charged. The batteries then supply inverter energy with a fixed DC power supply. The inverter converts DC into AC two-time to provide the required electricity. The advantage of online ups is that electricity is constantly supplied from batteries. The fixed load power supply unit enters the circuit in a power outage and offers a fully stable power supply.