Downtime is a period when a system is not available. It may apply to any computer or network, but is most commonly used in reference to servers. In particular, web server reliability is often measured in terms of downtime, where little to no downtime is ideal.
There are several reasons why a server may experience downtime:
- Server reboot - Restarting a server may require a few minutes of downtime because the system must shut down, reboot, then restart the necessary processes in order to respond to incoming requests.
- Software restart - Restarting a process, such as Apache on a web server, may cause a few seconds of downtime while the process is restarting.
- Network disconnect - If a server is physically disconnected from a network, it will not be reachable by the systems on the network.
- Network outage - If any part of a network (including the Internet) is not functioning between the server and client, the client will not be able to communicate with the server.
- Traffic overload - If a server receives more traffic than it can handle, it will not be able to respond to all the requests. Users may experience downtime until the traffic decreases. This may be caused by a spike in traffic or a DDoS attack.
- Hardware failure - If an important hardware component, such as an HDD or SSD fails, it may cause the server to stop functioning.
- Software failure - If a process on a server, such as the httpd (HTTP) service stops running, it will cause the server to be unresponsive to requests until the process is restarted.
- Power outage - If the electrical power goes out and no backup power is available (for example, a generator or UPS), any affected systems will be offline until power is restored.
- Hacker attack - If a hacker gains control of a server, he or she may prevent access to the required services, causing the server to stop responding.
In order to minimize downtime, server administrators must implement strong security measures and redundancy. Network security helps protect against malicious activity, such as unauthorized logins and DDoS attacks. Redundancy, such as RAID storage systems and backup power generators, help prevent downtime due to hardware failure. In some cases, multiple servers may be configured so that a secondary server can take over if the primary server fails.
While server admins try to minimize downtime as much as possible, sometimes downtime is unavoidable. For example, when performing a server migration, several minutes or even a few hours of downtime may be necessary. This type of "planned downtime" is typically scheduled for early morning or weekend hours when traffic levels are lowest.