Troubleshooting Guide

From Spry Wiki

Jump to: navigation, search

The main common technical issues you could face with your Spry's server are issues with your webserver, name servers, mail server, database server and FTP server. You could also face networking issues. A situation can be caused by a failure of one or several of those servers. The tools and techniques presented further will allow you to identify those failures. Most of the time any issue can be -temporarily- fixed, or at least pinpointed, by restarting the concerned daemon. Once the important service back up and the pressure relieved, you will have to determine WHY it started to go wrong.

Contents

Note to Readers

The methods presented in this guide are general. They assume that your system is configured as the majority of the Linux systems are. However the different control panels have different behaviors and will change the way things are done in your system. ALWAYS use your control panel to restart services unless unavailable. This guide's purpose is to help you troubleshoot simple issues, it will not help you setting up any kind of web, mail, FTP, database, etc... server. If you need help on those matters you can refer to Spry's base knowledge and the links included at the end of the guide.

Starting Point

We will start by considering that your server is up. You can verify that by pinging it. If not follow these instructions. We will also consider that your local computer is successfully connected to Internet and able to browse all websites (no firewall on the local side). Finally we will consider that your nameservers are working properly and returning the correct information. If you think you are having any issue with your nameservers go there. Ping and nameservers are basic services. If they are failing it will be more difficult to troubleshoot other services.

Common Servers

Web server

What is a webserver

The webserver is a software that displays webpages upon request from a web-browser. The main webserver used on the Internet and at Spry is Apache and its daemon is httpd. It interacts with the nameservers and potentially with the database server (through PHP). It uses ports 80 and 443.

How to identify an issue with the webserver

This simple test can help you diagnose a webserver issue. Obviously if something else is wrong at the same time it'll make things more difficult.

Try the following instructions to troubleshoot your webserver:

    *If pulling your website brings a 'unable to connect' page while your domain name resolves correctly,
    *If accessing your IP from a Internet browser shows a 'unable to connect' page,
    *If your firewall isn't blocking the port 80,
    *If your server isn't out of memory,

your webserver is probably not working properly.

How to restart the webserver

If you have a control panel, and if it is accessible, chances are there is a feature allowing you to restart Apache. If possible use your control panel. Otherwise you will have to do it manually: open a ssh connection to your server as root and run the following command: /etc/init.d/httpd restart. This will stop and start apache and eventually give you an useful error message if it fails.

Database server

What is a database server

A database is a software allowing you to manipulate a large quantity of data. It can be used on its own or by other softwares or programs: Plesk and HSPC use a database, softwares like oscommerce use a database (with apache and php). The database consists of 2 different pieces of softwares: the database server, that it running all the time on your system and accepting connections, and the database client, who connects and submits queries to the database server. The database client can be a program written in PHP for example. The main database server used in Linux and at Spry is MYSQL, and its daemon is called mysql or mysqld.

How to identify an issue with the database server.

This simple test can help you diagnose a database server issue. Obviously if something else is wrong at the same time it'll make things more difficult. Try the following instructions to troubleshoot the database server:

   * if you see mysql can't connect errors on your website (this only happens if error_reporting is enabled in php.ini), chances are that the database server is not working properly.

How to restart the database server.

If you have a control panel, and if it is accessible, chances are there is a feature allowing you to restart MySQL. If possible use your control panel. Otherwise you will have to do it manually: open a ssh connection to your server as root and run the following command: /etc/init.d/mysqld restart. This will stop and start MySQL and eventually give you an useful error message if it fails.

FTP server

What is an FTP server

FTP itself is a file transfer protocol. It is used to transfer files from one computer to another. It can work with a username and password, or with anonymous access. The FTP server is the software running on your server. It accepts connections from a FTP client (typically running on your local computer). There are several FTP servers used with Linux: proftp, pure-ftp, others... It can also run through xinetd. FTP server uses several ports, 21 for the commands and dynamic ports for the actual transfer.

How to identify an issue with the FTP server.

This simple test can help you diagnose a FTP issue. Obviously if something else is wrong at the same time it'll make things more difficult. Try the following instructions to troubleshoot your FTP server:

   * if you are unable to open a FTP connection to your ftp server, using your domain name or your IP,
   * if your firewall isn't blocking port 21,
   * if resetting the password of your user doesn't help,
   * if creating a test user doesn't help,
   * if your diskspace is not totally used up, chances are your FTP server isn't behaving properly.

How to restart the FTP server.

If you have a control panel, and if it is accessible, chances are there is a feature allowing you to restart the ftp server. If possible use your control panel. Otherwise you will have to do it manually: open a ssh connection to your server as root. You will have to identify the name of your FTP daemon. Run the following command: ls /etc/init.d. This will display a list of the services on your system. If you see one containing the letters ftp, chances are it is the command that controls your FTP server. In that case run /etc/init.d/"the command you found" restart. If you don't see anything containing the letters ftp, the ftp server is probably controlled by xinetd. In that case run /etc/init.d/xinetd restart . This will stop and start FTP and eventually give you an useful error message if it fails.

Email server

What is a mail server.

The mail server is a software running on your server. It accepts connections from other softwares, running locally on your server or not, and decides if the email has to be delivered locally or forwarded to another mail server. There are many mail servers like exim, qmail, sendmail, postfix.. Mail servers use the SMTP communication protocol and listen on port 25. They also can use IMAP which uses port 143. Mail servers also use the POP communication protocol. This protocol is used by your mail client (outlook) to download the content of your mailbox to your local computer. POP uses the port 110.

How to identify an issue with the mail server

Mail issues are harder to troubleshoot. Identifying authentication issues require more advanced tools beyond the scope of this guide. Here we will focus on whether you are able to connect to the mail server or not.

   * if you are able to resolve your domain name and its MX record,
   * if your firewall is not blocking the ports 25, 110, 143,
   * if resetting your user's password doesn't work,
   * if creating a test email account doesn't work,
   * if sending/receiving from and to a free email account like gmail or hotmail (not AOL 'link') doesn't work, your mail server probably doesn't work properly. if you are able to send email to gmail but not to retrieve your email, the pop daemon needs to be restarted.

How to restart the mail server

If you have a control panel, and if it is accessible, chances are there is a feature allowing you to restart the mail server. If possible use your control panel. Otherwise you will have to do it manually: open a ssh connection to your server as root. You will have to identify the name of your mail daemon. Run the following command: ls /etc/init.d This will display a list of the daemons on your system. You will most likely find qmail, exim or sendmail. . In that case run /etc/init.d/"the command you found" restart.

How to restart the POP server.

If you can use your control panel to restart the POP daemon, use it. Otherwise the POP daemon may run as a standalone daemon or through xinetd. Run /etc/init.d/name_of_the_daemon restart or /etc/init.d/xinetd restart . It will restart the daemons.

Email generally works but can't send to AOL

This is most likely because you don't have a PTR record linking your main IP to your hostname. Send us a reverse DNS request and have your IPs point towards your hostname. The second solution is that your IP address is blocked by AOL. Contact AOL support to see if it is the case.

Is the port 25 blocked by your ISP?

You can use a program called telnet available for Mac and Windows to test if your ISP is blocking the port 25. From Mac, open the terminal. From Windows, open the command prompt.

At the prompt enter
telnet example.com 25
where example.com is the domain name of the mail server that you want to connect to and hit ENTER.

If you get a connection failed message, it is either because the mail server you specified isn't accepting connections (not working), either because your ISP is blocking the port 25. You can then try to connect to a server that you know to be working, like smtp1.google.com, to see if it is your ISP or the first server that you tried who's refusing the connection.

Networking

Traceroute

Traceroute is a tool that traces a packet from a host computer to another host computer on a network. It shows how many hops the packet requires to reach the other host computer and how long each hop takes. This is a good tool to test the outside network connectivity. For example, let.s say we have a scenario where you can browse www.yahoo.com but you can not browse www.google.com. So you wander what is going on? If you are a Windows user, you can run tracert by selecting Start -> Run. For example:

tracert www.yahoo.com

In UNIX you can type traceroute For example:

traceroute www.yahoo.com

The traceroute results should be something like this.

# traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 66.102.7.104
traceroute to www.l.google.com (66.102.7.104), 64 hops max, 40 byte packets
1 66.249.1.129 (66.249.1.129) 0.408 ms 0.419 ms 0.404 ms
2 ge-1-3.r01.sttlwa01.us.bb.gin.ntt.net (129.250.11.149) 94.282 ms 2.538 ms 3.493 ms
3 xe-1-2-0.r21.sttlwa01.us.bb.gin.ntt.net (129.250.3.38) 0.650 ms 0.459 ms 0.384 ms
4 if-1-0.core1.SEP-Seattle.Teleglobe.net (207.45.196.33) 0.572 ms 0.418 ms 0.390 ms
5 if-12-0.core1.SQN-SanJose.teleglobe.net (66.198.97.21) 18.745 ms 52.401 ms 38.620 ms
6 Vlan3.msfc1.SQN-SanJose.Teleglobe.net (64.86.82.194) 19.272 ms 18.559 ms 18.396 ms
7 Vlan154.msfc1.SQN-SanJose.teleglobe.net (209.58.3.6) 21.888 ms 21.926 ms 22.019 ms
8 66.249.94.0 (66.249.94.0) 21.880 ms 21.845 ms 21.624 ms
9 72.14.233.129 (72.14.233.129) 21.762 ms 66.249.94.226 (66.249.94.226) 33.933 ms 21.800 ms
10 72.14.233.129 (72.14.233.129) 21.890 ms 21.982 ms 72.14.233.144 (72.14.233.144) 21.823 ms
11 216.239.49.54 (216.239.49.54) 23.381 ms 24.830 ms 24.559 ms
12 66.102.7.104 (66.102.7.104) 22.288 ms 22.313 ms 22.257 ms

When google.com is unreachable your traceroute will look like this:

# traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 66.102.7.104
traceroute to www.l.google.com (66.102.7.104), 64 hops max, 40 byte packets
1 66.249.1.129 (66.249.1.129) 0.408 ms 0.419 ms 0.404 ms
2 ge-1-3.r01.sttlwa01.us.bb.gin.ntt.net (129.250.11.149) 94.282 ms 2.538 ms 3.493 ms
3 xe-1-2-0.r21.sttlwa01.us.bb.gin.ntt.net (129.250.3.38) 0.650 ms 0.459 ms 0.384 ms
4 if-1-0.core1.SEP-Seattle.Teleglobe.net (207.45.196.33) 0.572 ms 0.418 ms 0.390 ms
5 if-12-0.core1.SQN-SanJose.teleglobe.net (66.198.97.21) 18.745 ms 52.401 ms 38.620 ms
6 Vlan3.msfc1.SQN-SanJose.Teleglobe.net (64.86.82.194) 19.272 ms 18.559 ms 18.396 ms
7 Vlan154.msfc1.SQN-SanJose.teleglobe.net (209.58.3.6) 21.888 ms 21.926 ms 22.019 ms
8 66.249.94.0 (66.249.94.0) 21.880 ms 21.845 ms 21.624 ms
9 * * *
10 * * *
11 * * *
12 * * *

Ping

PING is another utility that checks whether a host is alive or not. A host computer sends ping packets to another host. In return, the other host has to return the ping packets. Usually, you can see that ping packets are not returned normally, this is due to a firewall blocking the packets. An example of ping with return packets:

#ping www.google.com
PING www.l.google.com (66.102.7.104): 56 data bytes
64 bytes from 66.102.7.104: icmp_seq=0 ttl=246 time=23.361 ms
64 bytes from 66.102.7.104: icmp_seq=1 ttl=246 time=35.879 ms
--- www.l.google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 23.361/29.620/35.879/6.259 ms

An example of ping with no return packets:

#ping www.spry.com
PING spry.com (66.249.3.5): 56 data bytes
--- www.spry.com ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

Howto

SSH to your server

From Windows:

   * Download it from http://www.putty.nl/latest/x86/putty.exe and install it on your computer.
   * To use it you need at least:
    **to be able to ping your VPS
    **to know your root password
    **to have ssh running and accepting connections on the VPS
   * After installation, you will find a putty icon on your desktop.
   * double-click on the icon.
   * enter your IP address in the Host Name (or IP address) box.
   * leave the port and protocol on default, unless you have a custom setup (and probably don't need to read this).
   * click on Open.
   * when asked login as: type in root.
   * when asked for the password enter your root password. The screen doesn't display any character and the password is case sensitive.
   * the connection should be successfully established now. Any command you type in the window will be executed on the server.
   * when you are done type in 'exit' without quotes to end the ssh connection.

From a Mac:

   * Open the terminal by going to Applications > terminal
   * enter the following command: ssh root@yourip and press enter
   * you will eventually have to accept the public key (only the first time)
   * then you will be prompted for your root password.
   * You can start working.
   * Once done you can quit this session by entering exit and press enter.


How to do a PTR record lookup

Checking for your PTR record allows you to see if people and other servers are able to get your hostname from your IP(useful to send email to mail server like AOL's) From Windows:

   * open the command prompt in Windows by clicking on Start > programs > accessories > command prompt.
   * from the command line enter 'nslookup ipaddress' without quotes and replacing ipaddress by the IP address whose PTR record you want to check.
   * you should see an output telling you if the check was successful or not.

From your server:

   * Open a ssh connection to your server,
   * run the following command: dig -x your_ip_address
   * if you don't see a line with your hostname in the output the PTR record isn't setup.

From the Internet:

   * go to http://www.dnsstuff.com.
   * enter your IP address in the Reverse DNS lookup box.
   * click on the RevDNS button.
   * the useful information is located after the answer line. It will either contain a domain name or tell you that no PTR record was found.

How to do a A record lookup

Checking for A record allows you to verify that your nameservers are setup properly and link your domain to the right IP. From your server:

   * Open a ssh connection to your server,
   * enter the following command dig domain.com
   * the answer must have a line with your IP.

From Windows:

   * open the command prompt in Windows by clicking on Start > programs > accessories > command prompt.
   * from the command line enter 'nslookup domain' without quotes and replacing by the domain you want to resolve.
   * It returns the IP address the domain resolves to (the A record).

From the Internet:

   * go to http://www.dnsstuff.com.
   * enter the domain with its eventual subdomains that you want to test in the DNS lookup box.
   * select the type of record you want to test in the pull down menu:
   * A record is the main kind of record. You need at least one A record linking your domain and your IP.
   * click the Lookup button.
   * the useful information is contained in the Answer column or under the Answer line.

How to do a CNAME record lookup

From your server:

   * Open a ssh connection to your server,
   * enter the following command dig domain.com CNAME
   * the answer must have a line with your IP.

From the Internet:

   * go to http://www.dnsstuff.com.
   * enter the domain with its eventual subdomains that you want to test in the DNS lookup box.
   * CNAME record are aliases of A record. For example a A record links domain.com to your IP, and a CNAME record links www.domain.com to domain.com.
   * click the Lookup button.
   * the useful information is contained in the Answer column or under the Answer line.

How to do a MX record lookup

From your server:

   * Open a ssh connection to your server,
   * enter the following command dig domain.com MX
   * the answer must have a line with your IP.

From the Internet:

   * go to http://www.dnsstuff.com.
   * enter the domain with its eventual subdomains that you want to test in the DNS lookup box.
   * MX records define the mail server used for the domain. If you send an email to johndoe@domain.com the system looks up the MX record for domain.com and contact the server it got as result to send the email.
   * click the Lookup button.
   * the useful information is contained in the Answer column or under the Answer line.

How to do a NS record lookup

From your server:

   * Open a ssh connection to your server,
   * enter the following command dig domain.com NS
   * the answer must have a line with your IP.

From the Internet:

   * go to http://www.dnsstuff.com.
   * enter the domain with its eventual subdomains that you want to test in the DNS lookup box.
   * NS records define the authoritative nameservers for the domain.
   * click the Lookup button.
   * the useful information is contained in the Answer column or under the Answer line.

How to turn off your firewall

The firewall is a tool allowing you to block certain type of traffic through certain ports on your server.

   * Open a ssh connection to your server as root,
   * run the following command to stop your firewall: /etc/init.d/iptables stop
   * you can start your firewall with : /etc/init.d/iptables start

How to check your diskspace and inodes

Every file on your server takes an inode. The number of inodes is limited. Normally you should never max out your inodes, but if you have a lot of very small files it can happen. If it is the case you will not be able to create any more files on your server, which can cause issues in all aspects of your server.

  • Open a ssh connection to your server,
  • To get your diskspace usage, run: df -h
  • To get your inodes usage, run: df -i

How to do a WHOIS lookup

Doing a WHOIS lookup can be useful to verify that your domain hasn't expired and that it is setup to use the correct nameservers. How to do a WHOIS lookup from your server

  • Open a ssh connection to your server,
  • run the following command: whois domain where domain is yourdomain.com without the www.
  • Interesting lines are the expiration date and the nameservers.

How to do a WHOIS lookup from the Internet

  • go to http://www.domaintools.com.
  • enter your domain without www., but with the .com or .biz or whatever your domain has in the "WHOIS Lookup" box.
  • click on the "WHOIS" button.
  • the format of the result varies, but usually the information you are interested in is on the lines expiration date and Name server (2 lines for name servers).

How to check your available resources

If your Spry server is a VPS, or a Sprybox (single VPS running on a dedicated server), you can check your resources usage with the /proc/user_beancounters file. This file keeps track of every memory and socket allocation made on your server. It also keeps track of the failures in resources allocation (if a process is trying to allocate more memory than what is available, it counts as a failure). The last column of the file is the failures counter. Normally this file is never reset, except if your VPS is stopped for a long time. To check the /proc/user_beancounters file follow these instructions:

  • open a ssh connection to your server as root,
  • run cat /proc/user_beancounters,
  • the file is displayed in the terminal, each line represents a different resource. The first column is your current usage of the resource and the last one tells you how many times the system tried to allocate more resources than available.
  • If the current usage of a resource is too close to the limit and creating failures you can kill some processes or restart services.
  • If at anytime during the precedent process you get "cannot allocate memory" or "cannot fork" errors and can't run any command it is because your system is out of memory. In that case the only solution is to restart your VPS.

Name servers

What is a name server

A nameserver's purpose is to convert a domain name in an IP address. When someone enters your domain in their Internet browser, the system finds the authoritative nameservers for the domain (the ones found in the whois information) and queries them to find the correct IP. It uses the IP to connect to the webserver and to pull the requested webpage. Behind any system using a domain or hostname (like email, FTP using domain to connect, script getting information from a particular website..) there is a nameserver. They are also used locally on your server to resolve external domains. When you send an email to gmail.com from your server, there is a system that finds gmail.com's IP locally. The main nameserver used in Linux and at Spry is called BIND and its daemon is named. Nameservers use port 53.

How to identify an issue with the nameservers

This simple test can help you diagnose a nameserver issue. Obviously if something else is wrong at the same time it'll make things more difficult. Try the following instructions to troubleshoot your nameserver:

   * if you can't resolve your domain name,
   * if using your IP in an Internet browser shows a page different than 'unable to connect',
   * if your server is setup to use its own nameservers and can't resolve an external domain like google.com,
   * if you can ping your IP but not your domain, chances are that your nameserver is not working properly.

How to restart the nameservers.

If you have a control panel, and if it is accessible, chances are there is a feature allowing you to restart BIND. If possible use your control panel. Otherwise you will have to do it manually: open a ssh connection to your server as root and run the following command: /etc/init.d/named restart. This will stop and start BIND and eventually give you an useful error message if it fails.

Glossary

daemon

A daemon is a process (a process is a running program) running in the background on your server. It can create child processes. For example Apache's daemon is called httpd. It is constantly running and when one of your customers visits one of your webpages a http child process is created.

ports

Here is the wikipedia definition of port:A Software Port (usually just called a 'port') is a virtual data connection that can be used by programs to exchange data directly, instead of going through a file or other temporary storage location. The most common of these are TCP and UDP ports which are used to exchange data between computers on the Internet.

xinetd

Xinetd is called the Internet super daemon. Let say you have a POP and a FTP server running, but you get sporadic FTP and POP traffic. It is a waste of memory to have the POP and the FTP daemon running at the same time all the time for such low traffic. You can then run them through xinetd. Xinetd will be the only daemon running, and when an incoming POP connection arrives to your server it will start the POP daemon. It will do the same for FTP or any other daemon running through it.

Links to go further

There are countless sources of documentation online specific to Linux and its different software. It will allow you to greatly increase your knowledge and system administration skills. I have listed below some good starting points.

Wikipedia http://wikipedia.org/ The online free encyclopedia accessible at http://en.wikipedia.org and its Information technology portal has a lot of information about the services and protocol used on the Internet and Linux in general.

The Linux Documentation Project http://tldp.org/ The Linux Documentation Project is a website cataloging numerous HOWTOS on Linux and its main software.

Google Linux This version of Google will return only results relative to Linux: http://www.google.com/linux

The Official Websites of the different software

Recent Changes | RSS Feed RSS