Linux: strange timeouts

Linux howto's, compile information, information on whatever we learned on working with linux, MACOs and - of course - Products of the big evil....
Post Reply
User avatar
^rooker
Site Admin
Posts: 1483
Joined: Fri Aug 29, 2003 8:39 pm

Linux: strange timeouts

Post by ^rooker »

[SYMPTOMS]
Strange timeouts occuring all over the system. The easiest one to check/reproduce is probably logging in using SSH:

the "username"-prompt appears immediately, but it takes ~20sec. until the "password" prompt shows up. After successful login, it takes another ~20sec. again until you get a shell prompt.

---------------
I've even had the exact same strange timeout when trying to connect to an Oracle database: The queries itself were damn fast, but logging into the database took forever!
---------------

[SOLUTION]
Check if your system can "reach" (ping?) the nameservers in /etc/resolv.conf - if it can't, it will try to, but timeout - so if you cannot fix this connection problem (or you don't want to), simply DELETE all invalid entries within the /etc/resolv.conf - and that's it! :!:
User avatar
^rooker
Site Admin
Posts: 1483
Joined: Fri Aug 29, 2003 8:39 pm

personal note

Post by ^rooker »

I know that probably noone is really interested in the background of this short knowledge base entry, but I've spent a WHOLE day finding this problem, and since it happened on an important server, I was about to lose my head if I couldn't fix it, so:

It took me ~3.5 hours to pinpoint the approximate location of the error, since there's a combination of apache/tomcat/oracle running, and the logs said that the database queries took a few msec, I thought: it's not the database.

But the logs said that loading a .jsp page took a few minutes (up to ~12mins sometimes) which made me think that it might be Tomcat causing problems.

Forcing Tomcat to recompile all JSPs did not fix anything, but in the apache logs I saw strange "AJP13: broken pipe" errors which made me think that it's a problem with mod_jk+AJP13.

I switched from AJP13 to the new Coyote connector, but this still wasn't it.

I've upgraded Tomcat - no change.

I've increased the logging "volume" and saw that it took up to 2mins to connect to the database - so I knew that it was somewhere in this corner.

I tried to connect manually to the database using sqlplus and there I also had a strange slowliness when establishing a connection - again: the queries were fast. This made me assume that it's not the JDBC.

----------
What finally led to the solution was that the login with SSH had 2 "timeout" places:
1) between username and password
2) between password and shell.

Luckily I've had the exactly same slow ssh-login with another server in the same net, which did NOT have this problem anymore after connecting it completely to the internet.

So I had to figure out what those 2 machines had in common:
It was their firewall-restricted-outgoing access to the internet - and this lead immediately to the assumption that there might be some "outgoing lookups" which fail - voila.

----------
User avatar
^rooker
Site Admin
Posts: 1483
Joined: Fri Aug 29, 2003 8:39 pm

by the way...

Post by ^rooker »

I almost forgot:

The oracle listener also took ages to start.
Post Reply