Finally having a job in my field is not without its challenges. I came across a situation I cannot explain, let alone understand. Mid morning yesterday one of my colleagues informed me that the 2 computer systems at the circulation desk had, for what ever reason, dropped off the internet. As we had work being done in the area the night before my first thought was that they got disconnected, but on second thought that could not be a solution; for one, it’s too easy, and second, the systems still had access to the PDC (Primary Domain Controller) which allows the users to log in.
So my next step was to do a few ping tests, I could reach anything on its own network, 192.168.1.*, including the server and switch, but I could not ping the gateway, 192.168.3.1, or anything outside. I checked to cabling in the server room and everything was green lighted (meaning all the right ports on the switch we connected and communicating).
I checked other system on the network (mainly the offices) and they were able to ping the gateway and the outside work without issue. So if it’s not a connection issue, and not a network wide issue, then it must me a computer issue… But in my experience (ask me about Bell Tech Support sometime), if one or more computers drop off at the same time, it’s usually not the computers fault. And in this case, although I was only told of 2, by the end of the day there were 3, and I was told today of a 4th, all on the same network, all the same problem. And in the end, all the same solution.
Now before I get a head of myself, I have to go back to the question: What is happening here? If it’s not network based, not connection based, not computer based, what could it be? I checked all the wires, all green lighted, I check the server configs, all correct, I check the firewall, no drop rules pertain to these IPs. I even changed the IPs. I could not for the life of me figure out why these 2 systems (as I did not find out about the other two till I found a solution) could not access the outside world. The rest of the network was fine, but why these two….
The solution is just as confusing as the problem itself. In fact I don’t even know why I tried it, it was one of those “for the hell of it” moments where there is absolutely no reason for doing it other than you’ve tried everything else possible. As our server hosts our current website, I decided to, “for the hell of it”, type in the IP of the server into a browser, it loaded the website, why? I have no idea, I couldn’t ping it at all before. After it loaded, I decided to try to ping again, and for what ever reason , it worked, I was now able to reach the outside world. Now after I did this to the first machine, I didn’t realize it was the viewing in the browser that fixed it (that’s how unbelievable this solution was). So now that I had one system back up and running, and not knowing why it decided to all of a sudden work, I went to the second machine, no access, no ping… Now I am even more confused, why is one of the systems now working and the other is not?
After about another half hour, that “for the hell of it” moment returned. I mean I checked everything else I did to the earlier system, why the hell not try the browser trick. Sure enough, I got access back to the net. Which stunned me, cuz although I now had a solution (well, realized I had a solution), I still couldn’t figure out the reason for the problem in the first place. How is it that these systems just stopped working and the connection to the gateway just “fell asleep”. Why is it that I could not ping the gateway, but putting its address in a web browser magically “woke up” the connection? It makes no sense at all.
When I was told about the third machine, I didn’t even bother doing more than a ping test, the second I saw the failed returns on the test I simply went straight to the web browser and presto, the internet started to work again and I could ping everywhere.
So what was the problem? A ghost in the machine perhaps? I don’t think I will ever know what caused the systems to go offline, or why nothing would work till I put the IP into a web browser. All I know is I spent the better part of the day trying to find a solution to a very strange problem, and the solution I found made just as much sense (or lack there of).
I leave the comments open for anyone willing to comment on what could have possibly been the problem. I am a tech and I am at a loss as to how this could have happened. If you have any idea, or have had something similar, please share.