However, today another virtual server has started doing the exact same thing! It responds to about half of the pings we send to it and RDP only connects sometimes we send to it yet on the server itself you can see it is doing absolutely nothing. The really odd thing is that it can ping out to other devices on the network without a problem, so its only inbound network connections that are intermittently having problems.
One thing I should point out is that we upgraded our ESX servers to a new version of ESXi a couple of weeks ago, but it seems odd that this server would have been running fine for a couple of weeks and then suddenly have this issue if that was the cause.
We have upgraded the vmware tools on the virtual server with the problem but this made no difference. Not sure what you mean by updating to esxi though as that is what we have already done as I mentioned in this paragraph:
haha thanks, we just tried that (shut down, remove network card, add network card, boot back up) and it didnt help. Also tried resetting the TCP/IP config on the virtual machien with netsh but again this didnt help.
Well this has sorted itself out again just like the other server did that had this issue. All pings are now responding fine and I can RDP to the server without a problem. Its just annoying because we have no idea what caused it or what fixed it so it is probably going to happen again with another machine
OK well this issue got a hell of a lot worse yesterday as it struck our main file server (which is also our DHCP server) and it was not just intermittent incoming network connections that were being rejected it was pretty much all incoming network connections (we got about 1 in 50 pings back). Again the same symptoms though - it could send data out without a problem and other VMs on the same ESX server could send data to it fine, just nothing else on the network could send to it.
We called VMWare tech support and they connected to my PC to take a look and after much troubleshooting basically said that it is not a VMWare problem it is a network problem. I can see why they said that, but found it hard to believe it could be a network issue when we had found that if we put a different VM on the same ESX server and gave it the same IP as the file server (turning the file server off first obviously) we did not have any problems at all accessing that VM from anywhere.
The file server VM started playing up again yesterday, so rebuilding it had not actually fixed the problem. What we did find though was that one of the network switches on the other side of the site had started using 100% of its CPU power and when we went and rebooted that switch the file server started working properly again. So we have now replaced the switch and so far have had no more issues!
I just wanted to add here that we had this same issue after a conversion from HyperV to VMware. It turns out that HyperV will continue to ARP for the machine even when it has been turned off or even removed from its inventory. Presently the only quick order workaround we have was rebooting the HyperV host to get it to stop.
I have 3 servers, a DC an exchange server and an apps server.If I ping the DC from any server or ping any server from the DC I get an IPv4 response, as you would expect, however if I ping exchange from apps, or the other way round the result I get is an IPv6 address.
All the NICs have IPv6 unticked so I can't understand why they are responding with these addresses, I also think this is causing problems because the Blackbery Enterprise Server I have just installed on apps cannot communicate with the Exchange server.
The problem with BES resolving Exchange does seem to be an IP or DNS issue. If I use the IPv4 address of the exchange server in the BES MAPI setup it resolves that name to Exchange, and gets a connection, but then when you start BES it tries to resolve the name, and fails!
We have an issue on a few of our Windows 2016 servers that started sometime in the last few months and seems to be getting progressively worse. Resetting the system seems to right it for a few days, but inevitably it will slide into the same useless state and require another hard reset. So far the system has bounced back, with sometimes nothing more than a chkdsk, but on our SQL servers this can sometimes take a few minutes of recovery.
These systems run well for a few days, but then we notice that we can no longer connect via RDP. If we try to log in on the VM console, it will usually hang on the "Waiting for user profile service" but that never resolves and the console is stuck on that login until reset. The SQL or web service on the VM continue to run as if there is no problem for several hours, but eventually we will notice the IP address that vCenter shows for the server disappears and the box is now completely isolated. We have to hard reset to restore service.
I have ran SFC on all of these servers and there is no corruption reported. I ran the DISM tools and it does report the component store can be repaired, but looking in the DISM and CBS logs, there are no errors reported, only Info and Warning. We dont seem to have any problem installing Windows updates, we are patched up to the March roll-up. These servers cant reach MS Update servers, so not sure how to clear these DISM issues. I have injected from a KB CAB before, but if the logs dont identify a KB, then what?
This behavior where it works ok for a few days, then services start to die off sounds to me like a memory leak in some component, but Im sure there could be other things. We recently installed Elastic Metricbeat to see if we can spot the process that might be running amok.
So I am looking for some tips on things to watch that might cause RDP/User profile service to die, or a NIC to suddenly stop working. I assume that the VMware tools installed on this server are getting killed or choked out by this supposed runaway process.
We got nowhere with this. It just stopped happening. So $500 wasted on MS Technical Services. I am going to assume this was some kind of conflict between our antivirus suite and Microsoft Trusted Installer. That seems to be a common thing we saw in the log files when the crash occurred. I guess a Windows update or a McAfee update resolved the issue at some unknown time. I just hope it doesnt come back.
I have the same problem this started i think in late february begining of march
First the i would reset vm's and it woulkd last a few weeks , lately it's a few days with luck.
They all stop responding and if try to logon with console it hangs on profile
The only "error" i can see in the logs is this
svchost (1068) SoftwareUsageMetrics-Svc: Um pedido para escrever no ficheiro "C:\Windows\system32\LogFiles\Sum\Svc.log"
dont fully know if this is the actual problem or byproduct of the hang....
I moved the vm to another server and the problem is the same
i have malware bytes anti ransomware in the servers, i think im going to disable to see if it solves it
It seems the same for us. We found out the IP is not disappearing, its the just vmtools service being taken down that makes the IP disappear in vCenter. The VM still pings, its just all the services have stopped.
I have been running the same settings for over a year with zero problems and suddently, starting last night, remote connections aren't working and the server is not responding (local connections work fine). This is confirmed by my remote users, DNS test site which says the server connection isn't working, and browser tests from within the wired network. I have checked all router, DNS service, and Emby settings and they are all 100% correct. I have turned off and rebooted the modem and routers, rebooted windows, reflashed the router firmware & deleted and reinput the port forwarding settings, rebooted Emby server multiple times, confirmed I am running latest Emby Server software, and no I never had any evidence of the recent "hack" issues. I even tried using a new DNS name as a test and it made no difference. I am not running a VPN during these tests. Software firewall still has EmbyServer authorized so no blocks there.
My statement that DNS check service was not working is false. I used a different site and it is working. I was using mxtoolbox.com, but the wrong tab/service. I changed to a different tab on that site and its working. I also just used and i am showing all green.
That's the wonky thing, I didn't change any settings whatsoever when this all started. I only tested a different DNS domain name after everything else wasn't fixing the problem. I have reverted back to the original domain name as that's what everyone else has in their settings.
The connections being used and tested are all http, not https. I have always only used the normal http. I have both types setup, but only share the normal http.
Would that make a difference in this case?
I'm just trying to eliminate DNS - ie making sure the DNS resolves to your ISP Public IP address - which I believe it does if you used the DNS name on 'canyouseeme.org' and it responsed that the port (8096/80?) was 'open' ..
If you wanna ping me your fqdn via PM, I can check for you - but there is not a lot more anyone can do without this info as 'something' is listening - if you wanna ping it to ebr or luke then that's fine - just trying to help..
Yes, when I go to canyouseeme.org and enter the normal http port I am using and the one external users are accessing it says "success" in green denoting it's not being blocked. I specified nonstandard ports in my setup to avoid any conflicts.
c80f0f1006