Re: Losing Communications with the Mount


Christopher Erickson
 

My experience with TCP comes from 30 years of telecommunications and robotics engineering. My primary concerns are much more with Layer-1 of the OSI model (cables, connectors) and Layer-2 (Ethernet frames), not Layer-3 (IP packets) or Layer-4 (TCP/UDP.) 

OSI Layers 1 & 2 are VERY opaque to the average user so consequently they are usually ignored when troubleshooting. I think this is typically a mistake. Sort of like looking for your car keys under a nice streetlight instead of next to your car, where you dropped them. 

PingPlotter is a very graphical, visual troubleshooting tool that has a free version. It is PROFOUNDLY better and more intuitive than using the DOS prompt command line Ping command. PingPlotter also incorporates a very nice, visual, graphical, dynamic traceroute. Download it and try it out. You won't go back to the nasty DOS prompt command line ever again, unless forced to on a strange machine. 

I agree Wireshark is a complicated tool. I already stated that. However I believe that the typical AP mount owner is more qualified than the average person to gain benefit from it, given some time. I would add that starting with PingPlotter instead of Wireshark would be good.

It could be bad to have a firewall or router in between the CP4/5 and the observatory PC. If there is, it might have LAN packet filtering capabilities, which I would disable, if I could.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   


On Thu, May 6, 2021, 9:04 AM Seb@stro <sebastiendore1@...> wrote:
Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things
 
Not sure where you got that from... Communication link reliability has little to do with the underlying protocol granted it is used in the proper context, well managed and used within a properly designed "network" architecture. 
 
Also, TCP protocol IS a connection-based protocol. It is the very reason it is considered "more reliable" that UDP, the latter being a "best-effort" protocol with no handshake and less error-correction mechanisms but typically lower latency. Both have their usecases where they shine. USB and WiFi are no different either. Some are more complex to manage for a "standard" end-user than others, that's all. 
 
The way some manufacturers implement their datalink solution is another key factor. Don't expect Ferrari performance from a Chevy van. And don't drive a Ferrari when you've always driven a Chevy van (at least not without a proper training)...
 
That aside, OP seems to have isolated the problem between the mount and his wired computer. And I agree IP address conflict (connected devices with same address) could be the culprit here given the symptoms. In that case, I would expect communication to re-establish by itself over a few seconds/minutes wait (without powercycling the mount) and then fail again a few seconds/minutes later and re-establish, and so on. A test to verify that would be to run a "perpetual ping" (add "-t" to the usual ping syntax from the command line, e.g. ping X.X.X.X -t, where X.X.X.X is the mount's IP address) and let it run for several minutes. Hit CTRL+C to end the ping.
 
Wireshark isn't a tool for a "standard" end-user. From the OP's posts, I'd say he probably knows a bit about networking and probably already uses it. If not, I would rather suggest to download Advanced IP Scanner (free) or similar, which will help discover every device alive (responsive) and dead (not responsive for a small amount of time) on a network. It will also show the MAC address (which is a unique hardware network identifier) of all devices discovered. Try running a scan while the mount is responsive and take note of the MAC address associated with its IP address. Run it again when it becomes non-responsive and if the tool marks it as "alive" and shows a different MAC address, it means you indeed have a duplicate IP address in your network. The "Name" and "Manufacturer" listed will also help you identify which device is using the same IP address as the mount. If the device is marked "dead" when the mount is non-responsive, then the problem is probably elsewhere.
 

 
Another useful (a bit more advanced) command, if you are familiar with your network IP addressing, is the "tracert" command ("traceroute" in linux) which will essentialy show the path (routing hop) taken by a packet from the computer from which you entered the command to the destination device. Its usage is similar to the ping command, e.g. tracert X.X.X.X, where X is the mount's IP address (wired or wireless). Some routers/firewalls might block this request though and you only get a series of *** + a timeout message, instead of actual routing hop IP addresses, which won't help you much. If you have a "flat" network architechture or only one routing instance, it will only return the destination lP address which won't help you much either (see example below).
 

 
If it goes through however, it will help identify up to which routing network component (switch, router, access point, etc.) communication is achieved properly by returning a series of IP addresses through which packets need to go through to their destination, as well as the round-trip time between hops. Over modern wired ethernet links, expect values below 100ms. Over wireless links, it can go much higher depending on multiple factors, but I'd say below 150-200ms on average would be acceptable (but not particularly good). Over those figure, you possibly have a bottleneck somewhere or a failling network component. BTW that command can even be used over the internet with domain names. Example below is between my computer and google.ca. The last line shows the destination IP address of one of the servers hosting the domain google.ca. Lines 1-8 shows the routing instances every data packet has to come across to reach that google server from my computer.
 

 
 
Note that the fact that the mount is not responsive from either interfaces (Ethernet and WiFi) at the same time is also a clue the problem comes from a common source to both, hence probably not coming from the wireless Access Point.
 
Also worth mentionning, even if you haven't said you are using one, is firewalls (sorry, I'll get a bit technical here). While it is actually often not the firewall root-causing the problem, it can be the one ending the communication by dropping data packets. New generations of firewalls (even home router - WiFi or wired - with firewalls functionnalities) have dynamic adaptative algorithms that "recognize" the type and "behavior" of data traffic that goes through them. They do that to prevent, amongst other things, DoS (denial of service) attacks which consist of an attacker flooding a computer with a massive amount of requests until it crashes by running out of memory.

Now, I've monitored Ethernet/WiFi communications between APCC/APv2 ASCOM drivers and my mount's CP5 (using Wireshark) and based on the amount of connections (not talking about physical hardware connection here, rather software connections at the OSI model layer 4) used, it could well be mis-recognized by some firewall's algorithm as a DoS attack. I'm saying this because it creates a new connection for seemingly every data exchange between the computer and mount, which occurs very often - like every second or so. They possibly implemented this that way for heartbeat or synchronization purposes, but that's only a guess. (That left me scratching my head a bit BTW as there are more memory-efficient ways of accomplishing this). Anyway, thing to note here, is there is nothing you can do about that last part as it's an inherent p
roperty of AP's communication between controlling software and mount.
 
But if you are using such a next-gen firewall with that kind of security feature, it could result in similar symptoms to what you are experiencing: communications working for a while and then stopping entirely when packets are dropped. Note here that the firewall is simply doing its job of protecting you. I therefore wouldn't recommend disabling this security feature entirely to solve the problem if that proves to be the case. Rather, I'd try creating a rule to whitelist the mount's IP address in your firewall's configuration in that regard.

Hope this helps as well,
 
Sébastien

Join main@ap-gto.groups.io to automatically receive all group messages.