Losing Communications with the Mount


alex
 

Ok, I’ve finally have some time to start playing with my new 1100, and I’m having an issue with losing connection with the mount.  Everything would be going along, and then the driver starts complaining it can’t communicate with the mount.  Today I started playing with APCC and started to define the horizon, and it happened again.  APCC displays NO RESPONSE FROM MOUNT in the telescope position section.  I also get no response if I try to connect to the mount via a browser.  Pinging the mount’s IP address also fails.  I can only get communications with the mount if I power cycle it, at which point APCC re-establishes communications.
 
I’m not using a USB connection to the mount.  I’ve directly connected to my network via ethernet, as well as connecting it to my WiFi.  I have APCC setup to use the direct ethernet connection, so WiFi issues aren’t the problem. 

The last time this happened, the mount was simply tracking during an exposure and communications failed about halfway through the exposure.  It’s clear from that exposure during the failure that tracking stopped as the stars started streaking.  I got the mount and downloaded all the software last week, so I assume I'm all up to date on that account.  The PC has windows 10 Pro and the latest updates.

I’m at a loss as to what might be going on.   Any suggestions as how to troubleshoot this?

Alex

 


Ray Gralak
 

Hi Alex,

Today I started playing with APCC and started to define the horizon, and it
happened again. APCC displays NO RESPONSE FROM MOUNT in the telescope position section. I also get
no response if I try to connect to the mount via a browser. Pinging the mount’s IP address also fails. I can only
get communications with the mount if I power cycle it, at which point APCC re-establishes communications.
If the mount won't ping, there could be a network or power issue.

Are you directly connecting the computer to the mount? Or, is it going through a network router/switch/hub?

Try replacing the network cable(s) and any hub/switch.

Also, try raising the timeout value slightly in APCC. You should not need to go over 300 msecs for most requests. The mount will respond to most requests very quickly, but depending on the performance of your computer, the responses might not make it back through the network protocol stack to APCC before a timeout occurs.

Lastly, try switching the network protocol type. In APCC, if you are using UDP, try TCP instead, or vice versa.

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
Sent: Tuesday, May 4, 2021 11:27 PM
To: main@ap-gto.groups.io
Subject: [ap-gto] Losing Communications with the Mount

Ok, I’ve finally have some time to start playing with my new 1100, and I’m having an issue with losing
connection with the mount. Everything would be going along, and then the driver starts complaining it can’t
communicate with the mount. Today I started playing with APCC and started to define the horizon, and it
happened again. APCC displays NO RESPONSE FROM MOUNT in the telescope position section. I also get
no response if I try to connect to the mount via a browser. Pinging the mount’s IP address also fails. I can only
get communications with the mount if I power cycle it, at which point APCC re-establishes communications.

I’m not using a USB connection to the mount. I’ve directly connected to my network via ethernet, as well as
connecting it to my WiFi. I have APCC setup to use the direct ethernet connection, so WiFi issues aren’t the
problem.

The last time this happened, the mount was simply tracking during an exposure and communications failed
about halfway through the exposure. It’s clear from that exposure during the failure that tracking stopped as
the stars started streaking. I got the mount and downloaded all the software last week, so I assume I'm all up
to date on that account. The PC has windows 10 Pro and the latest updates.

I’m at a loss as to what might be going on. Any suggestions as how to troubleshoot this?

Alex




Steve Reilly
 

To add to Ray's post, we had issues with the scope at SRO which was wired (CAT5) but through a switch instead of the direct computer connect. When I discovered this I had the guys at SRO wire the mount directly to the 2nd LAN connection on the computer. Of course the IP address of the CP4 had to be discovered and the address used but since there hasn't been a single COM error. Worth the effort if not wired from computer to mount even if a second LAN connection is only available via an addon card. Our motherboard had 2 native LAN connections.

-Steve

-----Original Message-----
From: main@ap-gto.groups.io <main@ap-gto.groups.io> On Behalf Of Ray Gralak
Sent: Wednesday, May 5, 2021 6:10 AM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] Losing Communications with the Mount

Hi Alex,

Today I started playing with APCC and started to define the horizon,
and it happened again. APCC displays NO RESPONSE FROM MOUNT in the
telescope position section. I also get no response if I try to
connect to the mount via a browser. Pinging the mount’s IP address also fails. I can only get communications with the mount if I power cycle it, at which point APCC re-establishes communications.
If the mount won't ping, there could be a network or power issue.

Are you directly connecting the computer to the mount? Or, is it going through a network router/switch/hub?

Try replacing the network cable(s) and any hub/switch.

Also, try raising the timeout value slightly in APCC. You should not need to go over 300 msecs for most requests. The mount will respond to most requests very quickly, but depending on the performance of your computer, the responses might not make it back through the network protocol stack to APCC before a timeout occurs.

Lastly, try switching the network protocol type. In APCC, if you are using UDP, try TCP instead, or vice versa.

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf
Of alex
Sent: Tuesday, May 4, 2021 11:27 PM
To: main@ap-gto.groups.io
Subject: [ap-gto] Losing Communications with the Mount

Ok, I’ve finally have some time to start playing with my new 1100, and
I’m having an issue with losing connection with the mount. Everything
would be going along, and then the driver starts complaining it can’t
communicate with the mount. Today I started playing with APCC and
started to define the horizon, and it happened again. APCC displays
NO RESPONSE FROM MOUNT in the telescope position section. I also get no response if I try to connect to the mount via a browser. Pinging the mount’s IP address also fails. I can only get communications with the mount if I power cycle it, at which point APCC re-establishes communications.

I’m not using a USB connection to the mount. I’ve directly connected
to my network via ethernet, as well as connecting it to my WiFi. I
have APCC setup to use the direct ethernet connection, so WiFi issues aren’t the problem.

The last time this happened, the mount was simply tracking during an
exposure and communications failed about halfway through the exposure.
It’s clear from that exposure during the failure that tracking stopped
as the stars started streaking. I got the mount and downloaded all the software last week, so I assume I'm all up to date on that account. The PC has windows 10 Pro and the latest updates.

I’m at a loss as to what might be going on. Any suggestions as how to troubleshoot this?

Alex




Howard Hedlund
 

Hi Alex,
For starters, I would hook up a USB cable as your backup COM port in APCC.  That's just a good idea regardless.  Ethernet connections through a network are usually our preferred method for communicating.  Here are a few things to consider:
  • Do each of these things one at a time, or you'll never know what the actual problem was.
  • Ray mentioned increasing the timeout to 300 mSec.  I would go even higher, especially if you have other users on this same network. If you have 2 or 3 people on the same home network, all streaming different 4K movies, you are bound to need occasional longer timeouts for the mount.  The IEEE protocol (if I understand and remember correctly) was originally set up for timeouts up to 3000 mSec (3 seconds).
  • Try a new cable or cables, and try a different port on your router or switch.  Check the cable ends, and check the receptacle.  Outdoor switches in observatories are great places for spiders to call home.
  • Change from the network (GTOCP4 is a client) to operating peer-to-peer (GTOCP4 is a server).  This setting can be changed on the webpage if you can access it.  It can also be changed using USB or serial and the SerialUtilities.jar program that is on your thumb drive.  You MUST power-cycle the CP4 for any changes to take effect.
  • If none of the above seems to help: 1. ground yourself to allow any static to discharge.  2. Carefully remove the white top cover of the CP4.  Mind the WiFi antenna lead!  The left side is taken up by a digital board that is mounted on the main board below.  3. Note whether there are screws or snap fittings holding the board in place.  The board may have been knocked partially loose by rough handling from shippers.  This is unlikely if the CP4 is new enough to have the screws, but either way, check that the board is properly seated.  The location of the connector between the boards is shown on the photo.  (This board is a prototype and will look a bit different from yours.)  This is the only place that may need reseating.
  • If you continue to have issues, call me at AP.


mjb87@...
 

I had similar problems.  My Mach2 was connected via a wired Ethernet connection but ultimately needed to pass through a series of managed LAN switches. My hypothesis is that, when the switches are heavy with traffic, the time it takes to get the signal through all of the switches exceeds the response time the CP4 or CP5 is looking for. My setup was CP5->Switch1->Switch2->Switch-3->MainSwitch->Router. Most of the switch connections are fiber; only one is Cat6.

I simplified the LAN setup
- Removed one switch from the daisy chain
- Reduced the security-camera traffic going through those switches

At some point I will investigate setting switch port priorities that favor the CP4/5 links but I need to read up on that first since I am currently using jumbo frames and my switches (TrendNet TPE series) won't let you do that with jumbo frames.

Since then I haven't seen the issue nearly as much, if at all. I also added a USB backup connection which seems to work well.

Marty


 

Marty are you using UDP or TCP?


On Wed, May 5, 2021 at 9:03 AM mjb87 via groups.io <mjb87=verizon.net@groups.io> wrote:
I had similar problems.  My Mach2 was connected via a wired Ethernet connection but ultimately needed to pass through a series of managed LAN switches. My hypothesis is that, when the switches are heavy with traffic, the time it takes to get the signal through all of the switches exceeds the response time the CP4 or CP5 is looking for. My setup was CP5->Switch1->Switch2->Switch-3->MainSwitch->Router. Most of the switch connections are fiber; only one is Cat6.

I simplified the LAN setup
- Removed one switch from the daisy chain
- Reduced the security-camera traffic going through those switches

At some point I will investigate setting switch port priorities that favor the CP4/5 links but I need to read up on that first since I am currently using jumbo frames and my switches (TrendNet TPE series) won't let you do that with jumbo frames.

Since then I haven't seen the issue nearly as much, if at all. I also added a USB backup connection which seems to work well.

Marty



--
Brian 



Brian Valente


alex
 

I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch, and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
 
I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-establishes communications once the mount is power cycled with no other intervention on my part.
 
Also, the mount stops tracking when communications starts failing.  If it was just a communications failure between the computer and the mount, wouldn’t the mount continue to track?  When the mount is in this mode, the GTOCP4’s power light is on and the ethernet activity lights continue to flicker.
 
Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after power cycling the GTOCP4, everything is working again.
 
I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.  If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
 
Alex
 


 

Hi Alex

in my experience TCP is a more reliable protocol than UDP

i had similar timeout issues, switched to TCP and it was resolved for me

On Wed, May 5, 2021 at 3:40 PM alex <groups@...> wrote:
I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch, and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
 
I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-establishes communications once the mount is power cycled with no other intervention on my part.
 
Also, the mount stops tracking when communications starts failing.  If it was just a communications failure between the computer and the mount, wouldn’t the mount continue to track?  When the mount is in this mode, the GTOCP4’s power light is on and the ethernet activity lights continue to flicker.
 
Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after power cycling the GTOCP4, everything is working again.
 
I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.  If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
 
Alex
 



--
Brian 



Brian Valente


Ray Gralak
 

Also, the mount stops tracking when communications starts failing.
If it was just a communications failure
between the computer and the mount, wouldn’t the mount continue to track?
When APCC is in use, its Safety Park feature will cause the mount to stop tracking if the mount does not receive regular messages from APCC.

So, that the mount stopped tracking indicates that communication failed in some way.

-Ray


-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
Sent: Wednesday, May 5, 2021 3:41 PM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] Losing Communications with the Mount

I had already upped the timeout to 200ms, so I’ll try 400ms. The mount is directly connected to a switch in my
observatory. The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in
the observatory. My computer (a piggy backed eagle 2) is the only thing using that access point, so
communications is Eagle2 -> AP -> Switch -> Mount. Said switch is backhauled to my house’s main switch,
and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft
Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.

I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing
and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-
establishes communications once the mount is power cycled with no other intervention on my part.

Also, the mount stops tracking when communications starts failing. If it was just a communications failure
between the computer and the mount, wouldn’t the mount continue to track?
Communications failed again as I was writing this response. The mount was parked at the time. I had bumped
the timeout to 400ms and switched to UDP before hand. Pings to the mount’s hard wired ethernet IP address
is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC
and try connecting it via that WiFi address, it still get’s no response from the mount. Again, a few seconds after
power cycling the GTOCP4, everything is working again.

I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.
If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.

Alex


Christopher Erickson
 

TCP is more reliable than UDP if there are a bunch of lost packets on your network for some reason. Bad cable someplace, congestion, etc.In other words, TCP can hide a network problem that UDP does not. If UDP doesn't work, it is worthwhile trying to find out why and fixing it.

Also check your Ethernet cable lengths. Any cable over 100m can cause timeouts, retransmission congestion and packet loss.

Mixing different brands and vintages of Ethernet switches can sometimes cause problems. Different vintage Ethernet transceiver chips, different protocol capabilities, etc.

Get rid of any old Ethernet hubs.

Home made cables can have various issues due to bad crimps, crossed pairs, etc.

Check all cable connectors & sockets for oxidation, corrosion, bent pins, etc.

Make sure there is only one DHCP server on your network.

Always use Ethernet instead of WiFi, when you can.

The most robust and reliable mount communications option of all is RS-232 to a real serial port on your observatory computer. Second-best option is USB. Third is Ethernet and last place goes to WiFi. Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things. Also, most USB connectors are total, unreliable cr*p.

Wireshark is a free, open-source network diagnostic tool that can give you insights into your network. It is very powerful and does have a bit of a learning curve.

PingPlotter is a great, simple diagnostic tool that can be used to track down network congestion and packet loss in your network and your Internet connection.

Make sure you aren't suffering from duplicate IP addresses on your network. The WiFi and Ethernet ports MUST have different IP addresses from each other. Same goes for every single device port on your network.

If you have smart Ethernet switches that let you lock Ethernet ports to soecific, lower protocol speeds, try lowering them all to 10 or 100 MBPS and see if your network problems go away. Could be an auto-negotiation incompatibility issue between the switch and your CP4. Also if you have a smart switch, check its port statistics for clues.

I hope this helps.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   


On Wed, May 5, 2021, 2:35 PM Ray Gralak <iogroups@...> wrote:
> Also, the mount stops tracking when communications starts failing.
> If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?

When APCC is in use, its Safety Park feature will cause the mount to stop tracking if the mount does not receive regular messages from APCC.

So, that the mount stopped tracking indicates that communication failed in some way.

-Ray


> -----Original Message-----
> From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
> Sent: Wednesday, May 5, 2021 3:41 PM
> To: main@ap-gto.groups.io
> Subject: Re: [ap-gto] Losing Communications with the Mount
>
> I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my
> observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in
> the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so
> communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch,
> and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft
> Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
>
> I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing
> and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-
> establishes communications once the mount is power cycled with no other intervention on my part.
>
> Also, the mount stops tracking when communications starts failing.  If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?
> Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped
> the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address
> is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC
> and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after
> power cycling the GTOCP4, everything is working again.
>
> I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.
> If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
>
> Alex
>
>







 

All good advice from Christopher. Adding on to this,

 

Try increasing your timeout even more. 400ms is high for serial, and is fine for just ethernet on it’s own, but can be on the low side for wifi connections. Additionally, can you send us a picture of the setup? Sometimes a problem might not be obvious from the description alone. After that, checking all of your cables, even the power for the ethernet switch, is a good thing to do.

 

Liam

 

From: main@ap-gto.groups.io <main@ap-gto.groups.io> On Behalf Of Christopher Erickson
Sent: Thursday, May 6, 2021 1:27 AM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] Losing Communications with the Mount

 

TCP is more reliable than UDP if there are a bunch of lost packets on your network for some reason. Bad cable someplace, congestion, etc.In other words, TCP can hide a network problem that UDP does not. If UDP doesn't work, it is worthwhile trying to find out why and fixing it.

 

Also check your Ethernet cable lengths. Any cable over 100m can cause timeouts, retransmission congestion and packet loss.

 

Mixing different brands and vintages of Ethernet switches can sometimes cause problems. Different vintage Ethernet transceiver chips, different protocol capabilities, etc.

 

Get rid of any old Ethernet hubs.

 

Home made cables can have various issues due to bad crimps, crossed pairs, etc.

 

Check all cable connectors & sockets for oxidation, corrosion, bent pins, etc.

 

Make sure there is only one DHCP server on your network.

Always use Ethernet instead of WiFi, when you can.

 

The most robust and reliable mount communications option of all is RS-232 to a real serial port on your observatory computer. Second-best option is USB. Third is Ethernet and last place goes to WiFi. Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things. Also, most USB connectors are total, unreliable cr*p.

 

Wireshark is a free, open-source network diagnostic tool that can give you insights into your network. It is very powerful and does have a bit of a learning curve.

 

PingPlotter is a great, simple diagnostic tool that can be used to track down network congestion and packet loss in your network and your Internet connection.

 

Make sure you aren't suffering from duplicate IP addresses on your network. The WiFi and Ethernet ports MUST have different IP addresses from each other. Same goes for every single device port on your network.

 

If you have smart Ethernet switches that let you lock Ethernet ports to soecific, lower protocol speeds, try lowering them all to 10 or 100 MBPS and see if your network problems go away. Could be an auto-negotiation incompatibility issue between the switch and your CP4. Also if you have a smart switch, check its port statistics for clues.

 

I hope this helps.


-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   

 

On Wed, May 5, 2021, 2:35 PM Ray Gralak <iogroups@...> wrote:

> Also, the mount stops tracking when communications starts failing.
> If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?

When APCC is in use, its Safety Park feature will cause the mount to stop tracking if the mount does not receive regular messages from APCC.

So, that the mount stopped tracking indicates that communication failed in some way.

-Ray


> -----Original Message-----
> From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
> Sent: Wednesday, May 5, 2021 3:41 PM
> To: main@ap-gto.groups.io
> Subject: Re: [ap-gto] Losing Communications with the Mount
>
> I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my
> observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in
> the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so
> communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch,
> and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft
> Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
>
> I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing
> and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-
> establishes communications once the mount is power cycled with no other intervention on my part.
>
> Also, the mount stops tracking when communications starts failing.  If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?
> Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped
> the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address
> is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC
> and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after
> power cycling the GTOCP4, everything is working again.
>
> I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.
> If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
>
> Alex
>
>






Donald Gaines
 

Hi Christopher,
My computer has no RS-232 (Serial I think) ports, just USB.  Would the 15’ serial cable that comes with the 1100GTO mount along with the Astro-Physics Serial to USB converter, provide the same performance and reliability as the RS-232 you mention below?
BTW, Observatory Engineer.....in Hawaii.....you must really enjoy going to work!
Thanks,
Don Gaines


On Thursday, May 6, 2021, Christopher Erickson <christopher.k.erickson@...> wrote:
TCP is more reliable than UDP if there are a bunch of lost packets on your network for some reason. Bad cable someplace, congestion, etc.In other words, TCP can hide a network problem that UDP does not. If UDP doesn't work, it is worthwhile trying to find out why and fixing it.

Also check your Ethernet cable lengths. Any cable over 100m can cause timeouts, retransmission congestion and packet loss.

Mixing different brands and vintages of Ethernet switches can sometimes cause problems. Different vintage Ethernet transceiver chips, different protocol capabilities, etc.

Get rid of any old Ethernet hubs.

Home made cables can have various issues due to bad crimps, crossed pairs, etc.

Check all cable connectors & sockets for oxidation, corrosion, bent pins, etc.

Make sure there is only one DHCP server on your network.

Always use Ethernet instead of WiFi, when you can.

The most robust and reliable mount communications option of all is RS-232 to a real serial port on your observatory computer. Second-best option is USB. Third is Ethernet and last place goes to WiFi. Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things. Also, most USB connectors are total, unreliable cr*p.

Wireshark is a free, open-source network diagnostic tool that can give you insights into your network. It is very powerful and does have a bit of a learning curve.

PingPlotter is a great, simple diagnostic tool that can be used to track down network congestion and packet loss in your network and your Internet connection.

Make sure you aren't suffering from duplicate IP addresses on your network. The WiFi and Ethernet ports MUST have different IP addresses from each other. Same goes for every single device port on your network.

If you have smart Ethernet switches that let you lock Ethernet ports to soecific, lower protocol speeds, try lowering them all to 10 or 100 MBPS and see if your network problems go away. Could be an auto-negotiation incompatibility issue between the switch and your CP4. Also if you have a smart switch, check its port statistics for clues.

I hope this helps.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   

On Wed, May 5, 2021, 2:35 PM Ray Gralak <iogroups@...> wrote:
> Also, the mount stops tracking when communications starts failing.
> If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?

When APCC is in use, its Safety Park feature will cause the mount to stop tracking if the mount does not receive regular messages from APCC.

So, that the mount stopped tracking indicates that communication failed in some way.

-Ray


> -----Original Message-----
> From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
> Sent: Wednesday, May 5, 2021 3:41 PM
> To: main@ap-gto.groups.io
> Subject: Re: [ap-gto] Losing Communications with the Mount
>
> I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my
> observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in
> the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so
> communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch,
> and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft
> Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
>
> I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing
> and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-
> establishes communications once the mount is power cycled with no other intervention on my part.
>
> Also, the mount stops tracking when communications starts failing.  If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?
> Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped
> the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address
> is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC
> and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after
> power cycling the GTOCP4, everything is working again.
>
> I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.
> If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
>
> Alex
>
>







Howard Hedlund
 

Using the supplied serial cable with a USB to serial converter, even the FTDI unit that we sell, will still be limited in its reliability to that of USB.  Chris is referring to a native serial port on a dedicated board inside the computer itself that avoids the Universal Serial Bus.  Our on-board USB port uses the same FTDI chipset, but it has an added advantage over an external USB to serial device.  An external device is powered solely from the bus in the computer.  Our port, however, is also powered so a power drop from the computer won't kill the USB connection because the CP4/5 will keep it alive.


Seb@stro
 

Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things
 
Not sure where you got that from... Communication link reliability has little to do with the underlying protocol granted it is used in the proper context, well managed and used within a properly designed "network" architecture. 
 
Also, TCP protocol IS a connection-based protocol. It is the very reason it is considered "more reliable" that UDP, the latter being a "best-effort" protocol with no handshake and less error-correction mechanisms but typically lower latency. Both have their usecases where they shine. USB and WiFi are no different either. Some are more complex to manage for a "standard" end-user than others, that's all. 
 
The way some manufacturers implement their datalink solution is another key factor. Don't expect Ferrari performance from a Chevy van. And don't drive a Ferrari when you've always driven a Chevy van (at least not without a proper training)...
 
That aside, OP seems to have isolated the problem between the mount and his wired computer. And I agree IP address conflict (connected devices with same address) could be the culprit here given the symptoms. In that case, I would expect communication to re-establish by itself over a few seconds/minutes wait (without powercycling the mount) and then fail again a few seconds/minutes later and re-establish, and so on. A test to verify that would be to run a "perpetual ping" (add "-t" to the usual ping syntax from the command line, e.g. ping X.X.X.X -t, where X.X.X.X is the mount's IP address) and let it run for several minutes. Hit CTRL+C to end the ping.
 
Wireshark isn't a tool for a "standard" end-user. From the OP's posts, I'd say he probably knows a bit about networking and probably already uses it. If not, I would rather suggest to download Advanced IP Scanner (free) or similar, which will help discover every device alive (responsive) and dead (not responsive for a small amount of time) on a network. It will also show the MAC address (which is a unique hardware network identifier) of all devices discovered. Try running a scan while the mount is responsive and take note of the MAC address associated with its IP address. Run it again when it becomes non-responsive and if the tool marks it as "alive" and shows a different MAC address, it means you indeed have a duplicate IP address in your network. The "Name" and "Manufacturer" listed will also help you identify which device is using the same IP address as the mount. If the device is marked "dead" when the mount is non-responsive, then the problem is probably elsewhere.
 

 
Another useful (a bit more advanced) command, if you are familiar with your network IP addressing, is the "tracert" command ("traceroute" in linux) which will essentialy show the path (routing hop) taken by a packet from the computer from which you entered the command to the destination device. Its usage is similar to the ping command, e.g. tracert X.X.X.X, where X is the mount's IP address (wired or wireless). Some routers/firewalls might block this request though and you only get a series of *** + a timeout message, instead of actual routing hop IP addresses, which won't help you much. If you have a "flat" network architechture or only one routing instance, it will only return the destination lP address which won't help you much either (see example below).
 

 
If it goes through however, it will help identify up to which routing network component (switch, router, access point, etc.) communication is achieved properly by returning a series of IP addresses through which packets need to go through to their destination, as well as the round-trip time between hops. Over modern wired ethernet links, expect values below 100ms. Over wireless links, it can go much higher depending on multiple factors, but I'd say below 150-200ms on average would be acceptable (but not particularly good). Over those figure, you possibly have a bottleneck somewhere or a failling network component. BTW that command can even be used over the internet with domain names. Example below is between my computer and google.ca. The last line shows the destination IP address of one of the servers hosting the domain google.ca. Lines 1-8 shows the routing instances every data packet has to come across to reach that google server from my computer.
 

 
 
Note that the fact that the mount is not responsive from either interfaces (Ethernet and WiFi) at the same time is also a clue the problem comes from a common source to both, hence probably not coming from the wireless Access Point.
 
Also worth mentionning, even if you haven't said you are using one, is firewalls (sorry, I'll get a bit technical here). While it is actually often not the firewall root-causing the problem, it can be the one ending the communication by dropping data packets. New generations of firewalls (even home router - WiFi or wired - with firewalls functionnalities) have dynamic adaptative algorithms that "recognize" the type and "behavior" of data traffic that goes through them. They do that to prevent, amongst other things, DoS (denial of service) attacks which consist of an attacker flooding a computer with a massive amount of requests until it crashes by running out of memory.

Now, I've monitored Ethernet/WiFi communications between APCC/APv2 ASCOM drivers and my mount's CP5 (using Wireshark) and based on the amount of connections (not talking about physical hardware connection here, rather software connections at the OSI model layer 4) used, it could well be mis-recognized by some firewall's algorithm as a DoS attack. I'm saying this because it creates a new connection for seemingly every data exchange between the computer and mount, which occurs very often - like every second or so. They possibly implemented this that way for heartbeat or synchronization purposes, but that's only a guess. (That left me scratching my head a bit BTW as there are more memory-efficient ways of accomplishing this). Anyway, thing to note here, is there is nothing you can do about that last part as it's an inherent p
roperty of AP's communication between controlling software and mount.
 
But if you are using such a next-gen firewall with that kind of security feature, it could result in similar symptoms to what you are experiencing: communications working for a while and then stopping entirely when packets are dropped. Note here that the firewall is simply doing its job of protecting you. I therefore wouldn't recommend disabling this security feature entirely to solve the problem if that proves to be the case. Rather, I'd try creating a rule to whitelist the mount's IP address in your firewall's configuration in that regard.

Hope this helps as well,
 
Sébastien


Christopher Erickson
 

If your PC doesn't have a native RS-232 serial port and you need to go 15' (or even 50') it would be much-better to combine a long serial cable with a short USB-to-Serial adapter at the PC. Serial can go the distance, USB can get rather flaky over about 2 meters.

And yep, I love my work!

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   

On Thu, May 6, 2021, 6:16 AM Donald Gaines <onegaines@...> wrote:
Hi Christopher,
My computer has no RS-232 (Serial I think) ports, just USB.  Would the 15’ serial cable that comes with the 1100GTO mount along with the Astro-Physics Serial to USB converter, provide the same performance and reliability as the RS-232 you mention below?
BTW, Observatory Engineer.....in Hawaii.....you must really enjoy going to work!
Thanks,
Don Gaines

On Thursday, May 6, 2021, Christopher Erickson <christopher.k.erickson@...> wrote:
TCP is more reliable than UDP if there are a bunch of lost packets on your network for some reason. Bad cable someplace, congestion, etc.In other words, TCP can hide a network problem that UDP does not. If UDP doesn't work, it is worthwhile trying to find out why and fixing it.

Also check your Ethernet cable lengths. Any cable over 100m can cause timeouts, retransmission congestion and packet loss.

Mixing different brands and vintages of Ethernet switches can sometimes cause problems. Different vintage Ethernet transceiver chips, different protocol capabilities, etc.

Get rid of any old Ethernet hubs.

Home made cables can have various issues due to bad crimps, crossed pairs, etc.

Check all cable connectors & sockets for oxidation, corrosion, bent pins, etc.

Make sure there is only one DHCP server on your network.

Always use Ethernet instead of WiFi, when you can.

The most robust and reliable mount communications option of all is RS-232 to a real serial port on your observatory computer. Second-best option is USB. Third is Ethernet and last place goes to WiFi. Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things. Also, most USB connectors are total, unreliable cr*p.

Wireshark is a free, open-source network diagnostic tool that can give you insights into your network. It is very powerful and does have a bit of a learning curve.

PingPlotter is a great, simple diagnostic tool that can be used to track down network congestion and packet loss in your network and your Internet connection.

Make sure you aren't suffering from duplicate IP addresses on your network. The WiFi and Ethernet ports MUST have different IP addresses from each other. Same goes for every single device port on your network.

If you have smart Ethernet switches that let you lock Ethernet ports to soecific, lower protocol speeds, try lowering them all to 10 or 100 MBPS and see if your network problems go away. Could be an auto-negotiation incompatibility issue between the switch and your CP4. Also if you have a smart switch, check its port statistics for clues.

I hope this helps.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   

On Wed, May 5, 2021, 2:35 PM Ray Gralak <iogroups@...> wrote:
> Also, the mount stops tracking when communications starts failing.
> If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?

When APCC is in use, its Safety Park feature will cause the mount to stop tracking if the mount does not receive regular messages from APCC.

So, that the mount stopped tracking indicates that communication failed in some way.

-Ray


> -----Original Message-----
> From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of alex
> Sent: Wednesday, May 5, 2021 3:41 PM
> To: main@ap-gto.groups.io
> Subject: Re: [ap-gto] Losing Communications with the Mount
>
> I had already upped the timeout to 200ms, so I’ll try 400ms.  The mount is directly connected to a switch in my
> observatory.  The only other thing plugged into that switch is my UniFi WiFi access point, which is mounted in
> the observatory.  My computer (a piggy backed eagle 2) is the only thing using that access point, so
> communications is Eagle2 -> AP -> Switch -> Mount.  Said switch is backhauled to my house’s main switch,
> and the only traffic between the house and the observatory is my Mac connecting to the Eagle2 using Microsoft
> Remote Desktop. The Remote Desktop connection to the eagle has been rock solid.
>
> I had pings repeating from my wired Mac in the house, and when the problem happens, the pings start failing
> and stay failing until the mount is power cycled, at which point the pings start working again. APCC re-
> establishes communications once the mount is power cycled with no other intervention on my part.
>
> Also, the mount stops tracking when communications starts failing.  If it was just a communications failure
> between the computer and the mount, wouldn’t the mount continue to track?
> Communications failed again as I was writing this response.  The mount was parked at the time.  I had bumped
> the timeout to 400ms and switched to UDP before hand.  Pings to the mount’s hard wired ethernet IP address
> is failing, but curiously I can ping the mount’s WiFi IP address, though if I disconnect from the mount in APCC
> and try connecting it via that WiFi address, it still get’s no response from the mount.  Again, a few seconds after
> power cycling the GTOCP4, everything is working again.
>
> I’ll try snaking a USB cable down from the Eagle 2 to the mount and try that as backup or perhaps the primary.
> If that also fails, then I’ll pop open the GTOCP4 and check the daughter board seating.
>
> Alex
>
>







Christopher Erickson
 

If a person is going over 2m or so between the mount and the PC, I would avoid using the USB port on the CP4.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   

On Thu, May 6, 2021, 7:49 AM Howard Hedlund <howard@...> wrote:
Using the supplied serial cable with a USB to serial converter, even the FTDI unit that we sell, will still be limited in its reliability to that of USB.  Chris is referring to a native serial port on a dedicated board inside the computer itself that avoids the Universal Serial Bus.  Our on-board USB port uses the same FTDI chipset, but it has an added advantage over an external USB to serial device.  An external device is powered solely from the bus in the computer.  Our port, however, is also powered so a power drop from the computer won't kill the USB connection because the CP4/5 will keep it alive.


Christopher Erickson
 

My experience with TCP comes from 30 years of telecommunications and robotics engineering. My primary concerns are much more with Layer-1 of the OSI model (cables, connectors) and Layer-2 (Ethernet frames), not Layer-3 (IP packets) or Layer-4 (TCP/UDP.) 

OSI Layers 1 & 2 are VERY opaque to the average user so consequently they are usually ignored when troubleshooting. I think this is typically a mistake. Sort of like looking for your car keys under a nice streetlight instead of next to your car, where you dropped them. 

PingPlotter is a very graphical, visual troubleshooting tool that has a free version. It is PROFOUNDLY better and more intuitive than using the DOS prompt command line Ping command. PingPlotter also incorporates a very nice, visual, graphical, dynamic traceroute. Download it and try it out. You won't go back to the nasty DOS prompt command line ever again, unless forced to on a strange machine. 

I agree Wireshark is a complicated tool. I already stated that. However I believe that the typical AP mount owner is more qualified than the average person to gain benefit from it, given some time. I would add that starting with PingPlotter instead of Wireshark would be good.

It could be bad to have a firewall or router in between the CP4/5 and the observatory PC. If there is, it might have LAN packet filtering capabilities, which I would disable, if I could.

-Christopher Erickson
Observatory engineer
Waikoloa, HI 96738
www.summitkinetics.com
   


On Thu, May 6, 2021, 9:04 AM Seb@stro <sebastiendore1@...> wrote:
Ethernet is less reliable than USB because Ethernet and TCP use connectionless, multi-point protocols that make any device-to-device communications more vulnerable to disruption by a multitude more things
 
Not sure where you got that from... Communication link reliability has little to do with the underlying protocol granted it is used in the proper context, well managed and used within a properly designed "network" architecture. 
 
Also, TCP protocol IS a connection-based protocol. It is the very reason it is considered "more reliable" that UDP, the latter being a "best-effort" protocol with no handshake and less error-correction mechanisms but typically lower latency. Both have their usecases where they shine. USB and WiFi are no different either. Some are more complex to manage for a "standard" end-user than others, that's all. 
 
The way some manufacturers implement their datalink solution is another key factor. Don't expect Ferrari performance from a Chevy van. And don't drive a Ferrari when you've always driven a Chevy van (at least not without a proper training)...
 
That aside, OP seems to have isolated the problem between the mount and his wired computer. And I agree IP address conflict (connected devices with same address) could be the culprit here given the symptoms. In that case, I would expect communication to re-establish by itself over a few seconds/minutes wait (without powercycling the mount) and then fail again a few seconds/minutes later and re-establish, and so on. A test to verify that would be to run a "perpetual ping" (add "-t" to the usual ping syntax from the command line, e.g. ping X.X.X.X -t, where X.X.X.X is the mount's IP address) and let it run for several minutes. Hit CTRL+C to end the ping.
 
Wireshark isn't a tool for a "standard" end-user. From the OP's posts, I'd say he probably knows a bit about networking and probably already uses it. If not, I would rather suggest to download Advanced IP Scanner (free) or similar, which will help discover every device alive (responsive) and dead (not responsive for a small amount of time) on a network. It will also show the MAC address (which is a unique hardware network identifier) of all devices discovered. Try running a scan while the mount is responsive and take note of the MAC address associated with its IP address. Run it again when it becomes non-responsive and if the tool marks it as "alive" and shows a different MAC address, it means you indeed have a duplicate IP address in your network. The "Name" and "Manufacturer" listed will also help you identify which device is using the same IP address as the mount. If the device is marked "dead" when the mount is non-responsive, then the problem is probably elsewhere.
 

 
Another useful (a bit more advanced) command, if you are familiar with your network IP addressing, is the "tracert" command ("traceroute" in linux) which will essentialy show the path (routing hop) taken by a packet from the computer from which you entered the command to the destination device. Its usage is similar to the ping command, e.g. tracert X.X.X.X, where X is the mount's IP address (wired or wireless). Some routers/firewalls might block this request though and you only get a series of *** + a timeout message, instead of actual routing hop IP addresses, which won't help you much. If you have a "flat" network architechture or only one routing instance, it will only return the destination lP address which won't help you much either (see example below).
 

 
If it goes through however, it will help identify up to which routing network component (switch, router, access point, etc.) communication is achieved properly by returning a series of IP addresses through which packets need to go through to their destination, as well as the round-trip time between hops. Over modern wired ethernet links, expect values below 100ms. Over wireless links, it can go much higher depending on multiple factors, but I'd say below 150-200ms on average would be acceptable (but not particularly good). Over those figure, you possibly have a bottleneck somewhere or a failling network component. BTW that command can even be used over the internet with domain names. Example below is between my computer and google.ca. The last line shows the destination IP address of one of the servers hosting the domain google.ca. Lines 1-8 shows the routing instances every data packet has to come across to reach that google server from my computer.
 

 
 
Note that the fact that the mount is not responsive from either interfaces (Ethernet and WiFi) at the same time is also a clue the problem comes from a common source to both, hence probably not coming from the wireless Access Point.
 
Also worth mentionning, even if you haven't said you are using one, is firewalls (sorry, I'll get a bit technical here). While it is actually often not the firewall root-causing the problem, it can be the one ending the communication by dropping data packets. New generations of firewalls (even home router - WiFi or wired - with firewalls functionnalities) have dynamic adaptative algorithms that "recognize" the type and "behavior" of data traffic that goes through them. They do that to prevent, amongst other things, DoS (denial of service) attacks which consist of an attacker flooding a computer with a massive amount of requests until it crashes by running out of memory.

Now, I've monitored Ethernet/WiFi communications between APCC/APv2 ASCOM drivers and my mount's CP5 (using Wireshark) and based on the amount of connections (not talking about physical hardware connection here, rather software connections at the OSI model layer 4) used, it could well be mis-recognized by some firewall's algorithm as a DoS attack. I'm saying this because it creates a new connection for seemingly every data exchange between the computer and mount, which occurs very often - like every second or so. They possibly implemented this that way for heartbeat or synchronization purposes, but that's only a guess. (That left me scratching my head a bit BTW as there are more memory-efficient ways of accomplishing this). Anyway, thing to note here, is there is nothing you can do about that last part as it's an inherent p
roperty of AP's communication between controlling software and mount.
 
But if you are using such a next-gen firewall with that kind of security feature, it could result in similar symptoms to what you are experiencing: communications working for a while and then stopping entirely when packets are dropped. Note here that the firewall is simply doing its job of protecting you. I therefore wouldn't recommend disabling this security feature entirely to solve the problem if that proves to be the case. Rather, I'd try creating a rule to whitelist the mount's IP address in your firewall's configuration in that regard.

Hope this helps as well,
 
Sébastien


Donald Gaines
 

Hi Howard,
Thanks for the info. I thought I might have add a card with serial ports. I’ll go that route. Thanks for your advice. 
Regards,
Don Gaines


On Thursday, May 6, 2021, Howard Hedlund <howard@...> wrote:
Using the supplied serial cable with a USB to serial converter, even the FTDI unit that we sell, will still be limited in its reliability to that of USB.  Chris is referring to a native serial port on a dedicated board inside the computer itself that avoids the Universal Serial Bus.  Our on-board USB port uses the same FTDI chipset, but it has an added advantage over an external USB to serial device.  An external device is powered solely from the bus in the computer.  Our port, however, is also powered so a power drop from the computer won't kill the USB connection because the CP4/5 will keep it alive.


Steve Reilly
 

I’ve used a ton of the Startech Serial cards and they work great. Never had a failure in any of the systems I’ve built. The one I use now has 8 serial cables on the rear but you can get less. The card is a similar to this version

 

-Steve

 

 

From: main@ap-gto.groups.io <main@ap-gto.groups.io> On Behalf Of Donald Gaines
Sent: Thursday, May 6, 2021 5:01 PM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] Losing Communications with the Mount

 

Hi Howard,

Thanks for the info. I thought I might have add a card with serial ports. I’ll go that route. Thanks for your advice. 

Regards,

Don Gaines

On Thursday, May 6, 2021, Howard Hedlund <howard@...> wrote:

Using the supplied serial cable with a USB to serial converter, even the FTDI unit that we sell, will still be limited in its reliability to that of USB.  Chris is referring to a native serial port on a dedicated board inside the computer itself that avoids the Universal Serial Bus.  Our on-board USB port uses the same FTDI chipset, but it has an added advantage over an external USB to serial device.  An external device is powered solely from the bus in the computer.  Our port, however, is also powered so a power drop from the computer won't kill the USB connection because the CP4/5 will keep it alive.


alex
 

Ok, I opened up the GTOCP4 yesterday and the daughter board seems to be seated fine.  I put it back and switched the ethernet cable to a brand new professionally made 15’ cable (ie, I didn’t put the connectors on), and changed the ports on the switch it was plugged into.  I rebooted the switch as well in case it was in some weird state.  I also hooked up the GTOCP4 to the eagle 2 directly via USB and configured it as the backup port.  The primary connection was configured to be the ethernet connection using TCP and a 500ms timeout.
 
The switch and AP (and all my networking infrastructure) is Ubiquiti UniFi stuff (a prosumer/SOHO brand), so no different brand incompatibility in my network infrastructure.  After the initial failures, I configured the router’s DHCP server to assign a fixed IP address assigned to the mount instead of a dynamic one.  I’m pretty obsessive about managing my IP address space and am fairly certain there isn’t other devices colliding.  The switch is fairly recent, a UniFi US-8-60W and is a fully managed smart switch.  I suppose I could configure a separate VLAN for the observatory and put the mount and the eagle 2 on as the only hosts on to make sure there wasn’t interference from other devices on the network, though that seems like overkill.
 
Last night the ethernet connection failed again but I didn’t notice right away as this time as APCC successfully failed over to the USB connection, so that worked great.  I had a perpetual ping repeating once a second the whole night, and showed response times typically between 2 and 9 milliseconds, though occasionally have some 30-50ms ones, and a few 1-2 second ones here and there.  Right before the connection failed, the last few pings had 2-7ms ping times, then all the ping requests started timing out. These timeouts have been non-stop from the last 12 hours or so.
 
While communications was failing, The UniFi controller software didn’t show any abnormal packet loss on the port and I sshed directly into the switch and poked around the internal logs, and didn’t see anything fishy.  I tried changing the port the ethernet was plugged into, and power cycling the switch to see if the problem was some bad state the switch was in.  Neither woke up the TCP connection. The only thing that fixes it is power cycling the GTOCP4, so to me this seems to be a problem with the state the GTOCP4 is in.  If there was some persistent ongoing problem with the network infrastructure, then a reboot of the mount wouldn’t fix the problem.
 
I’m still mystified as to what’s going on with that ethernet connection.  I’m a software engineer and have been programming IP networks professionally over 30 years, and I’ve never seen behavior like this. I haven’t formally done IT/OPs stuff (I program back end web services nowadays), I’ve setup plenty of IP networking equipment over the years.
 
I could see a transient communications problem starting things off, it wouldn’t explain the inability to ping until the GTOCP4 is power cycled, which magically fixes everything. It’s appearing that some problem occurs, and the GTOCP4 goes into a mode where the network is down and only a power cycle resets it.  I’ve never encountered any device with this behavior.  
 
What’s the OS on this thing?  It has a reasonably capable ARM processor.  Is it running Linux or some embedded OS?  Is it possible to SSH into this thing and poke around, check some logs, do something like an ifconfig, netstat, etc ?  While the USB connection seems to be working well, I still want to track down what’s going on with the ethernet connection.  It’s been my experience that wired ethernet connections are pretty rock solid assuming you avoid problems like long distances or interference with electrical wiring.  My cable isn’t near anything like that.
 
I may try plugging my Mac directly into the same switch and see what Wireshark shows, if anything.  It’s been a few years since I’ve messed with it.
 
Alex