AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)


ap@CaptivePhotons.com
 

Thank you for pursuing and fixing what is obviously a rare error.  I'll grab the new version.

Though I shall be careful in that area regardless, and move on to finding even more obscure ways to break things.  :)

Linwood


Ray Gralak
 

Hi Linwood,

The problem did have to do with the refraction option, but it was just the lack of initialization of the site's latitude, which is contained in the missing PNT file. If you unchecked the refraction option in APCC then the problem would go away.

In case you are interested, the actual problem was not the refraction calculation itself, but that APCC would pre-calculate the sine and cosine of latitude before using them so that it would not have to recalculate the same value all of the time. However, since the PNT file was missing, the latitude's sine and cosine values were never pre-calculated. The problem was that the cosine of latitude=0 is 1, but since it was never initialized, it remained at the default value of 0, causing large Alt/Az to HA/Dec computation errors.

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of ap@...
Sent: Thursday, January 20, 2022 6:49 AM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)

On Thu, Jan 20, 2022 at 09:28 AM, Tom Blahovici wrote:


...exactly why self driving cars will never work...

When users complained that the new GUI on Teslas was inconvenient, Musk famously responded "User Input
Equals Error".

So his theory is if the user just does nothing and lets the software do it all, everything works; all failures
originate from the human.

Which of course would have been true for my mount. :)

Linwood


ap@CaptivePhotons.com
 

On Thu, Jan 20, 2022 at 09:28 AM, Tom Blahovici wrote:
...exactly why self driving cars will never work...
When users complained that the new GUI on Teslas was inconvenient, Musk famously responded "User Input Equals Error". 

So his theory is if the user just does nothing and lets the software do it all, everything works; all failures originate from the human.

Which of course would have been true for my mount.   :) 

Linwood


Tom Blahovici
 

. I believe you are the first to run across this in all the years since APCC has been available, 

....exactly why self driving cars will never work...


ap@CaptivePhotons.com
 

What can I say, breaking software is a gift.  Or a curse. 

Yes, now that I realize the cause, the solution is obvious.

I renamed them because I have two OTA's, and planned to reuse the models, and the names given by APPM are not meaningful.  I put the OTA type in the name, so I could know which was which.   It's sure easier than remember the date when I last created the refractor model, or the SCT model.

Linwood


Ray Gralak
 

I was just guessing on the refraction issue. I haven't yet tried reproducing this, but I will as soon as I have some free time. I believe you are the first to run across this in all the years since APCC has been available, and there are a couple of obvious solutions, including:

1. Making sure the PNT file exists.
2. Disable tracking and pointing corrections.

And may I ask why you renamed the PNT file in the first place?

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of ap@...
Sent: Wednesday, January 19, 2022 7:36 AM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)

No, still running the released version of 6.5SP1, I tend to have enough stuff break around me without asking
for trouble with the latest version of ascom. But if you think it will help I can.

So it does not do this if you try it?

That would be a lot of refraction. :)

Linwood


ap@CaptivePhotons.com
 

No, still running the released version of 6.5SP1, I tend to have enough stuff break around me without asking for trouble with the latest version of ascom.    But if you think it will help I can. 

So it does not do this if you try it? 

That would be a lot of refraction. :) 

Linwood


Ray Gralak
 

That’s odd, all the terms are zero so I am not sure why a Dec value of 511 degrees would have been calculated, unless refraction has something to do with this.

 

By any chance have you updated the new ASCOM platform to 6.6? The release notes said something about fixing refraction.

 

-Ray

 

From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of ap@...
Sent: Wednesday, January 19, 2022 7:09 AM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)

 

Sorry, Ray, when I say it was because of the model name, I meant it was because I renamed the model and the model name APCC was expecting was not present.

It's easily re-creatable for me.  Last night I was imaging with the model, and shut down normally.  To recreate this morning, before starting APCC I renamed the model to a new name.

For simplicity I started just APCC, no ASCOM, no NINA.  It auto-connected and and the model "loaded" all zeros (first two screen shots) 

I hand-keyed in the goto pane the uncorrected coordinates from the log you have (2d1m7.10s, 85d12m39.2s) and hit Goto.

It took off counterclockwise again (I think I said clockwise somewhere, sorry).   I stopped it after about 100 degrees or so, you can see in the 3D where I stopped it.  Notice also the custom rate; at this point if I go to the tracking correction status it shows -18000.00 sec/hour.  The log shows the corrected slew of +511*15:55.2 again.

At that point (having stopped the slew, but nothing else, no restart) I went to Model, loaded the (now) proper name, and immediately the numbers all changed to something sensible.  Without changing it I hit the goto button again and it properly slewed back to where it should be.

It really seems like it is just as simple as removing a model from where APCC expects it to be when it starts up causes this, no special situation that I can see. I must be the only one that likes meaningful names.  :) 

Obviously I won't be doing that any more.  

Thanks for looking at it. 

Linwood







ap@CaptivePhotons.com
 

Sorry, Ray, when I say it was because of the model name, I meant it was because I renamed the model and the model name APCC was expecting was not present.

It's easily re-creatable for me.  Last night I was imaging with the model, and shut down normally.  To recreate this morning, before starting APCC I renamed the model to a new name.

For simplicity I started just APCC, no ASCOM, no NINA.  It auto-connected and and the model "loaded" all zeros (first two screen shots) 

I hand-keyed in the goto pane the uncorrected coordinates from the log you have (2d1m7.10s, 85d12m39.2s) and hit Goto.

It took off counterclockwise again (I think I said clockwise somewhere, sorry).   I stopped it after about 100 degrees or so, you can see in the 3D where I stopped it.  Notice also the custom rate; at this point if I go to the tracking correction status it shows -18000.00 sec/hour.  The log shows the corrected slew of +511*15:55.2 again.

At that point (having stopped the slew, but nothing else, no restart) I went to Model, loaded the (now) proper name, and immediately the numbers all changed to something sensible.  Without changing it I hit the goto button again and it properly slewed back to where it should be.

It really seems like it is just as simple as removing a model from where APCC expects it to be when it starts up causes this, no special situation that I can see. I must be the only one that likes meaningful names.  :) 

Obviously I won't be doing that any more.  

Thanks for looking at it. 

Linwood







Ray Gralak
 

Hi Linwood,

I am a bit puzzled by those delays, as the amount of time referenced (about 7-8 seconds) does not seem
represented by the time stamps on either those lines, or even relative to the 'giving up line'.
Messages and responses are all queue-driven within APCC. The time values are measured from the time a response was returned from the mount until they are dequeued and processed by another thread in APCC. If another application is using significant CPU time, then that thread may not get a chance to finish processing the responses expediently.

But regardless... I think this shows APCC did the run-away due to the model name, and not due to CPU
competition?
Not really the model name, but APCC should have disabled the model when it could not load the PNT file. I'm still looking into the reason for the crazy coordinates that were used when this situation occurred. Can you post a screenshot of the pointing terms when APCC is in this state?

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of ap@...
Sent: Tuesday, January 18, 2022 1:33 PM
To: main@ap-gto.groups.io
Subject: Re: [ap-gto] AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)

Thanks, Ray. Several things to digest here. And I have a different slant on the theories below after more
research.

First, I was indeed doing a test of a new star analysis program on NINA. It is quite compute intensive. I am
not quite sure why it would be running though, since it needed to slew before it could take the image to
analyze. But it is quite a coincidence this was the first time I used it live. So it seems the most likely culprit,
but maybe not...

The model:

On 1/14 I built a new 200 (+/-) point model. I assumed it was being loaded each night since then by default,
however I think I screwed up.

I renamed the model (since "ApPointData-2022-01-14-183708.pnt" is not exactly meaningful and I have two
OTA's). I did the rename that evening while APCC was still running (assuming it was already loaded, Indeed I
reviewed a bunch of target/corrected slews from that night, all good).

The 17th is the next night I imaged, when the runaway happened. APCC tried to load the file name by which it
knew the model and failed (because I had renamed it). I should have manually loaded and activated it, but it
had been 3 days and my attention span is about 3 minutes so I did not.

My further guess is that somehow having failed to load it, it proceeded to build a garbage model as opposed
to turning off the model altogether. I reproduced it today in daylight, and there is no indication of failure in
APCC itself, the model comes up with no data (all zeros). HOWEVER, pointing and tracking corrections are
turned on, and when I did the same slew from NINA, DEC ran away again, and in the APCC log I see this:



0042829 2022-01-18 16:24:29.881: Debug, Pointing Corrector, Slew - Target Slew: RA = 02:01:07.10, Dec
= +85*12:39.2, East Model = True

0042830 2022-01-18 16:24:29.886: Debug, Pointing Corrector, Slew - Corrected Slew: RA = 18:56:30.47,
Dec = +511*15:55.4

So it is pretty easy to reproduce just from renaming the model file, and forgetting to load the model file the
next time you start APCC. APCC does something internally that results in wacky calculations in that situation.

Without closing APCC, I loaded the right model, had NINA re-issue the slow, and it worked perfectly.

So I am going to speculate that whatever happened with the delayed poll response was unrelated to the wacky
slew (since in my test there were no competing CPU processes now in daylight).

I am a bit puzzled by those delays, as the amount of time referenced (about 7-8 seconds) does not seem
represented by the time stamps on either those lines, or even relative to the 'giving up line'.

But regardless... I think this shows APCC did the run-away due to the model name, and not due to CPU
competition?

Linwood




ap@CaptivePhotons.com
 

Thanks, Ray.  Several things to digest here.   And I have a different slant on the theories below after more research.  

First, I was indeed doing a test of a new star analysis program on NINA.  It is quite compute intensive. I am not quite sure why it would be running though, since it needed to slew before it could take the image to analyze.  But it is quite a coincidence this was the first time I used it live.  So it seems the most likely culprit, but maybe not... 

The model: 

On 1/14 I built a new 200 (+/-) point model.  I assumed it was being loaded each night since then by default, however I think I screwed up. 

I renamed the model (since "ApPointData-2022-01-14-183708.pnt" is not exactly meaningful and I have two OTA's).  I did the rename that evening while APCC was still running (assuming it was already loaded, Indeed I reviewed a bunch of target/corrected slews from that night, all good). 

The 17th is the next night I imaged, when the runaway happened.  APCC tried to load the file name by which it knew the model and failed (because I had renamed it).  I should have manually loaded and activated it, but it had been 3 days and my attention span is about 3 minutes so I did not.

My further guess is that somehow having failed to load it, it proceeded to build a garbage model as opposed to turning off the model altogether.   I reproduced it today in daylight, and there is no indication of failure in APCC itself, the model comes up with no data (all zeros). HOWEVER, pointing and tracking corrections are turned on, and when I did the same slew from NINA, DEC ran away again, and in the APCC log I see this: 

0042829 2022-01-18 16:24:29.881:      Debug, Pointing Corrector, Slew - Target Slew:     RA = 02:01:07.10, Dec = +85*12:39.2, East Model = True

0042830 2022-01-18 16:24:29.886:      Debug, Pointing Corrector, Slew - Corrected Slew:  RA = 18:56:30.47, Dec = +511*15:55.4

So it is pretty easy to reproduce just from renaming the model file, and forgetting to load the model file the next time you start APCC.  APCC does something internally that results in wacky calculations in that situation.

Without closing APCC, I loaded the right model, had NINA re-issue the slow, and it worked perfectly.

So I am going to speculate that whatever happened with the delayed poll response was unrelated to the wacky slew (since in my test there were no competing CPU processes now in daylight). 

I am a bit puzzled by those delays, as the amount of time referenced (about 7-8 seconds) does not seem represented by the time stamps on either those lines, or even relative to the 'giving up line'. 

But regardless... I think this shows APCC did the run-away due to the model name, and not due to CPU competition?  

Linwood


Ray Gralak
 

Hi Linwood,

035328 2022-01-17 18:19:58.285: ASCOM: Info : SET TargetRightAscension = 3.88285751697392
035329 2022-01-17 18:19:58.285: ASCOM: Info : SET TargetDeclination = 85.2108975166604
035330 2022-01-17 18:19:58.285: Telescope: Info : CommandString
TX=':*SLEW,3.88285751697392,85.2108975166604#'
035331 2022-01-17 18:19:59.300: Telescope: Error : CommandString: NO Response or bad APCC
response: ':APCC,4102,*SLEW,3.88285751697392,85.2108975166604#',
Response='*SLEW,3.88285751697392,85.2108975166604' -> Giving up!
The command didn't actually fail, but the driver timed out waiting for a response.

In APCC's log there is evidence that something else on your computer was consuming lots of CPU so the callback method for mount poll responses was delayed:

0058667 2022-01-17 18:20:00.621: Debug, Command Thread, TX='$GOS#'|RX='11100000#'
0058668 2022-01-17 18:20:00.664: Debug, RA/Dec Slew, New SafeSlew State=BeginSafeSlew
0058669 2022-01-17 18:20:00.668: Debug, PollMount, Elapsed time to Process: 2203 msecs
0058670 2022-01-17 18:20:00.671: Debug, PollMount, Elapsed time to Process: 1844 msecs
0058671 2022-01-17 18:20:00.675: Debug, PollMount, Elapsed time to Process: 1484 msecs
0058672 2022-01-17 18:20:00.678: Debug, PollMount, Elapsed time to Process: 1125 msecs
0058673 2022-01-17 18:20:00.680: Debug, PollMount, Elapsed time to Process: 781 msecs
0058674 2022-01-17 18:20:00.682: Debug, PollMount, Elapsed time to Process: 422 msecs
0058675 2022-01-17 18:20:00.684: Debug, PollMount, Elapsed time to Process: 63 msecs

As to why the mount slewed like that, it looks like there was some sort of error in the model as there is no filename for the model in the log (first line below), and the corrected Dec value is a crazy +511 degrees. I'll take a look at that to make sure that can't happen in the future but I think there was some sort of system event (maybe a disk read error) that kicked this off:

0058723 2022-01-17 18:20:01.028: Debug, Pointing Corrector, Slew - Model:
0058724 2022-01-17 18:20:01.028: Debug, Pointing Corrector, Slew - Current LST = 01:42:22.75
0058725 2022-01-17 18:20:01.028: Debug, Pointing Corrector, Slew - Target Slew: RA = 03:52:58.29, Dec = +85*12:39.2, East Model = True
0058726 2022-01-17 18:20:01.034: Debug, Pointing Corrector, Slew - Corrected Slew: RA = 20:48:07.95, Dec = +511*15:55.4
0058727 2022-01-17 18:20:01.036: Debug, Tracking Corrector, Slew - Side of Pier = EAST
0058728 2022-01-17 18:20:01.036: Debug, Tracking Corrector, Slew - RA Tracking Rate = 6.00000
0058729 2022-01-17 18:20:01.036: Debug, Tracking Corrector, Slew - Dec Tracking Rate = 0.00000
0058730 2022-01-17 18:20:01.036: Debug, SafeSlewStateMachine, New SafeSlew State=WaitPointingCorrectionSlew

-Ray

-----Original Message-----
From: main@ap-gto.groups.io [mailto:main@ap-gto.groups.io] On Behalf Of ap@...
Sent: Monday, January 17, 2022 4:48 PM
To: main@ap-gto.groups.io
Subject: [ap-gto] AP1100AE needs exorcism (DEC turned all the way around like Linda Blair)

I was doing a three point polar alignment in NINA, as I do every night, and when it started slewing (usually a
small movement to be just off the pole), the DEC axis spun clockwise (as viewed from behind/above facing
north) all the way around. I finally hit stop, but not sure if I stopped it, or if it actually got to the right place by
going too far.

Thank goodness for thru-the-mount cables.

Interestingly when I then did a park 3 it unwound. Find home went to the right place.

I started all over and it worked fine.

I tracked this down in the NINA logs and it was just a slew, like any other slew.

I tracked it down in the ASCOM logs and got this:


035328 2022-01-17 18:19:58.285: ASCOM: Info : SET TargetRightAscension = 3.88285751697392
035329 2022-01-17 18:19:58.285: ASCOM: Info : SET TargetDeclination = 85.2108975166604
035330 2022-01-17 18:19:58.285: Telescope: Info : CommandString
TX=':*SLEW,3.88285751697392,85.2108975166604#'
035331 2022-01-17 18:19:59.300: Telescope: Error : CommandString: NO Response or bad APCC
response: ':APCC,4102,*SLEW,3.88285751697392,85.2108975166604#',
Response='*SLEW,3.88285751697392,85.2108975166604' -> Giving up!

There are no comm events shown on the APCC screen. Nothing I can make sense of in the APCC log.

Ray, attempting to anticipate, logs zipped and are here:

https://drive.google.com/file/d/1dpOb6Hw8JwcC0MoHzXXiOV-oX1MEO2eR/view?usp=sharing

That's taken from the mount after 2 restarts of NINA but no restarted of APCC nor did I re-initialize the mount;
the 18:19:58 event and subsequent LONG slew is the one of interest.

My first thought was I had confused something with a bad unpark or such, but Park 3 immediately afterwards
worked perfectly, as did Find Home, another Park 3, then I started NINA and did (+/- movement due to time)
exactly the same thing and it worked fine.

It sort of looks like Ascom and APCC had a disagreement.

Did I do something wrong?

Linwood



ap@CaptivePhotons.com
 

I was doing a three point polar alignment in NINA, as I do every night, and when it started slewing (usually a small movement to be just off the pole), the DEC axis spun clockwise (as viewed from behind/above facing north) all the way around.  I finally hit stop, but not sure if I stopped it, or if it actually got to the right place by going too far.

Thank goodness for thru-the-mount cables.

Interestingly when I then did a park 3 it unwound.  Find home went to the right place.  

I started all over and it worked fine.

I tracked this down in the NINA logs and it was just a slew, like any other slew. 

I tracked it down in the ASCOM logs and got this: 

035328 2022-01-17 18:19:58.285:            ASCOM: Info      : SET TargetRightAscension = 3.88285751697392
035329 2022-01-17 18:19:58.285:            ASCOM: Info      : SET TargetDeclination = 85.2108975166604
035330 2022-01-17 18:19:58.285:        Telescope: Info      : CommandString TX=':*SLEW,3.88285751697392,85.2108975166604#'
035331 2022-01-17 18:19:59.300:        Telescope: Error     : CommandString: NO Response or bad APCC response: ':APCC,4102,*SLEW,3.88285751697392,85.2108975166604#', Response='*SLEW,3.88285751697392,85.2108975166604' -> Giving up!

There are no comm events shown on the APCC screen.  Nothing I can make sense of in the APCC log. 

Ray, attempting to anticipate, logs zipped and are here: 

https://drive.google.com/file/d/1dpOb6Hw8JwcC0MoHzXXiOV-oX1MEO2eR/view?usp=sharing

That's taken from the mount after 2 restarts of NINA but no restarted of APCC nor did I re-initialize the mount; the 18:19:58 event and subsequent LONG slew is the one of interest.

My first thought was I had confused something with a bad unpark or such, but Park 3 immediately afterwards worked perfectly, as did Find Home, another Park 3, then I started NINA and did (+/- movement due to time) exactly the same thing and it worked fine.

It sort of looks like Ascom and APCC had a disagreement.

Did I do something wrong? 

Linwood