Odd Crash On 1540

backXslash · 2017-06-27 04:59:05

I can't pin down what's causing this crash. I'm on 1540 beta. I originally thought it was a temperature related crash, so I added a cooling fan to the RPi and it never broke 41 degrees celsius. Then I figured it might be a WiFi related issue, so I ran an ethernet hard line.

I can't figure what may be causing it. The debug zip is attached here.

Shahin · 2017-06-27 05:53:07

How often it happens? Have you seen resource indicators before restart?

backXslash · 2017-06-27 06:20:41

Shahin wrote:

How often it happens? Have you seen resource indicators before restart?

All the fricken' time really. Two may three times per print. It's aggravating as all hell. I didn't catch any indicators other than the temp before the last crash.

Shahin · 2017-06-27 07:25:49

Very Odd, I use same build with similar config without any issue.
Do you remember which version was reliable before?
Do you experience this issue just after upgrade?
Just a Do you have extra SD card, RPi and Power Supply around to try?
Have you noticed any pattern? For example it fails every 300 layers or so.

This time if it crashes check out process number indicator before restart, it could give us some clue.

backXslash · 2017-06-27 12:14:24

Shahin wrote:

Very Odd, I use same build with similar config without any issue.
Do you remember which version was reliable before?
Do you experience this issue just after upgrade?
Just a Do you have extra SD card, RPi and Power Supply around to try?
Have you noticed any pattern? For example it fails every 300 layers or so.
This time if it crashes check out process number indicator before restart, it could give us some clue.

How do I check the process number indicator?

Shahin · 2017-06-27 14:13:09

Beside temperature indicator. (Proc)

backXslash · 2017-06-28 04:31:34

New data. It seems as though NanoDLP does not handle network disconnection gracefully, though I can't tell for sure. Essentially, it looks like if the network connection drops, the printer freezes.

What's the intended behavior?

Shahin · 2017-06-28 06:15:41

It is the other-way around, it probably freezes and network connection drops.

backXslash · 2017-06-28 12:52:47

Shahin wrote:

It is the other-way around, it probably freezes and network connection drops.

Is there a way to verify? I've got AT&T aDSL, in a ritual area. It's the literal worst. Plus the router itself will glitch out like all the time.

Shahin · 2017-06-28 18:20:48

Network could not have serious effects on NanoDLP. We need more data to troubleshoot this issue but considering you are only one reporting this serious issue, there is a high chance of hardware issue.

backXslash · 2017-06-28 23:59:20

Shahin wrote:

Network could not have serious effects on NanoDLP. We need more data to troubleshoot this issue but considering you are only one reporting this serious issue, there is a high chance of hardware issue.

Fair enough. What are the most likely suspects?

Shahin · 2017-06-29 05:36:14

My guess is SD card.

backXslash · 2017-06-30 02:56:17

Shahin wrote:

My guess is SD card.

Tracked it down. Shitty power supply. Replaced it with a beefier one. Problem disappeared.

backXslash · 2017-06-30 13:41:26

backXslash wrote:

Shahin wrote:
My guess is SD card.
Tracked it down. Shitty power supply. Replaced it with a beefier one. Problem disappeared.

I've spoken too soon. It looks like the NanoDLP process is crashing, not the actual RPi now. Happened about 350 layers into a print last night.

I was able to SSH into the printer though, and found there was no printer process running.

Is there a way to enable verbose logging in NanoDLP?

Shahin · 2017-06-30 16:07:18

During crash all stack information being written to file called /var/log/printer.log .
Is there any difference between previous and current crashes?

backXslash · 2017-06-30 16:18:55

Shahin wrote:

During crash all stack information being written to file called /var/log/printer.log .
Is there any difference between previous and current crashes?

Not sure yet. There doesn't appear to be anything being written to printer.log during operation. Is there a way to make NanoDLP log everything to a file in real time? My gut tells me the fastest way to track this down is to log everything in real time and pick over that log when it crashes, not wait for it to crash and trust it to log post-crash.

Shahin · 2017-06-30 16:39:35

Some loops are so tight doing anything inside them will cause serious slow down. As you are direct control user I suspect it is movement loop issue which we have tightest loop on nanodlp

backXslash · 2017-06-30 16:50:01

Shahin wrote:

Some loops are so tight doing anything inside them will cause serious slow down. As you are direct control user I suspect it is movement loop issue which we have tightest loop on nanodlp

The crash issue has persisted across several updates though. And I'm not using dynamic lift. I use dynamic cure, but it never fails on the parts that are actually dynamic (my dynamic cure formula plateaus after 20 layers or so and cures every layer past 20 for 25 seconds).

I suppose I can reboot, kill the NanoDLP process that is spawned on boot, and then start NanoDLP from the command line with "sudo /home/pi/printer/printer > ~/nanodlp-live-log.txt"

What is your suggestion for debugging this before I try that?

Shahin · 2017-06-30 18:59:48

Sure for the terminated process you can run nanodlp.
Could you share another debug file after crash? If it is not hardware issue, main suspect is the pulse generator logic.

backXslash · 2017-07-01 04:04:21

Shahin wrote:

Sure for the terminated process you can run nanodlp.
Could you share another debug file after crash? If it is not hardware issue, main suspect is the pulse generator logic.

Of course. I just pulled it down and haven't even opened it yet myself. It's here.

Shahin · 2017-07-01 05:59:16

I am not sure if it is the software issue, will stress test direct control modules to see if I could crash it or not.

backXslash · 2017-07-01 07:26:00

Shahin wrote:

I am not sure if it is the software issue, will stress test direct control modules to see if I could crash it or not.

Another crash log here, further along in the same print as the previous one.

backXslash · 2017-07-01 07:28:39

backXslash wrote:

Shahin wrote:
I am not sure if it is the software issue, will stress test direct control modules to see if I could crash it or not.
Another crash log here, further along in the same print as the previous one.

Looks like printer.log is empty. Here's the relevant crash info -

2017/07/01 06:41:34.374173 {"Layer":"606","module":"Image","level":"Warning","msg":"Display layer public/plates/53/606.png"}
2017/07/01 06:42:09.844458 {"Layer":"607","module":"Image","level":"Warning","msg":"Display layer public/plates/53/607.png"}
2017/07/01 06:42:45.287105 {"Layer":"608","module":"Image","level":"Warning","msg":"Display layer public/plates/53/608.png"}
2017/07/01 06:43:20.623799 {"Layer":"609","module":"Image","level":"Warning","msg":"Display layer public/plates/53/609.png"}
2017/07/01 06:43:55.993994 {"Layer":"610","module":"Image","level":"Warning","msg":"Display layer public/plates/53/610.png"}
2017/07/01 06:44:31.503874 {"Layer":"611","module":"Image","level":"Warning","msg":"Display layer public/plates/53/611.png"}
2017/07/01 06:45:07.002906 {"Layer":"612","module":"Image","level":"Warning","msg":"Display layer public/plates/53/612.png"}
net.runtime_pollWait(0x740c0fa8, 0x72, 0x0)
net.(*conn).Read(0x10a0e050, 0x10ce2000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/net.go:181 +0x58
net/http.(*connReader).Read(0x10de0120, 0x10ce2000, 0x1000, 0x1000, 0x248ff4, 0x10b7a640, 0xd0e9445e)
bufio.(*Reader).fill(0x10de0150)
        /usr/local/go/src/bufio/bufio.go:97 +0xf4
bufio.(*Reader).Peek(0x10de0150, 0x4, 0xe, 0x29618357, 0x6c6f60, 0x0, 0x0)
        /usr/local/go/src/bufio/bufio.go:129 +0x58
net/http.(*conn).serve(0x10a98780, 0x6a1c70, 0x10c40aa0)
        /usr/local/go/src/net/http/server.go:1850 +0x7a0
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2668 +0x234

goroutine 4232 [IO wait]:
net.runtime_pollWait(0x740c1098, 0x72, 0x11431000)
        /usr/local/go/src/runtime/netpoll.go:164 +0x44
net.(*pollDesc).wait(0x10ae7f3c, 0x72, 0x69f9f0, 0x69d798)
        /usr/local/go/src/net/fd_poll_runtime.go:75 +0x28
net.(*pollDesc).waitRead(0x10ae7f3c, 0x11431000, 0x1000)
        /usr/local/go/src/net/fd_poll_runtime.go:80 +0x24
net.(*netFD).Read(0x10ae7f00, 0x11431000, 0x1000, 0x1000, 0x0, 0x69f9f0, 0x69d798)
        /usr/local/go/src/net/fd_unix.go:250 +0x148
net.(*conn).Read(0x10a0e948, 0x11431000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/net.go:181 +0x58
net/http.(*connReader).Read(0x1141e360, 0x11431000, 0x1000, 0x1000, 0x248ff4, 0x10ae7f00, 0xd0e94472)
        /usr/local/go/src/net/http/server.go:754 +0x168
bufio.(*Reader).fill(0x1141e390)
        /usr/local/go/src/bufio/bufio.go:97 +0xf4
bufio.(*Reader).Peek(0x1141e390, 0x4, 0xe, 0x1cba5670, 0x6c6f60, 0x0, 0x0)
        /usr/local/go/src/bufio/bufio.go:129 +0x58
net/http.(*conn).serve(0x10a988a0, 0x6a1c70, 0x10b87000)
        /usr/local/go/src/net/http/server.go:1850 +0x7a0
created by net/http.(*Server).Serve
        /usr/local/go/src/net/http/server.go:2668 +0x234

trap    0x0
error   0x0
oldmask 0x0
r0      0x0
r1      0x1544
r2      0x6
r3      0x0
r4      0x76df7094
r5      0x358ff460
r6      0x0
r7      0x10c
r8      0x1
r9      0x34
r10     0x11020960
fp      0x358fec04
ip      0x358ff920
sp      0x358fead0
lr      0x76ce6f44
pc      0x76ce6f70
cpsr    0x20000010
fault   0x0

backXslash · 2017-07-01 07:51:59

Shahin wrote:

I am not sure if it is the software issue, will stress test direct control modules to see if I could crash it or not.

Seems to be something similar to this issue.

Shahin · 2017-07-01 08:00:32

Unfortunately wide range of issues from OS to the software and hardware issues could cause this kind of error message. So it is not helpful until I can reproduce the issue locally.

nanoDLP

#1 2017-06-27 04:59:05

Odd Crash On 1540

#2 2017-06-27 05:53:07

Re: Odd Crash On 1540

#3 2017-06-27 06:20:41

Re: Odd Crash On 1540

#4 2017-06-27 07:25:49

Re: Odd Crash On 1540

#5 2017-06-27 12:14:24

Re: Odd Crash On 1540

#6 2017-06-27 14:13:09

Re: Odd Crash On 1540

#7 2017-06-28 04:31:34

Re: Odd Crash On 1540

#8 2017-06-28 06:15:41

Re: Odd Crash On 1540

#9 2017-06-28 12:52:47

Re: Odd Crash On 1540

#10 2017-06-28 18:20:48

Re: Odd Crash On 1540

#11 2017-06-28 23:59:20

Re: Odd Crash On 1540

#12 2017-06-29 05:36:14

Re: Odd Crash On 1540

#13 2017-06-30 02:56:17

Re: Odd Crash On 1540

#14 2017-06-30 13:41:26

Re: Odd Crash On 1540

#15 2017-06-30 16:07:18

Re: Odd Crash On 1540

#16 2017-06-30 16:18:55

Re: Odd Crash On 1540

#17 2017-06-30 16:39:35

Re: Odd Crash On 1540

#18 2017-06-30 16:50:01

Re: Odd Crash On 1540

#19 2017-06-30 18:59:48

Re: Odd Crash On 1540

#20 2017-07-01 04:04:21

Re: Odd Crash On 1540

#21 2017-07-01 05:59:16

Re: Odd Crash On 1540

#22 2017-07-01 07:26:00

Re: Odd Crash On 1540

#23 2017-07-01 07:28:39

Re: Odd Crash On 1540

#24 2017-07-01 07:51:59

Re: Odd Crash On 1540

#25 2017-07-01 08:00:32

Re: Odd Crash On 1540

Board footer