Ticket #56 (accepted bug)

Opened 8 years ago

Last modified 7 years ago

"nagare-admin -serve" silently exiting

Reported by: Hemml Owned by: apoirier
Priority: critical Component: nagare-admin
Version: 0.3.0 Keywords:
Cc:

Description

My application contains a hidden form which is submitted every second (for testing purposes) using javascript (emulating .click() event on "submit" button). After 20-40 minutes of such activity nagare-admin silently exiting without any error reporting.

My configuration file:

[application]
path = app BDB3
name = ""
debug = on

[database]
activated = on
uri =  postgresql+psycopg2://bdb3@localhost/bdb3
encoding = utf-8
metadata = BDB3.models:metadata
populate = BDB3.models:populate

Application was started using this command:

nagare-admin serve --reload BDB3

There were no changes in files at runtime (no reloads).

Change History

comment:1 Changed 8 years ago by Hemml

Form in the application was rendered using AsyncRenderer?

comment:2 follow-up: ↓ 3 Changed 8 years ago by apoirier

  • Status changed from new to accepted

What is:

  • your OS ?
  • your Python version ?
  • the Nagare version ?

Is this problem always reproducible (i.e always an exit after 20-40 min) ?
Is your automatic form asynchronous submit the only activity ? Or do you have others sessions in parallel ?

comment:3 in reply to: ↑ 2 Changed 8 years ago by Hemml

Replying to apoirier:

What is:

  • your OS ?

MacOS 10.6 and FreeBSD 8.0,v problem was reproduced on both systems

  • your Python version ?

2.6.4

  • the Nagare version ?

0.3.0

Is this problem always reproducible (i.e always an exit after 20-40 min) ?

Was reproduced 3-4 times on MacOS, and once on FreeBSD (not tried more).

Is your automatic form asynchronous submit the only activity?

Yes.

Or do you have others sessions in parallel?

No other session was active.

comment:5 follow-up: ↓ 6 Changed 8 years ago by apoirier

When the form is submitted, are some comp.call() / comp.answer() invoked (directly or not) by its associated callbacks ?

If yes, please check these call() / answer() calls are only made by callbacks registered on input with type="submit", not by callbacks for other form elements or for the form itself (pre_action() / post_action() callbacks).

comment:6 in reply to: ↑ 5 Changed 8 years ago by Hemml

Replying to apoirier:

When the form is submitted, are some comp.call() / comp.answer() invoked (directly or not) by its associated callbacks ?

You can view source (very dirty at this time, I know :) of the refreshing component here:  http://pastebin.org/257855

This component is invoked by this chain:

app: h<<self.admin_cat.render(xhtml.AsyncRenderer?(h))
admin_cat (via h.a().action(lambda .....)): comp.call(component.Component(FileParser?(f)))
FileParser? (via h.input(type=submit).action(lambda .....)): comp.call(component.Component(Parser2(.....)))
Parser2 (via h.input(type=submit).action(lambda .....)): comp.call(component.Component(AdminInfo?()))

If yes, please check these call() / answer() calls are only made by callbacks registered on input with type="submit", not by callbacks for other form elements or for the form itself (pre_action() / post_action() callbacks).

comment:7 follow-up: ↓ 8 Changed 8 years ago by apoirier

From your description, I have extracted a code with only the navigation kept ( http://pastebin.org/261548).

And I observed the same exit from Python. Compiling Python in debug mode, it seems to me the problem is a bug discovered and fixed just 2 weeks ago:  http://www.stackless.com/pipermail/stackless/2010-May/004701.html

With this fix, I don't have the problem anymore.

Can you checkout and try the dev version of Stackless 2.6 too ?:

comment:8 in reply to: ↑ 7 ; follow-up: ↓ 9 Changed 8 years ago by Hemml

Replying to apoirier:

Can you checkout and try the dev version of Stackless 2.6 too ?:

Sorry for long delay. I have tried Stackless 2.6-dev on FreeBSD (checkouted at May 26). After many reloads, python exited with this error:

Segmentation fault: 11 (core dumped)

Examination of python.core via GDB gives me this output:

#0 0x080da249 in impl_tasklet_kill (task=0x953c79c)

at Stackless/module/taskletobject.c:941

941 {
[New Thread 0x970ae00 (LWP 100144)]
[New Thread 0x95e6200 (LWP 100132)]
[New Thread 0x95e2200 (LWP 100120)]
[New Thread 0x95dd200 (LWP 100117)]
[New Thread 0x95d8200 (LWP 100114)]
[New Thread 0x95d4000 (LWP 100100)]
[New Thread 0x95d0000 (LWP 100098)]
[New Thread 0x95cc000 (LWP 100096)]
[New Thread 0x9587000 (LWP 100086)]
[New Thread 0x9582c00 (LWP 100085)]
[New Thread 0x957ac00 (LWP 100081)]
[New Thread 0x957aa00 (LWP 100077)]
[New Thread 0x81bd000 (runnable)]

Sending C on nagare-admin after some activity produces completely the same error.

comment:9 in reply to: ↑ 8 Changed 8 years ago by Hemml

(gdb) bt
#0 0x080da249 in impl_tasklet_kill (task=0x956dc34)

at Stackless/module/taskletobject.c:941

#1 0x080d937d in tasklet_clear (t=0x956dc34)

at Stackless/module/taskletobject.c:935

#2 0x080da362 in impl_tasklet_kill (task=0x956dc34)

at Stackless/module/taskletobject.c:953

#3 0x080d937d in tasklet_clear (t=0x956dc34)

at Stackless/module/taskletobject.c:935

#4 0x080da362 in impl_tasklet_kill (task=0x956dc34)

at Stackless/module/taskletobject.c:953

#5 0x080d937d in tasklet_clear (t=0x956dc34)

at Stackless/module/taskletobject.c:935

...and many thousands strings like that

comment:10 Changed 8 years ago by Hemml

The command "gdb -c python.core -x cmd -batch ~/nagare/bin/python | tail" was completed just now :), the last strings was:

#157892 0x080da362 in impl_tasklet_kill (task=0x956dc34)

at Stackless/module/taskletobject.c:953

#157893 0x080d937d in tasklet_clear (t=0x956dc34)

at Stackless/module/taskletobject.c:935

#157894 0x080da362 in impl_tasklet_kill (task=0x956dc34)

at Stackless/module/taskletobject.c:953

#157895 0x080d92ef in PyTasklet?_Kill (task=0x8201c0c)

at Stackless/module/taskletobject.c:937

#157896 0x080d490f in slp_kill_tasks_with_stacks (ts=0x95c0800)

at Stackless/core/stacklesseval.c:360

#157897 0x080fb2ac in PyThreadState?_Clear (tstate=0x95c0800)

at Python/pystate.c:230

#157898 0x0810bcf4 in t_bootstrap (boot_raw=0x95c8270)

at ./Modules/threadmodule.c:449

#157899 0x281d0ec8 in pthread_mutexattr_init () from /lib/libpthread.so.2
#157900 0x2843d850 in ?? ()

comment:11 follow-up: ↓ 12 Changed 7 years ago by apoirier

(back after some great debugging sessions ;)

This new version of nagare/sessions/common.py solves the segfault here:

 http://nagare.pastebin.com/00YWNNg2

Can you try it?

comment:12 in reply to: ↑ 11 ; follow-up: ↓ 13 Changed 7 years ago by Hemml

Replying to apoirier:

(back after some great debugging sessions ;)

This new version of nagare/sessions/common.py solves the segfault here:

 http://nagare.pastebin.com/00YWNNg2

Can you try it?

I have tried it, but problem still exist :( After 2-3 hours of 5-second refreshes, core-file was dumped at the same point, as it was shown by gdb
I can send you my sources and postgresql database dump (or, even, archive of whole directories with nagare and stackless ~170Mb) for testing, if you tell me, how to do this securely.

comment:13 in reply to: ↑ 12 Changed 7 years ago by apoirier

I have tried it, but problem still exist :( After 2-3 hours of 5-second refreshes, core-file was dumped at the same point, as it was shown by gdb

Is it the same test that was previously crashing in 20-40 min?

I can send you my sources and postgresql database dump (or, even, archive of whole directories with nagare and stackless ~170Mb) for testing, if you tell me, how to do this securely.

Mail sent

Note: See TracTickets for help on using tickets.