I am having exactly the same problem. There's a "ghost" session that won't
die.
When I try to log on, I get the following error:
NX> 203 NXSSH running with pid: 47638
NX> 285 Enabling check on switch command
NX> 285 Enabling skip of SSH config files
NX> 285 Setting the preferred NX options
NX> 200 Connected to address: 128.84.95.130 on port: 22
NX> 202 Authenticating user: nx
NX> 208 Using auth method: publickey
HELLO NXSERVER - Version 3.3.0 - GPL
NX> 105 Hello nxclient - version 3.3.0
NX> 134 Accepted protocol: 3.3.0
NX> 105 Set SHELL_MODE: SHELL
NX> 105 Set AUTH_MODE: PASSWORD
NX> 105 Login
NX> 101 User:
NX> 102 Password: **********
/tmp/launch-TbX6Xj/: unknown host. (nodename nor servname provided, or not
known)
NX> 103 Welcome to: megalodon.redrover.cornell.edu user: pgill
NX> 105 Listsession --user="pgill" --status="suspended,running"
--geometry="1440x900x32+render" --type="unix-gnome"
NX> 127 Session list of user 'pgill':
Display Type Session ID Options Depth
Screen
Status Session Name
------- ---------------- -------------------------------- -------- -----
-------------- ----------- ------------------------------
52 unix-gnome 25C5A2FF54B21EAB5603365C963074E9 -RD--PSA 24
1440x852
Suspended Megalodon
NX> 148 Server capacity: not reached for user: pgill
NX> 105 Restoresession --link="adsl" --backingstore="1" --encryption="1"
--cache="32m" --images="64m" --shmem="1" --shpix="1" --strict="0"
--composite="1"
--media="0" --session="megalodon" --type="unix-gnome" --geometry="1440x852"
--client="macosx" --keyboard="query" --id="25c5a2ff54b21eab5603365c963074e9"
NX> 500 Internal error
NX> 999 Bye.
NX> 280 Exiting on signal: 15
The directory
/usr/local/var/lib/neatx/sessions/25C5A2FF54B21EAB5603365C963074E9
exists on the server, but when I try to delete this session, I get the
following:
root@megalodon:/usr/local/var/lib/neatx/sessions# nxserver --force-terminate
25C5A2FF54B21EAB5603365C963074E9
NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL, using backend: 3.3.0)
NX> 500 Error: Session 25C5A2FF54B21EAB5603365C963074E9 could not be found.
NX> 999 Bye
It seems like I'm in a fix where this session exists enough to mess things
up, but it
doesn't exist enough to be deleted cleanly.
Incidentally, just moving this directory doesn't solve the problem. On
ther server,
I tried:
root@megalodon:/usr/local/var/lib/neatx/sessions# mv
./25C5A2FF54B21EAB5603365C963074E9
25C5A2FF54B21EAB5603365C963074E9_old_and_broken
root@megalodon:/usr/local/var/lib/neatx/sessions# nxserver --cleanup
NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL, using backend: 3.3.0)
NX> 500 Error: No running sessions found.
NX> 999 Bye
root@megalodon:/usr/local/var/lib/neatx/sessions# nxserver --restart
NX> 100 NXSERVER - Version 3.2.0-74-SVN OS (GPL, using backend: 3.3.0)
NX> 123 Service stopped
NX> 122 Service started
NX> 999 Bye
root@megalodon:/usr/local/var/lib/neatx/sessions#
Then the client gets a slightly different error message:
NX> 203 NXSSH running with pid: 47684
NX> 285 Enabling check on switch command
NX> 285 Enabling skip of SSH config files
NX> 285 Setting the preferred NX options
NX> 200 Connected to address: 128.84.95.130 on port: 22
NX> 202 Authenticating user: nx
NX> 208 Using auth method: publickey
HELLO NXSERVER - Version 3.3.0 - GPL
NX> 105 Hello nxclient - version 3.3.0
NX> 134 Accepted protocol: 3.3.0
NX> 105 Set SHELL_MODE: SHELL
NX> 105 Set AUTH_MODE: PASSWORD
NX> 105 Login
NX> 101 User:
NX> 102 Password: **********
/tmp/launch-TbX6Xj/: unknown host. (nodename nor servname provided, or not
known)
NX> 103 Welcome to: megalodon.redrover.cornell.edu user: pgill
NX> 105 Listsession --user="pgill" --status="suspended,running"
--geometry="1440x900x32+render" --type="unix-gnome"
NX> 127 Session list of user 'pgill':
Display Type Session ID Options Depth
Screen
Status Session Name
------- ---------------- -------------------------------- -------- -----
-------------- ----------- ------------------------------
52 unix-gnome 25C5A2FF54B21EAB5603365C963074E9 -RD--PSA 24
1440x852
Suspended Megalodon
NX> 148 Server capacity: not reached for user: pgill
NX> 105 Restoresession --link="adsl" --backingstore="1" --encryption="1"
--cache="32m" --images="64m" --shmem="1" --shpix="1" --strict="0"
--composite="1"
--media="0" --session="megalodon" --type="unix-gnome" --geometry="1440x852"
--client="macosx" --keyboard="query" --id="25c5a2ff54b21eab5603365c963074e9"
NX> 500 Failed to load session
NX> 105 NX> 280 Exiting on signal: 15
If only there were some way to cleanly remove a session. *sigh*
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
I've discovered a workaround. Every time you can't log on via neatx, you
can change
the name of a saved session on the client. This then gives the client the
chance to
create a new session.
You still can't remove the old sessions, but at least you can get a remote
desktop again.
Hi,
I though to circumvent the logout issue by having my own script being
executed
instead of the window manager logout. I wanted to execute nxdialog like
that:
nxdialog --dialog yesnosuspend --parent $PID --message "bla" --caption "bla"
The Terminate works nicely but the Disconnect does not. This is because in
HandleSessionAction(agentpid, action) DISCONNECT and TERMINATE act on
different PIDs.
Would it be possible to use the agentpid also for DISCONNECT. I added a
possbible
solution.
Greeting,
erazortt
Attachments:
nxdialog.diff 449 bytes
@erazortt: Good idea. I took a look at the pid handling, and how nxagent
calls
nxdialog, and did a bit of cleanup. Patch out for review. Thanks for
spotting that.
@erazortt: your issue should be fixed in
http://code.google.com/p/neatx/source/detail?
r=56
There is no dialog anymore but there is the follwoing error message in
/var/log/messages:
Feb 19 22:22:33 pcikf26 nxdialog[5468]: ERROR cli:68 Caught exception
Feb 19 22:22:33 pcikf26 nxdialog[5468]: Traceback (most recent call last):
Feb 19 22:22:33 pcikf26 nxdialog[5468]: File
"/usr/lib/python2.4/site-packages/neatx/cli.py", line 62, in Main
Feb 19 22:22:33 pcikf26 nxdialog[5468]: self.Run()
Feb 19 22:22:33 pcikf26 nxdialog[5468]: File
"/usr/lib/python2.4/site-packages/neatx/app/nxdialog.py", line 224, in Run
Feb 19 22:22:33 pcikf26 nxdialog[5468]: if dlgtype in
(constants.DLG_TYPE_PULLDOWN,
Feb 19 22:22:33 pcikf26 nxdialog[5468]: AttributeError: 'NxDialogProgram'
object has
no attribute 'option'
I guess in file nxdialog.py line 225:
constants.DLG_TYPE_YESNOSUSPEND) and not self.option.agentpid:
should be:
constants.DLG_TYPE_YESNOSUSPEND) and not self.options.agentpid:
Greets,
erazortt
Hi I just found another bug related to this:
Feb 21 01:06:41 pcikf26 nxdialog[18569]: ERROR cli:68 Caught exception
Feb 21 01:06:41 pcikf26 nxdialog[18569]: Traceback (most recent call last):
Feb 21 01:06:41 pcikf26 nxdialog[18569]: File
"/usr/lib/python2.4/site-packages/neatx/cli.py", line 62, in Main
Feb 21 01:06:41 pcikf26 nxdialog[18569]: self.Run()
Feb 21 01:06:41 pcikf26 nxdialog[18569]: File
"/usr/lib/python2.4/site-packages/neatx/app/nxdialog.py", line 254, in Run
Feb 21 01:06:41 pcikf26 nxdialog[18569]:
ShowYesNoSuspendBox(message_caption,
message_text))
Feb 21 01:06:41 pcikf26 nxdialog[18569]: File
"/usr/lib/python2.4/site-packages/neatx/app/nxdialog.py", line 166, in
HandleSessionAction
Feb 21 01:06:41 pcikf26 nxdialog[18569]: logging.info("Disconnecting
from
session, sending SIGHUP to %s", ppid)
Feb 21 01:06:41 pcikf26 nxdialog[18569]: NameError: global name 'ppid' is
not defined
Now I attached a patch for solving both problems.
Attachments:
nxdialog.diff 889 bytes
Hey, nxdialog was fixed in
http://code.google.com/p/neatx/source/detail?r=59 , thanks.
I've just started working with NeatX, and spent considerable time on this
bug. My
rough testing shows that under Ubuntu 10.04 for both client and server OS's
that the
NoMachine client is likely the culprit.
nxclient_3.4.0-7_i386.deb
Connect to the server first time, launch program (terminal), and
disconnect/suspend.
Sometimes the server side shows "suspended" other times it's
just "suspending" that
will wait forever. Client app *appears* to exit normally. Looking at
client side ps
& netstat you'll see that nxssh stays connected to the ssh port of there
server.
Connect again from the same client computer and if the server didn't get
the session
switched to "suspended" the connection will timeout since it can't resume.
On the client side now kill the running nxssh apps and you'll notice on the
server
side that the session now goes to "suspended". Try connecting with the
client again
and everything connects fine and resumes the session where left off.
You can repeat this over and over. Even sessions that terminate or suspend
properly
are still hanging with nxssh connected to port 22 on the server side.
heya,
Just thought I'd point out, I experienced this bug on the Windows NX Client
as well.
Basically, clicking on Suspend, and it wouldn't reliably Resume the next
time you
tried to (had to delete sessions from /var/lib/neatx/sessions.
Cheers,
Victor
I wanted to post a followup to my comment above, post 22. (Want to thank
Victor, as
it was his post that pointed me at the client side connection hanging open.)
Not sure this is actually a client side issue afterall, but rather a
problem with the
ssh session disconnect. It's likely that the neatx isn't sending the
proper signals
for disconnect to the nxssh client. Normally I'd diagnose with wireshark,
but being
ssh that's not going to work.
I'm testing neatx under a virtualbox Ubuntu 10.04 desktop installation.
Currently
one virtual machine with NoMachine NXServer Free edition and the other with
neatx.
Both servers have ssh set to debug mode so I can see what's happening with
the ssh
connections. Both servers are a clone of the same base installation so
that the only
difference is which flavor of NX server is running.
Looking at server side logs shows nothing unusual or very different from
each other.
I've been specifically looking at the data just before and after I use
Ubuntu's logout.
NXServer Free Edition:
I can connect and disconnect from the client side cleanly everytime.
There's never a
hanging nxssh process on the client side when disconnecting from the
NXServer free
edition.
neatx:
On the server side I see that once I request a disconnect from the service
there is
rarely a disconnect from the ssh service. Client side nxssh process just
sits there
forever, and I've given it up to 1 hour to die. If I kill the process on
the client
side then the server side logs show the disconnect, and then the connection
is
cleaned up and the session close is processed correctly it seems.
Unfortunately the nxclient doesn't seem to have a debug option, so it's
hard to see
what is going on with that side. The limited logs that are available for
the client
sessions seem to be nearly identical other than the notice of a terminated
nxssh
after waiting 2 minutes post screen logout. Actually those client side
logs don't
generally log anything about the connection closing.
If there are other troubleshooting methods to watch what's going I'd be
wiling to
give it a try.
To sniff nxssh session you need to become a man-in-the-middle. The last
time I did it
was 10 years ago, so modern instruments should do it more easily. However,
I could not
find a Wireshark solution, but Ettercap seems to be just what you need
http://ettercap.sourceforge.net/ dniff also claims to be capable of doing
this thing.
http://www.monkey.org/~dugsong/dsniff/
You can record the SSH conversation by replacing the nx users' shell with a
program
that logs all input/output and executes the login program behind it. Socat
(invoked
in a shell script) can be used for this.
I've got the same issue here.
I'm not sure it's a neatx problem as I've been unable to restore any
session with both freeNX and neatX since upgrading to Ubuntu 10.04.
Using neatX I have the same error as reported in this thread. Cleaning the
/var/lib/neatx/sessions directory solves the problem about creating a new
session though.
My client is the nomachine windows client 3.4.0-7.
Even if it's the client that is leaving around old stale sessions,
shouldn't neatx automatically clean them up rather than bailing? There
must be a way to test if sessions are still live or not. Does anybody know
what is actually wrong with the stale sessions?
heya,
Just experienced it again when one of my users couldn't log in.
Yeah, I have to agree with Luke, surely there's an alternative to just
aborting like this?
Cheers,
Victor
Right, if it tries to start up an old session and the session is stale,
shouldn't it just start a new session? Does anybody know exactly where the
failure is occuring?
I have the same problem on a clean install of Ubuntu 10.04 64bit.
I have found a workaround for those using windows clients connecting to
ubuntu 10.04 machines. If you "disconnect" the session instead
of "terminating" it, it will not create a new session the next time you
connect to the machine thereby mitigating the failure mode experienced here
that is only remedied by deleting the contents of the session folder.
Hi guis, I have found a workaround.
in my neatx.con:
-----------------
xsession-path = /path/to/my/startgnomesession
startgnomesession:
--------------------
#Execute gnome session. This command exits only when user terminates the
session.
/usr/bin/dbus-launch --exit-with-session /usr/bin/gnome-session
# waiting for gnome exit
sleep 1
# NX_ROOT is the path to user session file.
# delete it after 10 seconds
if [ "${NX_ROOT}" != "" ]; then
sleep 10 && rm -rf "${NX_ROOT}" >/dev/null 2>&1 &
fi
Neatx is a great project!
I'm a big fan of Neatx on Ubuntu 10.04 x86. Normally I can reconnect to my
sessions if the are cleanly disconnected. However every once in a while a
session will have a problem with the client timeout before the connection
can be established. I'm not sure if there are debug logs I could look at on
the server? I have to do the server session cleanup to get things moving
again.
Had a weird breakthrough with this: if sessions are "dead", sometimes they
can be opened from one machine but not another. For example, on client A,
I open a connection to server S, then later disconnect, go home, and
re-connect to server S from client B (reconnecting to the same disconnected
session). I leave this connection running all night long. For some reason
the NXClient software crashes and when I wake up in the morning the window
is gone. I try to reconnect to server S from client B, and it times out.
I thought it was time to do the usual "ssh to server S then rm -rf
/var/lib/neatx/session/*". However instead I went back to work and tried
reconnecting from client A, and it reconnects fine. It is the *same*
session in all cases, but somehow the combination of some staleness in
/var/lib/neatx/session on the server and some brokenness on crashed client
A seems to disallow reconnecting to the session, whereas client B can still
reconnect.
The attached script seemed to solve this issue for me. I simply run it at
then close of every session. Works great!
Attachments:
clean-sessions.sh 236 bytes
mmdj4u: yes, but as I mentioned in Comment 37, the stale sessions are not
always dead (sometimes they can be reached from one machine but not
another). Removing the session directories kills them. Sometimes you
really need to re-connect to an existing non-dead-but-somehow-stale
session, not just kill it...
Also it's a shame that this is even a problem. Is anyone at Google still
working on neatx?
I second the last comment. I need to be able to reliably connect back to a
session that was previously started, not just blow it away.
I wish this were fixed, because I find that neatx is much more CPU friendly
than nomachine's NX. (When I'm disconnected, from NX I have 15% cpu usage
by NX_agent, when I use neatx, I don't have any CPU usage that registers
more than 0 when I'm disconnected, though the processes in the disconnected
session are using CPU.) I'm running a CPU intensive task on neatx and NX
doesn't cut it for me.
Rob
Rob
I have found that sometimes you can reattach to the same "defunct" session
from the same computer. I'm trying to discover what the trick is. At this
point what I think I did that might have been relevant is to disconnect my
VPN and use another one or maybe the same one. I also tried using QTNX
instead of the nomachine client (unsuccessfully).
Rob
Further testing indicates if I just disconnect the VPN and then reconnect
it (same VPN), I can then reconnect to the "defunct" session.
I don't know if this is related, but sometimes when I am not using a VPN,
and cannot reconnect, it claims it has timed out. When I suspend the
machine I was trying to use to connect to it, upon wake up, I get a message
saying that my connection was severed, and now I am disconnected, which is
strange because I wasn't able to reconnect.