I am having trouble with the Indy 9 Telnet Client component freezing in BDS
2006.
It will run just fine for quite a while (I connect and disconnect a lot)
and
then suddenly it will stop recieving data (I know this because I have
another test app acting as the telnet server and it sends the data) and
when I try and disconnect the entire app just stops - Timers the works.
The TIdTelnet component is set to use threads.
I even placed a TIdAntiFreeze component on the form just in case, but that
made no difference either.
Any help would be greatly appreciated.
Regards,
Andre'
> It will run just fine for quite a while (I connect and disconnect a lot)
> and then suddenly it will stop recieving data (I know this because I
> have another test app actring as the telnet server and it sends the data)
> and when I try and disconnect the entire app just stops.
You have likely deadlocked the Telnet component. It is a multi-threaded
component, so you have to be careful with it. Please show your actual code.
Gambit
Its a fairly simple application.
I am connecting to a buffer-box that gathers data from a PABX.
It uses a simple Telnet server and I connect to this on port 23 and request
data from it.
I need to disconnect when I am done getting the data so that I can connect
to the next buffer-box.
Basically I have instantiated and TIdTelnet in the code.
I use a timer to make the connections and request the data.
I then use the OnDataAvailable method to grab the incoming data and insert
into a TThreadList called Incoming_Queue.
Inside the timer I check to see if there is data in my queue which I then
process.
Both the connect and the disconnect happen inside the timer.
Now the thing is that in testing this app, I have been connecting and
disconnecting to the buffer boxes (simulated) every 20 seconds in order to
simulate a long useage period. In real life this would happen once an hour.
Sometimes the app will run for 5 hours before it hangs, other times for less
than an hour.
The reason I recreate the TIdTelnet component every time is because I was
hoping to sidestep the problem this way.
It *seems* to have helped a little, previously I could not run the app for
much more than 2 hours.
But it has not removed the problem.
I have included the most important parts of the code while having trimmed
the timer code so as not to confuse. (--snip-- indicates a cut in the code
to simplify).
See code below.
I would greatly appreciate any advice.
Regards,
Andre'
void __fastcall Tcdrtelnet::Timer2Timer(TObject *Sender)
{
// This timer is used to send requests to the buffer box(es) for data and to
control connects and disconnects
AnsiString branch_string;
AnsiString branch_ip;
int pos;
Timer2->Enabled = false;
try
{
try
{
Timer2Counter++;
EventLogger.logEvent(ET_Info, "Timer2Counter = " +
IntToStr(Timer2Counter));
//first we check some stuff once a second
--snip--
if (do_count)
{
counter++;
if (state == 10)
{
if (counter >= 10)
{
try
{
EventLogger.logEvent(ET_Info,"No data recieved for 10 counts,
setting state from 10 to 0");
Sleep(500);//wait to make sure network coms is done
MyDisconnect(); //<----- this is where everything hangs
when things go wrong
EventLogger.logEvent(ET_Info, "Closing and renaming file 7: "
+ filename);
if (OutStream)
{
CloseAndRenameFile(filename);
}
state = 0;
record_state = 0;
}
catch(Exception &E)
{
EventLogger.logEvent(ET_Error, "Exception occurred on no more
data: " + E.Message);
}
}
}
}
//now we can can carry on with the normal timed events
//check if its time to connect and request data
EventLogger.logEvent(ET_Info, "Cycle_Interval = " +
FloatToStr(Cycle_Interval));
if (Timer2Counter < Cycle_Interval)
{//next timer...
return;
}
Timer2Counter = 0;
NoTimerCounter = 0;
EventLogger.logEvent(ET_System, "Timer2Timer");
bool Send_Request = false;
//here I would retrieve any data from the incoming queue and process
it
--snip--
//here we go and connect to each of the buffer boxes in turn and poll
each one
//setup the buffer box connection info
--snip--
CreateTelnet(branch_ip); //create the new instance of the telnet
component
--snip--
try
{
IdTelnet2->Connect();
AnsiString asMessage = "Connected to: " + branch_name + " " +
branch_ip;
timer_state = 2;
}
catch(...)
{
AnsiString asMessage = "Failed to connect to: " + branch_name +
" " + branch_ip;
EventLogger.logEvent(ET_Error, asMessage);
}
--snip--
IdTelnet2->WriteLn("send 99"); //send the request, this asks for up
to 99 lines of data
state = 10; //sent a record request
--snip--
}
catch(Exception &E)
{
EventLogger.logEvent(ET_Info, "Exception occurred (Timer2Timer): " +
E.Message);
}
}
__finally
{
Timer2->Enabled = true;
EventLogger.logEvent(ET_Info,"Timer2 Finally");
}
}//Timer2Timer
//---------------------------------------------------------------------------
void __fastcall Tcdrtelnet::IdTelnet1DataAvailable( TIdTelnet *Sender,
const AnsiString Buffer)
{
try
{
counter = 0;
AnsiString * pTTBuffer;
pTTBuffer = new AnsiString;
*pTTBuffer = Buffer; //copy the content over into a new AnsiString for
insertion into the queue
Incoming_Queue->Add((void *)pTTBuffer);
Timer2Counter = Cycle_Interval - 1; //give me a second and then fire the
timer
}
catch(Exception &E)
{
EventLogger.logEvent(ET_Error, "Exception occurred
(IDTelnet1DataAvailable): " + E.Message);
}
return;
}//IdTelnet1DataAvailable
//---------------------------------------------------------------------------
void __fastcall Tcdrtelnet::MyDisconnect()
{
try
{
if (IdTelnet2->Connected())
{
IdTelnet2->Disconnect(); // <-this is where it all hangs
}
}
catch(Exception &E)
{
EventLogger.logEvent(ET_Error, "Exception occurred (MyDisconnect): " +
E.Message);
}
}//MyDisconnect
//---------------------------------------------------------------------------
void __fastcall Tcdrtelnet::CreateTelnet(AnsiString branch_ip)
{
try
{
EventLogger.logEvent(ET_System, "Creating new IdTelnet2");
if (IdTelnet2)
{
if (IdTelnet2->Connected())
{
IdTelnet2->Disconnect();
}
delete IdTelnet2;
}
IdTelnet2 = new TIdTelnet(NULL);
IdTelnet2->Host = branch_ip;
IdTelnet2->OnDataAvailable = IdTelnet1DataAvailable;
IdTelnet2->ThreadedEvent = true;
}
catch(Exception &E)
{
EventLogger.logEvent(ET_Error, "Exception occurred (CreateTelnet): " +
E.Message);
}
}//CreateTelnet
//---------------------------------------------------------------------------
"Remy Lebeau (TeamB)" <no....@no.spam.com> wrote in message
news:44f6917c$1...@newsgroups.borland.com...
>I would greatly appreciate any advice.
2 suggestions:
1) Don't rely on _finally catching a return;
There is a "bug" with _finally and return.
In this case, if it hits, your timer stays Off.
2) Restucture your code
If "state" is a state machine, it might be clearer to use a
switch(state) with the cases in state-order.
Added benefit: breaking out of the switch allows you to drop to the
_finally and timer-enable, no internal returns needed.
At the very least, restart the timer before every return statement,
and see if things don't improve.
I follow what you are saying but in this case I dont think thats the
problem.
I write events to a log file (most of which I stripped out of the example
code to simplify things.
The log file shows that I send the request for data an 10-12 seconds later
the program times out with the message:
"No data recieved for 10 counts"
I can then see it going into the MyDisconnect function and calling the
disconnect.
I write a log message right before and directly after the actual diconnect
call and the log file shows the one before it and nothing after that.
In fact I have added another timer to the form that simply fires once a
second and writes the state of the original timer to the log file, eg
whether it is enabled or not.
Even this extra timer stops at this point.
I originally had a similar problem when I tried to disconnect from inside
the OnDataAvailable event.
Looking in the delphi source for the Telnet component I figured out that
this was causing a deadlock as the mainthread was waiting for the connection
thread to terminate which was stuck inside the OnDataAvailable method also
waiting for itself to terminate.
This seems to be something similar, except that the disconnect is now no
longer being called in the OnDataAvailable method but rather in the timer.
It seems as if the Telnet component recieves the incoming data, but for some
reason gets stuck and doesnt call the OnDataAvailable method. When I then
try and disconnect the disconnect method waits for the connection thread to
terminate (which is stuck) so it never returns, thus locking up the main
thread (from which the disconnect was called).
A work-around for this *might* be to spawn my own thread in which to call
the disconnect.
But then I will have the problem of killing this new thread if it should get
locked up with the telnet component.
And of course it could introduce extra complexity in my own code with shared
data and critical sections etc. etc.
Make any sence? Or I am on the wrong track here completely?
Andre'
"Bob Gonder" <no...@notmindspring.invalid> wrote in message
news:adsdf2dk8k4s0tefv...@4ax.com...
> I can then see it going into the MyDisconnect function and calling
> the disconnect. I write a log message right before and directly after
> the actual diconnect call and the log file shows the one before it
> and nothing after that.
In order for that to happen, the TIdTelnet's internal reading thread would
have to be deadlocked. Under normal situations, disconnecting the socket
would terminate the thread immediately, but if the thread is blocked then it
would not be able to detect the socket being closed, which would in turn
block Disconnect() because it waits for the thread to terminate before then
exiting.
> In fact I have added another timer to the form that simply fires once
> a second and writes the state of the original timer to the log file, eg
> whether it is enabled or not.
> Even this extra timer stops at this point.
Sounds like you are deadlocking the main thread as well, since TTimer is a
message-based timer. If you are calling Disconnect() in the context of the
main thread, as the internal reading thread is blocked so that it cannot
terminate itself, then Disconnect() is going to be blocked, and thus prevent
the main thread from processing pending messages.
> I originally had a similar problem when I tried to disconnect from
> inside the OnDataAvailable event.
You cannot do that. That is a guaranteed deadlock, since Disconnect() would
be waiting on the thread to terminate, but the thread would be waiting for
Disconnect() to exit.
> This seems to be something similar, except that the disconnect
> is now no longer being called in the OnDataAvailable method
> but rather in the timer.
That does not negate the possibility of the reading thread still becoming
deadlocked.
> It seems as if the Telnet component recieves the incoming data, but
> for some reason gets stuck and doesnt call the OnDataAvailable
> method.
That cannot happen. What is more likely the case is that the
OnDataAvailable event handler is becoming deadlocked.
Gambit
I agree with everything you have said.
What I can't figure out is why the OnDataAvailable would deadlock, or for
that matter the thread that is calling it.
The very first thing I do on entering the OnDataAvailable method is to log a
message to the log file.
When this happens, I dont even get this message in the log-file, so its not
even getting into the code of the OnDataEvent.
So it seems its not something in my code in the OnDataEvent.
The logging code has a single critical section which is only locked when the
actualy writing to the file takes place so as to
prevent threading issues when writing to the log.
This same code is used in a number of other projects and has never caused
any issues so I dont think its that either.
This one has me totally floored.
Andre'
"Remy Lebeau (TeamB)" <no....@no.spam.com> wrote in message
news:44f7e8a0$1...@newsgroups.borland.com...
> What I can't figure out is why the OnDataAvailable would
> deadlock, or for that matter the thread that is calling it.
I see that you have the TIdTelnet::ThreadedEvent property set to true. That
makes the OnDataAvailable event handler code run in the context of the
reading thread, so you have to make sure that its code is thread-safe. Is
your logger thread-safe? What kind of container are you using for your
Incoming_Queue, and is it thread-safe? I do notice that you are accessing a
couple of counter-related variables in the event handler without protection.
> When this happens, I dont even get this message in the log-file, so
> its not even getting into the code of the OnDataEvent.
Since you have the ThreadedEvent property set to true, then that is only
possible if the reading thread is not receiving any data, meaning that it is
blocked in an idle reading operation. If actual data is received, then
there is no way the reading thread can become blocked before triggering the
OnDataAvailable event.
With that said, there is another possibility to consider - since you are
using a Telnet component, what does the actual transmitted data look like?
Although it is probably not the case in this situation, it is technically
possible that the data itself could be interpreted entirely as Telnet
control sequences with no application data available. If that were to
happen, then there would be no reason for the OnDataAvailable to be
triggered.
Gambit
Logger, should be thread safe, it has a single critical section which is
only locked while actually writing to the file.
It is used in a number of other multi-threaded apps without a problem.
Incoming_Queue is a TThreadList. All I do in the OnDataAvailable is Add,
which is supposed to completely thread safe.
In my data processing code I have changed it around so that I only lock the
Incoming queue inside the loop and then unlock it again immediately, so I
dont have it locked for any long periods of time.
counter gets set to zero in the OnDataAvailable, this is simply used to see
when last I recieved data from the server.
Timer2Counter gets set as well, this is to force the timer code to do some
extra processing.
Neither of these *should* cause a lockup (but you never know).
>
>> When this happens, I dont even get this message in the log-file, so
>> its not even getting into the code of the OnDataEvent.
>
> Since you have the ThreadedEvent property set to true, then that is only
> possible if the reading thread is not receiving any data, meaning that it
> is
> blocked in an idle reading operation. If actual data is received, then
> there is no way the reading thread can become blocked before triggering
> the
> OnDataAvailable event.
>
> With that said, there is another possibility to consider - since you are
> using a Telnet component, what does the actual transmitted data look like?
> Although it is probably not the case in this situation, it is technically
> possible that the data itself could be interpreted entirely as Telnet
> control sequences with no application data available. If that were to
> happen, then there would be no reason for the OnDataAvailable to be
> triggered.
Well, from the test server that I have, I am actually sending the same data
over and over again.
Data looks like this:
E 022 00 T010267 1165567 02/20 14:44:31 00:01:02.0\r\n
&0124607806XXXXXX 3905XXX 000 000\r\n
& 000 01 0\r\n
\r\n
N 021 00 1165545 A012022 02/20 14:39:39 00:04:52.0 A00833014904\r\n
& 000 000\r\n
& 000\r\n
\r\n
S 023 00 T010170 4830 02/20 14:43:45 00:00:48.0\r\n
&0184410151XXXXXX 4830XXX 000 000\r\n
& N 000 01 0\r\n
These 3 records get sent over and over again, in different order at random.
The server will send the following string when its finished sending (decided
randomly):
TotalRecords\r\n
And also randomly decided it will sometimes send the following instead of
the above records:
NO DATA\r\n
The server is written in BCB6, with an Indy Telnet Server component.
When the lockup occurs I can see that the server has sent the data to the
client, but the client doesnt seem to register it.
I can't trace into it either because sometimes it runs for hours before this
happens.
:-(
>
>
> Gambit
>
>
>I see that you have the TIdTelnet::ThreadedEvent property set to true.
Is TIdTelnet able to have multiple copies running?
I'd be tempted to create a thread in the Timer that would handle the
entire session in blocking mode.
That would keep Main from blocking, and probably make the code
simpler.
Also, if one node hangs, the app could continue polling the other
boxes.
> Is TIdTelnet able to have multiple copies running?
Yes.
> I'd be tempted to create a thread in the Timer that would
> handle the entire session in blocking mode.
>
> That would keep Main from blocking, and probably make the
> code simpler. Also, if one node hangs, the app could continue
> polling the other boxes.
TIdTelnet already does most of that for you. Like I said, incoming data is
received in a separate thread that TIdTelnet runs internally.
Gambit
From looking at the IdTelnet.pas file, it would seem that if I were to call
the Disconnect method from inside my main thread while the OnDataAvailable
is busy within the connection's thread, that the main thread would block
until the connection thread was terminated, since the Disconnect method
calls Terminate for the thread and then does a waitfor on it.
This should then not cause an error or a locking condition so that in theory
I can call Disconnect safely at any time.
Do you agree with this?
Andre'
> From looking at the IdTelnet.pas file, it would seem that if I were to
call
> the Disconnect method from inside my main thread while the OnDataAvailable
> is busy within the connection's thread, that the main thread would block
> until the connection thread was terminated, since the Disconnect method
> calls Terminate for the thread and then does a waitfor on it.
When the ThreadedEvent property is False, OnDataAvailable is triggered via a
call to TThread::Synchronize(). Synchronize() requires the main thread to
reply to the request, so the main thread can't be fully blocked, or else the
thread will also be blovked. Fortunately, TThread::WaitFor() does allow
Synchronize() calls to finish their work. So even if Disconnect() (and thus
TThread::WaitFor()) were called in the context of the main thread,
TIdTelnet's reading thread should still be able to terminate cleanly and
then release WaitFor() from its waiting. The only time that would not
happen is if your OnDataAvailable event handler is performing operations
that require the main message queue (which is blocked), in which case you
will have to call Application->ProcessMessages() in your code.
Gambit
I discovered (I dont know how I missed this one, too much in the log files
maybe) that there was a slight issue with the server app that caused a
failed connection once in a long while (socket error 10061, Connection
Refused).
This happened every time that this strange behaviour of my service occurred.
Seems what was happening is that for a moment the server was not accepting
connections and at that moment my service tried to connect. I then "timed
out" after 10 seconds and tried to disconnect because I wasn't getting any
data.
Attempting to disconnect at this point for some reason caused everything to
stop dead.
First I simply removed the call to MyDisconnect so that even though I timed
out, I did nothing about it.
This allowed the TIdTelnet component to "recover" from the exception on its
own and processing continued as normal.
Next I went and cleaned up the code in the server just a little, which now
seems to prevent the 'connection refused' error from happening at all
(server was not releasing the old connection correctly under all
circumstances and so could not pick up a new one).
It's now been running for 15 hours straight with not a problem in sight.
And thats with connecting and disconnecting twice every 20 seconds.
So far it looks good.
This make any sense?
Andre'