mTCP NetDrive hangs

66 views
Skip to first unread message

Tim Ur

unread,
Mar 23, 2024, 1:20:13 PM3/23/24
to mTCP
Hello, Mr. Brutman! :-)

    I have a strange problem with NetDrive, it hangs after several seconds after start (~ 5 seconds). My configuration is: HP Vectra VLi8 (P-III, 500MHz, 192Mb RAM, Matrox graphics, SB16 AWE32) with integrated 3COM PCI Ethernet adapter. Packet driver is 3C90XPD.COM, v5.2.6. The system is MS-DOS 7.10 (Windows 98, ver. 4.10.2222), no drivers loaded except netdrive.sys. I tried different servers, including your public one.
    How can I catch this error? I can't imagine how to debug this... :-\

P. S. Other utilities works correctly: ping, dnstest, sntp, dhcp, htget with no problems.

WBR, Tim.

Michael Brutman

unread,
Mar 26, 2024, 10:35:35 PM3/26/24
to mTCP
These things are really hard to debug - I wish that I could see the machine in person.  I've had a few scattered reports that mTCP works fine but NetDrive doesn't, but I don't have the same level of tracing in NetDrive that I do elsewhere because it's a device driver so the only way to debug it is using TCPdump on a Linux machine.

Some ideas and things to try:
  • Are you using the latest version of the device driver?  That's the 2024-02-18 one.
  • Please start the server with debug logging turned on.  That will tell me if packets are getting out from the device driver to the server.  The command line will look like "netdrive -log_file log.txt -log_level debug serve" .
  • I'd like to see an mTCP packet trace from a ping command.  To do that use "set logfile=ping.log", "set debugging=255", and then "ping brutman.com".  That will let me see UDP packets from the DNS lookup and other things.
  • I can give you some debug code that turns off things like UDP checksumming to see if that makes it work.  That code was tricky in assembler and I've had one bug so far.  I think it's good now but you never know.  (Computing the checksum incorrectly would make it look like a connectivity problem.)
  • I have found a bug in a packet driver so far - the EtherSlip driver was stuffing the wrong MAC address into received packets.  The packet trace from Ping will allow me to ensure the 3Com driver isn't doing that.
  • Try another Ethernet card.  I don't know what's wrong with 3Com, but their packet drivers are generally garbage.  I've had a few experiences lately helping people get setup and the later packet drivers are particularly buggy.  Intel, Western Digital, SMC, Dlink, AMD, etc. all make good cards.  3Com makes good cards, but they just didn't seem to test their packet drivers figuring nobody was using them.

-Mike

Tim Ur

unread,
Apr 2, 2024, 2:39:33 PM4/2/24
to Michael Brutman, mTCP
Hello.

    TCPdump? Good idea. Or Wireshark. I can try this later.
I answer in order:
1) device driver version is "Feb 18 2024" (screenshot attached);
2) two log files attached: NETDRVE1.LOG - I was quickly able to go to the disk and read the directory,  NETDRVE2.LOG - clean start and freeze without any action;
3) PING.LOG attached;
4) Fine. I am a software engineer with experience in assembly language development, including x86. I can take a look.
5) Ok;
6) I'll try to look in my garage. I had a lot of ISA ethernet cards, but that was a long time ago :)

WBR, Tim.

ср, 27 мар. 2024 г. в 05:35, Michael Brutman <mbbr...@gmail.com>:
--
You received this message because you are subscribed to a topic in the Google Groups "mTCP" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mtcp/7gFFmNJOrik/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mtcp+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mtcp/4b433517-4f6d-4b64-9cb5-599e284821a3n%40googlegroups.com.
NETDRVE2.LOG
PING.LOG
NETDRVE1.LOG
VERSION.PNG

Michael Brutman

unread,
Apr 22, 2024, 1:41:10 AM4/22/24
to mTCP
Just following up ...

I sent Tim some debug versions of the code to run and we found out that his packet driver (3Com for a PCI card) doesn't tolerate it when NetDrive sends a packet under the BIOS timer tick interrupt.  That happens when NetDrive has to reply to ARP packets.  This was really surprising to me as I don't know of any other packet drivers that have that problem, or what 3Com might be doing that is so sensitive.  Then again, I've heard lots of horror stories about the later 3Com packet drivers.

I am able to work around that specific problem so even that packet driver will work.  I'll post that version in the next few days after I get some more testing time on it.


-Mike

Michael Brutman

unread,
May 24, 2024, 1:25:43 AM5/24/24
to mTCP
A few days turned into a few weeks because I really wanted to understand what was going wrong.  While I have not found the exact problem, I suspect it is uninitialized storage in the packet driver.

The previous work around was to send ARP responses only while doing reads or writes through the device driver, which was adequate but if you were not actively doing disk I/O through the device driver then ARP responses would not be sent.  I found a different work around that allows me to continue sending ARP responses under the timer interrupt, allowing the code to respond to ARP requests within 55 ms even if you are not actively using disk I/O through the driver.

The new code is posted on the web page.  If you have a 3Com 3C509 variant then you want this new code.  Otherwise, there is no need to upgrade at this time.


-Mike


Reply all
Reply to author
Forward
0 new messages