Hyrax BES HD5 Error

168 views
Skip to first unread message

Calloway, Chris

unread,
Mar 6, 2017, 2:15:24 PM3/6/17
to sup...@opendap.org, Hong Yi, renci_rayi.con
I’m following up on an unanswered request for information from last week regarding BES.Memory.GlobalArea.MaximumHeapSize.

For a little over a year, I have been running:

bes-3.16.0-1.static.el6.x86_64.rpm
libdap-3.16.0-1.el6.x86_64.rpm
olfs-1.14.1-webapp.tgz
apache-tomcat-7.0.67.tar.gz
jre-1.7.0-openjdk.x86_64

I am attempting to retrieve an nc4 subset of a 3Gig netCDF file of the dimensions and the first (3-d) variable called “mean.”

http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461]

The browser spins for several seconds and then responds:

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, c...@renci.org and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.

In /usr/tomcat7/logs/catalina.out, there is this error pointed at the BES:

2017-03-06T11:24:40.261 -0500 [thread:ajp-bio-8009-exec-8] [2916219][20] [HTTP-GET] ERROR - opendap.bes.dap2Responders.Netcdf4 - respondToHttpGetRequest()
encountered a BESError: Error {
code = 1001;
message = "fileout.netcdf - Failed to create array of floats for mean: NetCDF: HDF error";
};

2017-03-06T11:24:40.262 -0500 [thread:ajp-bio-8009-exec-8] [2916220][20] [HTTP-GET] ERROR - opendap.coreServlet.OPeNDAPException - anyExceptionHandler():
org.apache.catalina.connector.ClientAbortException: java.net.SocketException: Broken pipe

In /var/log/bes/bes.log, the error is confirmed with the same dearth of information:

[EST Mon Mar 6 11:23:16 2017 id: 11716] 11716 from ip 127.0.0.1, port 39278 request received
[EST Mon Mar 6 11:23:16 2017 id: 11716] 11716 from ip 127.0.0.1, port 39278 [set context errors to xml;] received
[EST Mon Mar 6 11:23:16 2017 id: 11716] 11716 from ip 127.0.0.1, port 39278 [show catalog for
/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 request received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [set context xdap_accept to 3.2;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [set context dap_explicit_containers to no;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [set context errors to dap2;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [set context max_response_size to 0;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [set container in catalog values
catalogContainer,/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [define d1 in default as catalogContainer with
catalogContainer.constraint="latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461]";] received
[EST Mon Mar 6 11:23:18 2017 id: 11624] 11624 from ip 127.0.0.1, port 35390 [get dods for d1 return as netcdf-4;] received
[EST Mon Mar 6 11:24:40 2017 id: 11624] Error {
code = 1001;
message = "fileout.netcdf - Failed to create array of floats for mean: NetCDF: HDF error";
};

My /usr/tomcat7/bin/setenv.sh contains:

JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64
JRE_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64
CATALINA_PID="$CATALINA_BASE/tomcat.pid"
export CATALINA_OPTS="$CATALINA_OPTS -Xms512m"
export CATALINA_OPTS="$CATALINA_OPTS -Xmx8192m"

My /etc/bes/bes.conf contains (among other things):

BES.UncompressCache.size=4000

BES.Memory.GlobalArea.EmergencyPoolSize=1
BES.Memory.GlobalArea.MaximumHeapSize=4000
BES.Memory.GlobalArea.Verbose=no
BES.Memory.GlobalArea.ControlHeap=no

Memory profiling during the request shows on 1.5G out of 8G of memory being used with 6.5G free, and no swap being used.

There is no error when requesting ddx, dds, das, info, rdf or the html request form for the dataset. I have no error requesting a dods (DAP 2) object of the dimensions and the aforementioned variable. The resulting dods object is 131M in size. The same Internal Server error occurs if also requesting either an ASCII or nc3 object. If I simply request the dimensions in an nc4 object, there is no error.

I would appreciate your insight into what I should do next or what remedies I may have available.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 6, 2017, 2:19:00 PM3/6/17
to Calloway, Chris, Nathan Potter, sup...@opendap.org, Hong Yi, renci_rayi.con
Hi Chris,

Sorry we didn’t get back to you sooner. I poked around and was unable to find an answer to you question about the heap size.

I’ll give it some more time today.

Thanks for your patience.


Sincerely,

Nathan
= = =
Nathan Potter ndp at opendap.org
OPeNDAP, Inc. +1.541.231.3317

Nathan Potter

unread,
Mar 7, 2017, 12:59:42 PM3/7/17
to Calloway, Chris, Nathan Potter, sup...@opendap.org, Hong Yi, renci_rayi.con
Hi Chris,

Can you point me to where I might download this file?

http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc



Thanks,

N




> On Mar 6, 2017, at 11:15 AM, Calloway, Chris <c...@unc.edu> wrote:
>

Calloway, Chris

unread,
Mar 7, 2017, 1:30:23 PM3/7/17
to Hong Yi, sup...@opendap.org, renci_rayi.con, Nathan Potter
Hong,

Would you please help Nathan get a copy of the offending nc file? I’m sure Nathan wants to validate it just as I would if I knew how to get a copy. When I attempt to access it through the Fuse mount as the bes user, I get:

Couldn't get handle: Failure

I just need a copy to put on our FTP server.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Ray Idaszak

unread,
Mar 7, 2017, 1:51:02 PM3/7/17
to Calloway, Chris, Hong Yi, sup...@opendap.org, Nathan Potter
Chris, Nathan,

It's here as a HydroShare resource:

https://www.hydroshare.org/resource/ff2a6f87817544a08c82ebcf119bae80/

Download the NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc file in the above by right-clicking it.

Thanks,
-Ray

Ray Idaszak

unread,
Mar 7, 2017, 1:58:15 PM3/7/17
to Calloway, Chris, Hong Yi, sup...@opendap.org, Nathan Potter
For others cc:'ed on this email, Hong pointed out to me that the file is too large to download through the web interface (i.e. 3GB) so she will work with Chris to stage the data in an accessible place.

Nathan Potter

unread,
Mar 7, 2017, 1:58:28 PM3/7/17
to Ray Idaszak, Nathan Potter, Calloway, Chris, Hong Yi, sup...@opendap.org


Hi Ray,

Thanks for the link:

https://www.hydroshare.org/django_irods/download/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc

Unfortunately when I tried it I got this message:

"File larger than 1GB cannot be downloaded directly via HTTP. Please download the large file via iRODS clients."

Is there another way?

Nathan Potter

unread,
Mar 7, 2017, 1:58:50 PM3/7/17
to Ray Idaszak, Nathan Potter, Calloway, Chris, Hong Yi, sup...@opendap.org
Thanks!!

Nathan

Ray Idaszak

unread,
Mar 7, 2017, 2:10:55 PM3/7/17
to Nathan Potter, Calloway, Chris, Hong Yi, sup...@opendap.org
Hi Nathan,

There is a formal way to do this described here:

https://pages.hydroshare.org/creating-and-managing-resources/uploading-large-files-into-hydroshare/

But you don't have to do the above as I believe Chris is already working with Hong to stage the data behind the scenes in an area you can access.

Calloway, Chris

unread,
Mar 7, 2017, 3:15:16 PM3/7/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

Thanks to Hong Yi, it is available here:

http://people.renci.org/~hongyi/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc

I verified that it can be downloaded from there and the ‘mean’ variable accessed through the Python netCDF4 package.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 9, 2017, 6:03:31 PM3/9/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

Thanks for making that available. I have been poking at it and before I say much else I have question:

Did you install from RPMs or did you build from source?

Thanks,

Nathan

Nathan Potter

unread,
Mar 9, 2017, 6:14:11 PM3/9/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

Also what’s the OS name and version on the system where you have installed Hyrax, I think it’s: http://hyrax.hydroshare.org

Thanks,

N

Nathan Potter

unread,
Mar 9, 2017, 7:55:26 PM3/9/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

I have another question, which makes three, so to summarize:


3) In the /etc/olfs/olfs.xml file what is the value of the BES timeOut element: <timeOut>???</timeOut>

2) Also what’s the OS name and version on the system where you have installed Hyrax?

1) Did you install from RPMs or did you build from source?



Thanks,

Nathan

Calloway, Chris

unread,
Mar 10, 2017, 8:16:43 AM3/10/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

Thank you so much for your reply. I’m going to bundle up the answers to all of your questions from the three replies you sent yesterday into this one reply.

> Did you install from RPMs or did you build from source?

From RPMs. In a previous email:

bes-3.16.0-1.static.el6.x86_64.rpm
libdap-3.16.0-1.el6.x86_64.rpm
olfs-1.14.1-webapp.tgz
apache-tomcat-7.0.67.tar.gz
jre-1.7.0-openjdk.x86_64

We installed from RPM on your recommendation after spending weeks trying to get the thing to build from source with your help.

> Also what’s the OS name and version on the system where you have installed Hyrax, I think it’s: http://hyrax.hydroshare.org

It’s Centos 7, I’m pretty sure. I’m going to have to get back to you on the exact minor and revision number, as the authentication system seems to be down this morning.

Hyrax.hydroshare.org is also simply an alias for hyrax01.renci.org.

> In the /etc/olfs/olfs.xml file what is the value of the BES timeOut element: <timeOut>???</timeOut>

I’ll have to get back to you on that one as well. It appears I am locked out.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 10, 2017, 8:50:08 AM3/10/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
OK, now I have all the answers. The /tmp directory was full of files owned by BES, which was preventing my login. Quick drive resize by our VMWare group and fixed.

3) In the /etc/olfs/olfs.xml file what is the value of the BES timeOut element: <timeOut>???</timeOut>

Doesn’t seem to exist:

[cbc@hyrax01 ~]$ ls /etc/olfs/olfs.xml
ls: cannot access /etc/olfs/olfs.xml: No such file or directory
[cbc@hyrax01 ~]$ ls /etc/olfs
ls: cannot access /etc/olfs: No such file or directory
[cbc@hyrax01 ~]$

As we installed from RPM, I don’t recall any instructions to configure OLFS.

> 2) Also what’s the OS name and version on the system where you have installed Hyrax?

I was wrong, it is Centos 6. I seem to recall that being a requirement back from when we were trying to build from source:

[cbc@hyrax01 ~]$ cat /etc/redhat-release
CentOS release 6.7 (Final)
[cbc@hyrax01 ~]$

Hyrax.hydroshare.org is also simply an alias for hyrax01.renci.org. The idea was to eventually have testbed, failover, and load balancing. But there is no hyrax02, etc., yet.

> 1) Did you install from RPMs or did you build from source?
From RPMs. In a previous email:

bes-3.16.0-1.static.el6.x86_64.rpm
libdap-3.16.0-1.el6.x86_64.rpm
olfs-1.14.1-webapp.tgz
apache-tomcat-7.0.67.tar.gz
jre-1.7.0-openjdk.x86_64

We installed from RPM on your recommendation after spending weeks trying to get the thing to build from source with your help.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 10, 2017, 8:57:26 AM3/10/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org


Chris,

Thanks for the answers just what I needed, sorry I missed the RPMs in the previous post.

Since the olfs.xml file is not in the expected location then the system is either getting the location from the Tomcat user’s shell environment (OLFS_CONFIG_DIR) or utilizing the default configuration /usr/share/tomcat/webapps/opendap/WEB-INF/conf/olfs.xml

My guess is the latter.

I’m thinking the problem is fixed in the current release, Hyrax-1.13.3, but before I can say that with confidence I want to do more testing against that file.

Thanks for your patience,

Nathan

Calloway, Chris

unread,
Mar 10, 2017, 9:52:59 AM3/10/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

Thanks for your persistence. This may help shed light:

[cbc@hyrax01 ~]$ echo $OLFS_CONFIG_DIR

[cbc@hyrax01 ~]$ ls /usr/share/tomcat
ls: cannot access /usr/share/tomcat: No such file or directory
[cbc@hyrax01 ~]$ ls /usr/share/tomcat7
ls: cannot access /usr/share/tomcat7: No such file or directory
[cbc@hyrax01 ~]$ ls /usr/tomcat7/webapps/opendap/WEB-INF/
classes/ lib/ logback-test.xml logback.xml urlrewrite.xml web.xml
[cbc@hyrax01 ~]$ sudo find /usr/tomcat7 -iname conf
[sudo] password for cbc:
/usr/tomcat7/conf
[cbc@hyrax01 ~]$ ls /usr/tomcat7/conf
Catalina catalina.properties context.xml.bak logging.properties tomcat-users.xml
catalina.policy context.xml context.xml.bak2 server.xml web.xml
[cbc@hyrax01 ~]$

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 10, 2017, 10:18:15 AM3/10/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org, James Gallagher
Chris,

I think this is a bug that has been fixed in the current release, Hyrax-1.13.3.

I put the file on our test server here:

http://test.opendap.org/opendap/conus/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4

And I have hit it with the poison request:

curl -s "http://test.opendap.org/opendap/conus/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude\[0:1:194\],longitude\[0:1:461\],time\[0:1:365\],crs,mean\[0:1:365\]\[0:1:194\]\[0:1:461\]

Thousands of times, using tens of simultaneous requests.

And so far the server appears to be returning reasonable stuff.

I’m thinking you might see if our test server works for you.
If it does would you consider upgrading your Hyrax instance to the current release?
I’m not going to promise that it’s perfect, but I think it will fix this problem.



Sincerely,

Nathan




Your system has Hyrax-1.12.2 and is 4 releases back.

Calloway, Chris

unread,
Mar 10, 2017, 10:33:04 AM3/10/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org, James Gallagher
Nathan,

➢ I’m thinking you might see if our test server works for you.

It does indeed! It even renders the entire dataset as nc4 when requested with no constraints (and much faster than a subset).

➢ If it does would you consider upgrading your Hyrax instance to the current release?

Considering that the instance is currently installed from RPM, I would imagine your suggestion is to install from RPM again? Is it possible to do that upgrade to the current RPM installation in place? Or would I need to start a new server VM afresh? This was my concern with installing the RPM originally, that I wouldn’t be able to simply git pull to receive updates and bug fixes.

James Gallagher

unread,
Mar 10, 2017, 10:37:41 AM3/10/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, sup...@opendap.org, Hong Yi


On March 10, 2017 at 10:33:03 AM, Calloway, Chris (c...@unc.edu) wrote:

Nathan, 

➢ I’m thinking you might see if our test server works for you. 

It does indeed! It even renders the entire dataset as nc4 when requested with no constraints (and much faster than a subset). 

➢ If it does would you consider upgrading your Hyrax instance to the current release? 

Considering that the instance is currently installed from RPM, I would imagine your suggestion is to install from RPM again? Is it possible to do that upgrade to the current RPM installation in place? Or would I need to start a new server VM afresh? This was my concern with installing the RPM originally, that I wouldn’t be able to simply git pull to receive updates and bug fixes. 

You should be able to ‘yum upgrade libdap*.rpm bes*.rpm’ (where libdap*.rpm is the name of the Hyrax 1.13.x rpm for libdap, …) just fine.

You will almost certainly need to put both RPM on the same command line.

My apologies for chiming in…

James



-- 
Sincerely, 

Chris Calloway 
Applications Analyst 
University of North Carolina 
Renaissance Computing Institute 
(919) 599-3530 



-- 
James Gallagher
jgall...@opendap.org

Calloway, Chris

unread,
Mar 10, 2017, 11:22:26 AM3/10/17
to James Gallagher, Nathan Potter, renci_rayi.con, sup...@opendap.org, Hong Yi

Thanks, James.

 

This is what I did:

 

[cbc@hyrax01 ~]$ wget https://www.opendap.org/pub/binary/hyrax-1.13.3/centos6.6/libdap-3.18.3-1.el6.x86_64.rpm

[cbc@hyrax01 ~]$ wget https://www.opendap.org/pub/binary/hyrax-1.13.3/centos6.6/bes-3.17.4-1.static.el6.x86_64.rpm

[cbc@hyrax01 ~]$ sudo besctl stop

Shutting down the BES daemon

There are several different BES processes running: 27154

Successfully shut down the BES

[cbc@hyrax01 ~]$ sudo besctl kill

[cbc@hyrax01 ~]$ sudo yum upgrade libdap-3.18.3-1.el6.x86_64.rpm bes-3.17.4-1.static.el6.x86_64.rpm

Complete!

[cbc@hyrax01 ~]$ sudo besctl start

Starting the BES

OK: Successfully started the BES

PID: 28503 UID: 0

[cbc@hyrax01 ~]$

 

In a browser I went to:

And back to square one with Internal Server Error.

 

In /var/log/bes/bes.log:

 

[EST Fri Mar 10 11:12:23 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 request received

[EST Fri Mar 10 11:12:23 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set context errors to xml;] received

[EST Fri Mar 10 11:12:23 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [show catalog for /ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 request received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set context xdap_accept to 3.2;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set context dap_explicit_containers to no;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set context errors to dap2;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set context max_response_size to 0;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [set container in catalog values catalogContainer,/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [define d1 in default as catalogContainer with catalogContainer.constraint="latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461]";] received

[EST Fri Mar 10 11:12:30 2017 id: 28828] 28828 from ip 127.0.0.1, port 46056 [get dods for d1 return as netcdf-4;] received

[EST Fri Mar 10 11:13:52 2017 id: 28828] Child listener caught SIGPIPE (master listener PID: 28505). Child listener Exiting.

Calloway, Chris

unread,
Mar 10, 2017, 11:37:22 AM3/10/17
to James Gallagher, Nathan Potter, renci_rayi.con, sup...@opendap.org, Hong Yi

However, after upgranding:

 

/usr/tomcat7/content/opendap/olfs.xml

 

is now there.

 

Looking in that file, the bes timeout is commented out:

 

            <!-- Timeout (in seconds) for this BES, defaults to 300 seconds-->

            <!-- <timeOut>300</timeOut> -->

 

Does this mean there is no timeout, or should I uncomment this, or should I change it to something longer? The lag between the request producing the error and the actual error was 89 seconds in the example below.

Nathan Potter

unread,
Mar 10, 2017, 11:47:16 AM3/10/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, sup...@opendap.org, Hong Yi

Chris,

1) OLFS Configuration

This directory: /usr/tomcat7/content/opendap is probably not in-use and left over from your previous installation and should be moved out of the way (and when you’re convinced it’s not needed , deleted) (Did you upgrade the OLFS too?)

The OLFS configuration location is determined as described here: http://docs.opendap.org/index.php/Hyrax_-_OLFS_Configuration#OLFS_Configuration_Location

If you want to need/want to have a localized configuration create the directory /etc/olfs and make it owned by the Tomcat user. Restart Tomcat and the OLFS will move the default configuration into that spot. Changes made there will not be overwritten when new versions are installed.

2) Your hydroshare.org upgrade

I went here: http://hyrax.hydroshare.org/opendap/version

And it looks to me like that system is still running Hyrax-1.12.2

Maybe a restart?



Thanks,

Nathan

Nathan Potter

unread,
Mar 10, 2017, 12:15:21 PM3/10/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, sup...@opendap.org, Hong Yi

> On Mar 10, 2017, at 8:47 AM, Nathan Potter <n...@opendap.org> wrote:
>
>
> Chris,
>
> 1) OLFS Configuration
>
> This directory: /usr/tomcat7/content/opendap is probably not in-use and left over from your previous installation and should be moved out of the way (and when you’re convinced it’s not needed , deleted) (Did you upgrade the OLFS too?)


By which I meant this:

https://www.opendap.org/pub/olfs/olfs-1.16.2-webapp.tgz
https://www.opendap.org/pub/olfs/olfs-1.16.2-webapp.tgz.sig

Calloway, Chris

unread,
Mar 10, 2017, 2:53:17 PM3/10/17
to Nathan Potter, James Gallagher, renci_rayi.con, sup...@opendap.org, Hong Yi
> (Did you upgrade the OLFS too?)

Not until just now. After that:

http://hyrax.hydroshare.org/opendap/version

correctly reports:

<Hyrax version="1.13.3"/>
<OLFS version="1.16.2"/>

I had to move the old webapps/opendap out of the way, even after deploying the new war file, in order to get it to take effect. I think that’s something that can also be done through the Tomcat admin interface, although I keep that turned off.

However, still an Internal Server Error on the offending request.

> This directory: /usr/tomcat7/content/opendap is probably not in-use and left over from your previous installation and should be moved out of the way (and when you’re convinced it’s not needed , deleted)

OK, saved.

> If you want to need/want to have a localized configuration create the directory /etc/olfs and make it owned by the Tomcat user. Restart Tomcat and the OLFS will move the default configuration into that spot. Changes made there will not be overwritten when new versions are installed.

Done. And it copied (not moved) all of /usr/tomcat7/webapps/opendap/WEB-INF/conf/ to /etc/olfs including olfs.xml in which the BES timeout is commented out.

Again, internal server error on the offending request.

In /var/log/bes/bes.log:

[EST Fri Mar 10 14:28:42 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 request received
[EST Fri Mar 10 14:28:42 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context bes_timeout to 300;] received
[EST Fri Mar 10 14:28:42 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context errors to xml;] received
[EST Fri Mar 10 14:28:42 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [show catalog for /ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 request received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context bes_timeout to 300;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context errors to xml;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [show version;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 request received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context bes_timeout to 300;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context cf_history_entry to 2017-03-10 19:28:44 GMT Hyrax-1.13.3 http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context xdap_accept to 3.2;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context dap_explicit_containers to no;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context errors to xml;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set context max_response_size to 0;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [set container in catalog values catalogContainer,/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc;] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [define d1 in default as catalogContainer with catalogContainer.constraint="latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D";] received
[EST Fri Mar 10 14:28:44 2017 id: 31709] 31709 from ip 127.0.0.1, port 50072 [get dods for d1 return as netcdf-4;] received
[EST Fri Mar 10 14:29:52 2017 id: 31709] Child listener caught SIGPIPE (master listener PID: 31616). Child listener Exiting.

In /etc/olfs/logs/HyraxErrors.log:

2017-03-10T14:29:52.768 -0500 [152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [thread:ajp-bio-8009-exec-1] [173235][2] [HTTP-GET] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D] ERROR - opendap.ppt.NewPPTClient - closeConnection(): Unable to close socket, continuing. Base message: 'Socket is closed'
2017-03-10T14:29:52.769 -0500 [152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [thread:ajp-bio-8009-exec-1] [173236][2] [HTTP-GET] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D] ERROR - opendap.bes.BES - besGetTransaction() - Problem encountered with BES connection. Message: 'java.net.SocketException: Broken pipe' OPeNDAPClient executed 2 prior commands.
2017-03-10T14:29:52.769 -0500 [152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [thread:ajp-bio-8009-exec-1] [173236][2] [HTTP-GET] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D] ERROR - opendap.coreServlet.OPeNDAPException - anyExceptionHandler(): opendap.ppt.PPTException: Problem encountered with BES connection. Message: 'java.net.SocketException: Broken pipe' OPeNDAPClient executed 2 prior commands.
2017-03-10T14:29:52.771 -0500 [152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [thread:ajp-bio-8009-exec-1] [173238][2] [HTTP-GET] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D] ERROR - opendap.coreServlet.OPeNDAPException - Bad things happened! Cannot process incoming exception! New Exception thrown: org.apache.catalina.connector.ClientAbortException: java.net.SocketException: Broken pipe

In /etc/olfs/logs/HyraxAccess.log:

[152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [2AB16B55A6E0F3271774D71EF27C8B1C] [-] [2017-03-10T14:28:44.243 -0500] [ 1307 ms] [200] [ 1] [LAST-MOD] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D]
[152.54.8.87] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0] [2AB16B55A6E0F3271774D71EF27C8B1C] [-] [2017-03-10T14:29:52.771 -0500] [68526 ms] [-1] [ 2] [HTTP-GET] [/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4] [latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D]

> Maybe a restart?

Yep. All day my cycle has been:

sudo besctl stop
sudo besctl kill
sudo service tomcat7 stop
[… make some changes …]
sudo besctl start
sudo service tomcat7 start

I did notice that every time there is a bad response now, a file such as the following is created in /tmp:

-rw------- 1 bes bes 131914424 Mar 10 14:29 ncSDwlgz

That file size is exactly the file size of the nc4 returned by your test server if the offending subset is requested. That seems to tell me maybe there is a problem in the pipe between OLFS and BES, as seems to be indicated in the error logs above?
--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 10, 2017, 3:29:26 PM3/10/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, sup...@opendap.org, Hong Yi
Hi Chris,

There is definitely an issue, and it’s in the BES. These errors:

> ERROR - opendap.bes.BES - besGetTransaction() - Problem encountered with BES connection. Message: 'java.net.SocketException: Broken pipe' OPeNDAPClient executed 2 prior commands.

Usually indicate that the beslistener process to which the OLFS is connected has crashed. And the BES is not very good about logging its own issues.

I need to touch base with James - in my mind the next step, if you are up for it, is to turn on debug output in the BES. But to do so I think means you’ll have to install more RPMs in order to get all the debuggity symbols and the like. If you are willing to help me probe this let me know. In the mean time I’ll find out some things.

Thanks,

Nathan

Calloway, Chris

unread,
Mar 10, 2017, 3:52:46 PM3/10/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
> If you are willing to help me probe this let me know. In the mean time I’ll find out some things.

Oh yes, willing. I have to fix this somehow. I simply appreciate that you are willing.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 10, 2017, 4:24:23 PM3/10/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

OK, here we go. Look here:

https://www.opendap.org/pub/binary/hyrax-1.13.3/centos6.6/

Install all of those RPMs that you have not yet installed. The debuginfo packages might be enough, so if you wish to do the absolute minimum install just those.

- Stop Tomcat.
- Restart the BES like this: besctl restart -d “/etc/olfs/logs/bes_debug.log,bes,fonc”
- Start Tomcat.
- Make the request that makes the trouble.
- Send me the file /etc/olfs/logs/bes_debug.log

And let’s see what we find out.

If we don’t find much you can run it again but turn up the debuggity all the way:

besctl restart -d “/etc/olfs/logs/bes_debug.log,all”



Thanks,

Nathan

Nathan Potter

unread,
Mar 10, 2017, 4:34:19 PM3/10/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

> On Mar 10, 2017, at 1:24 PM, Nathan Potter <n...@opendap.org> wrote:
>
> If we don’t find much you can run it again but turn up the debuggity all the way:
>
> besctl restart -d “/etc/olfs/logs/bes_debug.log,all”

And if we do turn this on you’ll want to run the test(s) and then turn the debugging back down when you are finished:

besctl restart

With the debug “all” switch turned on the size of debug log files size could become a system resources issue.

Calloway, Chris

unread,
Mar 10, 2017, 5:15:21 PM3/10/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Adding Ray, Hong, and support back in the cc list.

Same thing when I put the BES user in the Tomcat group. I also realized I had copied and pasted your besctl restart for debug mode out of your email. The paste had turned the quote marks into smart quotes. So I fixed that as well and got even more error (both unable to open debug file and beslistener not running).

[cbc@hyrax01 ~]$ sudo besctl restart -d "/etc/olfs/logs/bes_debug.log,bes,fonc"
Shutting down the BES daemon
The BES daemon is not currently running
Starting the BES
Caught unhandled exception:
Unable to open the debug file: /etc/olfs/logs/bes_debug.log
The beslistener status is not 'BESLISTENER_RUNNING' (it is '0') the master pid was not changed.
besdaemon: server cannot mount at first try (core dump). Please correct problems on the process manager /usr/bin/beslistener
BES PID file exists but process not running, cleaning up
FAILED: The BES daemon did not appear to start
[cbc@hyrax01 ~]$ [cbc@hyrax01 ~]$ cat /etc/group | grep tomcat7
tomcat7:x:500:bes
[cbc@hyrax01 ~]$ ls -l /etc/olfs
total 60
-rw-rw-r-- 1 tomcat7 tomcat7 4701 Mar 10 14:19 catalog.xml
-rw-rw-r-- 1 tomcat7 tomcat7 1570 Mar 10 14:19 idFilter.xml
drwxrwxr-x 2 tomcat7 tomcat7 4096 Mar 10 14:19 logs
-rw-rw-r-- 1 tomcat7 tomcat7 1761 Mar 10 14:19 memberships.xml
-rw-rw-r-- 1 tomcat7 tomcat7 8351 Mar 10 14:19 olfs.xml
-rw-rw-r-- 1 tomcat7 tomcat7 3706 Mar 10 14:19 PEPFilter.xml
drwxrwxr-x 2 tomcat7 tomcat7 4096 Mar 10 14:19 testDocs
-rw-rw-r-- 1 tomcat7 tomcat7 3610 Mar 10 14:19 TomcatSecurityExample.xml
-rw-rw-r-- 1 tomcat7 tomcat7 1228 Mar 10 14:19 TomcatSecurity.xml
-rw-rw-r-- 1 tomcat7 tomcat7 2857 Mar 10 14:19 viewers.xml
-rw-rw-r-- 1 tomcat7 tomcat7 2723 Mar 10 14:19 wcs.xml
-rw-rw-r-- 1 tomcat7 tomcat7 1654 Mar 10 14:19 webstart.xml
[cbc@hyrax01 ~]$

Hey, it’s quitting time here and I need to go to the gym and take care of myself. Shall we resume next week? I have restarted the BES in normal mode for the weekend. However, adding the debug and devel rpms seems to have left Hyrax is a weird state. That is, I can see the html form:

http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.html

But all other resources respond either 404:

http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461];

or as an empty response:

http://hyrax.hydroshare.org/opendap/

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


On 3/10/17, 4:42 PM, "Calloway, Chris" <c...@unc.edu> wrote:

[cbc@hyrax01 ~]$ sudo besctl restart -d “/etc/olfs/logs/bes_debug.log,bes,fonc”
Shutting down the BES daemon
The BES daemon is not currently running
Starting the BES
Caught BES Error while processing the daemon's options: Unable to open the debug file: “/etc/olfs/logs/bes_debug.log
FAILED: The BES daemon did not appear to start
[cbc@hyrax01 ~]$

/etc/olfs is owned by the Tomcat user as requested in your earlier email today.

I will try putting the BES user in the Tomcat group. Sound OK?

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 10, 2017, 6:17:44 PM3/10/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Hi Chris,

The start up issue is about ownership/permissions on the log file. Probably it’s there now and is owned by root. Regardless, "touch" the file, and set the ownership of it to the BES user it should work.

(Sorry about the bumps here)

Yes I see the server is not working correctly. Did you restart tomcat after restarting the BES? That might help…

More Monday.

Thanks,


Nathan

Calloway, Chris

unread,
Mar 13, 2017, 11:55:58 AM3/13/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

Yes, I restart Tomcat every time.

bes_debug.log was in /etc/olfs/logs owned by root. I chowned it to bes:tomcat7 and chmoded it to 664. I made sure again that bes is in the tomcat7 group, and that the mode on both /etc/olfs and /etc/olfs/logs (both owned by tomcat:tomcat) is 774.

Bes now restarts in debug mode. But the result is the same (this result started occurring after installing the debug and devel rpms):

http://hyrax.hydroshare.org/opendap is blank.

http://hyrax.hydroshare.org/opendap/hyrax/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude%5B0:1:194%5D,longitude%5B0:1:461%5D,time%5B0:1:365%5D,crs,mean%5B0:1:365%5D%5B0:1:194%5D%5B0:1:461%5D shows a Hyrax 404 page.

However, you can clearly see that Tomcat is running at http://hyrax.hydroshare.org/opendap/version

The bes_debug.log file after the problem request for the large resource is at http://people.renci.org/~cbc/bes_debug.log

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 13, 2017, 12:47:38 PM3/13/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Hold off on looking at this, Nathan. I just found out there is a problem with the BES.Catalog.catalog.RootDirectory mount.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 13, 2017, 12:50:42 PM3/13/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

That makes sense given this:

http://hyrax.hydroshare.org/opendap/contents.html

Calloway, Chris

unread,
Mar 13, 2017, 1:59:25 PM3/13/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
OK, all recovered now. We had a problem after a reboot late in the day on Friday. The BES.Catalog.catalog.RootDirectory mount now functional and the BES is in debug mode.

I made the offending request and reuploaded the log file:

It appears in the log as though BES is writing the file for the response without a problem. And there is a file in /tmp created at the time of the request which is exactly the correct length of what the requested nc4 should be:

-rw------- 1 bes bes 131914424 Mar 13 13:49 ncd9vWfK

That was observed as well on Friday even before debug was installed. So I’m still thinking an OLFS to BES communication issue. I’ll rerun the test with all debugging turned on and let you know when that log is available.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 13, 2017, 2:26:27 PM3/13/17
to Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Now I just encountered something super weird. I can start besctl in debug mode with just bes and fonc logged. But if I try to debug all, it cannot open the log file:

[cbc@hyrax01 ~]$ sudo besctl restart -d “/etc/olfs/logs/bes_debug.log,all”
Shutting down the BES daemon
The BES daemon is not currently running
Starting the BES
Caught BES Error while processing the daemon's options: Unable to open the debug file: “/etc/olfs/logs/bes_debug.log
FAILED: The BES daemon did not appear to start
[cbc@hyrax01 ~]$ ls -l /etc/olfs/logs/
total 1780
-rw-rw-r-- 1 tomcat7 tomcat7 0 Mar 10 14:19 AnonymousAccess.log
-rw-rw-r-- 1 tomcat7 tomcat7 0 Mar 10 14:19 BESCommands.log
-rw-rw-r-- 1 bes tomcat7 0 Mar 13 14:12 bes_debug.log
-rw-rw-r-- 1 tomcat7 tomcat7 107830 Mar 10 23:28 HyraxAccess.2017-03-10.log
-rw-rw-r-- 1 tomcat7 tomcat7 351749 Mar 11 23:28 HyraxAccess.2017-03-11.log
-rw-rw-r-- 1 tomcat7 tomcat7 382364 Mar 12 23:12 HyraxAccess.2017-03-12.log
-rw-rw-r-- 1 tomcat7 tomcat7 125119 Mar 13 13:49 HyraxAccess.log
-rw-rw-r-- 1 tomcat7 tomcat7 826692 Mar 13 13:49 HyraxErrors.log
[cbc@hyrax01 ~]$ sudo besctl restart -d "/etc/olfs/logs/bes_debug.log,bes,fonc"
Shutting down the BES daemon
The BES daemon is not currently running
Starting the BES
OK: Successfully started the BES
PID: 32036 UID: 0
[cbc@hyrax01 ~]$

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 13, 2017, 3:53:29 PM3/13/17
to Calloway, Chris, James Gallagher, Nathan Potter, renci_rayi.con, Hong Yi, sup...@opendap.org
Hi Chris,

That is mighty odd. The process is only ever the root or the bes user and thus should be able to write to the file. The only thing I can think of would be to try creating the file in /tmp, making it world writeable (I know, but it’s just for a few minutes) and then

sudo besctl start -d “/tmp/bes_debug.log,all”

And see if that works, and honestly I doubt it does. I need to think about it…
Does the server need to be up? Will it still work without the debugging? What’s the urgency level here?

Thanks,

N

Calloway, Chris

unread,
Mar 13, 2017, 4:01:25 PM3/13/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Will do on the suggestions in a few minutes.

The server needs to be up and the problem is fairly urgent. The service is a component of the production environment used by many people for Hydrological research. The bug report was passed to us by a researcher who needs to be able to subset this particulate NetCDF. I’ve dropped a lot of what I’ve been doing, as I sure you have as well, to try to put this issue to rest as quickly as possible so I can get back to what I would otherwise be doing.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 13, 2017, 4:28:16 PM3/13/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

Did you see anything of using the “bes,fnoc” switch for the debug logs?

N

Calloway, Chris

unread,
Mar 13, 2017, 4:35:51 PM3/13/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Yes, I posted it here and sent the link in an earlier email today:

http://people.renci.org/~cbc/bes_debug.log

This is just for the bes,fonc debug. I just sent an email showing the horrible thing that all debugging is doing.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 13, 2017, 5:08:36 PM3/13/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

The log appears to end without protest. Most unfortunate. I just talked to James and he thinks that the problems running the debug mode “all” are probably endemic to the RPMs. They were, for obvious reasons, not built in “developer” mode.

So what can we do? I will try again to replicate the problem.

You are using CentOS-6.? How much system memory?

Does that system have any of the netcdf-3, netcdf-4, or hdf5 libraries installed?

Thanks,


Nathan

Nathan Potter

unread,
Mar 13, 2017, 5:21:41 PM3/13/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

I guess at this point I am thinking it’s a system specific issue;

My system: http://test.opendap.org:8080/opendap/version

And Hyrdroshare: http://hyrax.hydroshare.org/opendap/version

Are now running the same software.

Here’s a thought - How much disk is available on hydroshare?
In particular in /tmp partition?

The file-out responses for netcdf-3 and netcdf-4 are saddled with writing the entire netcdf response to local disk before it can be returned (Netcdf is not a “streamable” format), by default it’s set to use /tmp - (b.t.w. this is controlled in /etc/bes/modules/fonc.conf)

If /tmp is filling up that might the culprit.

Calloway, Chris

unread,
Mar 13, 2017, 8:17:53 PM3/13/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Ouch:

[cbc@hyrax01 ~]$ ls -l /tmp/bes_debug.log
-rwxrwxrwx 1 bes tomcat7 0 Mar 13 16:30 /tmp/bes_debug.log
[cbc@hyrax01 ~]$ sudo besctl start -d “/tmp/bes_debug.log,all”
Starting the BES
Caught BES Error while processing the daemon's options: Unable to open the debug file: “/tmp/bes_debug.log
FAILED: The BES daemon did not appear to start
[cbc@hyrax01 ~]$

I don’t think it’s file permissions.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 14, 2017, 10:18:09 AM3/14/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

Thank you for your persistence. Persistence is how problems get solved.

I’m going to answer both your replies here in this one email. Suggestions at end of this email.

➢ You are using CentOS-6.?

[cbc@hyrax01 ~]$ cat /etc/redhat-release
CentOS release 6.7 (Final)
[cbc@hyrax01 ~]$

➢ How much system memory?

[cbc@hyrax01 ~]$ cat /proc/meminfo
MemTotal: 8061376 kB
MemFree: 6223400 kB
Buffers: 272280 kB
Cached: 854572 kB
SwapCached: 0 kB
Active: 942864 kB
Inactive: 672804 kB
Active(anon): 488992 kB
Inactive(anon): 84 kB
Active(file): 453872 kB
Inactive(file): 672720 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 1047548 kB
SwapFree: 1047548 kB
Dirty: 16 kB
Writeback: 0 kB
AnonPages: 488800 kB
Mapped: 54576 kB
Shmem: 276 kB
Slab: 122804 kB
SReclaimable: 91908 kB
SUnreclaim: 30896 kB
KernelStack: 3648 kB
PageTables: 11656 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 5078236 kB
Committed_AS: 1584212 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 170760 kB
VmallocChunk: 34359554276 kB
HardwareCorrupted: 0 kB
AnonHugePages: 432128 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 8192 kB
DirectMap2M: 2088960 kB
DirectMap1G: 6291456 kB
[cbc@hyrax01 ~]$

➢ Here’s a thought - How much disk is available on hydroshare?
➢ In particular in /tmp partition?


[cbc@hyrax01 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VGos-LVslash
8.8G 3.9G 4.5G 47% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sda1 239M 82M 145M 36% /boot
/dev/mapper/VGos-LVhome
3.9G 143M 3.5G 4% /home
/dev/mapper/VGos-LVopt
3.9G 8.1M 3.7G 1% /opt
/dev/mapper/VGos-LVvar
134G 388M 127G 1% /var
[cbc@hyrax01 ~]$

> Does that system have any of the netcdf-3, netcdf-4, or hdf5 libraries installed?

No. I can find no trace of them in /lib or /usr/lib or /usr/local/lib. And…

[cbc@hyrax01 lib]$ sudo yum info netcdf*
Loaded plugins: fastestmirror, security
Loading mirror speeds from cached hostfile
Available Packages
[.. only available packages, none installed …]

[cbc@hyrax01 lib]$ sudo yum info hdf*
Loaded plugins: fastestmirror, security
Loading mirror speeds from cached hostfile
Available Packages
[.. only available packages, none installed …]

> So what can we do?

You’ve already done a lot. If the cooks are asking me, though, I’m at a loss. But…

I would like to return to my original question: what should the conf settings be for bes and olfs? We’ve never really addressed that and our first though was there’s something that may not be right with conf settings if there’s a problem specifically with requests for large responses. This is even more on my mind since we put /etc/olfs in place, which seems to have copied all the conf out of /usr/tomcat7 so that I don’t even know which conf is in effect. That is, /etc/olfs captures *all* the conf that was in /usr/tomcat, not just olfs.xml, except for the conf in /etc/bes.

In order to facilitate that, I would suggest that I give you temporary sudoer access to the box for you to have a look around. I think we could play games of try this try that tag for more time than either of us have. Would that be OK with you?

I would also ask if you see in problems arising from running Hyrax on VMWare VMs? That is all I have available to me. Anything that I can put into production at RENCI will be running on VMWare.

Finally, there is something I will do. I had already looked at the /tmp disk space and I think it is OK. But to be sure, I will ask for an increase. We have a group at RENCI who maintain the VMWare and storage infrastructure for me. They routinely insist on partitioning the storage such that the lion’s share of what I request goes to /var, often to my chagrin. But as a matter of due diligence, I will ask that group to please increase the space of the partition which mounts /tmp. However, first…[see next paragraph].

That is, unless you could tell me if there is a way and what it might be where we can have bes and olfs communicate through a subdirectory in /var which already has a vastly unused 134G space with currently 127 free. Is there a way to have bes and olfs do that given an RPM install? Because my next step would be rather painful: reinstalling from source on a new VM. We’ve tried that before with no success even with your help. Using RPMs was your solution out of that. And it would involve even more pain because of a storage mount we must use for the netCDF files that involves three other systems.

Nathan Potter

unread,
Mar 14, 2017, 11:54:52 AM3/14/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

I’ll try to keep it short(ish): 

0) The OLFS announces where it is reading its configuration from at startup. This should be easily located in the $CATALINA_HOME/logs/catalina.out file:

08:57:10.992 [localhost-startStop-1] DEBUG opendap.coreServlet.ServletUtil - The environment variable OLFS_CONFIG_DIR was not set. Trying default config location: /etc/olfs/
08:57:10.993 [localhost-startStop-1] INFO  opendap.coreServlet.ServletUtil - Using config location: /etc/olfs/

Also, /etc/olfs _should_ have all of those files - it’s a good thing! The OLFS only uses $CATALINA_HOME/webapps/opendap/WEB-INF/conf for it’s configuration if it cannot find it any of the other expected locations.

1) Just forget the debug “all” switch - it’s hosed in the RPM and we can’t fix it. :(

2) I think the /tmp space may be an issue based on your df:

/dev/mapper/VGos-LVslash   8.8G  3.9G  4.5G  47% /

But the good news is that this is JUST a default setting. The BES can be configured to cache stuff anywhere in the mounted file system. The bad news is that a bunch of the BES’ components utilize temporary storage or disk cache and they all have independent configurations. Rather than doing the whole list, lets just focus on the ones that we suspect are causing trouble, if that works for you.

Assuming it does, I suggest this:

- Make a dir in /var (For example: /var/hyrax)
- Make it so both the Tomcat and BES users can write to it.
- In /etc/bes/bes.conf set:
  
    BES.UncompressCache.dir=/var/hyrax

- In /etc/bes/modules/fonc.conf

    fonc.conf:FONc.Tempdir=/var/hyrax
    fong.conf:FONg.Tempdir=/var/hyrax


- Restart Hyrax (shutdown tomcat, bes, start tomcat, olfs).

- Try it yet again :)



Thanks,

Nathan




PS - 
Here is a list of all of the /tmp references in /etc/bes/modules

fojson.conf:FoJson.Tempdir=/tmp
fonc.conf:FONc.Tempdir=/tmp
fong.conf:FONg.Tempdir=/tmp
gateway.conf:Gateway.Cache.dir=/tmp
h4.conf:HDF4.CacheDir=/tmp
h4.conf:HDF4.Cache.latlon.path=/tmp/latlon
h4.conf:H4.Cache.metadata.path=/tmp/md
ncml.conf:NCML.DimensionCache.directory=/tmp
w10n.conf:w10n.Tempdir=/tmp



Nathan Potter

unread,
Mar 14, 2017, 2:02:55 PM3/14/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

> On Mar 14, 2017, at 8:54 AM, Nathan Potter <n...@opendap.org> wrote:
>
> - Restart Hyrax (shutdown tomcat, bes, start tomcat, olfs).
>

Meant to say:

- Restart Hyrax (shutdown tomcat, bes, start tomcat, bes).

Calloway, Chris

unread,
Mar 14, 2017, 5:02:27 PM3/14/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

0)       14:19:50.923 [localhost-startStop-1] INFO  opendap.coreServlet.ServletUtil - Using config location: /etc/olfs/

1)       Thought it might be because of the sticky bit on /tmp. But when I created bes_debug in /var/opt/hyrax with no sticky bit, debug all still would not start. I might try to run each of the debug services individually later

2)       Well, this is odd. BES.UncompressCache.dir was already set to /var/cache/bes, not /tmp. And it doesn’t appear writable by tomcat7:

[cbc@hyrax01 tomcat7]$ ls -l /var/cache
total 56
drwxrwxr-x  2 bes       bes       4096 Jan  9 18:44 bes
[,,,{

 

And it doesn’t appear to have been written to in awhile:

 

[cbc@hyrax01 tomcat7]$ ls -l /var/cache/bes

total 452

-rw-r--r-- 1 bes bes      8 Dec 14  2015 uncompress_cache.cache_control

-rw-r--r-- 1 bes bes 204960 Dec 14  2015 uncompress_cache#usr#share#hyrax#data#gdal#Atlantic.wind.grb

-rw-r--r-- 1 bes bes  95160 Dec 14  2015 uncompress_cache#usr#share#hyrax#data#gdal#Caribbean.wind.grb

-rw-r--r-- 1 bes bes 150060 Dec 12  2015 uncompress_cache#usr#share#hyrax#data#gdal#CentralAtlantic.wind.grb

[cbc@hyrax01 tomcat7]$

Whereas FONc.Tempdir and FONg.Tempdir were both already /tmp. So maybe a misconfiguration between bes, fonc, and fong?


I set BES.UncompressCache.dir, FONc.Tempdir, FONg.Tempdir to /var/opt/hyrax:

 

drwxrwxrwx 2 tomcat7 tomcat7 4096 Mar 14 16:32 hyrax

[cbc@hyrax01 ~]$

 

Restarted bes and tomat7, sent offending request and got internal server error. However there were two cache files in /var/opt/hyrax, which seems odd:

 

-rw------- 1 bes bes     131914424 Mar 14 16:24 nccj60Va

-rw------- 1 bes bes     131914424 Mar 14 16:24 ncXXBrCF

 

So I turned on debug after a restart:

 

[cbc@hyrax01 ~]$ ls -l /var/opt/hyrax

-rwxrwxrwx 1 bes tomcat7         0 Mar 14 16:38 bes_debug.log

 

And then I got the internal server error again from the offending request. Except this time only one new cache file was created:

 

[cbc@hyrax01 ~]$ ls -l /var/opt/hyrax/

total 386500

-rwxrwxrwx 1 bes tomcat7     20526 Mar 14 16:50 bes_debug.log

-rw------- 1 bes bes     131914424 Mar 14 16:50 nc2oIPEH

-rw------- 1 bes bes     131914424 Mar 14 16:24 nccj60Va

-rw------- 1 bes bes     131914424 Mar 14 16:24 ncXXBrCF

-rw-r--r-- 1 bes bes             8 Mar 14 16:22 uncompress_cache.cache_control

[cbc@hyrax01 ~]$

 

Here’s the bes_debug.log which simply shows that the cache was written:

 

http://people.renci.org/~cbc/bes_debug.log

 

More tomorrow. Are there other debug options I could run that would be helpful?

Nathan Potter

unread,
Mar 14, 2017, 5:57:37 PM3/14/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Hi Chris,

So those cache files must surely be the FoNC output being staged. When I make the request of our test server and toss it in a file I get:

-rw-r--r--  1 ndp  staff  131914424 Mar 14 14:16 foo.nc4

Same byte count.

I set BES.UncompressCache.dir, FONc.Tempdir, FONg.Tempdir to /var/opt/hyrax:

Good.

Maybe we should talk in at the AM….

N

Calloway, Chris

unread,
Mar 15, 2017, 8:40:28 AM3/15/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

At the ready. My number below. Or send me your number and an EST time.

Calloway, Chris

unread,
Mar 16, 2017, 4:18:38 PM3/16/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

OK, Nathan. I went through the steps we discussed yesterday:

 

1)       The cached files are indeed valid nc4 with the correct data in them. I uploaded a copy of one here:

http://people.renci.org/~cbc/ncLK24tq

2)       Changed olfs.xml to comment out <CatalogCache>. Internal server error on offending request. Debug log here looks the same as always:

http://people.renci.org/~cbc/bes_debug.catalog_cache_commented_out.log

3)       Changed olfs.xml to uncomment <BesManager><timeout> and set it to 0 for no timeout. Debug log here looks the same as always but slightly larger:

http://people.renci.org/~cbc/bes_debug.bes_manager_timeout_uncommented_and_set_to_0.log

4)       Changed olfs.xml to ramp down <ClientPool> attributes to maximum=4 and maxcmds=200. Debug log here looks the same as always:

http://people.renci.org/~cbc/bes_debug.client_pool_max_4_maxcmd_200.log

5)       Changed olfs.xml to ramp up <ClientPool> attributes to maximum=2000 and maxcmds=20000. Debug log here looks the same as always:

http://people.renci.org/~cbc/bes_debug.client_pool_max_2000_maxcmd_20000.log

Aftwards, I restarted BES and OLFS with debug turned off and <ClientPool> returned to the default settings of maximum=200 and maxcmds=2000 (but left the <CatalogCache commented out and <BesManager><timeout> uncommented and set to 0.

 

So, as you predicted, nothing came of those changes. But hopefully some causes were eliminated. If we do gdb, that will be compiling from source in develop mode on another VM, correct?

Nathan Potter

unread,
Mar 17, 2017, 5:47:06 PM3/17/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

Sorry I disappeared for a few days. The headache finally passed, and today we turned in the large final report that has been consuming me.

So - can we pick this up Monday? Do you have time then?

Comments inline below…


N

> On Mar 16, 2017, at 1:18 PM, Calloway, Chris <c...@unc.edu> wrote:
>
> OK, Nathan. I went through the steps we discussed yesterday:
>
> 1) The cached files are indeed valid nc4 with the correct data in them. I uploaded a copy of one here:
>
> http://people.renci.org/~cbc/ncLK24tq
>
> 2) Changed olfs.xml to comment out <CatalogCache>. Internal server error on offending request. Debug log here looks the same as always:
>
> http://people.renci.org/~cbc/bes_debug.catalog_cache_commented_out.log
>
> 3) Changed olfs.xml to uncomment <BesManager><timeout> and set it to 0 for no timeout. Debug log here looks the same as always but slightly larger:
>
> http://people.renci.org/~cbc/bes_debug.bes_manager_timeout_uncommented_and_set_to_0.log
>
> 4) Changed olfs.xml to ramp down <ClientPool> attributes to maximum=4 and maxcmds=200. Debug log here looks the same as always:
>
> http://people.renci.org/~cbc/bes_debug.client_pool_max_4_maxcmd_200.log
>
> 5) Changed olfs.xml to ramp up <ClientPool> attributes to maximum=2000 and maxcmds=20000. Debug log here looks the same as always:
>
> http://people.renci.org/~cbc/bes_debug.client_pool_max_2000_maxcmd_20000.log

Tomcat only supports 200 concurrent connections so it’s unlikely making the maximum larger than 200
will have much perceivable effect. Just saying.


>
> Aftwards, I restarted BES and OLFS with debug turned off and <ClientPool> returned to the default settings of maximum=200 and maxcmds=2000 (but left the <CatalogCache commented out and <BesManager><timeout> uncommented and set to 0.

Good.

>
> So, as you predicted, nothing came of those changes. But hopefully some causes were eliminated. If we do gdb, that will be compiling from source in develop mode on another VM, correct?

Maybe not. The gdb symbols should be in the “debug” RPMs you installed.

Calloway, Chris

unread,
Mar 20, 2017, 10:19:55 AM3/20/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
At the ready when you are.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 20, 2017, 2:56:07 PM3/20/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Nathan,

I checked my Hyrax and we were just too impatient for a response.

In addition to seeing the processes running, I checked the Tomcat server and applications status pages and they show opendap running. Screenshots attached.

If I click on the opendap link on the application status page, it takes me to a functioning root or this Hyrax:

http://hyrax01.renci.org:8080/opendap/

which simply took a really long time to load. Maybe we should turn caching back on?

Anyway, I was able to slowly navigate to this resource:

http://hyrax01.renci.org:8080/opendap/fef58369046c4a64a2d7564c4e7e1fd0/data/contents/contents.html

which is only 103M (compared to the offending 3.1G). I made an nc4 subset request for the first five variables, which were the same ones as on the offending dataset, and after many, many seconds, got a valid 4.3M response. So it is working with smaller datasets. I think we have tried this before early in this investigation.

When you could not connect, was your browser still waiting for a response, or had it already timed out?

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


hyrax-application-status.pdf
hyrax-server-status.pdf

Nathan Potter

unread,
Mar 20, 2017, 3:24:22 PM3/20/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Hi Chris,

I am now getting a slow response. Too slow. The BES and OLFS are running on the same system? Is the prices stack full of beslistener processes?

N
> <hyrax-application-status.pdf><hyrax-server-status.pdf>

Nathan Potter

unread,
Mar 20, 2017, 3:56:37 PM3/20/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org

> On Mar 20, 2017, at 12:24 PM, Nathan Potter <n...@opendap.org> wrote:
>
> Hi Chris,
>
> I am now getting a slow response. Too slow. The BES and OLFS are running on the same system? Is the prices stack full of beslistener processes?

Is the process stack for of beslistener processes?

Calloway, Chris

unread,
Mar 20, 2017, 4:03:27 PM3/20/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Yes, right many. Twelve. It’s usually only four:

[cbc@hyrax01 conf]$ ps -Af | grep beslist
bes 10474 10472 0 03:32 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 10730 10474 0 03:32 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17921 10474 0 13:43 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17922 10474 0 13:44 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17923 10474 0 13:44 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17931 10474 0 13:44 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17936 10474 0 13:44 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17937 10474 0 13:44 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17941 10474 0 13:45 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17951 10474 0 13:45 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17953 10474 0 13:45 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
bes 17959 10474 0 13:46 ? 00:00:00 /usr/bin/beslistener -c /etc/bes/bes.conf -d /var/log/bes/bes.log,-ascii,-besdaemon,-csv,-dap,-ff,-fits,-fojson,-fonc,-fong,-gateway,-gdal,-h4,-h5,-nc,-ncml,-ppt,-reader,-server,-usage,-w10n,-www,-xd -i /usr -r /var/run/bes
cbc 19437 2511 0 16:00 pts/0 00:00:00 grep beslist
[cbc@hyrax01 conf]$

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 20, 2017, 4:13:33 PM3/20/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
I restarted bes and tomcat (stop tomcat, stop bectl, start besctl, start tomcat) and it’s a wee bit faster. Started with one listener. Made a request and there were three listeners afterwards.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 20, 2017, 4:23:42 PM3/20/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

And by now are there a bunch more beslistener jobs? It’s taking me 30 to 60 seconds to get a page from the server. Can you see if there is a process using a lot (all?) of the cpu?



Nathan

Calloway, Chris

unread,
Mar 21, 2017, 8:45:17 AM3/21/17
to Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
There are only three beslisteners this morning. And the CPU is practically idle.

I’m ready today when you are. I’m also working on other things in parallel, so a phone call is going to alert me more quickly than an email.

I’m going to look at the logs for a traffic analysis as we discussed at the end of yesterday.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 21, 2017, 9:15:35 AM3/21/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
In my recent analysis here are the “key” words I searched for in the user-agent field to locate the robots:

Slurp
Googlebot
Exabot
Yandex
bingbot
DotBot
Scrapy
MJ12bot
Qwantify
ltx71
BLEXBot
AhrefsBot
Baiduspider
yandex.ru
Cliqzbot
SemrushBot
msnbot-media
MojeekBot
Slackbot
ZoomBot
Sogou
DAPBOT
ia_archiver

Though DAPBOT is actually a science thing.

Nathan Potter

unread,
Mar 21, 2017, 9:31:59 AM3/21/17
to Calloway, Chris, Nathan Potter, James Gallagher, renci_rayi.con, Hong Yi, sup...@opendap.org
Chris,

If you should decide to send me credentials on that system (or a clone thereof) I have attached my public key (do you pgp/gpg?).


Thanks,

Nathan

ndp.asc

Nathan Potter

unread,
Mar 29, 2017, 6:24:24 PM3/29/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, James Gallagher, support@opendap.org support

Hi Chris,

I have found some interesting things afoot on hyrax.hydroshare.org

1) It appears that the “problem” subset is working. I logged into hyrax.hydroshare.org and used valgrind to check on the BES while it processed this previously difficult request and valgrind found nothing amiss. I was able to get the NetCDF-4 response from the server using this URL:

http://hyrax.hydroshare.org/opendap/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461]

Also, while logged into hyrax.hydroshare.org I was able to get the NC4 response using the besstandalone and bescmdln applications:

besstandlone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
bescmdln -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc


2) There is definitely some issue with the filesystem that I don’t understand. When I use “ls” to look at the top level data directory the response is quite quick:

[root@hyrax01 bes]# time ls -l /opt/inetcdf_public_hydroshare > /dev/null

real 0m0.004s
user 0m0.000s
sys 0m0.004s


But when Hyrax uses the “stat” function to get the information it takes the better part of 4 seconds, and that’s every time:

[root@hyrax01 bes]# time bescmdln -i bescmd.xml -f /dev/null

real 0m3.736s
user 0m0.007s
sys 0m0.003s

Which is really killing Hyrax performance.

3) There is some mysterious configuration issue with the web server and I don’t understand it. In Hyrax we check for users hitting collections like this:

http://hyrax.hydroshare.org:8080/opendap

and we issue a 302 redirect to

http://hyrax.hydroshare.org:8080/opendap/

But that is not happening on this system, as far as I can tell something seems to be rewriting the URL before Hyrax gets it, but because this doesn’t generate a 302 redirect it means that the client gets a web page back where all the links are broken.

This server (AWS EC2 instance CentOS-6.7, Hyrax from RPM) does the redirect correctly:

http://test.opendap.org/opendap

Any thoughts on what could be happening here?

Thanks,

Nathan



> On Mar 28, 2017, at 1:07 PM, Nathan Potter <n...@opendap.org> wrote:
>
> Thanks guys, I’m in now.
>
> N
>
>> On Mar 28, 2017, at 11:32 AM, Calloway, Chris <c...@unc.edu> wrote:
>>
>> Thanks, Ray, I already had ndp in the wheel group. What I needed to uncomment in /etc/sudoers was:
>>
>> %wheel ALL=(ALL) NOPASSWD: ALL
>>
>> Nathan, this seems to me to work when I su to ndp. See if works with a key login and let me know. Thanks for getting back to this.
>>
>> [root@hyrax01 ~]# su - ndp
>> [ndp@hyrax01 ~]$ ls /root
>> ls: cannot open directory /root: Permission denied
>> [ndp@hyrax01 ~]$ sudo ls /root
>> anaconda-ks.cfg install.log install.log.syslog prep_template.sh
>> [ndp@hyrax01 ~]$ exit
>> logout
>> [root@hyrax01 ~]#
>>
>> --
>> Sincerely,
>>
>> Chris Calloway
>> Applications Analyst
>> University of North Carolina
>> Renaissance Computing Institute
>> (919) 599-3530
>>
>>
>> On 3/28/17, 2:00 PM, "Ray Idaszak" <ra...@renci.org> wrote:
>>
>> Chris,
>>
>> I added ndp to sudo just now on hyrax01.renci.org but I haven't told him if you want to let him know since you are his point-of-communication. I'm okay with it since RENCI squashes root on partitions outside of a given VM.
>>
>> Best Regards,
>> -Ray
>>
>> -----Original Message-----
>> From: Nathan Potter [mailto:n...@opendap.org]
>> Sent: Tuesday, March 28, 2017 12:23 PM
>> To: Calloway, Chris <c...@unc.edu>
>> Cc: Nathan Potter <n...@opendap.org>; Ray Idaszak <ra...@renci.org>
>> Subject: Re: [support] Hyrax BES HD5 Error
>>
>> Chris,
>>
>> I need to restart the BES on hyrax.hydroshare.org but I find that “sudo” wants my password which I don’t know because of id_rsa.pub
>>
>> Can you fix my sudo’r config so no password is needed, or is that verboten?
>>
>> Thanks,
>>
>> N
>>
>>> On Mar 23, 2017, at 6:05 AM, Calloway, Chris <c...@unc.edu> wrote:
>>>
>>> Copying Ray because we took him out of the loop for awhile.
>>>
>>> It would not surprise me in the least if the problem was in the NetCDF library. Good luck with your investigation and let me know what I can do on my end.
>>>
>>> --
>>> Sincerely,
>>>
>>> Chris Calloway
>>> Applications Analyst
>>> University of North Carolina
>>> Renaissance Computing Institute
>>> (919) 599-3530
>>>
>>>
>>> On 3/22/17, 6:20 PM, "Nathan Potter" <n...@opendap.org> wrote:
>>>
>>> Chris,
>>>
>>> No, I didn’t. And I got pulled off onto other stuff, and in the course of said stuff I think I found that you are not alone in this issue. I will get back on it in the morning - hopefully I’ll get a chance to talk to James and Fan about other examples that we can try. Fan thinks it’s a problem in the NetCDF-C library, which might explain why it’s proving so difficult to locate.
>>>
>>> More soon…
>>>
>>> N
>>>
>>>> On Mar 22, 2017, at 5:46 AM, Calloway, Chris <c...@unc.edu> wrote:
>>>>
>>>> Didja see anything?
>>>>
>>>> --
>>>> Sincerely,
>>>>
>>>> Chris Calloway
>>>> Applications Analyst
>>>> University of North Carolina
>>>> Renaissance Computing Institute
>>>> (919) 599-3530
>>>>
>>>>
>>>> On 3/21/17, 3:05 PM, "Nathan Potter" <n...@opendap.org> wrote:
>>>>
>>>> I got in.
>>>>
>>>>
>>>>> On Mar 21, 2017, at 10:21 AM, Calloway, Chris <c...@unc.edu> wrote:
>>>>>
>>>>> New public key installed. Give it a whirl. RSA public keys are supposed to have user/hostname embedded. But this one has crabby.local for a hostname. Don’t know if that will work. Lets’ see. If it fails, I’ll just give you a password.
>>>>>
>>>>> I think the previous public key file had a bad line ending.
>>>>>
>>>>> --
>>>>> Sincerely,
>>>>>
>>>>> Chris Calloway
>>>>> Applications Analyst
>>>>> University of North Carolina
>>>>> Renaissance Computing Institute
>>>>> (919) 599-3530
>>>>>
>>>>>
>>>>> On 3/21/17, 12:45 PM, "Nathan Potter" <n...@opendap.org> wrote:
>>>>>
>>>>> Chris,
>>>>>
>>>>> That was a bad id_rsa.pub, I made a new one, but I am wondering if there will reissues based on the user/hostname embedded within. (see attached)
>>>>>
>>>>>
>>>>
>>>> = = =
>>>> Nathan Potter ndp at opendap.org
>>>> OPeNDAP, Inc. +1.541.231.3317
>>>>
>>>>
>>>>
>>>
>>> = = =
>>> Nathan Potter ndp at opendap.org
>>> OPeNDAP, Inc. +1.541.231.3317
>>>
>>>
>>>
>>
>> = = =
>> Nathan Potter ndp at opendap.org
>> OPeNDAP, Inc. +1.541.231.3317
>>
>>
>>
>
> = = =
> Nathan Potter ndp at opendap.org
> OPeNDAP, Inc. +1.541.231.3317
>

Calloway, Chris

unread,
Mar 30, 2017, 1:20:51 PM3/30/17
to Nathan Potter, renci_rayi.con, James Gallagher, support@opendap.org support, Hong Yi
Nathan, paragraph numbers here correspond to the numbers on your previous reply.

Hong, there is a question for you on bullet 2b below.

1a) I don’t understand why the link works for you. I just clicked on it and got the same Internal Sever Error we’ve been seeing. What am I doing wrong?

1b) Also, when I run besstandalone, as either root or bes, I get command not found:

[root@hyrax01 ~]# besstandlone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
-bash: besstandlone: command not found
[root@hyrax01 ~]# su - bes
-bash-4.1$ besstandlone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
-bash: besstandlone: command not found
-bash-4.1$ exit
logout
[root@hyrax01 ~]#

1c) If besstandalone is working is writing the nc file for you, I can understand that. We’ve already established that bes itself is writing a correct result with an auto-generated name to /tmp. The problem seems to be that olfs is not picking it up and making a response out of it, correct?

2a) time ls -l /opt/inetcdf_public_hydroshare appears to run fast only because the output is being redirected to /dev/null. If the ls output is directed anywhere else, it also takes almost 4 seconds. I might assume this is because of the fuse mount. However, /opt/inetcdf_public_hydroshare contains only links, not files.

-bash-4.1$ time ls -l /opt/inetcdf_public_hydroshare > /dev/null

real 0m0.002s
user 0m0.000s
sys 0m0.001s
-bash-4.1$ time ls -l /opt/inetcdf_public_hydroshare
total 0
lrwxrwxrwx 1 bes bes 53 Mar 13 12:48 0e62b639e17c40259fc969437f33f59b -> /opt/hydrosharevault/0e62b639e17c40259fc969437f33f59b
lrwxrwxrwx 1 bes bes 53 Mar 13 12:48 13eb6f4034fd44db87b0a1eef3ee3826 -> /opt/hydrosharevault/13eb6f4034fd44db87b0a1eef3ee3826
lrwxrwxrwx 1 bes bes 53 Mar 13 12:49 31acb0c0a66f4de08f541077bdf3e1d2 -> /opt/hydrosharevault/31acb0c0a66f4de08f541077bdf3e1d2
lrwxrwxrwx 1 bes bes 53 Mar 18 21:59 392fc3a92ea54ad69f67728d13c9d919 -> /opt/hydrosharevault/392fc3a92ea54ad69f67728d13c9d919
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 3f354dd111f24998b37099ebdf478441 -> /opt/hydroshareuservault/3f354dd111f24998b37099ebdf478441
lrwxrwxrwx 1 bes bes 53 Mar 13 12:50 416870ea7f634d35abddb2bff218ab7d -> /opt/hydrosharevault/416870ea7f634d35abddb2bff218ab7d
lrwxrwxrwx 1 bes bes 53 Mar 13 12:50 4f704a3bcd1a482693137264abc44b99 -> /opt/hydrosharevault/4f704a3bcd1a482693137264abc44b99
lrwxrwxrwx 1 bes bes 53 Mar 13 12:50 50fa9c5b3b5b4263bf4f1752a8b6723c -> /opt/hydrosharevault/50fa9c5b3b5b4263bf4f1752a8b6723c
lrwxrwxrwx 1 bes bes 53 Mar 13 12:50 57db338f82224c44b492efac75265367 -> /opt/hydrosharevault/57db338f82224c44b492efac75265367
lrwxrwxrwx 1 bes bes 53 Mar 13 12:51 644b6d9c65e8471a9c99c1fed33a0906 -> /opt/hydrosharevault/644b6d9c65e8471a9c99c1fed33a0906
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 7f0392828f01467386102ae4b52c3b5a -> /opt/hydroshareuservault/7f0392828f01467386102ae4b52c3b5a
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 ba64d962eb6c460abc9a8628946df116 -> /opt/hydroshareuservault/ba64d962eb6c460abc9a8628946df116
lrwxrwxrwx 1 bes bes 53 Mar 13 12:52 c1f2679252c54d92bf54fc71eb1e03f4 -> /opt/hydrosharevault/c1f2679252c54d92bf54fc71eb1e03f4
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 c9fb977bae21432b8b202f13b62285b1 -> /opt/hydroshareuservault/c9fb977bae21432b8b202f13b62285b1
lrwxrwxrwx 1 bes bes 53 Mar 13 12:53 cea10ad2d9534d0cae21f5950eb7649b -> /opt/hydrosharevault/cea10ad2d9534d0cae21f5950eb7649b
lrwxrwxrwx 1 bes bes 53 Mar 13 12:53 d8ff66ceab6b4d70a2a6c7251dab56a4 -> /opt/hydrosharevault/d8ff66ceab6b4d70a2a6c7251dab56a4
lrwxrwxrwx 1 bes bes 53 Mar 13 12:53 e66ccfb09b634e0b9ab28f3225c93ce0 -> /opt/hydrosharevault/e66ccfb09b634e0b9ab28f3225c93ce0
lrwxrwxrwx 1 bes bes 53 Mar 13 12:53 f3f947be65ca4b258e88b600141b85f3 -> /opt/hydrosharevault/f3f947be65ca4b258e88b600141b85f3
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 f42f1387d7d54d7a9228888381d7c30e -> /opt/hydroshareuservault/f42f1387d7d54d7a9228888381d7c30e
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 fbc7af608a324a7a9cbbdd415d0a9499 -> /opt/hydroshareuservault/fbc7af608a324a7a9cbbdd415d0a9499
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 fc00c8eaa0944a4a98ea2ddbfe54320e -> /opt/hydroshareuservault/fc00c8eaa0944a4a98ea2ddbfe54320e
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 fef58369046c4a64a2d7564c4e7e1fd0 -> /opt/hydroshareuservault/fef58369046c4a64a2d7564c4e7e1fd0
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 ff2a6f87817544a08c82ebcf119bae80 -> /opt/hydroshareuservault/ff2a6f87817544a08c82ebcf119bae80
lrwxrwxrwx 1 bes bes 57 Mar 13 12:53 ff2e648104254ee4bcf8db925170ea91 -> /opt/hydroshareuservault/ff2e648104254ee4bcf8db925170ea91

real 0m3.787s
user 0m0.002s
sys 0m0.002s
-bash-4.1$

2b) I only set up Hyrax and I believe the fuse mount at /opt/hydrosharevault, not /opt/inetcdf_public_hydroshare. The links in /opt/Inetcdf_public_hydroshare are setup up by a script cron’d by my co-worker Hong Yi thusly:

-bash-4.1$ crontab -l
0 0 * * * /usr/local/bin/python2.7 /var/log/bes/scripts/expose_pub_netcdf_res.py
-bash-4.1$

My understanding is that the fuse mount is at /opt/hydrosharevault . It contains all the holdings in the irods zone for hydroshare. That is a terrifyingly huge number of files. So much so that ls /opt/hydrosharevault takes a really really long time. Many of the files on that mount are not desirable for access through a public hyrax server due to the nature of that data in them which may need to remain private for any number of reasons. The script, I believe, creates an irods proxy use home directory in /opt/hydroshareuservault for only the NetCDF resources hydroshare wishes to make public, which is a small subset of all the files in /opt/hydrosharevault. This is updated once per day. Then symbolic links are created for /opt/hydroshareuservault in /opt/inetcdf_public_hydroshare. I’m not sure of the reasons for that. Hong?

2c) I agree about time bescmdln -i bescmd.xml -f /dev/null . What more would bescmdln be doing than ls –l ? The unredirected output seems like it is building a catalog after stating the directory contents?

-bash-4.1$ time bescmdln -i bescmd.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<response reqID="[ajp-bio-8009-exec-6:34:bes_request]" xmlns="http://xml.opendap.org/ns/bes/1.0#">
<showCatalog>
<dataset catalog="catalog" count="24" lastModified="2017-03-19T01:59:49" name="/" node="true" size="4096">
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="0e62b639e17c40259fc969437f33f59b" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="13eb6f4034fd44db87b0a1eef3ee3826" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="31acb0c0a66f4de08f541077bdf3e1d2" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-03-01T21:03:17" name="392fc3a92ea54ad69f67728d13c9d919" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T12:18:49" name="3f354dd111f24998b37099ebdf478441" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-11-04T17:46:10" name="416870ea7f634d35abddb2bff218ab7d" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-12-02T17:17:34" name="4f704a3bcd1a482693137264abc44b99" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="50fa9c5b3b5b4263bf4f1752a8b6723c" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-12-02T20:50:30" name="57db338f82224c44b492efac75265367" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-02-02T05:11:39" name="644b6d9c65e8471a9c99c1fed33a0906" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T11:34:03" name="7f0392828f01467386102ae4b52c3b5a" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T11:55:24" name="ba64d962eb6c460abc9a8628946df116" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="c1f2679252c54d92bf54fc71eb1e03f4" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T14:53:24" name="c9fb977bae21432b8b202f13b62285b1" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="cea10ad2d9534d0cae21f5950eb7649b" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="d8ff66ceab6b4d70a2a6c7251dab56a4" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-03-06T16:33:01" name="e66ccfb09b634e0b9ab28f3225c93ce0" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2016-09-23T20:56:19" name="f3f947be65ca4b258e88b600141b85f3" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T15:04:12" name="f42f1387d7d54d7a9228888381d7c30e" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T20:34:03" name="fbc7af608a324a7a9cbbdd415d0a9499" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T11:48:47" name="fc00c8eaa0944a4a98ea2ddbfe54320e" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T11:42:19" name="fef58369046c4a64a2d7564c4e7e1fd0" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T14:57:12" name="ff2a6f87817544a08c82ebcf119bae80" node="true" size="4096"/>
<dataset catalog="catalog" count="1" lastModified="2017-01-24T15:07:22" name="ff2e648104254ee4bcf8db925170ea91" node="true" size="4096"/>
</dataset>
</showCatalog>
</response>

real 0m3.897s
user 0m0.004s
sys 0m0.003s
-bash-4.1$

3a) I brought up this issue with you a year and a half ago. For direct access to tomcat on port 8080, I had installed the recommended /etc/httpd/conf.d/proxy_ajp.conf

<Proxy *>
AddDefaultCharset Off
Order deny,allow
Allow from all
</Proxy>

ProxyPass /opendap ajp://127.0.0.1:8009/opendap
ProxyPassReverse /opendap ajp://127.0.0.1:8009/opendap

3b) There is no other configuration in for hydroshare or opendap in /etc/httpd/conf/httpd.conf. I’m checking with Ray to see where the alias hyrax.hydroshare.org is configured. I didn’t set that up.

3c) The redirect of http://hyrax.hydroshare.org:8080/opendap to http://hyrax.hydroshare.org:8080/opendap/ is working for me, although it take several seconds. But then http://hyrax.hydroshare.org:8080/opendap/ also take as long.

Summary: the problem subset works for you but not for me, the redirect works for me but not for you, and I’m hoping Ray can tell me about where the subdomain hyrax,hydroshare.org is configured.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 30, 2017, 2:53:55 PM3/30/17
to Calloway, Chris, Nathan Potter, renci_rayi.con, James Gallagher, support@opendap.org support, Hong Yi
Hi Chris,

So much inline below…

N

> On Mar 30, 2017, at 10:20 AM, Calloway, Chris <c...@unc.edu> wrote:
>
> Nathan, paragraph numbers here correspond to the numbers on your previous reply.
>
> Hong, there is a question for you on bullet 2b below.
>
> 1a) I don’t understand why the link works for you. I just clicked on it and got the same Internal Sever Error we’ve been seeing. What am I doing wrong?

Browser cache? Maybe try it with curl:

curl -o foo.nc "http://hyrax.hydroshare.org/opendap/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude\[0:1:194\],longitude\[0:1:461\],time\[0:1:365\],crs,mean\[0:1:365\]\[0:1:194\]\[0:1:461\]

And if it works you can check it with ncdump:

ncdump -h foo.nc


> 1b) Also, when I run besstandalone, as either root or bes, I get command not found:

I had to be the bes user to make it go.

And you had a typo, you said: “besstandlone" and you meant: “besstandalone"


>
> [root@hyrax01 ~]# besstandlone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
> -bash: besstandlone: command not found
> [root@hyrax01 ~]# su - bes
> -bash-4.1$ besstandlone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
> -bash: besstandlone: command not found
> -bash-4.1$ exit
> logout
> [root@hyrax01 ~]#
>
> 1c) If besstandalone is working is writing the nc file for you, I can understand that. We’ve already established that bes itself is writing a correct result with an auto-generated name to /tmp. The problem seems to be that olfs is not picking it up and making a response out of it, correct?

I don’t think that’s the issue because of the error message that we originally were chasing:

message = "fileout.netcdf - Failed to create array of floats for mean: NetCDF: HDF error”;

James and I know exactly where this came from and it’s not associated with the OLFS. It seems to be the BES failing to create (via the NetCDF-C library) the tmp file. Once the tmp file is created the NetCDF part is over and the transmitting to the OLFS is just moving bytes.

The besstandalone app runs the beslistener code as a standalone application and writes the the result to stdout or to a file, and it uses all of the machinery (including building the cache file and then reading it back as the response) that the daemon uses.

bescmdln connects via a socket to the running (master) beslistener which is what the OLFS does, and sends the command the OLFS would, and gets back the response on the socket as the OLFS would. It then dumps that response to a file (-f) or stdout.
It pretty much just writes to a stream the stuff it gets from stat(). Typically this is a pretty fast response to build. For example on our test server generating this catalog:

http://test.opendap.org:8080/opendap/

The way I tried did on hyrax.hydroshare.org goes pretty quickly:

[centos@ip-172-31-44-215 ~]$ time -p bescmdln -i bescmd.xml -f /dev/null
real 0.08
user 0.00
sys 0.00
[root@ip-172-31-44-215 centos]$ time -p besstandalone -c /etc/bes/bes.conf -i bescmd.xml -f /dev/null
real 0.08
user 0.06
sys 0.01

And a lot slower on hyrax.hydroshare.org (as the bes user operating from /var/log/bes )

bash-4.1$ time bescmdln -i bescmd.xml -f /dev/null
real 0m3.670s
user 0m0.006s
sys 0m0.003s
bash-4.1$ time besstandalone -c /etc/bes/bes.conf -i bescmd.xml -f /dev/null
real 0m3.908s
user 0m0.132s
sys 0m0.015s

The Hyrax code uses the stat() function to gather information from the filesystem. Is there some reason that these irods mounts would be slow to respond to that API?
Yes, I remember and we never sorted it out.

>
> <Proxy *>
> AddDefaultCharset Off
> Order deny,allow
> Allow from all
> </Proxy>
>
> ProxyPass /opendap ajp://127.0.0.1:8009/opendap
> ProxyPassReverse /opendap ajp://127.0.0.1:8009/opendap
>
> 3b) There is no other configuration in for hydroshare or opendap in /etc/httpd/conf/httpd.conf. I’m checking with Ray to see where the alias hyrax.hydroshare.org is configured. I didn’t set that up.
>
> 3c) The redirect of http://hyrax.hydroshare.org:8080/opendap to http://hyrax.hydroshare.org:8080/opendap/ is working for me, although it take several seconds. But then http://hyrax.hydroshare.org:8080/opendap/ also take as long.

So when you say “it works” does that mean your client actually receives a redirect and then goes to a different URL? Because I get the page, but I don’t get the redirect. In my browser the address bar remains http://hyrax.hydroshare.org:8080/opendap and I see the page. When I use curl I do not get a 302 response from hyrax.hydroshare.org:

[-bash: ~] curl -I http://hyrax.hydroshare.org/opendap
HTTP/1.1 200 OK
Date: Thu, 30 Mar 2017 18:17:43 GMT
X-FRAME-OPTIONS: DENY
Last-Modified: Sun, 19 Mar 2017 05:59:49 GMT
Set-Cookie: JSESSIONID=90233D4174EB6961CB93556ECE5C1C17; Path=/opendap/; HttpOnly
Content-Description: dap_directory
Cache-Control: max-age=0, no-cache, no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 21538

Where I do from my test server:

[-bash: ~] curl -I http://test.opendap.org/opendap
HTTP/1.1 302 Found
Date: Thu, 30 Mar 2017 18:18:31 GMT
Location: /opendap/
Connection: close
Content-Type: text/plain

And from looking at the debug logs I think it’s the case that the URL path that is being passed to the OLFS has an appended “/“. Why? Beats me. The httpd configurations for the hyrax AJP and the tomcat configurations appear to be identical on the two systems.

So I am wondering if there is another agent at work here…

Calloway, Chris

unread,
Mar 30, 2017, 4:46:17 PM3/30/17
to Nathan Potter, renci_rayi.con, James Gallagher, support@opendap.org support, Hong Yi
Tried this on two different machines:

pylantic:~ cbc$ curl -o foo.nc "http://hyrax.hydroshare.org/opendap/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude\[0:1:194\],longitude\[0:1:461\],time\[0:1:365\],crs,mean\[0:1:365\]\[0:1:194\]\[0:1:461\]"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 533 100 533 0 0 8 0 0:01:06 0:01:00 0:00:06 144
pylantic:~ cbc$

and both produced the following output in foo.nc:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
c...@renci.org and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>

I don’t get it, though. If you are able to get the desired response from that URL, which is the “problem” URL, isn’t that the desired fix to the problem (although only for you)?

Sorry about the besstandalone type. I was just cutting and pasting from your email. ( Plus, I don’t see so good.

Anyway, now when I run it correctly spelled:

-bash-4.1$ besstandalone -c /etc/bes/bes.conf -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc
-bash-4.1$ ls -l /tmp/foo.nc
-rw-rw-r-- 1 bes bes 131914424 Mar 30 15:32 /tmp/foo.nc
-bash-4.1$

I downloaded it to a machine with NetCDF installed and confirmed foo.nc is the correct output with ncdump. All except for this additional global attribute:

:history = "2017-03-28 21:17:22 GMT Hyrax-1.13.3 http://hyrax.hydroshare.org:8080/opendap/ff2a6f87817544a08c82ebcf119bae80/data/contents/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.nc4?latitude[0:1:194],longitude[0:1:461],time[0:1:365],crs,mean[0:1:365][0:1:194][0:1:461]" ;

Which is the wrong date. I checked the date on the hyrax server and it is not two days behind. I checked the original NetCDF being subsetted and there is no history global attribute in it. Also the timestamp on the original netCDF in /opt/inetcdf_public_hydroshare is Jan 24 09:57, which is not close to a match, so I don’t believe the date is being taken from the file. I would have to conclude then that it is being added by bes and added incorrectly.

Could it be bes doing the caching?

bescmdln -i /var/log/bes/crusher_bescmd.xml -f /tmp/foo.nc

as bes user (after besctl start to get a beslistener) produces the same foo.nc as besstandalone right down to the very same added history attribute.

I’m still stymied that things working for you are not working for me a s far as getting the subset through the hyrax web interface.

But the way http://test.opendap.org:8080/opendap/ gives a correct response as well as running both the bescmdln and besstandalone tests in 0.08 seconds leads me to agree with you that there’s something fishy in the fuse mount. You have put the original netCDF on the local filesystem of http://test.opendap.org:8080/opendap/, correct?

What I can do tomorrow is temporarily change the catalog root to a local filesystem directory and put a copy the offending netCDF in it. That should be definitive about fuse.

As for the opendap redirect, you are correct in stating that no 302 takes place. I just observed opendap without trailing slash work and thought that’s what you meant.

Your wondering about another agent is justified because I don’t get the redirect when looking at http://hyrax01.renci.org:8080/opendap either, with no alias of hydroshare.org or port 80 to 8080 routing occurring.

So I have no idea what is appending a / on the URL being passed to OLFS. Does the bes look at anything other than localhost?

But I will attempt to eliminate (or not) fuse as a problem tomorrow.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Hong Yi

unread,
Mar 30, 2017, 7:51:44 PM3/30/17
to Calloway, Chris, Nathan Potter, Ray Idaszak, James Gallagher, support@opendap.org support
Regarding 2b:

/opt/hydrosharevault and /opt/hydroshareuservault are iRODS fuse mounted directories that contain all HydroShare resource data hosted in two different iRODS zones, and my script looks at these two fuse mounted directories and only picks out public netCDF resources and then create symbolic links in /opt/inetcdf_public_hydroshare that links to those public netCDF resources in either of those two iRODS fuse mounted directories. The reason for using iRODS fuse mount is for simplicity and avoiding file transfer. But if iRODS fuse mount is an issue here (we'll know for sure tomorrow after Chris test it out), I think we'll have to probably copy files over without depending on fuse mount.

Thanks,
Hong

-----Original Message-----
From: Calloway, Chris [mailto:c...@unc.edu]

Nathan Potter

unread,
Mar 31, 2017, 9:21:03 AM3/31/17
to Hong Yi, Nathan Potter, Calloway, Chris, Ray Idaszak, James Gallagher, support@opendap.org support
Greetings,

James and I had a poke around hyrax.hydroshare.org (h.h.o) and we discovered some very confusing things. As Chris observed, the returned NetCDF file had an incorrect date in the history attribute. James and I observed that this date seems to be fixed: Subsequent repeats of the request to the server returned a file with exactly the same history attribute. This must mean that somehow Hyrax is reading a previously generated response. The cache directory, /var/opt/hyrax, is now on one of these fuse mounts:

[root@hyrax01 hyrax]# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VGos-LVvar
134G 518M 127G 1% /var

and in that directory we found a number of expired cache files. This in itself is a bad sign as these files get “unlinked” moments after their creation so that only the running process with the open file handle can access them and they won’t normally show up in a directory listing ( https://www.gnu.org/software/libc/manual/html_node/Deleting-Files.html ). The fact that some of these files remained in the cache dir is odd. We purged the cache dir of all these files, which included one with a matching history attribute to the one we have been getting. We restarted the server, made the request and again got the previously generated response. So where is that coming from? Is it possible that is the server using a different configuration than the one in /etc/bes ??

I think there might be some real value in testing this with both the cache and data directories on something resembling a traditional filesystem just so we can rule out the fuse mounts, but I also realize that your infrastructure may not make that easy, or even possible.

Sincerely,

Nathan

Calloway, Chris

unread,
Mar 31, 2017, 9:38:22 AM3/31/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
/var/opt/hyrax is not a fuse mount. Only /opt/hydrosharevault and /opt/hydroshareuservault are fuse mounts.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Ray Idaszak

unread,
Mar 31, 2017, 9:47:23 AM3/31/17
to Calloway, Chris, Nathan Potter, Hong Yi, James Gallagher, support@opendap.org support
If it is helpful, all iRODS fusemounts on that VM are listed in /etc/mtab as irodsFs entries.

Nathan Potter

unread,
Mar 31, 2017, 9:55:33 AM3/31/17
to Calloway, Chris, Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
Chris,

Ok, that’s cool, I think it was confusing to me because df didn't report the fuse mounts until I became the user “bes”

N

Calloway, Chris

unread,
Mar 31, 2017, 10:35:05 AM3/31/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
It is confusing because the fuse mounts are not even visible by root. Only the bes user, which is also an irods user, can access them.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Calloway, Chris

unread,
Mar 31, 2017, 10:40:38 AM3/31/17
to Hong Yi, Nathan Potter, renci_rayi.con, James Gallagher, support@opendap.org support
When we created the fuse mount, it was because fuse was the only iRODS supported FS. Now it supports NFS. So we may not have to give up extracting files from iRODS itself so easily.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Hong Yi

unread,
Mar 31, 2017, 10:47:48 AM3/31/17
to Calloway, Chris, Nathan Potter, Ray Idaszak, James Gallagher, support@opendap.org support
I am not sure of the current NFS support status in iRODS, hence did not mention it in my earlier email, but if we can go with NFS, that'd be ideal. Copying file from iRODS is not a good option since the file could be updated in iRODS which would cause the file copy for hyrax stale, so hopefully we don't have to go with file copy option if at all possible.

Nathan Potter

unread,
Mar 31, 2017, 10:49:34 AM3/31/17
to Hong Yi, Nathan Potter, Calloway, Chris, Ray Idaszak, James Gallagher, support@opendap.org support


One Mystery Solved:

I have a reasonable explanation for the stuck “history” attribute in the files returned by the besstandalone and bescmdln tests using the BES command file /var/log/bes/crusher_bescmd.xml

The value of the history entry is in the command file, and thus is frozen.


> On Mar 31, 2017, at 6:21 AM, Nathan Potter <n...@opendap.org> wrote:
>
> We restarted the server, made the request and again got the previously generated response. So where is that coming from? Is it possible that is the server using a different configuration than the one in /etc/bes ??

Calloway, Chris

unread,
Mar 31, 2017, 11:04:40 AM3/31/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
I created /opt/tempcatalogroot owned by bes and mode 755.

Into it I copied the offending netCDF file.

I set BES.Catalog.catalog.RootDirectory to /opt/tempcatalogroot in /etc/bes/bes.conf.

I restarted tomcat7 and bes.

The offending netCDF shows up as the sole entry at:

http://hyrax.hydroshare.org/opendap

When I click on the resource (linked to http://hyrax.hydroshare.org/NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.html) to get the subsetting form, the response is:

Not Found
The requested URL /NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.html was not found on this server.

OK, maybe an issue with port redirection. So I made it explicit:

http://hyrax.hydroshare.org:8080/opendap

I got the same single entry catalog. When I click on the resource this time, I get the same error, but as a page from OLFS:

HTTP Status 404 - /NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.html
type Status report
message /NLDAS_NOAH0125_D_002_EVPsfc_CONUS.nc.html
description The requested resource is not available.
Apache Tomcat/7.0.67

OK, maybe an issue with the alias. So I went straight for host I know:

http://hyrax01.renci.org:8080/opendap

Same catalog. Same 404 from OLFS.

Surely I’m doing something very wrong.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 31, 2017, 12:10:49 PM3/31/17
to Calloway, Chris, Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support

Chris,

You have simply run into the mysterious problem with the redirects that I was going on about earlier (where the URL path appears inside tomcat with a trailing “/“ while the URL path sent by the browser does not.)


Starting here:

http://hyrax.hydroshare.org:8080/opendap

Breaks all the links.


If you start here:

http://hyrax.hydroshare.org:8080/opendap/

Everything works.



And it is noticeably faster.


N

Calloway, Chris

unread,
Mar 31, 2017, 12:51:06 PM3/31/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
SMH.

Yes, it is lightning fast (I sit on the same GigE network as the server).

And the history attribute is correct.

Thanks!

Other than the crazy lack of redirect for opendap, I think RENCI should take it from here.

I did an ls /inetcdf_public_hydroahare this morning and it was slower that slow. Files names were creeping by on the terminal.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 31, 2017, 3:24:57 PM3/31/17
to Calloway, Chris, Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
Chris,

UPDATE (Just an FYI)

Redirect code is being utilized for subdir access:


curl -I "http://hyrax.hydroshare.org/opendap/dirBoo"
HTTP/1.1 302 Found
Date: Fri, 31 Mar 2017 19:21:12 GMT
X-FRAME-OPTIONS: DENY
Last-Modified: Fri, 31 Mar 2017 23:20:19 GMT
Set-Cookie: JSESSIONID=EFB1CE3A36AD3C0281B18DCB223CF170; Path=/opendap/; HttpOnly
Location: /opendap/hyrax/dirBoo/
Content-Type: text/plain

But NOT for top dir:

[-bash: ~/OPeNDAP/hyrax] curl -I "http://hyrax.hydroshare.org/opendap"
HTTP/1.1 200 OK
Date: Fri, 31 Mar 2017 19:23:14 GMT
X-FRAME-OPTIONS: DENY
Last-Modified: Fri, 31 Mar 2017 23:18:51 GMT
Set-Cookie: JSESSIONID=6313F24CB4552376ABD2DB667CE7D3BD; Path=/opendap/; HttpOnly
Content-Description: dap_directory
Cache-Control: max-age=0, no-cache, no-store
Content-Type: text/html;charset=ISO-8859-1


Calloway, Chris

unread,
Mar 31, 2017, 3:39:04 PM3/31/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
Yes, now I believe I remember this being the case when we looked at it a year ago.

--
Sincerely,

Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530


Nathan Potter

unread,
Mar 31, 2017, 4:56:36 PM3/31/17
to Calloway, Chris, Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support


Chris,

And now I know the redirect issue has something to do with Tomcat. I was running 7.0.57 and the redirect worked. I went to 7.0.67 (same as h.h.o) and *boom* broken root page redirect:

[-bash: ~/OPeNDAP/hyrax] curl -I http://localhost:8080/opendap
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
X-FRAME-OPTIONS: DENY
Last-Modified: Sat, 25 Feb 2017 03:07:53 GMT
Set-Cookie: JSESSIONID=5D7701CB49A2E80AC999E4EB432690FB; Path=/opendap/; HttpOnly
Content-Description: dap_directory
Cache-Control: max-age=0, no-cache, no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 14692
Date: Fri, 31 Mar 2017 20:55:09 GMT

So now maybe that it’s broken on my dev system I may be able to sort it out.


Thanks,

Nathan

Nathan Potter

unread,
Apr 1, 2017, 10:46:28 AM4/1/17
to Calloway, Chris, Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support

Woot! I got the redirect sorted.

http://hyrax.hydroshare.org/opendap

Now correctly redirects to:

http://hyrax.hydroshare.org/opendap/

I just started crawling around in the Tomcat changelog: http://tomcat.apache.org/tomcat-7.0-doc/changelog.html

And in release 7.0.66 I found this:

"Move the functionality that provides redirects for context roots and directories where a trailing / is added from the Mapper to the DefaultServlet. This enables such requests to be processed by any configured Valves and Filters before the redirect is made. This behaviour is configurable via themapperContextRootRedirectEnabled and mapperDirectoryRedirectEnabled attributes of the Context which may be used to restore the previous behaviour. (marks)"


And in release 7.0.67 I found

https://bz.apache.org/bugzilla/show_bug.cgi?id=58660 - "Correct a regression in 7.0.66 caused by the change that moved the redirection for context roots from the Mapper to the Default Servlet. (marks)"


The workaround patch is simple. In the file /usr/tomcat7/conf/context.xml I changed the line:

<Context>

to:

<Context mapperContextRootRedirectEnabled="true" mapperDirectoryRedirectEnabled="true" >

And restarted Tomcat. I will have to look for a better patch that doesn’t require a Tomcat configuration mod, but in the meantime this works great.

Thank you all for all your patience!

Sincerely,


Nathan

Calloway, Chris

unread,
Apr 1, 2017, 12:05:35 PM4/1/17
to Nathan Potter, Hong Yi, renci_rayi.con, James Gallagher, support@opendap.org support
Wow Nathan, you are awesome! Even working on a Saturday.

I think once we get NFS wired up, we should be good to go. And that's on me.

Have a great rest of the weekend.

Super cheers, Chris

Sent from my iPhone
Reply all
Reply to author
Forward
0 new messages