generating metadata from text file

145 views
Skip to first unread message

aape...@ncsu.edu

unread,
Apr 12, 2016, 11:23:09 AM4/12/16
to iRODS-Chat
I have an iRods metadata question:
I have a txt file, /var/lib/irods/scratch/outputFileA
It has two lines:
line 1:   Header stuff
line 2:    384    9

How can I write a rule that would add metadata, where the field-value duplet is "row2col1, 384" and "row2col2, 9".  This metadata is to be applied to  /var/lib/irods/iRODS/Vault/ncsuResc/home/irodsUser/inputFileB

Reagan Moore

unread,
Apr 12, 2016, 12:33:12 PM4/12/16
to irod...@googlegroups.com
Write a rule that uses the data grid name for the file, creates key value pairs for the metadata, and adds the metadata to the file

testrule {
  *Path = "/$rodsZoneClient/home/$userNameClient/inputFileB";

  msiAddKeyVal(*Keyval, "row2col1", "384");

  msiAddKeyVal(*Keyval, "row2col2""9");

  msiAssociateKeyValuePairsToObj(*Keyval,*Path,"-d");

}
INPUT null
OUTPUT ruleExecOut

Or use the imeta command to add the metadata interactively.

Reagan Moore

--
--
"iRODS: the Integrated Rule-Oriented Data-management System; A community driven, open source, data grid software solution" https://www.irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat

---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

aape...@ncsu.edu

unread,
Apr 12, 2016, 1:02:04 PM4/12/16
to iRODS-Chat
Thank.  What I meant was that 384 and 9 are read from outputFileA.  I have since found out that msiLoadMetadataFromDataObj is the library I need, and that  rulemsiLoadMetadataFromDataObj.r is the rule I need to master.  However I can not run this rule.  I get an error that says that the msi.. service is not found.  Further reading tells me I have so load ERA module to engage this service.  However the documentation says that I have to edit info.txt to enable ERA, and my tree has not such file.  How do I enable ERA with 3.0 

Terrell Russell

unread,
Apr 12, 2016, 1:30:41 PM4/12/16
to irod...@googlegroups.com
There are no modules in iRODS 4.x.   They were compile time code snippets that would add functionality to iRODS 3.x.

In 4.x, everything is moving to runtime configuration, rather than compile time.

If you need to add a specific microservice plugin, you'll need to compile it yourself against the development libraries and drop it into /var/lib/irods/plugins/microservices/.  It will be made available immediately to the running server.

Terrell

Reagan Moore

unread,
Apr 12, 2016, 1:30:45 PM4/12/16
to irod...@googlegroups.com
Antoine de Torcy created micro-service plug-ins for msiLoadMetadataFromDataObj.r.  They are available at 

Justin James

unread,
Apr 12, 2016, 1:42:26 PM4/12/16
to iRODS-Chat
Is outputFileA always the file name or are you intending to monitor every file in the scratch directory?  If the name of the file you are reading can change then what is the rule to map this to your file within your vault?

The way to do this is to write a microservice to monitor the scratch directory.  Once you detect a new file read that file in another microservice.  The second microservice can return the KVP's you want to write.  Then you can write in the same way that Reagan did above.

The rule that drives this would need to be a delayed rule that executes periodically.  Once a file is read, you can copy it to a processed directory, remove it, etc.  Or if you want to keep the file, you need a way to determine if the file is updated.  One way i've done this is to store the file checksum (or update time) somewhere in metadata so when the rule executes it can determine if the file contents have changed.

Here is an example rule that you could use.  This assumes you are monitoring a single file (inputFileB).  In this example the checksum of inputFileB is saved under the collection corresponding to the zone name.  You would need to write the calculate_checksum() and read_kvp_from_file() microservices.  

monitor_scratch_dir {
   delay( "<PLUSET>30s</PLUSET><EF>3m</EF><EF>REPEAT FOR EVER</EF>" ) {

           *path = "/$rodsZoneClient/home/$userNameClient/inputFileB";
          
           foreach (*row in select META_COLL_ATTR_VALUE where COLL_NAME = '/$rodsZoneClient' and META_COLL_ATTR_NAME = 'outputFileAChecksum') {
               old_checksum = *row.META_COLL_ATTR_VALUE;
           }

           calculate_checksum(*path, *current_checksum);

           if (*current_checksum != *old_checksum) {
               read_kvp_from_file('/var/lib/irods/scratch/outputFileA', *kvp);

               # might want to delete the old KVP's first?
               msiAssociateKeyValuePairsToObj(*kvp,*Path,"-d");

                # remove the old checksum and save the new one
                msiString2KeyValPair("outputFileAChecksum=*old_checksum", *kvp);
                msiRemoveKeyValuePairsFromObj(*kvp, '/$rodsZoneClient', "-C");

                msiString2KeyValPair("outputFileAChecksum=*current_checksum", *kvp)
                msiAssociateKeyValuePairsToObj(*kvp, '/$rodsZoneClient', "-C");
           }
    } # delay
}

OUTPUT ruleExecOut

aape...@ncsu.edu

unread,
Apr 12, 2016, 6:37:40 PM4/12/16
to iRODS-Chat
I went to site you listed below (thank you). Went one directory down, did git clone https://github.com/DICE-UNC/microservices.git
This gives me ../microservices/microservices/
      core,  msiCopyAVUMetadata,  msiLoadMetadataFromDataObj, msiLoadMetadataFromXml,  msiSetDataType
I eventually got core to make, after doing "find" in the tree for all of the dependencies and including them in the compile line.
 g++  -I/home/aapeters/downloadGithup/irods/iRODS/lib/api/include -I/home/aapeters/downloadGithup/irods/iRODS/lib/api/include -I/home/aapeters/downloadGithup/irods/iRODS/lib/core/include -I/home/aapeters/downloadGithup/irods/iRODS/server/core/include -I/home/aapeters/downloadGithup/irods/iRODS/server/icat/include -I/home/aapeters/downloadGithup/irods/external/boost_1_58_0z -I/home/aapeters/downloadGithup/irods/iRODS/server/re/include -I/home/aapeters/downloadGithup/irods/iRODS/server/drivers/include -I./include -c -DRODS_SERVER -fPIC -g -Wno-deprecated -Wno-write-strings -o ./obj/era.o ./src/eraUtil.c &> results.txt

Then I go to the directory called msiLoadMetadataFromDataObj (I had to compile core first because it needs core/obj/era.o)
I try to make.  I get the error

g++: libirods_client.a: No such file or directory

I can not "find" this library anywhere.  Please help.  How do I find or make this file, libirods_client.a?

Andrew

Terrell Russell

unread,
Apr 12, 2016, 7:02:14 PM4/12/16
to irod...@googlegroups.com
You need to get the 'irods-dev' package from here for your OS:

And then just run ./packaging/build.sh

You should not have to run/construct any compiler lines yourself.

Terrell




aape...@ncsu.edu

unread,
Apr 12, 2016, 9:56:52 PM4/12/16
to iRODS-Chat
Not quite there yet:

[aapeters@irods01 ~]$ irule -v -F rulemsiLoadMetadataFromDataObj.r
rcExecMyRule: myTestRule {
#Input parameter is:
#  Path name of file containing metadata
#    Format of file is
#    C-collection-name |Attribute |Value |Units
#    Path-name-for-file |Attribute |Value
#    /tempZone/home/rods/test/metadata-target.txt |Chicago |106 |Miles
#Output parameter is:
#  Status
  msiLoadMetadataFromDataObj(*Path,*Status);
  msiGetDataObjAVUs(*Filepath,*Buf);
  writeBytesBuf("stdout",*Buf);
}

outParamDesc: ruleExecOut
ERROR: rcExecMyRule error.  status = -808000 CAT_NO_ROWS_FOUND
Level 0: DEBUG: execMicroService3: error when executing microservice
line 9, col 2
  msiLoadMetadataFromDataObj(*Path,*Status);
  ^

The contents of the *.r file is:
[aapeters@irods01 ~]$ more rulemsiLoadMetadataFromDataObj.r
myTestRule {
#Input parameter is:
#  Path name of file containing metadata
#    Format of file is
#    C-collection-name |Attribute |Value |Units
#    Path-name-for-file |Attribute |Value
#    /tempZone/home/rods/test/metadata-target.txt |Chicago |106 |Miles
#Output parameter is:
#  Status
  msiLoadMetadataFromDataObj(*Path,*Status);
  msiGetDataObjAVUs(*Filepath,*Buf);
  writeBytesBuf("stdout",*Buf);
}
INPUT *Path="/var/lib/irods/iRODS/Vault/ncsuResc/home/aapeters/load-metadata.txt", *Filepath="/var/lib/irods/iRODS/Vault/n
csuResc/home/aapeters/metadata-target.txt"

The content of the load-metadata.txt file is:
[aapeters@irods01 ~]$ sudo more /var/lib/irods/iRODS/Vault/ncsuResc/home/aapeters/load-metadata.txt
/var/lib/irods/iRODS/Vault/ncsuResc/home/aapeters/metadata-target.txt |Chicago |106 |miles

The content of metadata-target.txt is:
[aapeters@irods01 ~]$ sudo more /var/lib/irods/iRODS/Vault/ncsuResc/home/aapeters/metadata-target.txt
#Test to see if attributes are applied

Near the end of the logfile shows:
[aapeters@irods01 ~]$  sudo tail -n 250 /var/lib/irods/iRODS/server/log/rodsLog.2016.04.11|more
#    C-collection-name |Attribute |Value |Units
#    Path-name-for-file |Attribute |Value
#    /tempZone/home/rods/test/metadata-target.txt |Chicago |106 |Miles
#Output parameter is:
#  Status
  msiLoadMetadataFromDataObj(*Path,*Status);
  msiGetDataObjAVUs(*Filepath,*Buf);
  writeBytesBuf("stdout",*Buf);
}
, status = -808000
Apr 12 21:46:13 pid:8631 NOTICE: readAndProcClientMsg: received disconnect msg from client
Apr 12 21:46:13 pid:8631 NOTICE: Agent exiting with status = 0
*** glibc detected *** irodsAgent: corrupted double-linked list: 0x00000000015c9170 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75e66)[0x7f589471ce66]
/lib64/libc.so.6(+0x76270)[0x7f589471d270]
/lib64/libc.so.6(+0x78958)[0x7f589471f958]
irodsAgent(_ZN5irods12lookup_tableIPNS_14ms_table_entryESsNS_17irods_string_hashEED2Ev+0x5f)[0x7b2fef]
/lib64/libc.so.6(exit+0xe2)[0x7f58946dcb22]
/lib64/libc.so.6(__libc_start_main+0x104)[0x7f58946c5d64]
irodsAgent[0x49af89]
======= Memory map: ========
00400000-009ce000 r-xp 00000000 fd:00 131957                             /var/lib/irods/iRODS/server/bin/irodsAgent
00bcd000-00be2000 rw-p 005cd000 fd:00 131957                             /var/lib/irods/iRODS/server/bin/irodsAgent
00be2000-00e3d000 rw-p 00000000 00:00 0
011a2000-016f2000 rw-p 00000000 00:00 0                                  [heap]

Any ideas???

de Torcy, Antoine

unread,
Apr 12, 2016, 11:20:07 PM4/12/16
to iRODS-Chat

Your paths (both inside your .r file and in load-metadata.txt) should be iRODS logical object paths, not the physical paths of files on disk. Once something is in iRODS you will almost always work with its logical path.


A



From: irod...@googlegroups.com <irod...@googlegroups.com> on behalf of aape...@ncsu.edu <aape...@ncsu.edu>
Sent: Tuesday, April 12, 2016 9:56 PM
To: iRODS-Chat
Subject: Re: [iROD-Chat:14959] generating metadata from text file
 

aape...@ncsu.edu

unread,
Apr 13, 2016, 7:14:06 AM4/13/16
to iRODS-Chat
Still issue:

Apr 13 07:09:08 pid:13244 ERROR: [-]    iRODS/server/api/src/rsDataObjOpen.cpp:7
8:rsDataObjOpen :  status [CAT_NO_ROWS_FOUND]  errno [] -- message [failed in ir
ods::resolve_resource_hierarchy for [/dasi_ncsu/home/aapeters/load-metadata.txt]
]
        [-]     iRODS/server/core/src/irods_resource_redirect.cpp:523:resolve_re
source_hierarchy :  status [CAT_NO_ROWS_FOUND]  errno [] -- message [resolve_res
ource_hierarchy :: failed in file_object_factory]
                [-]     iRODS/server/core/src/irods_file_object.cpp:276:file_obj
ect_factory :  status [CAT_NO_ROWS_FOUND]  errno [] -- message [failed in call t
o getDataObjInfoIncSpecColl for [/dasi_ncsu/home/aapeters/load-metadata.txt] CAT
_NO_ROWS_FOUND ]

[aapeters@irods01 ~]$ more load-metadata.txt
# Header
/dsai_ncsu/home/aapeters/metadata-target.txt |Chicago |106
[aapeters@irods01 ~]$ iput load-metadata.txt -f

Any idea?  I tried load-metadata.txt without the Header line also.

Reagan Moore

unread,
Apr 13, 2016, 9:59:04 AM4/13/16
to irod...@googlegroups.com
Here is an example.

I have a rule file called test-metaloadpipe.r, which contains:

myTestRule {

# test-metaloadpipe.r

#Input parameter is:

#  Path name of file containing metadata

#    Format of file is

#    C-collection-name |Attribute |Value |Units

#    Path-name-for-file |Attribute |Value

#Example

#    /lifelibZone/home/rwmoore/foo1 |Test |34

#Output parameter is:

#  Status

  *Path= "/$rodsZoneClient/home/$userNameClient/" ++ *Coll;

  writeLine ("stdout", "*Path");

  msiLoadMetadataFromDataObj(*Path,*Status);

  writeLine("stdout", "*Status Loaded metadata from file *Path");

}

INPUT *Coll="Class-INLS624/rules/metapipe1"

OUTPUT ruleExecOut


Note that the micro-service msiLoadMetadataFromDataObj needs the path name to a file in the data grid.  I constructed the path name using session variables for the zone name and my account name.  The variable *Coll is the relative collection path to the file metapipe1 in my home data grid account.  The file metapipe1 contains the metadata that will be loaded.


The file metapipe1 contents are:


/lifelibZone/home/rwmoore/Class-INLS624/rules/test |Last2 |1/20/2016

/lifelibZone/home/rwmoore/Class-INLS624/rules/test |Cr2 |blue


This adds the attribute “Last2” to the file /lifelibZone/home/rwmoore/Class-INLS624/rules/test in the data grid, with the value 1/20/2016 and a second attribute with the name “Cr2"


Reagan Moore

aape...@ncsu.edu

unread,
Apr 14, 2016, 2:12:48 PM4/14/16
to iRODS-Chat
Thank you.  My error was mistakenly reversing two letters.  Using your example, I was able to work backwards till I found it. 

aape...@ncsu.edu

unread,
Jan 4, 2018, 3:39:40 PM1/4/18
to iRODS-Chat
Hello. In 2016 I got "irule -v -F rulemsiLoadMetadataFromDataObj.r" to work just fine. 

Now, however, I get a " NO_RULE_OR_MSI_FUNCTION_FOUND_ERR" error, and I think the upgrade in 2017 from 4.1.8 to 4.2.0 might have something to do with it. The so file is present in /var/lib/irods/plugins/microservices/libmsiLoadMetadataFromDataObj.so, however this is failing because it is for the old version? If so, how do I get and compile a version for 4.2.0? I went to where Terrell told me to go tin 2016, :
But there is no 4.2.0 directory version here. Please give me directions on how I can obtain and compile libmsiLoadMetadataFromDataObj.so for 4.2.0 and "CentOS release 6.7 (Final)"

Regards
Andrew

On Tuesday, April 12, 2016 at 7:02:14 PM UTC-4, Terrell Russell wrote:

Jason Coposky

unread,
Jan 4, 2018, 4:45:26 PM1/4/18
to irod...@googlegroups.com

Andrew,

 

The plugin framework went through some changes between the 4.1.x to 4.2.x, the plugin will need to be ported.  Can you point us at the source code?

 

Thanks,

 

------

Jason Coposky
Executive Director, iRODS Consortium
RENCI at the University of North Carolina at Chapel Hill
w:
 (919)445-9675

m: (919)522-0517
jas...@renci.org
linkedin

twitter

irods.org

Terrell Russell

unread,
Jan 4, 2018, 6:14:37 PM1/4/18
to irod...@googlegroups.com
Hi Andrew,

The source is still here -- https://github.com/DICE-UNC/microservices/tree/master/microservices

It has not been ported to use the 4.2 development libraries or CMake or Clang.

Did you use the https://packages.irods.org repository to install 4.2.x itself?
If so, you will only need a 'yum install irods-devel' to get the development libraries necessary for building against/for 4.2+.

Terrell



To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+unsubscribe@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

--
--
"iRODS: the Integrated Rule-Oriented Data-management System; A community driven, open source, data grid software solution" https://www.irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat

---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+unsubscribe@googlegroups.com.

Andrew Petersen

unread,
Jan 4, 2018, 8:41:04 PM1/4/18
to irod...@googlegroups.com
yes I used the  https://packages.irods.org repository to install 4.2.0. 
Specifically, I did: 

sudo rpm --import https://packages.irods.org/irods-signing-key.asc

wget -qO - https://packages.irods.org/renci-irods.yum.repo | sudo tee /etc/yum.repos.d/renci-irods.yum.repo


"sudo yum install irods-devel" worked. What is the next step?


Regards

Andrew



You received this message because you are subscribed to a topic in the Google Groups "iRODS-Chat" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/irod-chat/gZSB3Pzv8XM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to irod-chat+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Andrew Petersen, PhD
Data Science Research Specialist
Advanced Computing, Office of Information Technology
2620 Hillsborough Street

Terrell Russell

unread,
Jan 4, 2018, 8:51:21 PM1/4/18
to irod...@googlegroups.com
So you have the development environment ready.

You need to port, and then build the https://github.com/DICE-UNC/microservices/tree/master/microservices code against the new irods-devel library.

An example of a 4.2-ready microservice is here - https://github.com/irods-contrib/irods_microservice_free_microservice_out

Terrell



Andrew Petersen

unread,
Jan 4, 2018, 10:05:07 PM1/4/18
to irod...@googlegroups.com
ok, 
svn co https://github.com/DICE-UNC/microservices.git /var/lib/irods/packages
works, but in /var/lib/irods/packages/trunk/packaging
./build.sh
gives
ERROR :: "irods-dev" package required to build this plugin
If I look in build.sh, I see that it is calling for /usr/lib/libirods_client.a -
# require irods-dev package
if [ ! -f /usr/lib/libirods_client.a ] ; then
echo ""
    echo "ERROR :: \"irods-dev\" package required to build this plugin" 1>&2
    exit 1
fi

Yes I did "sudo yum install irods-devel" before, as you said, and that installed without any issues.

I did a "find" for it and I found it at
/var/lib/irods/iRODS/Vault/ncsuResc/home/aapeters/downloadGithup4.1.3/irods/iRODS/lib/development_libraries/libirods_client.a

Should I modify build.sh so that it finds libirods_client.a at this path?


aape...@ncsu.edu

unread,
Jan 5, 2018, 11:47:23 AM1/5/18
to iRODS-Chat
build.sh is looking for /usr/lib/libirods_client.a. This is not present, however there are several *irods*.so files in /usr/lib, like libirods_client.so and libirods_client.so.4.2.0 
Please advise

Terrell Russell

unread,
Jan 5, 2018, 2:37:00 PM1/5/18
to irod...@googlegroups.com
No - this approach won’t work.  The repo needs to be ported to 4.2-compatibility before it will compile.

We’ll take a look at getting those DICE microservices ported.

Terrell

aape...@ncsu.edu

unread,
Jan 8, 2018, 1:36:06 PM1/8/18
to iRODS-Chat
Ok, I understand. msiSetAVU and msiLoadMetadataFromDataObj are no longer supported, and are replaced by msiString2KeyValPair and msiAssociateKeyValuePairsToObj)

And the list of all currently supported microservices is at:

Thanks
Reply all
Reply to author
Forward
0 new messages