[slurm-users] seff in slurm-23.02

981 views
Skip to first unread message

David Gauchard

unread,
May 25, 2023, 12:01:25 PM5/25/23
to slurm...@schedmd.com
Hello,

slurm-23.02 on ubuntu-20.04,

seff is not working anymore:

```
# ./seff 4911385
Use of uninitialized value $FindBin::Bin in concatenation (.) or string at ./seff line 11.
Name "FindBin::Bin" used only once: possible typo at ./seff line 11, <DATA> line 602.
perl: error: slurm_persist_conn_open: Something happened with the receiving/processing of the persistent connection init message to localhost:6819: Failed to unpack
SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at ./seff line 58, <DATA> line 602.
perl: error: [...]
```


while using https://github.com/SchedMD/slurm/blob/ce7d569807c495516ebfa6fcef25ad36ccc76827/contribs/seff/seff#LL19C3-L19C124 :

```
# sacct -P -n -a --format JobID,User,Group,State,Cluster,AllocCPUS,REQMEM,TotalCPU,Elapsed,MaxRSS,ExitCode,NNodes,NTasks -j 4911385
4911385|user|part|FAILED|hpc|1|2000M|00:23.041|00:00:31||0:9|1|
4911385.batch|||CANCELLED by 0|hpc|1||00:23.041|00:00:31|5936692K|0:9|1|1
```

I wonder whether this is an installation error and contrib/seff is working
for other 23.02 users.

Thanks

Mike Robbert

unread,
May 25, 2023, 12:34:28 PM5/25/23
to Slurm User Community List, slurm...@schedmd.com

How did you install seff? I don’t know exactly where this happens, but it looks like line 11 in the source file for seff is supposed to get transformed to include an actual path. I am running on CentOS and install Slurm by building the RPMs using the included spec files and here is a diff of the file in the source tree and the file that got installed to /usr/bin/seff

 

$ diff contribs/seff/seff /usr/bin/seff

11c11

< use lib "${FindBin::Bin}/../lib/perl";

---

> use lib qw(/usr/lib64/perl5);

 

Mike Robbert

Cyberinfrastructure Specialist, Cyberinfrastructure and Advanced Research Computing

Information and Technology Solutions (ITS)

303-273-3786mrob...@mines.edu  

A close up of a sign

Description automatically generated

Our values: Trust | Integrity | Respect | Responsibility

 

 

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of David Gauchard <gauc...@laas.fr>
Date: Thursday, May 25, 2023 at 10:02
To: slurm...@schedmd.com <slurm...@schedmd.com>
Subject: [EXTERNAL] [slurm-users] seff in slurm-23.02

CAUTION: This email originated from outside of the Colorado School of Mines organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.




Hello,

slurm-23.02 on ubuntu-20.04,

seff is not working anymore:

```
# ./seff 4911385
Use of uninitialized value $FindBin::Bin in concatenation (.) or string at ./seff line 11.
Name "FindBin::Bin" used only once: possible typo at ./seff line 11, <DATA> line 602.
perl: error: slurm_persist_conn_open: Something happened with the receiving/processing of the persistent connection init message to localhost:6819: Failed to unpack
SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at ./seff line 58, <DATA> line 602.
perl: error: [...]
```


David Gauchard

unread,
May 25, 2023, 1:04:31 PM5/25/23
to slurm...@lists.schedmd.com
Well, sorry, I indeed runned the raw script for this mail.
Running the installed one by `make install`, which is setting line 11
path correctly:
use lib qw(/usr/local/slurm-23.02.2/lib/x86_64-linux-gnu/perl/5.30.0);

I get:

perl: error: slurm_persist_conn_open: Something happened with the
receiving/processing of the persistent connection init message to
localhost:6819: Failed to unpack SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at
/usr/local/slurm/bin/seff line 57, <DATA> line 564.
perl: error: g_slurm_auth_pack: protocol_version 6500 not supported
perl: error: slurm_send_node_msg: g_slurm_auth_pack:
REQUEST_PERSIST_INIT has authentication error: No error
perl: error: slurm_persist_conn_open: failed to send persistent
connection init message to localhost:6819
perl: error: Sending PersistInit msg: Protocol authentication error
perl: error: DBD_GET_JOBS_COND failure: Unspecified error
Job not found.

Slurm is otherwise running well after an update from 20.11 -> 21.08 ->
23.02.

# sinfo -V
slurm 23.02.2
# sinfo -O nodehost,Version
HOSTNAMES VERSION
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2
x 23.02.2


On 5/25/23 18:33, Mike Robbert wrote:
> How did you install seff? I don’t know exactly where this happens, but
> it looks like line 11 in the source file for seff is supposed to get
> transformed to include an actual path. I am running on CentOS and
> install Slurm by building the RPMs using the included spec files and
> here is a diff of the file in the source tree and the file that got
> installed to /usr/bin/seff
>
> $ diff contribs/seff/seff /usr/bin/seff
>
> 11c11
>
> < use lib "${FindBin::Bin}/../lib/perl";
>
> ---
>
>> use lib qw(/usr/lib64/perl5);
>
> *Mike Robbert*
>
> *Cyberinfrastructure Specialist, Cyberinfrastructure and Advanced
> Research Computing*
>
> Information and Technology Solutions (ITS)
>
> 303-273-3786 | mrob...@mines.edu <mailto:mrob...@mines.edu>
>
> A close up of a sign Description automatically generated
>
> *Our values:*Trust | Integrity | Respect | Responsibility
>
> *From: *slurm-users <slurm-use...@lists.schedmd.com> on behalf of
> David Gauchard <gauc...@laas.fr>
> *Date: *Thursday, May 25, 2023 at 10:02
> *To: *slurm...@schedmd.com <slurm...@schedmd.com>
> *Subject: *[EXTERNAL] [slurm-users] seff in slurm-23.02
>
> CAUTION: This email originated from outside of the Colorado School of
> Mines organization. Do not click on links or open attachments unless you
> recognize the sender and know the content is safe.
>
>
> Hello,
>
> slurm-23.02 on ubuntu-20.04,
>
> seff is not working anymore:
>
> ```
> # ./seff 4911385
> Use of uninitialized value $FindBin::Bin in concatenation (.) or string
> at ./seff line 11.
> Name "FindBin::Bin" used only once: possible typo at ./seff line 11,
> <DATA> line 602.
> perl: error: slurm_persist_conn_open: Something happened with the
> receiving/processing of the persistent connection init message to
> localhost:6819: Failed to unpack
> SLURM_PERSIST_INIT message
> perl: error: Sending PersistInit msg: Message receive failure
> Use of uninitialized value in subroutine entry at ./seff line 58, <DATA>
> line 602.
> perl: error: [...]
> ```
>
>
> while using
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0> :

Angel de Vicente

unread,
May 26, 2023, 2:30:36 AM5/26/23
to David Gauchard, slurm...@schedmd.com, Slurm User Community List
Hello,

David Gauchard <gauc...@laas.fr> writes:

> slurm-23.02 on ubuntu-20.04,
>
> seff is not working anymore:

perhaps it is something specific to 20.04? I'm on Ubuntu 22.04 and
slurm-23.02.1 here and no problems with seff, except that the memory
efficiency part seems broken (I always seem to get 0.00% efficiency)

,----
| State: COMPLETED (exit code 0)
| Nodes: 1
| Cores per node: 20
| CPU Utilized: 05:50:07
| CPU Efficiency: 88.41% of 06:36:00 core-walltime
| Job Wall-clock time: 00:19:48
| Memory Utilized: 5.43 GB
| Memory Efficiency: 0.00% of 16.00 B
`----

--
Ángel de Vicente
Research Software Engineer (Supercomputing and BigData)
Tel.: +34 922-605-747
Web.: http://research.iac.es/proyecto/polmag/

GPG: 0x8BDC390B69033F52
Reply all
Reply to author
Forward
0 new messages