[Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works

246 views
Skip to first unread message

Jason Herman

unread,
Aug 18, 2011, 7:19:07 PM8/18/11
to condor...@cs.wisc.edu
hi-

Here are the machines i'm setting up:

1) Mac (intel osx) - as condor central server
2) paralles VM running Windows within the mac as execute machine
3) seperate windows desktop
4) after everthing else works: EC2 windows machines - i suppose running as a cluster that attachs as a flock. (perhaps with cyclecomputing)

I have tried (for days):
* playing with various configurations of condor_config & condor_config.local on both machines.
* taken down firewalls on both sides.
* read manuals, googled, etc..
* running condor_store_cred with various setting on both sides

STATUS:
So far I have Condor up and running on the MAC as an execute, submit, manage installation. I successfully ran a test job. The windows execute node is up but i can't test it until i get credd security working properly (i think that's the problem). I can see the windows and mac slots from the both sides (see below). 

When i submit a job from MAC that has windows requirements it doesn't run. Presently, condor_q -analyze says "not yet been considered by the matchmaker" and "match but reject the job for unknown reasons." Under a previously attempted configuration it was "reject your job because of their own requirements" , the Windows slot would got to 'Matched', but the job would be Idle and the logs would suggest a security issue.

I can't even condor_rm the Idle jobs on the MAC side. I'm guessing there being matched to Windows ceded their control:
------
jimi:~ root# condor_q


-- Submitter: jimi.westell.com : <169.254.177.117:49371> : jimi.westell.com
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
 11.0   Jason           8/17 22:10   0+01:46:05 I  0   0.0  sample-job 60     
 13.0   Jason           8/18 01:12   0+01:24:43 I  0   0.0  sample-job 60     
 14.0   Jason           8/18 01:24   0+00:02:49 I  0   0.0  sample-job 60     
 15.0   Jason           8/18 01:53   0+00:00:00 I  0   0.0  sample-job 60     

4 jobs; 4 idle, 0 running, 0 held

jimi:~ root# condor_rm 11.0
AUTHENTICATE:1003:Failed to authenticate with any method
No result found for job 11.0
------


CONFIGURATIONS:


-------- condor_config.local on MAC:
--------
  CREDD_HOST = 10.211.55.10
  STARTER_ALLOW_RUNAS_OWNER = True
  CREDD_CACHE_LOCALLY = True
  ALLOW_CONFIG = root@$(CONDOR_HOST), *
  SEC_CONFIG_NEGOTIATION = REQUIRED
  SEC_CONFIG_AUTHENTICATION = REQUIRED
  SEC_CONFIG_ENCRYPTION = REQUIRED
  SEC_CONFIG_INTEGRITY = REQUIRED
  SEC_PASSWORD_FILE = /usr/local/condor/etc/pool_password

-------- condor_config.local on Windows:
--------
CREDD_HOST = xx.xxx.55.10
  STARTER_ALLOW_RUNAS_OWNER = True
  CREDD_CACHE_LOCALLY = True
  SEC_CLIENT_AUTHENTICATION_METHODS = NTSSPI, PASSWORD
  ALLOW_CONFIG = *
  SEC_CONFIG_NEGOTIATION = REQUIRED
  SEC_CONFIG_AUTHENTICATION = REQUIRED
  SEC_CONFIG_ENCRYPTION = REQUIRED
  SEC_CONFIG_INTEGRITY = REQUIRED

------- condor_config on Windows
------- i made this low security just try to get it working:
-------
ALLOW_WRITE = *
ALLOW_READ = *
#... not sure what else you need to see


LOG FILES:

--------- CredLog - on windows
--------- this is after turning MAC & WIN firewalls off - not a perm solution, but not working anyway:
---------
08/18/11 14:42:18 Failed to start non-blocking update to <xxx.xxx.1.21:9618>.
08/18/11 14:42:18 Return from Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> 0.0000s
08/18/11 14:47:18 Calling Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:47:18 Return from Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> 0.0000s
08/18/11 14:47:18 Calling Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:47:18 SECMAN: required authentication with <xxx.xxx.1.21:9618> failed, so aborting command UPDATE_AD_GENERIC.
08/18/11 14:47:18 ERROR: SECMAN:2004:Failed to create security session to <xxx.xxx.1.21:9618> with TCP.
|AUTHENTICATE:1003:Failed to authenticate with any method
08/18/11 14:47:18 Failed to start non-blocking update to <xxx.xxx.1.21:9618>.
08/18/11 14:47:18 Return from Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> 0.0000s
08/18/11 14:52:39 attempt to connect to <xxx.xxx.1.21:9618> failed: timed out after 20 seconds.
08/18/11 14:52:39 Calling Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> (2)
08/18/11 14:52:39 ERROR: SECMAN:2004:Failed to create security session to <xxx.xxx.1.21:9618> with TCP.
|SECMAN:2003:TCP connection to <xxx.xxx.1.21:9618> failed.
08/18/11 14:52:39 Failed to start non-blocking update to <xxx.xxx.1.21:9618>.
08/18/11 14:52:39 Return from Handler <SecManStartCommand::WaitForSocketCallback UPDATE_AD_GENERIC> 0.0000s

--------- MasterLog - on windows
---------
---------
08/18/11 14:51:50 condor_read(): timeout reading 21 bytes from <10.211.55.10:53043>.
08/18/11 14:51:50 IO: Failed to read packet header
08/18/11 14:51:50 store_pool_cred: failed to receive all parameters


COMMAND LINE OUTPUT:

---------- condor_status - on windows
---------- Manual says to run this when you are done, doesn't mention the command 
---------- only works on the windows side:
C:\Users\Administrator>condor_status -f "%s\t" Name -f "%s\n" ifThenElse(isUndefined(LocalCredd),\"UNDEF"\",LocalCredd)
slot1@JASONHERMANB752   UNDEF
sl...@jimi.westell.com  UNDEF
slot2@JASONHERMANB752   UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF
sl...@jimi.westell.com  UNDEF


------- condor_status - MAC (identical on windows)
-------
-------
jimi:log root# condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.210  1024  0+19:09:01
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  1+11:24:12
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  1+03:18:37
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  0+23:14:03
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  0+15:05:52
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  0+11:04:54
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  0+06:59:54
sl...@jimi.westell OSX        X86_64 Unclaimed Idle     0.000  1024  1+15:27:42
slot1@JASONHERMANB WINNT60    INTEL  Unclaimed Idle     0.120  1023  0+00:00:04
slot2@JASONHERMANB WINNT60    INTEL  Unclaimed Idle     0.100  1023  0+00:00:02
                    Total Owner Claimed Unclaimed Matched Preempting Backfill

      INTEL/WINNT60     2     0       0         2       0          0        0
         X86_64/OSX     8     0       0         8       0          0        0

              Total    10     0       0        10       0          0        0


-------- condor_store_cred on Windows:
--------
--------
C:\Users\Administrator>condor_store_cred -c add
Account: condor_pool@JASONHERMANB752

Enter password:

Operation failed.
   Make sure you have CONFIG access to the target Master.


thanks kindly for any assistance, jason


Tomas Lidén

unread,
Aug 19, 2011, 2:36:00 AM8/19/11
to Condor-Users Mail List

Hi Jason,

 

I had similar problems – but was running Windows machines only. In my case it was important to include the port number in CREDD_HOST. We use the following settings:

 

LOCAL_CREDD = $(CONDOR_HOST)

CREDD_HOST = $(LOCAL_CREDD):$(CREDD_PORT)

 

After that it was important to add the pool passwords correctly on all machines and of course the user passwords. Finally to execute “condor_reconfig –all” and  making sure that the LocalCredd flag was set – see manual http://www.cs.wisc.edu/condor/manual/v7.6/6_2Microsoft_Windows.html.

 

For one of my machines I got trouble that looked that credd issues but it turned out that the host was not correctly registered in the domain (in Active Directory). Removing it and re-adding it solved that problem.

 

Hope that can be of some help.

/Tomas

Timothy St. Clair

unread,
Aug 23, 2011, 9:46:44 AM8/23/11
to Condor-Users Mail List
If your VM session exists only to run jobs, have you tried setting your
START expression to TRUE?

You should not need a credd unless you are running as owner, which is
not the default.

Also your CRED_HOST *must be* a windows machine. It may be too early in
the a.m., but I can't discern from the logs below if that is the case.

Cheers,
Tim

> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-use...@cs.wisc.edu with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-use...@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

Jason Herman

unread,
Aug 24, 2011, 11:43:44 PM8/24/11
to Condor-Users Mail List
Please, anybody!!!! help! i've been at it for days to little avail. 

Condor is running smoothly on my mac (central manager, submit, & execute).
Conder is also running on my Windows box (submit & execute).

However i need to configure CREDD on the windows box and can't set the pool password from the MAC side.
IS THIS A FEASIBLE CONFIGURATION? anyone have it working?

fyi, the windows machine is running in parallels on the MAC. IP addresses and hostnames seem to be resolving fine.

I have followed the manuals on installing CREDD, password authentication, and tried endless configurations.


**************
I can set the pool password from the Windows side:

C:\Users\Administrator>condor_store_cred add -c
Account: condor_pool@JASONHERMANB752

Enter password:

Operation succeeded.

But I can't set the pool password from the MAC side:

jimi:condor root# condor_store_cred add -c
Account: condo...@jimi.westell.com


Enter password: 

Operation failed.
   Make sure you have CONFIG access to the target Master.

*****************

I am really beyond wits' end with the obscurity of this! How could i not have CONFIG 
access to the target when i included "*" on both ends??!!



**********************
WINDOWS CONFIG:

CREDD_DEBUG = D_ALL

LOCAL_CREDD = windows_hostname
CREDD_HOST = windows_hostname:$(CREDD_PORT)

CREDD_CACHE_LOCALLY = True
#
STARTER_ALLOW_RUNAS_OWNER = True
#
ALLOW_CONFIG = Administrator@*, root@*, windows_IP, mac_IP, *@mymac_hostname, *
#SEC_CLIENT_AUTHENTICATION_METHODS = FS, NTSSPI, PASSWORD, ANONYMOUS
#SEC_CONFIG_NEGOTIATION = REQUIRED
#SEC_CONFIG_AUTHENTICATION = REQUIRED
#SEC_CONFIG_ENCRYPTION = REQUIRED
#SEC_CONFIG_INTEGRITY = REQUIRED
##

***********************
MAC CONFIG:

##
LOCAL_CREDD = windows_hostname
CREDD_HOST = windows_hostname:$(CREDD_PORT)


STARTER_ALLOW_RUNAS_OWNER = True
CREDD_CACHE_LOCALLY = True
##


## You'll also need to ensure that clients are configured to use
## PASSWORD authentication on any machine that can run jobs as the
## submitting user. For example,
###### duplicate line with below:
  SEC_CLIENT_AUTHENTICATION_METHODS = FS, NTSSPI, PASSWORD, ANONYMOUS
##
## And finally, you'll need to enable CONFIG-level access for all
## machines in the pool so that the pool password can be stored:
##


##
  ALLOW_CONFIG = mac_ip, windows_ip, Administrator@*, root@*, *
#   SEC_CONFIG_NEGOTIATION = REQUIRED
#   SEC_CONFIG_AUTHENTICATION = REQUIRED
#   SEC_CONFIG_ENCRYPTION = REQUIRED
#   SEC_CONFIG_INTEGRITY = REQUIRED
##
***************************


thanks, J

Timothy St. Clair

unread,
Aug 25, 2011, 9:46:04 AM8/25/11
to Condor-Users Mail List
try adding -debug to the command line for the tool

+ check the CONFIG_LOG output on windows.

I'm not certain, but I don't believe it's allowed to try to set the pool
passwd from a non-windows machine. I will have to run a test to
verify.

Cheers,
Tim

Michael O'Donnell

unread,
Aug 25, 2011, 10:13:03 AM8/25/11
to Condor-Users Mail List
I am probably missing something from previous emails but for clarification I am assuming the CREDD is running on the windows machine and not the central manager in your case. Second, why is the pool domain different when you set the pool password on windows and the MAC? We only use windows in our pool, but maybe this will help.

Windows: condor_pool@JASONHERMANB752
MAC: condo...@jimi.westell.com

Should these not be the same?

mike




From: Jason Herman <jbhe...@gmail.com>
To: Condor-Users Mail List <condor...@cs.wisc.edu>
Date: 08/24/2011 09:51 PM
Subject: Re: [Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works
Sent by: condor-use...@cs.wisc.edu


Jason Herman

unread,
Aug 25, 2011, 1:33:35 PM8/25/11
to tstc...@redhat.com, Condor-Users Mail List
i'll try -debug and check CONFIG_LOG.

the docs say the pool password needs to be set on all machines. The main reason i'm setting up CRED at all is so that i can submit jobs to the WIN box from the MAC side. my understanding is that all computers need to share the pool password so their daemons can communicate. Then further i need to have the same logon/password accounts on both mac & win so from mac i can run_as_owner on the WIN box. am i understanding that correctly?

regards, jason

On Aug 25, 2011, at 9:46 AM, Timothy St. Clair wrote:

try adding -debug to the command line for the tool  

+ check the CONFIG_LOG output on windows. 

I'm not certain, but I don't believe it's allowed to try to set the pool
passwd from a non-windows machine.  I will have to run a test to
verify. 

Cheers,
Tim

On Wed, 2011-08-24 at 23:43 -0400, Jason Herman wrote:

Jason Herman

unread,
Aug 25, 2011, 1:36:32 PM8/25/11
to Condor-Users Mail List
yes, CREDD IS running on the WIN box, and not the central manager. MAC is my central manager. JASONHERMANB752 and jimi.westell.com are hostnames. are they also 'pool domain's and must they be the same?

thanks, jason


On Aug 25, 2011, at 10:13 AM, Michael O'Donnell wrote:

I am probably missing something from previous emails but for clarification I am assuming the CREDD is running on the windows machine and not the central manager in your case. Second, why is the pool domain different when you set the pool password on windows and the MAC? We only use windows in our pool, but maybe this will help.

Windows: condor_pool@JASONHERMANB752 
MAC: condo...@jimi.westell.com 

Should these not be the same? 

mike




From: Jason Herman <jbhe...@gmail.com>
To: Condor-Users Mail List <condor...@cs.wisc.edu>
Date: 08/24/2011 09:51 PM
Subject:Re: [Condor-users] credd issues: heterogenous system MAC-central; WIN-execute + EC2 (win) when this works
Sent by:condor-use...@cs.wisc.edu





Timothy St. Clair

unread,
Aug 25, 2011, 3:12:54 PM8/25/11
to Condor-Users Mail List
inline.

On Thu, 2011-08-25 at 13:33 -0400, Jason Herman wrote:
> i'll try -debug and check CONFIG_LOG.
>
>
> the docs say the pool password needs to be set on all machines. The
> main reason i'm setting up CRED at all is so that i can submit jobs to
> the WIN box from the MAC side. my understanding is that all computers
> need to share the pool password so their daemons can communicate. Then
> further i need to have the same logon/password accounts on both mac &
> win so from mac i can run_as_owner on the WIN box. am i understanding
> that correctly?

The only way that would be true is if your entire single sign on was
validating against Active Directory (usually not the case unless you are
primarily a windows shop), and you wish to run-as-owner. (The 2
combined features are extremely rare).

You can most certainly submit jobs from the mac side and have them run
on windows, you only need to set your requirements correctly. You will
need to verify your ALLOW_* is correct.

e.g. Linux submit -> windows run
--------------------------------
universe = vanilla
executable = your_bat_file.bat
arguments = 2
requirements = ( Arch=="X86_64") && ( OpSys=="WINNT51" ||
OpSys=="WINNT52" || OpSys=="WINNT61" )

# need the line below
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
iwd = /tmp
queue 20
--------------------------------

Cheers,
Tim

Reply all
Reply to author
Forward
0 new messages