On Oct 17, 7:20 pm, Nathan Blythe <
nbly...@gmail.com> wrote:
> Thanks for the info, good to know how to approach the problem. I
> didn't realize you had root access to everybody's machines in Archer;
> I'm actually kind of surprised about that, but I guess it does make
> debugging a lot easier as you don't have to walk every user through
> the details of the VM and condor, when most people are probably just
> interested in getting their jobs running.
If I recall correctly we explain in the terms of use box that Archer
management has admin. access to the appliances for troubleshooting, we
should double-check this is still there. We should also update the
FAQ.
(Users can disable this by removing the Archer admin ssh key from the
appliance, but it's an important tool for us to debug problems).
--rf
>
> Thanks again,
> Nathan
>
> On 10/17/09, David Isaac Wolinsky <
davi...@ufl.edu> wrote:
>
>
>
> > Nathan Blythe wrote:
> >> Wow, yes, that fixed it, thanks!
>
> >> I'm curious how you divined that particular solution from my rather
> >> vague description. The linked thread doesn't mention the error I
> >> received; the only connection seems to be the "corruption" message
> >> (which, upon inspection, I did have).
>
> > Hidden away in antiquity:
> >
http://www.grid-appliance.org/wiki/index.php/Archer:FAQs#What_privile...
>
> > We can ssh into your machine to assist in debugging. It probably would
> > have been more proper to iterate with you through the problem and not
> > log into your machine. Though this is what I did...
>
> > - ps uax | grep condor
> > there was no condor_schedd
> > - check /opt/condor/var/log/
> > Something was stopping the condor_schedd from starting
> > - tried to manually restart the entire condor stack, no luck
> > - google, solution, but I didn't want to mess up with what you were
> > doing, so I left resolution up to you.
>
> > Cheers,
> > David
>
> >> Thanks again.
>
> >> On 10/16/09, David Isaac Wolinsky <
davi...@ufl.edu> wrote:
>
> >>> I think this is your problem:
> >>>
https://lists.cs.wisc.edu/archive/condor-users/pre-2004-June/msg01095...