I thought setting -allow_tix and -allow_tgs_req would do it, but I can
still get new valid tickets for services from an account with those
flags set.
The krb5kdc.log knows who's asking for the ticket, and it prints out:
Jul 24 02:45:55 blah.com krb5kdc[17432](info): TGS_REQ (4 etypes {18 17
16 23}) 1.1.1.1: ISSUE: authtime 1311493077, etypes {rep=18 tkt=18
ses=18}, a...@BLAH.COM for b...@BLAH.COM
even though a...@BLAH.COM has:
Attributes: DISALLOW_TGT_BASED DISALLOW_ALL_TIX REQUIRES_PRE_AUTH
There must be some way to do this? I totally get the aspect of not
being able to revoke live tickets and sessions, and those having to
expire, but getting new tickets seems like something that should be
disable-able?
The -allow_tgs_req entry on man kadmin seems like it would be what I
want, since the log above says it's a TGS_REQ, but the entry says, "This
option is useless for most things." so I'm obviously misunderstanding
what it does. Yet -allow_tix only seems to prevent tickets from being
issued _FOR_ the princ with it set, so b...@BLAH.COM above, which I don't
want to disable, since it's a service others will be using. I just want
a...@BLAH.COM to stop working.
As a bonus, I'd like services to be able to check if a...@BLAH.COM has an
enabled account, and -allow_tix seems to work for that, since if the
service tries to get a ticket for a...@BLAH.COM it fails.
What am I missing?
Thanks,
Chris
I must be missing something, though, since it seems like this would be
something that's already supported...
Chris
While I'm in the KDC code, I notice this related check in
validate_tgs_request:
/* Server must be allowed to be a service */
if (isflagset(server.attributes, KRB5_KDB_DISALLOW_SVR)) {
*status = "SERVER NOT ALLOWED";
return(KDC_ERR_MUST_USE_USER2USER);
}
Do I want to set -allow_svr on all my clients, since I know they'll only
ever be clients in a client<->server relationship, or u2u with another
client? Is there any reason to or not to set the flag?
Hmm, wait, if I set -allow_svr on b...@BLAH.COM, then fails even on a
KRB5_GC_USER_USER krb5_get_credentials where b is the creds->server...
Hmm^2, this code is slightly different between 1.9.1 and 1.6.1, or at
least the error return is different, so maybe this was fixed to work
like I think it should after 1.6.1. I need to build my own kdc on CentOS...
Chris
For performance reasons and because of cross-realm authentication, we
don't look up the client principal for TGS requests. That does mean
it's impossible to deny TGS requests based on updated database state for
the client.
You could modify the KDC code locally to do this if you need it. I
don't have any other clever ideas for doing what you want.
As for -allow_svr, I think you would want to set that on your user
principals (earlier I suggested -allow_tgs_req, but that's the wrong
flag, since it wouldn't prevent someone from making an AS req to another
user principal and performing an offline dictionary attack). However, I
think you're right that it would conflict with user-to-user
authentication to that principal. In 1.7 we changed the error return
for that case to KDC_ERR_MUST_USE_USER2USER, but I haven't been able to
find code that allows user-to-user requests to such principals.
Nico
--
Would you guys be interested in a patch? It seems very strange and
surprising--as someone just starting to learn and use Kerberos--that you
basically can't disable an account for any new requests to the KDC since
the ban, and the fix seems very simple. I could put it on a profile
bool and default it to off if that would make it easier to accept into core.
I don't understand the cross-realm authentication issue very well,
though, since I haven't really learned that stuff yet. I was going to
use similar code to what's in the as_req path to generate the client and
pass it to validate_tgs_request, hopefully that would "just work"? Hmm,
trying to understand this a bit more, could I just check
is_local_principal and only do the check in that case? I'm trying to be
conservative here...the patch wouldn't ever allow anything the current
code doesn't, it would just disallow more things, which seems safe?
Although, I guess its behavior for cross-realm stuff would have to be
documented well so nobody was misled into thinking it would ban somebody
across realms...? Maybe I name the profile bool something like
check_allow_tix_for_local_client_princs to be clear...
> authentication to that principal. In 1.7 we changed the error return
> for that case to KDC_ERR_MUST_USE_USER2USER, but I haven't been able
> to find code that allows user-to-user requests to such principals.
Yeah, I looked into it a bit farther in 1.9.1 and it definitely looks
like it would still fail even on u2u requests, which seems like bug, at
least given the new error code for the svr case implying it should work?
Either that or it should be documented.
I don't think I'm qualified to fix that one unless there's some easy way
to check if it's a u2u request in validate_tgs_request, so I'll just
have to leave them +allow_svr for now. I'll look at it more when I'm
fixing the allow_tix one. Hmm, wait, poking around more, it looks like
the if should just be something like this:
/* Server must be allowed to be a service */
if (isflagset(server.attributes, KRB5_KDB_DISALLOW_SVR) &&
!isflagset(request->kdc_options, KDC_OPT_ENC_TKT_IN_SKEY)) {
*status = "SERVER NOT ALLOWED";
return(KDC_ERR_MUST_USE_USER2USER);
}
Thanks for putting up with all my mails!
Chris
Shorter-lived x-realm TGTs, thus shorter-lived svc tix :)
Nico
--
What do you do for principals of trusted realms ?
Simo.
--
Simo Sorce * Red Hat, Inc * New York
I think performance is still an issue. We definitely still get feedback
about the number of LDAP queries per KDC operation, and TGS requests are
more frequent than AS requests. (At least, they should be. It depends
on how often the KDC is used purely as a password verifier.)
We could add a configuration knob, but I'm still trying to justify the
increased complexity to myself. Preventing a disabled account from
making new TGS requests with a valid TGT seems like closing the barn
door after the horse has escaped, as you have no control over the
service tickets the client already obtained before it was disabled.
It's a reasonable desire to want to flip a switch and make a
particular user instantly unable to affect your environment. That's
the kind of thing you should get from a centrally managed
authentication system, right? Unfortunately, there are three holes
big enough to drive a truck through:
1. The user can keep requesting service tickets until the user's TGT
expires.
2. The user can keep using service tickets until they expire.
3. The user can keep using active sessions until the session is
invalidated somehow (interrupted connection, restarted client or
server).
You're proposing to fix (1), which is the only one of the three which
can be addressed on the KDC. This comes at either a performance cost
(looking up the client every time, if it's a local principal) or a
complexity cost (adding a configuration variable, and making the TGS
validation code paths conditional on whether or not we've looked up
the client). Certainly, addressing (1) would limit the scope of the
things a bad actor could do after account closure, but not completely,
and only if the bad actor didn't anticipate the account closure by
requesting a bunch of service tickets.
>From an implementor's perspective, it's sometimes better to have a
simpler system with three weaknesses than a more complicated system
with two. An attacker doesn't care about the number of weaknesses as
long as it's positive. I can understand the appeal of doing whatever
you can because not all bad actors are perfect automatons with
unlimited foresight, but it's not compelling to me in this case.
To actually get rapid-reaction account closure, you need to implement
a temporary blacklist in the servers--and in your case, possibly also
in the clients. I know, that's a huge headache for an application
developer to have to worry about. But I don't think there's a
comprehensive substitute at the Kerberos level.
(Trivia: we used to sort of address (3) by refusing to unwrap messages
in the GSSAPI krb5 mech after a user's tickets expired. That at least
puts a time bound on how long the user can affect services, for
applications using GSSAPI wrap and unwrap. We had to turn this off
because it was too disruptive to existing applications, which were
generally not written by security people. Also, there are a lot of
applications like ssh which don't use GSSAPI wrap and unwrap, so it
wasn't really solving the problem.)
I do understand where you're coming from on the system complexity front,
I really do. However, the current behavior really does totally violate
the principal of least surprise in a big way for new kerberos users.
I read a lot about how kerberos works and the philosophy and theory
behind it before choosing it, and I was comfortable with the idea that
already-issued tickets are trusted until they expire; that's a
fundamental part of the model and why it scales. That makes sense, and
the engineering reasons behind that are sound, and I'm on board.
However, once somebody does have to go back to talk to the KDC, at that
point it seems completely clear that -allow_tix should actually not
allow any more tix. I mean, really, come on. :)
I spent a long time trying to figure out what I was doing wrong while I
was testing this, since it was so clearly not doing the Right Thing.
It seems like there are three hesitations here:
a. Whether it matters for security. I would say it does matter, even
though I understand your point about the bad guys and >0 weaknesses
below. However, I think it's pretty clear a little more security is
better, as long as it doesn't cost too much complexity and performance,
and it's well documented. In fact, the very idea that a ticket has a
lifetime is an argument for incremental security. You can do damage,
but not forever. In this case, you can do damage, but not to any new
principals. They seem very similar in philosophy to me.
Also, as far as I can tell from reading the web, Windows AD shuts down
all tickets for non-cross-realm principals when an account is
disabled[1]. And finally, it's 100% obviously the more intuitive
behavior to have the KDC stop issuing tickets, and I think making things
intuitive where possible is a big win in lots of ways, including security.
b. Complexity. I'm about to make the change, and it doesn't look like
it's going to be more than a few lines of code. I'm also fixing the
krb5_db_entry pass-by-value "bug" I mentioned in the private mail
(because passing a 68 byte structure containing pointers by value gives
me hives :), but the code for this client check and the aprof bool fetch
are going to be short and easy to code-review. Incidentally, I
implemented and tested the one-line if-statement change to fix the
-allow_svr u2u thing, and it works. It fails to issue the ticket if you
kvno princ, but it will let princ be the u2u server, so that's fixed
locally and I'll send a patch (I will keep all three patches separate
for your sanity).
c. Performance. I haven't tested this yet, but I am planning on
writing a load tester for this stuff anyway, and so I'll be able to say
quantitatively how much impact this has. However, even if it's a
significant performance hit, I would say the aprof bool is the right way
to handle that, letting admins decide to trade off the performance for
the additional security. Admins already have some control over your #2
below, they can trade scalability and convenience and have shorter lived
tickets.
Obviously all I can do is request the feature, make the argument, and
provide the patch. Hopefully the patch will be simple and clear enough,
and the performance impact low enough, that you'll be convinced! If
not, hey, that's the beauty of open-source, I can fix it for me, put the
patch on my website, and it's just a bit more friction for admins if
they want the feature. But, hopefully it'll go into mainline, because
patch management sucks. :)
> (Trivia: we used to sort of address (3) by refusing to unwrap
> messages in the GSSAPI krb5 mech after a user's tickets expired.
Hah, in my next batch of spam to the list I was going to ask about
exactly how that is handled, but I figured I should look at the source
for mk_safe and mk_priv first to see if they check expirations. It
sounds like you're saying they don't. I was thinking about this and it
does seem complicated to handle robustly for clients if things just
start failing and you need to reauth in the middle of communications.
Some sample code might help people do this right. I'll ask about the
best way to handle that in a different thread next week maybe.
Thanks,
Chris
[1] "When the administrator disables the user account in emea, he or
she won�t be able to get any more tickets for resources in emea. The
user will however still be able to get new tickets for resources in the
asiapac domain�this will be possible as long as the user�s TGT for
asiapac remains valid. The reason for this is that the DCs in the
asiapac domain don�t check the user�s account status when they issue
tickets."
http://books.google.com/books?id=05xyiZqC8ToC&lpg=PA178&ots=Bx9VV3TWvk&pg=PA178#v=onepage&q&f=false
+ if (validate_tgs_req_local_client &&
+ is_local_principal(header_enc_tkt->client)) {
+
+ /*
+ * If validate_tgs_req_local_client is set in kdc.conf, we
+ * will check KRB5_KDB_DISALLOW_ALL_TIX on any local clients.
+ *
+ * This client db_entry get code is basically copied from
+ * process_as_req. We free the client below after passing it
+ * to validate_tgs_request, before the s4u2self request sets
+ * it, and we use a local flags for getting the entry.
+ *
+ * Note: I have to admit to not understanding all the
+ * subtleties of this code, so somebody with more of a clue
+ * should review it! Some things I don't understand yet:
+ *
+ * - We really only need to be able to check if the client is
+ * local, and if the tix flag is set, but the
+ * KRB5_KDB_FLAG_CLIENT_REFERRALS_ONLY flag is only valid
+ * for AS_REQ, so is the db doing more work than it needs
+ * to? Is there a fast path here I could use?
+ *
+ * - The s4u code below also is confusing, and reusing the
+ * client local variable is a bit cheesy. Do they refer to
+ * the same client? From looking at the code below, I
don't
+ * think so, but I'm not sure. If they are, would we save
+ * any work by reusing it below if we get it here?
+ *
+ */
+
+ unsigned int local_c_flags = 0;
+
+ /*
+ * Note that according to the referrals draft we should
+ * always canonicalize enterprise principal names.
+ */
+ if (isflagset(request->kdc_options, KDC_OPT_CANONICALIZE) ||
+ krb5_princ_type(kdc_context,
+ request->client) ==
KRB5_NT_ENTERPRISE_PRINCIPAL) {
+ setflag(local_c_flags, KRB5_KDB_FLAG_CANONICALIZE);
+ setflag(local_c_flags, KRB5_KDB_FLAG_ALIAS_OK);
+ }
+ errcode = krb5_db_get_principal(kdc_context,
header_enc_tkt->client,
+ local_c_flags, &client);
+ if (errcode == KRB5_KDB_NOENTRY) {
+ status = "CLIENT_NOT_FOUND";
+ if (vague_errors)
+ errcode = KRB5KRB_ERR_GENERIC;
+ else
+ errcode = KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN;
+ goto cleanup;
+ } else if (errcode) {
+ status = "LOOKING_UP_CLIENT";
+ goto cleanup;
+ }
+ }
+
/* XXX make sure server here has the proper realm...taken from
AP_REQ
header? */
@@ -276,7 +333,7 @@
goto cleanup;
}
- if ((retval = validate_tgs_request(request, 0, server,
header_ticket,
+ if ((retval = validate_tgs_request(request, client, server,
header_ticket,
kdc_time, &status, &e_data)))
{
if (!status)
status = "UNKNOWN_REASON";
@@ -284,6 +341,14 @@
goto cleanup;
}
+ /* free the client if we created one above, so the below s4u code
+ works exactly the same as before the validate flag was added.
*/
+ if (client) {
+ krb5_db_free_principal(kdc_context, client);
+ client = 0;
+ }
+
+
if (!is_local_principal(header_enc_tkt->client))
setflag(c_flags, KRB5_KDB_FLAG_CROSS_REALM);
Right, this one can be fixed in the KDC (TGS), and should be. The
other two holes are not an excuse to not fix this one because if one
wishes to go fix all three holes one might want to start with the easy
one first.
> 2. The user can keep using service tickets until they expire.
Right. Nothing much can be done about this other than: a) set max
service ticket lifetimes short enough that this hole can be tolerated,
or b) implement a revocation protocol. (b) would be nice, but hard.
> 3. The user can keep using active sessions until the session is
> invalidated somehow (interrupted connection, restarted client or
> server).
This is the hardest problem of all. Short ticket lifetimes don't help
because expiring sessions with their tickets means re-keying or
re-connecting often and that's a pain (particularly in protocols that
don't have re-keying), and then there's local access where tickets are
completely irrelevant.
Plugging this hole requires a revocation protocol. I don't mean
OCSP-like -- the TGS effectively is Kerberos' OCSP equivalent (or,
more correctly, OCSP is PKIX's Kerberos equivalent :). I mean a
protocol by which services participating in a realm can get notified
of principal revocation so they can act accordingly (whatever that
might be). Cross-realm relationships make a revocation protocol...
interesting.
It'd be nice to have a standard revocation protocol for Kerberos...
Nico
--
+ /* Client must not be locked out */
+ if (client && isflagset(client->attributes,
KRB5_KDB_DISALLOW_ALL_TIX)) {
+ *status = "CLIENT LOCKED OUT";
+ return(KDC_ERR_CLIENT_REVOKED);
+ }
Chris
A better analogy: the current thing is like you identified the horse
thief at noon, but you decided to leave the barn open and unlocked until
sunset, even though he's sitting outside idling in a truck that already
has a couple of your horses in it, but has room for more.
I just want to lock the barn now, and I'm willing to walk out there to
do that.
Or something like that. :) Uh, that last sentence was to address the
performance implications. I need to figure out the metaphorical
expression of the profile bool. Maybe you ask the wife if it's okay to
stop doing dishes and walk out and lock the barn... Then, clearly, the
metaphor is lacking the cross-realm issue...maybe there's a dude taking
your horses but he was referred to you by your friend from the farm down
the road, and you keep trusting him based on that recommendation until
sunset when you have drinks at the bar with your friend.
Okay, stopping now,
Chris
+ request->client) == KRB5_NT_ENTERPRISE_PRINCIPAL) {
should be
+ header_enc_tkt->client) == KRB5_NT_ENTERPRISE_PRINCIPAL) {
The former line will deference a null pointer and crash (request->client
is 0 on tgs_req).
Chris
I guess if/when I get hit with performance problems, I will look into
those too, so maybe I will be hoisted on my own petard, or maybe the
ldap backend will get optimized as a secondary effect of me adding this
feature!
Chris
On 2011/07/25 13:08, Nico Williams wrote:
> On Jul 25, 2011 11:37 AM, "Greg Hudson" <ghu...@mit.edu
> <mailto:ghu...@mit.edu>> wrote:
> > On Sun, 2011-07-24 at 17:30 -0400, Nico Williams wrote:
> > > For performance reasons? It's like this forever, so there may not be
> > > a performance reason anymore. IMO this should be fixed.
> >
> > I think performance is still an issue. We definitely still get feedback
> > about the number of LDAP queries per KDC operation, and TGS requests are
> > more frequent than AS requests. (At least, they should be. It depends
> > on how often the KDC is used purely as a password verifier.)
>
> For LDAP the kdc ought to be async and/or multi-processed/threaded.
> Yeah, I know, it's not, but that's not my problem or that of anyone not
> using the LDAP backend.
>
> Also, IIRC LDAP has a method by which to request cache entry
> invalidation updates. Maybe the LDAP backend ought to cache, which
> would be no worse than not doing the client principal lookup in the TGS
> case, and if you can quickly invalidate cached entries, that's a win.
>
> IMO making this change would be a win.
>
> Nico
> --
>
Hi, hope the week is going well for everyone.
We have one, its called authorization.... :-)
Unfortunately as an industry/community we were never able to agree on
a 'standardized' way of doing this with respect to Kerberos. Actually
more correctly stated we ceded the entire industry segment to
Microsoft... :-)(
If anyone wants to Google around a bit for the slides of a paper I
presented at the AFS/KRB5 meeting at UMich a few years ago you can see
what our strategy was with respect to this. The service ticket ended
carried a 256 bit number in the authorization payload which
represented the 'identity' of the user's service. The role of the
service ticket was to transport and authenticate that service identity
to the application.
At that point the application has a token to use against an LDAP
directory server to determine if the service should continue to be
vended to the user. So in addition to the time limitation imposed by
the service ticket there is the opportunity to implement much higher
granularity revocation based on the 'service identity'.
The whole strategy implemented inheritance rather nicely since the
user's specific service identity was constructed from a compression of
a bitstream consisting of the user's identity and the identity of the
service. So by 'disabling' the master service identity all instances
of the service would be turned off with the alternative of only
disabling a particular user instance of a service.
We actually treated Kerberos authentication as a service as well which
was surprising with respect to how it confused people. It seemed
difficult for a lot of people to understand that authentication is a
service and as such should be subject to authorization constraints.
Which as can be seen in the context of this thread is something which
Chris as an application developer was definitely interested in
accomplishing.
> Nico
Best wishes for a productive week.
Greg
}-- End of excerpt from Nico Williams
As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-1686
FAX: 701-281-3949 EMAIL: gr...@enjellic.com
------------------------------------------------------------------------------
"If you care, you just get disappointed all the time. If you don't care
nothing matters so you are never upset."
-- Calvin
Not if we insist on delivering auth z-data via kerberos tickets (see Simo's
PAD proposal.
Also, we don't re-authorize long-lived sessions constantly -- not at all
actually. So, yes IMO we need a low latency revocation protocol.
Nico
--
We may add a feature like this at some point, in order to provide fast
revocation for high-value services. In order to get any solid security
guarantees, the service would need to set a short maximum lifetime, and
would need to force reauthentications upon ticket expiration.
I can't provide any timeline, though. Relative to your patch, we would
likely need to address:
* Precisely how the client lookup should be done (what flags,
basically). Canonicalization of the client principal should not
generally be needed since it will have been done during the AS request.
* Consideration of edge cases, such as when the client principal entry
has been deleted or renamed or deleted and recreated since the AS
request.
* Consideration of whether to extend the DAL interface's TGS
verification function to take the client DB entry as input when
available.
* A long-overdue refactoring of the TGS code path before additional
complexity is added to it.
* Documentation.
* Automated test cases.
What I'll do is put up a page with the various patches I've made to the
KDC, and send out a link, and you guys can take a look. I think there
are 5 in total (pass-by-value fix, u2u allow_svr fix, this
check-allow_tix-on tgs_req fix/feature, make vague_errors a profile bool
instead of compile time const (which turned out to be pretty useless for
various reasons I'll write up on the page), and fix the dupe
krb5_realm_params issue).
I'm not sure of the best way to write an automated test for this. Is
there an example of a complex test like this in the source tree? You'd
have to simultaneously be talking to the kadm5 interface to change the
flags and be talking to the kdc. I could do it in perl with my patched
Authen::Krb5(::Admin) modules, but it'd be a fairly big test in C.
Chris
We have a test framework in util/k5test.py which takes care of the heavy
lifting. You can find t_xxxx.py scripts in tests and some other parts
of the source tree as examples. tests/gssapi/t_gssapi.py is perhaps an
interesting example as a pattern for running a test under two different
configurations to get different results.