Lost in (Kerberos) service translation?

A year ago Brian J. Atkisson from Red Hat IT filed a bug against FreeIPA asking to remove a default [domain_realm] mapping section from the krb5.conf configuration file generated during installation of a FreeIPA client. The bug is still open and I’d like to use this opportunity to discuss some less known aspects of a Kerberos service principal resolution.

When an application uses Kerberos to authenticate to a remote service, it needs to talk to a Kerberos key distribution center (KDC) to obtain a service ticket to that remote service. There are multiple ways how an application could construct a name of a service but in a simplistic view it boils down to getting a remote service host name and attaching it to a remote service type name. Type names are customary and really depend on an established tradition for a protocol in use. For example, browsers universally assume that a component HTTP/ is used in the service name; to authenticate to www.example.com server they would ask a KDC a service ticket to HTTP/www.example.com principal. When an LDAP client talks to an LDAP server ldap.example.com and uses SASL GSSAPI authentication, it will ask KDC for a service ticket to ldap/ldap.example.com. Sometimes these assumptions are written down in a corresponding RFC document, sometimes not, but they assume both client and server know what they are doing.

There are, however, few more moving parts at play. A host name part of a service principal might come from an interaction with a user. For browser, this would be a server name from a URL entered by a user and browser would need to construct the target service principal from it. The host name part might be incomplete in some cases: if you only have a single DNS domain in use, server names would be unique in that domain and your users might find it handy to only use the first label of a DNS name of the server to address it. Such approach was certainly very popular among system administrators who relied on capabilities of a Kerberos library to expand the short name into a fully qualified one.

Let’s look into that. Kerberos configuration file, krb5.conf, allows to say for any application that a hostname passed down to the library would need to be canonicalized. This option, dns_canonicalize_hostname, allows us to say “I want to connect to a server bastion” and let libkrb5 to expand that one to a bastion.example.com host name. While this behavior is handy, it relies on DNS. A downside of disabling canonicalization of the hostnames is that short hostnames will not be canonicalized and upon requests to KDC might be not recognized. Finally, there is a possibility of DNS hijacking. For Kerberos, use cases when DNS responses are spoofed aren’t too problematic since the fake KDC or the fake service wouldn’t gain much knowledge, but even in a normal situation a latency of DNS responses might be a considerable problem.

Another part of the equation is to find out which Kerberos realm a specified target service principal belongs to. If you have a single Kerberos realm, it might not be an issue; by setting default_realm option in the krb5.conf we can make sure a client always assumes the only realm we have. However, if there are multiple Kerberos realms, it is important to map the target service principal to the target realm at a client side, before a request is issued to a KDC.

There might be multiple Kerberos realms in existence at any site. For example, FreeIPA deployment provides one. If FreeIPA has established a trust to an Active Directory forest, then that forest would represent another Kerberos realm. Potentially, even more than one as each Active Directory domain in an Active Directory forest is a separate Kerberos realm in itself.

Kerberos protocol defines that a realm in which the application server is located must be determined by the client (RFC 4120 section 3.3.1). The specification also defines several strategies how a client may map the hostname of the application server to the realm it believes the server belongs to.

Domain to realm mapping

Let us stop and think a bit at this point. A Kerberos client has full control over the decision process of to which realm a particular application server belongs to. If it decides that the application server is from a different realm than the client is itself, then it needs to ask for a cross-realm ticket granting ticket from its own KDC. Then, with the cross-realm TGT in possession, the client can ask a KDC of the application server’s realm for the actual service ticket.

As a client, we want to be sure we would be talking to the correct KDC. As mentioned earlier, overly relying on DNS is not always a particulary secure action. As a result, krb5 library provides a way to consider how a particular hostname is mapped to a realm. The search mechanism for a realm mapping is pluggable and by default includes:

registry-based search on WIN32 (does nothing for Linux)
profile-based search: uses [domain_realm] section in krb5.conf to do actual mapping
dns-based search that can be disabled with dns_lookup_realm = false
domain-based search: it is disabled by default and can be enabled with realm_try_domains = ... option in krb5.conf

The order of search is important. It is hard-coded in krb5 library and depends on what operation is performed. For realm selection it is hard-coded that profile-based search is done before DNS-based search and domain-based search is done as the last one.

Profile-based search

When [domain_realm] section exists in krb5.conf, it will be used to map a hostname of the application server to a realm. The mapping table in this section is typically build up based on the host and domain maps:

[domain_realm]
   www.example.com = EXAMPLE.COM
   .dev.example.com = DEVEXAMPLE.COM
   .example.com = EXAMPLE.COM

The mapping above says that www.example.com would be explicitly mapped to EXAMPLE.COM realm, all machines in DNS zone dev.example.com would be mapped to DEVEXAMPLE.COM realm and the rest of hosts in DNS zone example.com would be mapped to EXAMPLE.COM. This mapping only applies to hostnames, so a hostname foo.bar.example.com would not be mapped by this schema to any realm.

Profile-based search is visible in the Kerberos trace output as a selection of the realm right at the beginning of a request for a service ticket to a host-based service principal:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S cifs client.example.com
[30798] 1552847822.721561: Getting credentials host/client.example.com@EXAMPLE.COM -> cifs/client.example.com@EXAMPLE.COM using ccache KEYRING:persistent:0:0
...

The difference here is that for a service principal not mapped with profile-based search there will be no assumed realm and the target principal would be constructed without a realm:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S ldap dc.ad.example.com
[30684] 1552841274.602324: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ using ccache KEYRING:persistent:0:0

DNS-based search

DNS-based search is activated when dns_lookup_realm option is set to true in krb5.conf and profile-based search did not return any results. Kerberos library will do a number of DNS queries for a TXT record starting with _kerberos. It will help it to discover which Kerberos realm is responsible for the DNS host of the application server. Kerberos library will perform these searches for the hostname itself first and then for each domain component in the hostname until it finds an answer or processes all domain components.

If we have www.example.com as a hostname, then Kerberos library would issue a DNS query for TXT record _kerberos.www.example.com to find a name of the Kerberos realm of www.example.com. If that fails, next try will be for a TXT record _kerberos.example.com and so on, until DNS components are all processed.

It should be noted that this algorithm is only implemented in MIT and Heimdal Kerberos libraries. Active Directory implementation from Microsoft does not allow to query _kerberos.$hostname DNS TXT record to find out which realm a target application server belongs to. Instead, Windows environments delegate the discovery process to their domain controllers.

DNS canonicalization feature (or lack of) also affects DNS search since without it we wouldn’t know what realm to map to a non-fully qualified hostname. When dns_canonicalize_hostname option is set to false, Kerberos client would send the request to KDC with a default realm associated with the non-fully qualified hostname. Most likely such service principal wouldn’t be understood by the KDC and reported as not found.

To help in this situations, FreeIPA KDC supports Kerberos principal aliases. One can use the following ipa command line tool’s command to add aliases to hosts. Remember that a host principal is really a host/<hostname>:

$ ipa help host-add-principal
Usage: ipa [global-options] host-add-principal HOSTNAME KRBPRINCIPALNAME... [options]

Add new principal alias to host entry
Options:
  -h, --help    show this help message and exit
  --all         Retrieve and print all attributes from the server. Affects
                command output.
  --raw         Print entries as stored on the server. Only affects output
                format.
  --no-members  Suppress processing of membership attributes.

$ ipa host-add-principal bastion.example.com host/bastion
-------------------------------------------
Added new aliases to host "bastion.example.com"
-------------------------------------------
  Host name: bastion.example.com
  Principal alias: host/bastion.example.com@EXAMPLE.COM, host/bastion@EXAMPLE.COM

and for other Kerberos service principals the corresponding command is ipa service-add-principal:

$ ipa help service-add-principal
Usage: ipa [global-options] service-add-principal CANONICAL-PRINCIPAL PRINCIPAL... [options]

Add new principal alias to a service
Options:
  -h, --help    show this help message and exit
  --all         Retrieve and print all attributes from the server. Affects
                command output.
  --raw         Print entries as stored on the server. Only affects output
                format.
  --no-members  Suppress processing of membership attributes.

$ ipa service-show HTTP/bastion.example.com
  Principal name: HTTP/bastion.example.com@EXAMPLE.COM
  Principal alias: HTTP/bastion.example.com@EXAMPLE.COM
  Keytab: False
  Managed by: bastion.example.com
  Groups allowed to create keytab: admins
[root@nyx ~]# ipa service-add-principal HTTP/bastion.example.com HTTP/bastion
---------------------------------------------------------------------------------
Added new aliases to the service principal "HTTP/bastion.example.com@EXAMPLE.COM"
---------------------------------------------------------------------------------
  Principal name: HTTP/bastion.example.com@EXAMPLE.COM
  Principal alias: HTTP/bastion.example.com@EXAMPLE.COM, HTTP/bastion@EXAMPLE.COM

Domain-based search

Finally, domain-based search is activated when realm_try_domains = ... is specified. In this case Kerberos library will try heuristics based on the hostname of the target application server and a specific number of domain components of the application server hostname depending on how many components realm_try_domains option is allowing to cut off. More about that later.

KDC-based hostname search

However, there is another option employed by MIT Kerberos library. In case when MIT Kerberos client is unable to find out a realm on its own, starting with MIT krb5 1.6 version, a client will issue a request for without a known realm to own KDC. A KDC (must be MIT krb5 1.7 or later) can opt to recognize the hostname against own [domain_realm] mapping table and choose to issue a referral to the appropriate service realm.

The latter approach would only work if the KDC has been configured to allow such referrals to be issued and if client is asking for a host-based service. FreeIPA KDC, by default, allows this behavior. For trusted Active Directory realms there is also a support from SSSD on IPA masters: SSSD generates automatically [domain_realm] and [capaths] sections for all known trusted realms so that KDC is able to respond with the referrals.

However, a care should be taken by an application itself on the client side when constructing such Kerberos principal. For example, if we would use kvno utility, then a request kvno -S service hostname would ask for a referral while kvno service/hostname would not. The former is constructing a host-based principal while the latter is not.

When looking at the Kerberos trace, we can see the difference. Below host/client.example.com is asking for a service ticket to ldap/dc.ad.example.com as a host-based principal, without knowing which realm the application server’s principal belongs to:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S ldap dc.ad.example.com
[30684] 1552841274.602324: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ using ccache KEYRING:persistent:0:0
[30684] 1552841274.602325: Retrieving host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ from KEYRING:persistent:0:0 with result: -1765328243/Matching credential not found
[30684] 1552841274.602326: Retrying host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM with result: -1765328243/Matching credential not found
[30684] 1552841274.602327: Server has referral realm; starting with ldap/dc.ad.example.com@EXAMPLE.COM
[30684] 1552841274.602328: Retrieving host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM from KEYRING:persistent:0:0 with result: 0/Success
[30684] 1552841274.602329: Starting with TGT for client realm: host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM
[30684] 1552841274.602330: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals on
[30684] 1552841274.602331: Generated subkey for TGS request: aes256-cts/A93C
[30684] 1552841274.602332: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30684] 1552841274.602334: Encoding request body and padata into FAST request
[30684] 1552841274.602335: Sending request (965 bytes) to EXAMPLE.COM
[30684] 1552841274.602336: Initiating TCP connection to stream ip.ad.dr.ess:88
[30684] 1552841274.602337: Sending TCP request to stream ip.ad.dr.ess:88
[30684] 1552841274.602338: Received answer (856 bytes) from stream ip.ad.dr.ess:88
[30684] 1552841274.602339: Terminating TCP connection to stream ip.ad.dr.ess:88
[30684] 1552841274.602340: Response was from master KDC
[30684] 1552841274.602341: Decoding FAST response
[30684] 1552841274.602342: FAST reply key: aes256-cts/D1E2
[30684] 1552841274.602343: Reply server krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM differs from requested ldap/dc.ad.example.com@EXAMPLE.COM
[30684] 1552841274.602344: TGS reply is for host/client.example.com@EXAMPLE.COM -> krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM with session key aes256-cts/470F
[30684] 1552841274.602345: TGS request result: 0/Success
[30684] 1552841274.602346: Following referral TGT krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM
[30684] 1552841274.602347: Requesting tickets for ldap/dc.ad.example.com@AD.EXAMPLE.COM, referrals on
[30684] 1552841274.602348: Generated subkey for TGS request: aes256-cts/F0C6
[30684] 1552841274.602349: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30684] 1552841274.602351: Encoding request body and padata into FAST request
[30684] 1552841274.602352: Sending request (921 bytes) to AD.EXAMPLE.COM
[30684] 1552841274.602353: Sending DNS URI query for _kerberos.AD.EXAMPLE.COM.
[30684] 1552841274.602354: No URI records found
[30684] 1552841274.602355: Sending DNS SRV query for _kerberos._udp.AD.EXAMPLE.COM.
[30684] 1552841274.602356: SRV answer: 0 0 88 "dc.ad.example.com."
[30684] 1552841274.602357: Sending DNS SRV query for _kerberos._tcp.AD.EXAMPLE.COM.
[30684] 1552841274.602358: SRV answer: 0 0 88 "dc.ad.example.com."
[30684] 1552841274.602359: Resolving hostname dc.ad.example.com.
[30684] 1552841274.602360: Resolving hostname dc.ad.example.com.
[30684] 1552841274.602361: Initiating TCP connection to stream ano.ther.add.ress:88
[30684] 1552841274.602362: Sending TCP request to stream ano.ther.add.ress:88
[30684] 1552841274.602363: Received answer (888 bytes) from stream ano.ther.add.ress:88
[30684] 1552841274.602364: Terminating TCP connection to stream ano.ther.add.ress:88
[30684] 1552841274.602365: Sending DNS URI query for _kerberos.AD.EXAMPLE.COM.
[30684] 1552841274.602366: No URI records found
[30684] 1552841274.602367: Sending DNS SRV query for _kerberos-master._tcp.AD.EXAMPLE.COM.
[30684] 1552841274.602368: No SRV records found
[30684] 1552841274.602369: Response was not from master KDC
[30684] 1552841274.602370: Decoding FAST response
[30684] 1552841274.602371: FAST reply key: aes256-cts/10DE
[30684] 1552841274.602372: TGS reply is for host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM with session key aes256-cts/24D1
[30684] 1552841274.602373: TGS request result: 0/Success
[30684] 1552841274.602374: Received creds for desired service ldap/dc.ad.example.com@AD.EXAMPLE.COM
[30684] 1552841274.602375: Storing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ in KEYRING:persistent:0:0
[30684] 1552841274.602376: Also storing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM based on ticket
[30684] 1552841274.602377: Removing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM from KEYRING:persistent:0:0
ldap/dc.ad.example.com@: kvno = 28

However, when not using a host-based principal in the request we’ll fail.

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno ldap/dc.ad.example.com
[30695] 1552841932.100975: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM using ccache KEYRING:persistent:0:0
[30695] 1552841932.100976: Retrieving host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM from KEYRING:persistent:0:0 with result: -1765328243/Matching credential not found
[30695] 1552841932.100977: Retrieving host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM from KEYRING:persistent:0:0 with result: 0/Success
[30695] 1552841932.100978: Starting with TGT for client realm: host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM
[30695] 1552841932.100979: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals on
[30695] 1552841932.100980: Generated subkey for TGS request: aes256-cts/27DA
[30695] 1552841932.100981: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30695] 1552841932.100983: Encoding request body and padata into FAST request
[30695] 1552841932.100984: Sending request (965 bytes) to EXAMPLE.COM
[30695] 1552841932.100985: Initiating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100986: Sending TCP request to stream ip.ad.dr.ess:88
[30695] 1552841932.100987: Received answer (461 bytes) from stream ip.ad.dr.ess:88
[30695] 1552841932.100988: Terminating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100989: Response was from master KDC
[30695] 1552841932.100990: Decoding FAST response
[30695] 1552841932.100991: TGS request result: -1765328377/Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database
[30695] 1552841932.100992: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals off
[30695] 1552841932.100993: Generated subkey for TGS request: aes256-cts/C1BF
[30695] 1552841932.100994: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30695] 1552841932.100996: Encoding request body and padata into FAST request
[30695] 1552841932.100997: Sending request (965 bytes) to EXAMPLE.COM
[30695] 1552841932.100998: Initiating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100999: Sending TCP request to stream ip.ad.dr.ess:88
[30695] 1552841932.101000: Received answer (461 bytes) from stream ip.ad.dr.ess:88
[30695] 1552841932.101001: Terminating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.101002: Response was from master KDC
[30695] 1552841932.101003: Decoding FAST response
[30695] 1552841932.101004: TGS request result: -1765328377/Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database
kvno: Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database while getting credentials for ldap/dc.ad.example.com@EXAMPLE.COM

As you can see, our client tried to ask for a service ticket to a non-host-based service principal from outside our realm and this was not accepted by the KDC, thus resolution failing.

Mixed realm deployments

The behavior above is predictable. However, a client-side processing of the target realm behaves wrongly in case a client needs to request a service ticket to a service principal located in a trusted realm but situated in a DNS zone belonging to our own realm. This might sound like a complication but it is a typical situation for deployments with FreeIPA trusting Active Directory forests. In such cases customers often want to place Linux machines right in the DNS zones associated with Active Directory domains.

Since Microsoft Active Directory implementation does not support per-host Kerberos realm hint, unlike MIT Kerberos or Heimdal, such request from Windows client will always fail. It will be not possible to obtain a service ticket in such situation from Windows machines.

However, when both realms trusting each other are MIT Kerberos, their KDCs and clients can be configured for a selective realm discovery.

As explained at FOSDEM 2018 and devconf.cz 2019, Red Hat IT moved from an old plain Kerberos realm to the FreeIPA deployment. This is a situation where we have EXAMPLE.COM and IPA.EXAMPLE.COM both trusting each other and migrating systems to IPA.EXAMPLE.COM over long period of time. We want to continue providing services in example.com DNS zone but use IPA.EXAMPLE.COM realm. Our clients are in both Kerberos realms but over time they will all eventually migrate to IPA.EXAMPLE.COM.

Working with such situation can be tricky. Let’s start with a simple example.

Suppose our client’s krb5.conf has [domain_realm] section that looks like this:

[domain_realm]
   client.example.com = EXAMPLE.COM
   .example.com = EXAMPLE.COM

If we need to ask for a HTTP/app.example.com service ticket to the application server hosted on app.example.com, the Kerberos library on the client will map HTTP/app.example.com to the EXAMPLE.COM and will not attempt to request a referral from a KDC. If our application server is enrolled into IPA.EXAMPLE.COM realm, it means the client with such configuration will never try to discover HTTP/app.example.com@IPA.EXAMPLE.COM and will never be able to authenticate to app.example.com with Kerberos.

There are two possible solutions here. We can either add an explicit mapping for app.example.com host to IPA.EXAMPLE.COM in the client’s [domain_realm] section in krb5.conf or remove .example.com mapping entry from the [domain_realm] on the client side completely and rely on KDC or DNS-based search.

First solution does not scale and is a management issue. Updating all clients when a new application server is migrated to the new realm sounds like a nightmare if majority of your clients are laptops. You’d really want to force them to delegate to the KDC or do DNS-based search instead.

Of course, there is a simple solution: add _kerberos.app.example.com TXT record pointing out to IPA.EXAMPLE.COM in the DNS and let clients to use it. This would assume that all clients will not have .example.com = EXAMPLE.COM mapping rule.

Unfortunately, it is more complicated. As Robbie Harwood, Fedora and RHEL maintainer of MIT Kerberos, explained to me, the problem is what happens if there’s inadequate DNS information, e.g. DNS-based search failed. A client would fall back to heuristics (domain-based search) and these would differ depending which MIT Kerberos version is in use. Since MIT Kerberos 1.16 heuristics would be trying to prefer mapping HTTP/app.ipa.example.com into IPA.EXAMPLE.COM over EXAMPLE.COM, and prefer EXAMPLE.COM to failure. However, there is no a way to map HTTP/app.example.com to IPA.EXAMPLE.COM with these heuristics.

Domain-based search gives us another heuristics based on the realm. It is tunable via realm_try_domains option but it also would affect the way how MIT Kerberos library would choose which credentials cache from a credentials cache collection (KEYRING:, DIR:, KCM: ccache types). This logic is present since MIT Kerberos 1.12 but it also wouldn’t help us to map HTTP/app.example.com to IPA.EXAMPLE.COM.

After some discussions, Robbie and I came to a conclusion that perhaps changing the order how these methods are applied by the MIT Kerberos library could help. As I mentioned in “Domain to realm mapping” section, the current order is hard-coded: for realm selection the profile-based search is done before DNS-based search and domain-based search is done as the last one. Ideally, choosing which search is done after which could be given to administrators. However, there aren’t many reasonable orders out there. Perhaps, allowing just two options would be enough:

prioritizing DNS search over a profile search
prioritizing a profile search over DNS search

Until it is done, we are left with the following recommendation for mixed-domain Kerberos principals from multiple realms:

make sure you don’t use [domain_realm] mapping for mixed realm domains
make sure you have _kerberos.$hostname TXT record set per host/domain for the right realm name. Remember that Kerberos realm is case-sensitive and almost everywhere it is uppercase, so be sure the value of the TXT record is correct.