If you are using Kubernetes, chances are you are also using cert-manager to automate certificate creation and renewal, whether it is for securing intra-cluster communications or to get a certificate from a fully-fledged Certificate Authority (CA) for public-facing websites. While cert-manager is very convenient, it needs a lot of credentials to do its magic, and in a shared cluster this can present a security risk. In this post, I will briefly review some of the risks that you may encounter and present a way to properly setup cert-manager to minimize potential security issues.

How cert-manager works

First, let’s review what happens when you generate a certificate through cert-manager. You start by creating an Issuer or ClusterIssuer that holds the required information to tell cert-manager how to communicate with a CA supporting the ACME protocol, and in some cases some extra identifying information that cert-manager can provide to the CA on your behalf to identify you. When using the ACME DNS01 mechanism to prove domain ownership, credentials to the DNS provider are also needed for cert-manager to create the TXT record as part of the DNS challenge.

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <email>
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - dns01:
        cloudflare:
          email: <email>
          apiTokenSecretRef:
            name: cloudflare-api-token-secret
            key: api-token

Once an Issuer or ClusterIssuer is installed, you can request certificates by placing a Certificate resource referencing the issuer. cert-manager will then

  • create the Certificate Signing Request (CSR)
  • send it to the CA using the information provided in the Issuer resource
  • receive a DNS challenge from the CA
  • use the DNS provider credentials to create the required TXT record
  • periodically check the challenge status and retrieve the signed certificate from the CA
  • delete the TXT record used for the DNS challenge

What could go wrong

There are a lot of moving parts and actors involved when using cert-manager, and therefore multiple possible attack vectors. A malicious actor with access to the cluster can nicely ask cert-manager to create arbitrary certificates if the issuer is a ClusterIssuer, and cert-manager will be happy to comply.

Worse, if cert-manager’s read/write credentials to the DNS provider somehow finds its way into the wrong hands, they can now freely modify DNS records. Depending on your DNS provider, the API token used for DNS updates can also be used for other API endpoints such as account management, resulting in an account takeover. In cases where the DNS provider also acts as a registrar, the attacker could now also transfer out the domain and modify the NS records to point to a DNS provider entirely in their control. Then they are now able to redirect your visitors to servers under their control, with completely valid TLS certificates created at their discretion.

Properly setting up cert-manager

Now that we’ve seen some of the possible risks involved with an unnecessarily permissive use of cert-manager, we’ll explore some ways to protect ourselves. For our purposes, we’ll be using Amazon Route53 as our DNS provider as its IAM settings allow for more granular permissions and suits our purposes well, although any other DNS provider could do.

CAA records

First, we’ll edit the example.com to add some CAA records authorizing the CA we are using if this has not been done already. CAA records allow us to specify which CA is allowed to create certificates for the entire zone. This is not perfect as an attacker can simply use the same CA, and while it is a clear violation of CA requirements, an unauthorized CA could still issue certificates whether mistakenly or maliciously. It’s still better than nothing and adding a record is easy, so there’s no reason not to use CAA records.

Create a DNS zone for delegation purposes

Next, we’ll create a DNS zone for the sole purpose of solving ACME challenges. In this example, we’ll use acme.example.com, a subdomain of example.com, although any other domain would serve that purpose just fine. Ultimately, cert-manager only needs to create a temporary TXT record, so giving it free write access to the entire DNS zone is overkill. How DNS01 works is that when you request a certificate for some.example.com, you will be asked to insert a TXT record at _acme-challenge.some.example.com with a randomized text value, the challenge. The CA will then make a DNS query and expect the challenge value given back.

There is actually no requirement for the TXT to be exactly at _acme-challenge.some.example.com, as long a TXT query against _acme-challenge.some.example.com ultimately leads to the challenge value. It is therefore completely valid to instead create a CNAME record at _acme-challenge.some.example.com pointing to a TXT record at _acme-challenge.some.other.example.com. This is known as the CNAME delegation of ACME challenge TXT records, and the cert-manager documentation has a modest mention of it.

Delegate control for the new hosted zone

After creating the acme.example.com zone, we are given a set of NS records with instructions to add them at the registrar. As this is a subdomain of example.com, we won’t be doing that however. Instead, we’ll now go back to our example.com DNS provider, Cloudflare here for illustration purposes, and add those NS records under acme.example.com. Our zones should now look somewhat like so:

# example.com
example.com.         IN    NS    ivan.ns.cloudflare.com.
example.com.         IN    NS    cheryl.ns.cloudflare.com.
acme.example.com.    IN    NS    ns-162.awsdns-10.com.
acme.example.com.    IN    NS    ns-2004.awsdns-12.co.uk.
acme.example.com.    IN    NS    ns-2233.awsdns-14.net.
acme.example.com.    IN    NS    ns-1111.awsdns-11.org.

# acme.example.com
acme.example.com.    IN    NS    ns-162.awsdns-10.com.
acme.example.com.    IN    NS    ns-2004.awsdns-12.co.uk.
acme.example.com.    IN    NS    ns-2233.awsdns-14.net.
acme.example.com.    IN    NS    ns-1111.awsdns-11.org.

What we have just done here is that we have delegated control of the acme.example.com subdomain of example.com to another DNS provider, Amazon Route53 in this instance. Since acme.example.com will only be used for the purposes of completing the ACME DNS01 challenge, in the event it gets taken over the damage will be lesser than a takeover of the root domain example.com. When creating a service account to give to cert-manager, we’re also able to limit its write scope to the acme.example.com zone, which is convenient if you happen to manage a lot of zones in the same account. Fun fact, the .com registrar is pretty much doing the exact same thing when you are leasing example.com from them. Here we, example.com, are pretty much “leasing out” acme.example.com to another account, albeit also under our control, hosted at Amazon Route53 just like how registrars would do.

Setup CNAME delegation

Next, we need to indicate that acme.example.com will take care of DNS01 challenges for example.com. This is done via CNAME records. If for instance you plan to issue a certificate for some.example.com, you’ll need to create the following CNAME record in the example.com zone:

_acme-challenge.some.example.com.    CNAME    _acme-challenge.some.acme.example.com.

You’ll actually need to create similar CNAME records for all domains for which you plan to request certificates for. This means that the fact only those with write access to the example.com DNS zone can authorize certificate creations remains true, and write access to the acme.example.com does not grant the ability to create certificates on arbitrary subdomains.

Lastly, we’ll need to modify our Issuer and set cnameStrategy: Follow on the DNS01 solver settings to indicate to cert-manager that it should follow CNAMEs since it does not do so by default. Our earlier Issuer definition will look like so:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt
spec:
  acme:
    ...
    solvers:
    - dns01:
        cnameStrategy: Follow
        route53:
          ...

Limit the DNS zones the Issuer can act upon

This is not a strict requirement, but it is good practice to tell our Issuer what zones it can issue certificates for. This is so some actor inside our cluster who owns another domain from setting up CNAME delegation to acme.example.com and trick cert-manager into issuing certificates using our Issuer, although this is not very meaningful. Where this would be useful is if you use a CA like ZeroSSL which comes with an External Account Binding to their dashboard account, or if your CA has rate limits like Let’s Encrypt does. Nevertheless, specifying which zones our Issuer can issue certificates for is straightforawrd. You’ll only need to populate the .spec.acme.solvers.selector field and specify some dnsZones.

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt
spec:
  acme:
    ...
    solvers:
    - selector:
        dnsZones:
        - 'example.com'
    - dns01:
        cnameStrategy: Follow
        route53:
          ...

Limit who can request certificates inside the cluster

cert-manager offers two types of Issuer resources: ClusterIssuer and Issuer. The former is available cluster-wide whereas the latter is only available inside the namespace it is created in. We’ll be using the Issuer resource as we do not want to give anyone in our cluster access to certificate issuances. Especially in a cluster shared by multiple teams, even within the same organization, it is best to reduce access to the Issuer to prevent accidents. Otherwise another team could, without ill intent, request a wildcard certificate for *.example.com and be off to the races, and this could later create headaches for the Platform team.

That being said, other teams (namespaces) may need their own certificates for their public-facing services such as sales.example.com or marketing.example.com. There are several approaches we can take to reconcile the need to provide certificates to other namespaces and security.

One approach we could take if you are using Contour as your ingress controller is to use the provided TLSCertificateDelegation custom resource to delegate permission to Contour to read the Secret containing certificate data from another namespace. This gives you fine-grained control into which certificate can be used by which namespace:

apiVersion: projectcontour.io/v1
kind: TLSCertificateDelegation
metadata:
  name: sales-example-com-delegation
  namespace: cert-manager
spec:
  delegations:
    - secretName: sales-example-com-tls
      targetNamespaces:
      - sales-team

In this example, we are delegating the certificate stored in the sales-example-com-tls Secret to the sales-team namespace. Tenants in the sales-team namespace can then reference this certificate in and only in an HTTPProxy resource managed by Contour, like so:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: website
  namespace: sales-team
spec:
  virtualhost:
    tls:
      secretName: cert-manager/sales-example-com-tls
    ...

The nice thing about this approach is that only Contour has permission to read the certificate. If however you are using a different ingress controller, you can alternatively sync the secret to another namespace using Kubernetes Config Syncer as recommended by cert-manager in their documentation. What this does is sync the Secret resource containing the certificate to namespaces of your choosing. The downside is that the certificate gets synced into the target namespaces, making its contents entirely visible to anyone with read access to the namespace.

Lastly, depending on your ingress controller, there may be support for default certificates. This would involve issuing a wildcard certificate and setting the ingress controller to use it as a default. The downside of this approach is that now any namespace can create Ingress resources and expose a public website with a valid certificate, so this is not much different to having a ClusterIssuer.

Conclusion

In this post we’ve explored some of the attack surfaces that you may expose yourself to when using cert-manager and went through some opinionated steps to use cert-manager in a more secure fashion, namely using CNAME delegation and limiting access to certificate issuances and reference. Granted, in normal scenarios teams inside the same organization do not have ill intent. However in the same fashion as leaving passwords on a post-it note next to your computer is a bad idea, leaving important resources exposed inside a shared cluster environment should be avoided to minimize the attack surface in the event something unfortunate happens.