SSH certificates: the better SSH experience

← Back to news ← Back to articleDiscussions: hackernews, lobste.rs1d53mArticle 49 Comments

SSH certificates: the better SSH experience

All
hackernews
lobste.rs

thomashabets221h34m

Every couple of months someone re-discovers SSH certificates, and blogs about them.

I'm guilty of it too. My blog post from 15 years ago is nowhere near as good as OP's post, but if I though me of 15 years ago lived up to my standards of today, I'd be really disappointed: https://blog.habets.se/2011/07/OpenSSH-certificates.html

kaoD21h29m

I've known SSH certs for a while but never went through the effort of migrating away from keys. I'm very frustrated about manually managing my SSH keys across my different servers and devices though.

I assume you gathered a lot of thoughts over these 15 years.

Should I invest in making the switch?

papyDoctor20h14m

Another useful feature of SSH certificates is that you can sign a user’s public key to grant them access to a remote machine for a limited time and as a specific remote user.

Stefan-H18h39m

I think the scary reality is most people conflate "keys" and "certificates". I have worked with security engineers that I need to remind that we do not use SSH certs, but rather key auth, and they have to think it through to make it click.

V-eHGsd_18h10m

oh man, I referred back to your blog post when I wrote the ssh certificate authority for $job ... ~10 years ago.

Thank for writing it!

Thom200021h15m

Sadly services such as Github don't support these so it's mostly good for internal infrastructure.

lights012320h33m

They do, for Enterprise customers only: https://docs.github.com/en/enterprise-cloud@latest/organizat...

They've rolled their host key one time, so there's little reason for them to use it on the host side.

linsomniac20h50m

In our dev/stg environment we reinstall half our machines every morning (largely to test our machine setup automation), and SSH host certificates make that so much nicer than having to persist host keys or remove/replace them in known_hosts. Highly recommended.

grave8819h35m

[dead]

Tepix20h38m

The author lists all the advantes of CA certificates, yet doesn't list the disadvantages. OTOH, all the many steps required to set it up make the disadvantages rather obvious.

Also, I've never had a security issue due to TOFU, have you?

adrian_b20h15m

TOFU is convenient, but not necessary.

Choosing to use TOFU is a distinct choice from the choice of using the keys generated by SSH, instead of using certificates.

If you do not want to use TOFU, for extra security, you just have to pair the computers by copying between them the corresponding public keys through a secure channel, e.g. by using a USB memory.

Using certificates does not add any simplification or any extra security.

For real security, you still must pair the communicating computers by copying between them the corresponding certificates, through a secure channel, e.g. a USB memory.

When you use for HTTPS the certificates that have come with your Internet browser, you trust that the installer package for the browser has come to that computer through a secure channel from the authority that has created the certificates. This is usually an assumption much more far fetched than the assumption that you can trust TOFU between computers under your control.

Certificates may be useful in big organizations, if other functionality is needed beyond just establishing secure communication channels, e.g. if you want to use certificate revocation.

In the list of "advantages" enumerated in the parent article, more than half of them are false, because if certificates are implemented correctly, completely equivalent actions must be executed when SSH keys without TOFU are used and when certificates are used.

Perhaps the author meant by writing some of the "advantages" that the actions that supposedly are no longer needed with certificates are done by an administrator, not by the user. However that is also applicable with SSH. An administrator could install the certificates, so that no action is required from the user, but an administrator can also install the SSH public keys, so that no TOFU is ever needed from the user.

Using certificates requires exactly the same steps like using keys generated by SSH (i.e. generating certificates and copying them between computers through secure channels, to pair the servers and the authorized users), but it may need additional steps, caused by the fact that certificates provide additional functionality.

akerl_20h7m

> Also, I've never had a security issue due to TOFU, have you?

This is a bit like suggesting you've never been in a car crash, so seat belts must not be worth considering.

Do you feel that beyond the obvious and documented work in setting them up, there are disadvantages to using SSH certificates?

zamadatix17h51m

If you have some form of access to set up the CA config on the box before connecting then you can use the same access channel to avoid needing to rely on TOFU for setting up the key access all the same.

This can be anything from being part of the install script to customized deployment image to physical access to access via a host in virtualized scenarios.

TOFU only really comes into play when the box is already set up and you have no other way to load things onto the box other than connecting via SSH to do so. But, again, that would be the same story if you were intending to go the certificate approach too.

jcalvinowens19h58m

You can also address TOFU to some extent using SSHFP DNS records.

Openssh supports checking the DNSSEC signature in the client, in theory, but it's a configure option and I'm not sure if distros build with it.

jsiepkes19h52m

On top of that you would need something to secure DNS. Like DNSSEC or at the very least use DNS with TLS or DNS over HTTP. None of these are typically enabled by default.

fc417fc80218h3m

Any idea if there's a standardized location, something like /.well-known/ssh?

bobo5653919h41m

With the recent wave of npm hacks stealing private keys, I wanted to limit key's lifetimes.

I've set up a couple of yubikeys as SSH CAs on hosts I manage. I use them to create short lived certs (say 24h) at the start of the day. This way i only have to enter the yubikey pin once a day.

I could not find an easy way to limit maximum certificate lifetime in openssh, except for using the AuthorizedPrincipalCommand, which feels very fragile.

Does anyone else have any experience with a similar setup? How do you limit cert max lifetime?

moviuro19h4m

All those articles about SSH certificates fall short of explaining how the revocation list can/should be published.

Is that yet another problem that I need to solve with syncthing?

https://man.openbsd.org/ssh-keygen.1#KEY_REVOCATION_LISTS

blipvert18h49m

If you generate short lived certificates via an automated process/service then you don’t really need to manage a revocation list as they will have expired in short order.

gunapologist9918h50m

Anyone tried out Userify? It creates/removes ssh pubkeys locally so (like a CA) no authn server needs to be online. But unlike certs, active sessions and processes are terminated when the user access is revoked.

jamiesonbecker18h31m

We're in the process of updating the experience to this century! ;)

We've always taken the stance that crusty is better than vulnerable, but it turns out that not having a modern experience after 15 years is starting to feel like maybe we need to step up the features and shininess :)

sqbic18h22m

I've had very good experiences with SSH Communication Security company's (the guys who invented SSH) PrivX product to manage secure remote access, including SSH certificates and also cert based Windows authentication. It supports other kinds of remote targets too, via webui or with native clients. Great product.

jamiesonbecker18h18m

SSH certs quietly hurt in prod. Short-lived creds + centralized CA just moves complexity upward without solving the core problem: user management.

The system shifts from many small local states to one highly coupled control point. That control point has to be correct and reachable all the time. When it isn’t, failures go wide instead of narrow.

Example: a few boxes get popped and start hammering the CA. Now what? Access is broken everywhere at once.

Common friction points:

     1. your signer that has to be up and correct all the time
   2. trust roots everywhere (and drifting)
     3. TTL tuning nonsense (too short = random lockouts, too long = what was the point)
     4. limited on-box state makes debugging harder than it should be
     5. failures tend to fan out instead of staying contained

Revocation is also kind of a lie. Just waiting for expiry and hoping that’s good enough.

What actually happens is people reintroduce state anyway: sidecars, caches, agents… because you need it.

We went the opposite direction:

     1. nodes pull over outbound HTTPS
     2. local authorized_keys is the source of truth locally
     3. users/roles are visible on the box
     4. drift fixes itself quickly
     5. no inbound ports, no CA signatures (WELL, not strictly true*!)

You still get central control, but operation and failure modes are local instead of "everyone is locked out right now."

That’s basically what we do at Userify (https://userify.com). Less elegant than certs, more survivable at 2am. Also actually handles authz, not just part of authn.

And the part that usually gets hand-waved with SSH CAs:

     1. creating the user account
     2. managing sudo roles
     3. deciding what happens to home directories on removal
     4. cleanup vs retention for compliance/forensics

Those don’t go away - they're just not part of the certificate solution.

* (TLS still exists here, just at the transport layer using the system trust store. That channel delivers users, keys, and roles. The rest is handled explicitly instead of implied.)

ngrilly17h52m

How do you solve TOFU?

viraptor22h25m

For the "Automate host key certificate distribution?" section, author skips over the part where the client is getting validated.

For EC2, I've got an automated system where the instances request signed keys from a lambda which validates the uptime (no new certificates for a 10 day old instance) and tags (don't grant the cert to just about about host). https://codeberg.org/viraptor/auto-ec2-host-key

symgryph22h12m

Would this work with SK keys?

poptart17h27m

It does! When I wrote about SSH certs the FIDO key support came out like a week later and it works out of the box. In fact you can do fun things like restrict commands to specific keys, so when you swap in a key it will behave differently on the same command.

antonmedv21h4m

Code boxes are not scrollable on mobile

erock20h18m

Nice article! We recently introduced ssh cert support for pico.sh (https://pico.sh/access-control) and we agree the UX is better. It gives the account admin full control over the keypairs that are allowed to authn and by leveraging principals we have a mechanism for authz. Revocation is simply we have to implement but it's pretty simple: reject this pubkey from authn.

Golang's crypto/ssh made ssh certs ~100 loc to implement

tonyg17h59m

You might be interested in https://codeberg.org/forgejo/forgejo/pulls/11746 .

ahelwer13h23m

[Disclaimer, this company once paid me for a contract, which is how I found out about them] SSH certs are also managed pretty well by Teleport. It works by having an agent live on all your nodes to set up a short-lived certificate for just-in-time access whenever you want to SSH in. The author links to SmallStep SSH, which from a cursory glance seems similar.