Web Single Sign-On, the SAML 2.0 perspective

blog.theodo.com

108 points by guillaumeklaus 6 years ago

sk5t 6 years ago

Having worked with SAML, ADFS, etc., in substantial depth, I will opine that this article doesn't contribute above and beyond the same introductions to the technology from 10-15 years ago. It's just enough info to get a naive reader into trouble.

newusertoday 6 years ago

Do you have any suggestions for the best practices for implementing SAML? or any other blogs that you recommend which would cover this area in depth?
- sk5t 6 years ago
  
  Use the most battle-tested, well-supported SAML service available, even if it's more pain up-front; something that documents what XML canonicalization and signature algorithms and crypto suites it supports, so you're not dead in the water upon encountering a federation partner that requires encryption, or claims filtering and transformation, or whatever.
  The federation service should live outside your app(s); app(s) trusts your fed server only, and the fed server manages trust relationships with third parties, handles stuff like claims mapping; apps should be architected to understand that user information has been somehow added to the request context, but isolated from the mechanism, and ideally not expect to have an external source of user state.
  Get comfortable with the protocol over HTTP; capture the exchanges, decode the payloads.
- jarvuschris 6 years ago
  
  Just do it and see what happens ¯\_(ツ)_/¯
  Most certainly use a library at least for building and parsing messages. All the complexity is in figuring out how to use your particular library and figuring out any quirks in the systems your connecting re: what fields and formats they expect. It's not really something you can read up on and then know before you start
  
  zrail 6 years ago
  
  This is 100% my experience as well as a SAML SP. After the first few customers onboarding largely consisted of certificate exchanges and figuring out where the IdP put the fields we needed. Before that there was s lot of development to add those configuration knobs.
  
  mcguire 6 years ago
  
  XML encryption is weird. Some IdPs can handle the official "Encryption inside an XML doccy" thing, but others only like XML, encrypted.
- LoSboccacc 6 years ago
  
  you don't really go and implement saml, you deploy saml. you pick one service depending on your preferred language and go with it.
  I've worked both with apereo cas (java, was called jasig or something like that then) and simplesamlphp and they managed across multiple proprietary dialect with minimal effort, so those are good starting point
commandlinefan 6 years ago

Sadly, a large percentage of printed books are just as devoid of meaningful content. At least we only paid for this one with our time.
- Leace 6 years ago
  
  Maybe it's my selection bias but I found books that I read [0] to have higher information density than what I find online.
  [0]: I usually pick books with "Advanced" in title (or similar) even when I'm beginner in the subject at hand. It's easier to work backwards from advanced stuff for me.

CaliforniaKarl 6 years ago

Here's a tip for anyone planning on working with customers who are US research or education institutions: Have one of your Higher Ed customers sponsor you as an InCommon sponsored partner, and use that to join the InCommon Federation (https://www.incommon.org/)

InCommon is, among other things, a US federation of SAML SPs and IdPs. And this is one of the interesting things about SAML. Let's say that you are a Service Provider. Normally, for each customer, you have do at least one exchange of metadata (as u/victorNicollet mentioned elsewhere in the comments). If you have multiple SPs (say, a dev and a prod), or the customer has multiple IdPs (again, a dev and a prod), then you have to do this exchange multiple times.

An alternative is to join a federation. With a federation, the federation receives all of the metadata (all of the information about SPs and IdPs), and you trust the combined metadata the federation provides. And for Service Providers, there is a separate metadata feed of just IdPs.

What does this get you? It means, if your customer is already part of the federation, you don't have to exchange metadata with them. At most, they (the customer) might have to release additional attributes to you, that are not released by default (phone number, for example).

For InCommon, you can see all of the federation participants here: https://www.incommon.org/federation/incommon-federation-part...

Details on the technical side of the process are available here: https://spaces.at.internet2.edu/display/InCFederation/How+to...

tingletech 6 years ago

I literally have nightmares about shibboleth and InCommon. My university system has at least 11 IdPs (I think some of the medical centers have their own, but my apps aren't used by medical centers generally). Even with InCommon, I had to negotiate attribute release with every IdP to get email and eppn. Even after filling out the forms that went through a systemwide identity committee, I had to cold call half the campus identity management groups to get the attributes. To even get my metadata into InCommon, I had practically camp out in front of some dudes office because I could not get email or phone calls returned. But, I was getting it set up before the Research and Scholarship category came out. If you can get your app/SP approved, and your target IdPs support it, the attributes are pre-negotiated.
Plus the software was a PITA too.
I'm trying to get a "bridge" set up so I can have one SP in R&S that supports OIDC, that way when I do a new app I can just have the app use open id connect and not have to deal with shibboleth.
Research and Scholarship Category https://spaces.at.internet2.edu/display/InCFederation/Resear...
rb12345 6 years ago

To expand on this from a non-US perspective, the advice to join a federation also applies to working with universities elsewhere. Having joined one federation (e.g. InCommon or the UK Access Management Federation (https://www.ukfederation.org.uk/)), you can then get your metadata published in eduGAIN (https://edugain.org/) for institutions elsewhere in the world to use.
Other things to note from a HE perspective:
- Most potential users will likely be using Shibboleth as an IdP, as opposed to ADFS, Ping Federate, Okta and the like. This means that using federation-supplied metadata is far easer to work with than setting up per-SP metadata manually. (In ADFS at least, all the relying party setup is fairly manual, although Powershell can make this slightly less painful.)
- The admins running the IdP will most likely not not be the people buying or using the services provided by an SP. This can complicate setting up SAML integrations.
- SPs should try to support standard LDAP attribute schemas like the eduPerson (https://wiki.refeds.org/display/STAN/eduPerson) schema. These are likely to appear as "urn:oid:..." attributes in the SAML assertion. My personal suggestion for mixed HE/enterprise SPs would be to map both the eduPerson URNs and the equivalent ADFS claim identifiers to one internal identifier.
- As noted elsewhere, email addresses as identifiers are problematic in the general case: people change names, people change departments, and people may have multiple email addresses. The eduPersonPrincipalName attribute is generally more stable and is still non-opaque.

perlgeek 6 years ago

While this article is very clear, I had hoped (based on the title, I guess) for some more information:

How well does it actually work in practice?

How does the authorization part work?

How easy/hard is it for a developer? How about setting up an identify provider in dev?

imtringued 6 years ago

It depends on whether there are libraries available and how simple they are to use. The spring security SAML plugin is an absolute nightmare to integrate. A basic configuration requires 600 mostly redundant lines of which you probably copy paste 400 from the example project.
I've had a completely different experience with Grails which is a web framework for Groovy that is based on spring boot. It's plugins are usually just a thin wrapper over the actual spring libraries but they take care of those 600 lines and the end result is that you need to write slightly less than 50 lines of configuration in a YAML file.
Basically it can be hell or heaven depending on your language's ecosystem.
- vorg 6 years ago
  
  > Grails which is a web framework for Groovy that is based on spring boot. It's plugins
  I thought Grails uses Spring Boot only from 3 onwards, and only versions up to Grails 2.x have a plugin ecosystem. Virtually no-one has upgraded their apps and plugins since Grails 3 came out in early 2015, and no-one's started new projects in Grails since then. To mention Spring Boot and Grails Plugins in the same statement is misleading.
  
  dmux 6 years ago
  
  >Virtually no-one has upgraded their apps and plugins...
  I'll let the gripe about plugins pass because I think it's a bad metric for measuring how popular a framework is. How many of the plugins from 1.x and 2.x are even relevant anymore?
  >...and no-one's started new projects
  This is a ridiculous claim. It's just spreading FUD.
  
  zmmmmm 6 years ago
  
  254 Grails 3 plugins [1] is not enough for you?
  [1] http://plugins.grails.org/
newscracker 6 years ago

> How easy/hard is it for a developer?
If one is targeting any SAML compliant IdP, then this would be very hard and painful unless the developer chooses to use a mature library that preferably provides high level APIs to handle the request generation and response handling.
> How about setting up an identify provider in dev?
The easiest way is to not setup one in dev by oneself, but use free instances (potentially unstable at times) provided by the likes of Okta (okta.com) and other cloud based services.
- rkeene2 6 years ago
  
  I've written a simple SAML IdP[0] which could be used for development purposes -- it's overall not too complex to deal with.
  Maybe I should release this whole thing so people could have a simple SAML server for testing... though I'd probably switch to a better XML parser ! :-)
  [0] http://www.rkeene.org/viewer/tmp/tcl-saml.tcl.htm (there's a web front-end that goes in front of this package to take the HTTP request turn it into the appropriate username and call this library).
  
  rkeene2 6 years ago
  
  Here it is as a service that will just send you a signed SAML assertion for any username given.
  IdP URL: https://rkeene.dev/saml-idp-1/
  Metadata: https://rkeene.dev/saml-idp-1/metadata.xml
- neilv 6 years ago
  
  Agreed, coding "from scratch" to use a SAML IdP would be painful for a typical developer, who thus far mainly has experience using Web/app frameworks. And, as with much software development, it's easy to introduce vulnerabilities. Use off-the-shelf, if you can.
  That said, I've had to implement several SSO protocols "from scratch" (using off-the-shelf libraries for HTTPS, XML, and crypto) for a large Web system, including at least a couple different SAML IdP variants. There was a lot of figuring things out, and being very careful, and it was difficult but useful experience.
  (The SAML IdP uses were reasonable for authentication, and simply worked. Where I saw trouble was when cleanroom implementing client side of some of the non-standard SSO protocols used by some large companies. Every time I discovered a security problem with a customer's protocol design/implementation, I had to go to the director I was working with (fortunately, a very smart engineering PhD, who took it seriously), then he had to explain to the customer. Presumably there were historical and/or requirements reasons for doing a non-standard protocol, but security is hard, and every programmer changing the code is potentially a weak link.)
mcguire 6 years ago

"How well does it actually work in practice?"
Quite well, once you get path the teething pains of finding out how the various components you're using understand the standard. In practice, it's very similar to whatever Google, Github, and Facebook use for their federated authentication. (Very similar.)
We didn't use the authorization, and I wasn't involved in picking or setting up the IdP(s). But once you have one working, it's relatively simple to slap in front of all of your apps.
Kalium 6 years ago

> How well does it actually work in practice?
This is down to quality of implementation. Is your IdP any good? Do you have good management practices around access control? Do your applications check authorization properly?
> How does the authorization part work?
Role information is passed to applications. It is the responsibility of applications to check authorization, in whichever way the developers of each application see fit.
I hope this helps! Please don't hesitate to ask if you have further questions.
victorNicollet 6 years ago

I'm saying all of this as the CTO of a B2B service that accepts SAML for authentication.
> How well does it actually work in practice?
For authentication, pretty well. No downtimes observed on Google, Microsoft or customer-owned IdPs over the last 4 years of using SAML in production.
The customer's IT can easily provide you all the necessary data about their IdP as a standard format metadata.xml file. In return, you give them two URLs (your entityId and your ACS endpoint).
In case of IdP certificate change (once every few years unless something happens), they just send you ahead of time a new metadata.xml containing both old and new certificates, so you don't have to be there with you finger on the button to change the certificate on your side.
The only complex thing is whether any user that comes you way should have an associated user profile created in your application (i.e. any employee in your customer's directory is one of your users by default) or if you deny access to users that have not been manually registered in your application by the customer's admins.
> How does the authorization part work?
Not as elegantly as authentication. You need the IdP to provide you with additional information, such as roles. Not all IdPs can do this properly. Then, you need to manage your own mapping from that information to actual authorization info. Also, since you only get data from the IdP once (as part of the identity assertion), you won't have the granularity to provide per-resource access control.
In the end, we've found it far easier to use SAML only for authentication, and to implement an RBAC on our side, based on the identity of the users.
> How easy/hard is it for a developer? How about setting up an identify provider in dev?
Mature libraries exist for all major languages to deal with the crypto part, and the SAML protocol itself is trivial in terms of HTTP (redirect user to appropriate page on the IdP, receive a crypto-signed identity assertion on your ACS endpoint). It took me 3 days to go from zero to fully implemented and tested, in C#, and with extensive logging to detect sign-in issues.
Whenever possible, use the email as the user identifier. You can manage with any other kind of identifier, but debugging is significantly easier, since you don't have access to the customer's IdP to know the correspondance between an actual user and its opaque identifier.
Another important tip: even if you have a strong ownership between a customer and a customer subscription, some employees of your customer may need to access several customer subscriptions (for example Wrike project management: you can use it internally, and also participate as external contributors in projects from other companies) so you need to either allow an user account to access several customer subscriptions (likely hard) or allow a single SAML login to connect to several user accounts in several customer subscriptions. With email + password this could be worked around by using mail aliases (e.g. victor+companyA@example.com, victor+companyB@example.com) but with SAML you are limited to only one identity, so if you don't allow multi-user SAML, you end up in a situation like Wrike (again) where we actively discourage our customers from using Wrike internally because we want their employees to connect to our Wrike subscription...
Setting up an IdP... depends on how much you want to dogfood. If you have Office 365 or Google Suite, you already have an IdP that you can use easily to connect to your own system. I used Salesforce, which has a "free" level.
- CaliforniaKarl 6 years ago
  
  >Whenever possible, use the email as the user identifier. You can manage with any other kind of identifier, but debugging is significantly easier, since you don't have access to the customer's IdP to know the correspondance between an actual user and its opaque identifier.
  This is an assumption that a number of service providers run in to when setting up a SAML relationship with Stanford’s IdP: Not every Stanford person has a published email address. In particular, people who have been sponsored for accounts (in order to log in to Stanford systems) don’t have their email address published (we don’t have their permission to do so).
  Email addresses can also change.
  As was mentioned, there are opaque identifiers that uniquely identify someone and that do not change. We also provide the person’s SUNetID (their Stanford username), even if there is no email address.
  
  victorNicollet 6 years ago
  
  I agree that the service provider shouldn't expect the SAML identifier to be an email. I suspect that the SUNetId is something readable (based on the human's name) as opposed to a fully opaque alphanumeric string ?
  On the other hand, having emails (or namespaced identifiers that look like emails but aren't) does provide a nice way to start the login process from the service provider's login page, as opposed to having to start from the IdP side, since the service provider can infer which IdP to use based on the identifier's namespace.
  
  CaliforniaKarl 6 years ago
  
  >I agree that the service provider shouldn't expect the SAML identifier to be an email. I suspect that the SUNetId is something readable (based on the human's name) as opposed to a fully opaque alphanumeric string ?
  Yup. For those curious, you can find Stanford's list of released-by-default attributes here:
  https://uit.stanford.edu/service/saml/arp
  >On the other hand, having emails (or namespaced identifiers that look like emails but aren't) does provide a nice way to start the login process from the service provider's login page, as opposed to having to start from the IdP side, since the service provider can infer which IdP to use based on the identifier's namespace.
  Yup, totally agree, and that's one of the annoying things about SAML. In Stanford's case, that's the eduPersonPrincipalName attribute.
- davewritescode 6 years ago
  
  > Whenever possible, use the email as the user identifier. You can manage with any other kind of identifier, but debugging is significantly easier, since you don't have access to the customer's IdP to know the correspondance between an actual user and its opaque identifier.
  This isn't a great idea, things like email get recycled fairly frequently. Some systems also let people change their email, for example, it's not uncommon for an email to change when someone gets married or for an email to get recycled when an employee leaves a company.
  > In the end, we've found it far easier to use SAML only for authentication, and to implement an RBAC on our side, based on the identity of the users.
  This I 100% agree with, standardizing basic attributes is hard enough.
- rb12345 6 years ago
  
  The main things to remember about certificate rollover from experience are:
  - Publish the signing certificates ahead of time in the metadata before using them.
  - Install any new encryption certificates on your service (supporting old and new certificates) before publishing the certificates in the service metadata.
  In terms of SAML testing, https://samltest.id/ is useful for short-term testing of both IdPs and SPs.

youdontknowtho 6 years ago

New apps should NOT implement SAML.

All platforms support OpenID Connect. It's better in every way. SAML is effectivly dead from a development standpoint. All the energy from the standards bodies are focused on OIDC.

zrail 6 years ago

Disagree. Enterprise customers have SAML SSO deployed and want to use it. If you want to serve those customers at any sort of scale you need to implement SAML.
- youdontknowtho 6 years ago
  
  Literally all of those enterprise SSO systems support OIDC. ADFS, PING, OKTA, Azure AD, ETC...
  
  user5994461 6 years ago
  
  SAML has two way authentication and some advanced security stuff. The IDP and the SP must be registered to one another and can actively sign and verify their transactions. It is noteworthy for some use cases like AWS or bank authentication.
  Nonetheless, we can all agree that SAML is an order of magnitude more difficult for no significant benefit. OpenID connect should be largely preferred.
  
  mosdl 6 years ago
  
  You can verify transactions in openid as well
  
  user5994461 6 years ago
  
  openid is a dead protocol that stopped being supported by major websites and authentication solutions years ago. Not to be confused with openid connect.

motohagiography 6 years ago

I've also worked with SAML and OIDC as an architect, and implemented OAuth (relatively trivially) for websites, and SAML is not a solution that can be considered naively. The problem SAML solves is if you would like to stand up a federated IdP, or integrate to an existing federation that uses SAML. If you don't know what these mean, I'd argue you are blessed, and it's not a problem you have.

I will say that many enterprise/institutional applications are simple CRUD web pages with gargantuan iceberged identity management problems that developers and project managers walk right into. Identity is a basic unanswered business model question that shouldn't be left to last minute implementation decisions.

If you are building a consumer service today, I don't see why you would bother having your own password database or authentication scheme when you can just adopt any of the major platforms as an IdP using OAuth.

Storing passwords today is dumb. I'd argue if you do not want to adopt an IdP and use their OAuth or OIDC service, stand up a Vault instance and use it for TOTP secrets for users 2FA apps. They are literally better user experiences than trying to remember a never-used password.

If you need to do enterprise 2FA, Authy and Google Authenticator and now Apple's Sign-In and Relay services are great options. Keycloak is another cool package for handling OAuth and OIDC services as well.

unethical_ban 6 years ago

Warning - this site is categorized as malicious by at least one known web filter.

newscracker 6 years ago

This is very short. Almost any other site on SAML would provide the same information, describing the flows.

For a better foundation, it could be expanded with explanations of signing, encryption, SLO (single logout), and other aspects.

matthewaveryusa 6 years ago

The hardest part of SAML is the bonehead way in which they do signatures -- It normalizes the data before signing it, and then embeds the signature within the XML document.

So if you want to verify the authenticity of the identity assertion, you need to parse the XML, find the node that's being signed, canonicallize the node, and then finally verify the signature. It's absolutely gross to do and you need to use (lib)xmlsec1 lest you want to lose your sanity.

captn3m0 6 years ago

I implemented XML Signatures at our org for Payments Interface (MPI) with Card Networks and there is a part of my brain that will forever curse me for reading the XMLDSig spec.
How anyone ever thought XML Canonicalization was the right answer is beyond me.
- marcosdumay 6 years ago
  
  XML being the way it is, I don't think there was any alternative to canonicalization.
  
  tomjen3 6 years ago
  
  Of the top of my head: embed the XML as a pchar, sign only that; and the insert the signature as another XML value.
  Then you only parse the outside XML, extract the inner value and validate the signature, then you validate the inner value as-is, so no XML changes will undo the signature.
  
  marcosdumay 6 years ago
  
  Ok, you do not need to keep the signature at the same document you are signing.
  But you'll have to canonicalize your XML anyway. There are plenty of security issues with signing non-canonical XML.

mcguire 6 years ago

"9. The user can access to its desired resource."

This one is fun. In step 1, the "service provider" auth/auth system freezes the original request and satisfies it in 9. I wrote a Java Servlet filter that performed this dance (https://maniagnosis.crsr.net/p/web-authentication.html) many years ago, which may be one of my few pieces of software still in use (although the guy who took it over may have rewritten it; he doesn't like other people's software).

JadeNB 6 years ago

I guess anyone reading such a technical article already knows this, but I didn't have even the vaguest clue what SAML was—my first thought was some dialect of ML—and skimming the article only slightly clarified for me. According to Wikipedia (https://en.wikipedia.org/wiki/Security_Assertion_Markup_Lang...), it's the "Security Assertion Markup Language", and the discussion is about implementations of Single Sign-On.

jteppinette 6 years ago

One of the most painful parts of my career was writing a multi-tenant SAML authentication service in Golang (~3 years ago). experiencing painful flash backs

vgetr 6 years ago

It doesn’t.