User:Kjklingenstein/Internet identity in research and education

Much of the development in Internet identity has occurred within the research and education (R&E) community, with significant influence on other sectors. By Internet identity, we mean the set of mechanisms that allow the use of local authentication and attributes outside the perimeter of the organization that provides the local authentication and attribute services. “Authenticate locally, act globally” is a shibboleth of this vision. This article describes the history and current activities of federated identity within R&E.

Background
It is important to note that the same factors that led to the pivotal role of R&E in Internet development also motivated the R&E community to take a leadership role in the creation of Internet identity. Those factors include:
 * The need for researchers to collaborate between institutions
 * Distinctive privacy requirements that shape well-engineered approaches
 * Close interactions with the government sector and agencies that drive R&D investments
 * Strong relationships with standards organizations and corporate research groups
 * An orientation to open-source solutions

During much of the 1990’s, identity was contained within each application. Authentication and most attributes were implemented within the app. However, by the late 90’s, the idea of a central authentication service for the enterprise had taken root. The major one was the development of Kerberos by MIT. Its creation for Project Athena and subsequent adoption by Microsoft created a foundation that organizational identity and Internet identity could be built on top of. Another major precursor was PKI. As a technology, PKI offered global trust, at the expense of ready local deployability for both technical and policy reasons. A third major motivator was the plethora of applications emerging as the web developed and organizations could run multiple applications to provide services to their users. This need for single sign-on, implemented via a web browser, drove the creation of systems such as CAS, a popular open-source software system, and PubCookie.

At the same time, activities hosted by Educause engaged the R&E community with the federal government on a variety of PKI efforts. There was a continuing theme of establishing an R&E PKI infrastructure via a sector certificate authority. The approach would be similar to what the federal government was building with their PIV cards, agency-based certificates, and using the federal PKI bridge that was already providing trust mappings between federal agency infrastructures to potentially bridge to the R&E PKI root. The discussions never led to significant deployments, but they provided valuable lessons about where the flexibility and pivot points in trust needed to be for Internet scale identity. It was the soil for federated identity to grow in.

The Beginnings
Much of the leadership for the R&E community work in identity came from RL “Bob” Morgan, an architect first at Stanford University and then at the University of Washington. Through his vision, his interlocking personal relationships with members of key standards organizations such as IETF and Kantara Initiative, and his ability to articulate complex thoughts with simple mumbles, he was pivotal in shaping much of the work until his untimely passing in 2012, and even thereafter. Capturing a list of other key individuals would only do injustice to those many accidentally overlooked in the process.

There were several organizational vehicles for much of the work. Chief among them was a set of middleware activities with Internet2. The organization was focused on advanced applications and therefore constructed networks to support the applications. It became apparent that while the networks and the applications were advancing, there was a layer of middleware missing that was necessary to enable the authentication and authorization for those applications. That need led to the Internet2 Middleware initiative that was a developmental hub for the work in Internet identity. Much of the work was done in close partnership with Educause, a broad higher education IT community, with their expertise in outreach and training. To set direction and serve as a root anchor for name spaces, an architectural committee - called MACE - was established, populated by "usual suspects'of identity leaders, chaired by RL "Bob" Morgan. Federal agencies, in particular the National Science Foundation and National Institutes of Health, provided vital grant activity that supported the work, and those same agencies became adopters of the work product. Another important group was the Common Solutions Group, a set of leading universities that supplied much of the intellectual capital and early proving grounds for middleware work. A set of ad hoc activities connected these US efforts to expertise in Europe, where R&E networks in Sweden, the UK, France and Spain, among others, had the same drivers as in the US for federation. That locus eventually expanded to include Japan and Australia.

Internet identity was also being worked in some pockets of industry. IETF had long been the home for PKI standards, and bar BoF’s there had explored many of the issues in building Internet scale identity. The Burton Group hosted a set of industry-focused activities around NAC - the Network Applications Consortium - that highlighted leading efforts in both enterprise directory development and inter-organizational identity. OASIS became the standards organization for SAML. Companies such as IBM and Microsoft contributed expertise to the Internet2 middleware efforts.

First work: directories and schema
The initial efforts in the R&E community, beginning in 1999, were around directories and schema. It was evident that directories were critical to a scalable middleware infrastructure, and that a shared schema among the R&E community would enable meaningful sharing of attributes. Important principles such as differentiating values in the directories from values on the wire, and using privacy-preserving attributes and identifiers, helped set the foundations for federated identity. Reference documents such as the LDAP Recipe and the eduPerson schema were developed, along with community consensus processes, heavily influenced by the IETF, to create community standards.

The placemat
The major driver for much of the Internet2 Middleware work was to create a sustainable and scalable community mechanism for privacy-preserving inter-realm authentication and authorization. Use cases in science, in scholarly activities, and in community service drove a set of requirements that were the basis for investigations into using PKI, extending the Kerberos PAC concept to inter-realm use, and other architectures. After those evaluations, it was decided that each approach had significant problems, ranging from deployability to scalability. The decision was made to pursue strategies that were extensions of institutional web single sign-on systems, pushing the concept of single sign on into a general inter-institutional infrastructure.

That decision led to a number of key consequences. It added a unique set of privacy consequences. It charted a multi-lateral approach, and with that, critical use of metadata. It led to trust frameworks that would be deployable and scalable.

In early 2001, an architecture for interinstitutional web SSO was established. Echoing an early IETF design principle that a good protocol should fit on a cocktail napkin, the architecture was sketched one night on a restaurant placemat. Because of the breadth of use cases it had to address, the approach was more complicated than first envisioned, and so there was a six month reduction effort, attempting to simplify the flows and reduce the number of components. The architecture that emerged from that review, around the end of 2001, was the original, but the insights and understandings developed in the review led to a more rapid and solid development effort.

Around the same time, OASIS was beginning to convene a working group around a security markup language to address use cases of outsourced applications working with enterprise identity infrastructure. Key R&E identity leaders met with the OASIS business community to determine how to partition development activities. The business use cases were all bilateral relationships, while the R&E model was multilateral. It was agreed that the basic bilateral exchange protocol – SAML – would be developed within OASIS, with that architecture designed to accommodate multilateral requirements, such as supporting scoped attributes. The multilateral dimensions – from scoped attributes to shared metadata and schema – would be addressed within the R&E community development. That distinction and the resulting collaboration (many of the writers and editors of the OASIS SAML draft were those also developing the R&E multilateral aspects) was perhaps the most critical factor in the success that followed.

From the R&E development came several enduring elements of Internet identity infrastructure. The first was Shibboleth, a widely used and definitive open source federation software system, in use across a variety of sectors in the US and internationally. The second was a sequence of federations such as InQueue, that resulted in InCommon, pioneering the trust models that are both deployable and scalable. The federation work in turn drove yet another major component of Internet identity, end-entity metadata. Much of this was brought into focus at a workshop called Federation Soup in 2008.

Sequencing the building of institutional trust was delicate. The lessons of failed PKI deployments indicated that rigid policy requirements needed to be replaced with flexible and “roughly consistent” community agreements. over several years, the community joined InQueue and then moved to InCommon.

The rise of R&E federations worldwide
Other countries had the same R&E drivers for interinstitutional identity and, starting in the early 2000’s, a few places in Europe, most notably the UK, the Netherlands, Switzerland and Scandanavia began to experiment with federated identity. The relatively small size and high expertise allowed rapid development and simplified architectures. In particular, hub-and-spoke models became effective national level approaches, with the central hub taking on a variety of roles including being the federation operator, acting as an attribute release and consent point and being the IdP for its members.

While the growth was in national level federations, the R&E community is global, and inter-federation was always the desired end-state. In Slaughter England, a first international gathering was held (informally called “Leading Trust to Slaughter”) in 2004 to both develop strategies to encourage federations in other countries and to build enough consistency in approaches to facilitate eventual inter-federation. One output of that meeting was the eventual formation of REfeds, an international organization of R&E federations.

Other R&E Internet identity activities:

The federation work was not the only activity harnessing the R&E community. There was a steady stream of ongoing work in PKI. This included establishing a Higher Ed Bridge Certificate Authority (HEBCA) with innovative technology approaches at Dartmouth College and a policy authority with broad R&E representation. Another activity was to establish a US Higher Education Root Authority (USHER) that would be include in campus versions of web browsers to provide inter-institutional capabilities. In addition, for ten years, NIST, NIH and Internet2 hosted a yearly workshop in Gaithersburg on cutting edge PKI and other trust issues. The sessions, and the hallways around them, were quite valuable in facilitating academic and business cross-pollination.

Attributes
Though the early emphasis in Internet identity was the strength of authentication and LOA (levels of assurance), the R&E community, with its privacy and access control use cases, had a particular focus on attributes and identifiers. This was evident in starting the eduPerson attribute schema well before starting software development to exchange those attributes. The consensus process took quite some time, balancing a rich set of requests with the understanding that deployments became harder the broader the schema and interoperability became less likely with wide attribute controlled vocabulary. In addition, early work on identifiers helped frame the relevant characteristics to view them in.

There were some particularly useful engagements in this area. A “Tao of Attributes” workshop at NIH in early 2007 was one of the first investigations of the relevant metadata about attributes that were both viable to generate and responsive to needs. Early conversations about the scoping of attributes, effective ways to extend values within communities of interest, and building semantic communities of practice became practice as InCommon grew as a schema proving ground.

Current activities
The R&E community continues today as a leader in Internet identity, building practical and deployable approaches that are motivated by distinctive use cases, in scale, in privacy and security, and in trust. The success of R&E interfederation has resulted in very large and unwieldy metadata sets, and so the community is moving to dynamic metadata delivery, much as /etc/hosts evolved into DNS. In privacy, the particular needs of the academic community continue to push on code of conduct and user consent tools. There are new incident handling mechanisms being established to deal with federated security events. And while federations have developed in other verticals, the largest multilateral federations continue to exist within the R&E community.