FAQ | This is a LIVE service | Changelog

Skip to content
Snippets Groups Projects
Commit 111fa16e authored by Dave Hart's avatar Dave Hart :pizza:
Browse files

Switch to using Lookup API instead of Ibis client. (#35)

Add settings for Lookup API and API Gateway credentials.

Add Lookup API operations to replicate the LDAP operations for gathering
information about users, groups and institutions.

Remove LDAP sync code and related configuration code.

Remove LDAP configuration options from the example file.

Remove LDAP library as a requirement it is no longer needed.

Remove LDAP references from documentation.

Add information to the README about how to provide API Gateway credentials.

Add YAML versions of API Gateway credentials file to `.gitignore`.
parent 24a42cac
No related branches found
No related tags found
No related merge requests found
Pipeline #230617 passed with warnings
......@@ -110,7 +110,12 @@ venv.bak/
# Service account credentials (if instructions in README are followed)
credentials.json
# API Gateway credentials (if instructions in README are followed)
api_gateway_credentials.json
api_gateway_credentials.yaml
api_gateway_credentials.yml
# Local configuration
gsuitesync.yaml
.vscode/
\ No newline at end of file
.vscode/
# Google GSuite Synchronisation Tool
This repository contains a custom synchronisation tool for synchronising
information from the [Lookup service](https://www.lookup.cam.ac.uk/)'s LDAP
personality to a Google hosted domain (aka "GSuite").
information from the [Lookup service](https://www.lookup.cam.ac.uk/) to a
Google hosted domain (aka "GSuite").
Configuration is performed via a configuration file. Take a look at the [example
configuration file](configuration-example.yaml) for more information.
......@@ -141,3 +141,14 @@ service account is marked as being willing to "su" to another Google user. By
adding the generated Client ID to the GSuite security settings you are, as
domain administrator, giving that service account the ability to act as any user
in the domain **subject to the listed scopes**.
## Preparing API Gateway credentials
The `api_gateway` section in the configuration file specifies where the
credentials for accessing API Gateway can be found. These credentials must be
supplied as a file in YAML/JSON format with values for the keys `base_url`,
`client_id` and `client_secret`. The key and secret can be found on the
[Apigee apps page](https://apigee.google.com/organizations/api-prod-a3dc87f7/apps)
by searching for the "Directory Synchronisation Tool" app (there are staging
and production versions). The Lookup API base URL can be found on the
[Lookup API page on API Gateway](https://developer.api.apps.cam.ac.uk/docs/lookup/1/overview).
......@@ -165,46 +165,27 @@ google_api:
# reading and writing. Default: null.
read_only_credentials: null
# Details about the LDAP server
ldap:
# Scheme and hostname of the LDAP server.
host: 'ldaps://ldap.example.com'
# Credentials to be used when accessing the LDAP server from outside of the
# CUDN.
#
# When both username and password strings are specified, the sync tool will
# use SSL when connecting to the LDAP server, and will attempt to
# authenticate with these credentials.
#
# Username needs to be the full DN of the group, e.g.
# groupid=123456,ou=groups,o=example-corps,dc=example,dc=com
#
# The username and password properties should _not_ be specified when running
# the sync tool inside the CUDN (which includes running in the CI pipeline).
username: null
password: null
# LDAP search base for users. Person filters are always relative to this.
user_search_base: 'ou=people,o=example-corps,dc=example,dc=com'
# LDAP search base for groups. Group filters are always relative to this.
group_search_base: 'ou=groups,o=example-corps,dc=example,dc=com'
# LDAP search base for institutions. Institution filters are always relative to this.
inst_search_base: 'ou=insts,o=example-corps,dc=example,dc=com'
# API Gateway configuration
api_gateway:
# Authentication
auth:
# Path to on-disk JSON credentials used when accessing the API.
credentials: "./api_gateway_credentials.json"
# Details about the Lookup API queries to perform as part of the sync process.
# These queries replace the LDAP queries that were previously used.
lookup:
# Filter to use to determine the "eligible" list of users. If a non-admin user
# is found on Google who isn't in this list, their account will be suspended.
eligible_user_filter: '(uid=*)'
eligible_user_filter: "person: crsid != ''"
# Filter to use to determine the "eligible" list of groups. If a group is
# found on Google that isn't in this list, it will be deleted.
eligible_group_filter: '(groupID=*)'
eligible_group_filter: "group: groupid != ''"
# Filter to use to determine the "eligible" list of institutions. If an
# institution is found on Google that isn't in this list, it will be deleted.
eligible_inst_filter: '(instID=*)'
eligible_inst_filter: "inst: instid != ''"
# Filter to use to determine the "managed" list of users. If a user appears in
# this list who isn't in Google their account is created. If the user metadata
......
"""
API Gateway authentication.
"""
import dataclasses
import logging
from .mixin import ConfigurationDataclassMixin
LOG = logging.getLogger(__name__)
@dataclasses.dataclass
class Configuration(ConfigurationDataclassMixin):
"""
Configuration of API Gateway access credentials.
"""
# Path to on-disk JSON credentials used when accessing APIs through API Gateway.
credentials: str
"""
Retrieving user information from an LDAP directory.
Retrieving user information from Lookup API.
"""
import dataclasses
import typing
from typing import Optional
from .mixin import ConfigurationDataclassMixin
......@@ -11,29 +11,18 @@ from .mixin import ConfigurationDataclassMixin
@dataclasses.dataclass
class Configuration(ConfigurationDataclassMixin):
"""
Configuration for accessing the LDAP directory.
Configuration for filtering eligibility and managed users, groups and
institutions in Lookup API.
"""
host: str
user_search_base: str
group_search_base: str
inst_search_base: str
eligible_user_filter: str
eligible_group_filter: str
eligible_inst_filter: str
username: str = None
password: str = None
managed_user_filter: typing.Union[str, None] = None
managed_user_filter: Optional[str] = None
managed_group_filter: typing.Union[str, None] = None
managed_group_filter: Optional[str] = None
managed_inst_filter: typing.Union[str, None] = None
managed_inst_filter: Optional[str] = None
......@@ -5,7 +5,7 @@ import yaml
from .exceptions import ConfigurationNotFound
# Configuration declarations
from . import gapiauth, gapidomain, ldap, limits, sync, licensing
from . import api_gateway, gapiauth, gapidomain, limits, lookup, sync, licensing
LOG = logging.getLogger(__name__)
......@@ -50,10 +50,12 @@ def parse_configuration(configuration):
"""
return {
'api_gateway_auth': api_gateway.Configuration.from_dict(
configuration.get('api_gateway', {}).get('auth', {})),
'sync': sync.Configuration.from_dict(configuration.get('sync', {})),
'gapi_domain': gapidomain.Configuration.from_dict(configuration.get('google_domain', {})),
'ldap': ldap.Configuration.from_dict(configuration.get('ldap', {})),
'limits': limits.Configuration.from_dict(configuration.get('limits', {})),
'lookup': lookup.Configuration.from_dict(configuration.get('lookup', {})),
'gapi_auth': gapiauth.Configuration.from_dict(
configuration.get('google_api', {}).get('auth', {})),
'licensing': licensing.Configuration.from_dict(configuration.get('licensing', {})),
......
......@@ -191,7 +191,7 @@ class Comparator(ConfigurationStateConsumer):
#
# Note that the source of each of these groups may be either a Lookup group or a Lookup
# institution, which are handled the same here. Technically Lookup institutions do not
# have descriptions, but the code in ldap.py sets the description from the name for
# have descriptions, but the code in lookup.py sets the description from the name for
# Lookup institutions, which is useful since some institution names do not fit in the
# Google name field.
expected_google_group = {
......
......@@ -2,37 +2,67 @@
Load current user, group and institution data from Lookup.
"""
import logging
import collections
import ldap3
import logging
import yaml
from identitylib.lookup_client_configuration import LookupClientConfiguration
from identitylib.lookup_client import ApiClient as LookupApiClient
from identitylib.lookup_client.api.person_api import PersonApi
from identitylib.lookup_client.api.group_api import GroupApi
from identitylib.lookup_client.api.institution_api import InstitutionApi
from .base import ConfigurationStateConsumer
LOG = logging.getLogger(__name__)
# Number of entries to request with each API request to list entities
LIST_FETCH_LIMIT = 1000
# The scheme used to identify users
UID_SCHEME = 'crsid'
# Extra attributes to fetch when checking managed entities
MANAGED_USER_FETCH = 'firstName'
USER_INST_FETCH = 'all_insts'
MANAGED_GROUP_FETCH = 'all_members'
# Properties containing user/group/institution information in search query results
USER_RESULT_PROPERTY = 'people'
GROUP_RESULT_PROPERTY = 'groups'
INST_RESULT_PROPERTY = 'institutions'
# User and group information we need to populate the Google user directory.
UserEntry = collections.namedtuple('UserEntry', 'uid cn sn displayName givenName licensed')
GroupEntry = collections.namedtuple('GroupEntry', 'groupID groupName description uids')
class LDAPRetriever(ConfigurationStateConsumer):
required_config = ('ldap', )
class LookupRetriever(ConfigurationStateConsumer):
required_config = ('lookup', 'api_gateway_auth', )
default_api_params = dict(_request_timeout=120)
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.lookup_api_client = self._get_lookup_client()
self.person_api_client = PersonApi(self.lookup_api_client)
self.group_api_client = GroupApi(self.lookup_api_client)
self.inst_api_client = InstitutionApi(self.lookup_api_client)
def retrieve_users(self):
# Get a set containing all CRSids. These are all the people who are eligible to be in our
# GSuite instance. If a user is in GSuite and is *not* present in this list then they are
# suspended.
LOG.info('Reading eligible user entries from LDAP')
LOG.info('Reading eligible user entries from Lookup')
eligible_uids = self.get_eligible_uids()
LOG.info('Total LDAP user entries: %s', len(eligible_uids))
LOG.info('Total Lookup user entries: %s', len(eligible_uids))
# Sanity check: there are some eligible users (else LDAP lookup failure?)
# Sanity check: there are some eligible users (else Lookup failure?)
if len(eligible_uids) == 0:
raise RuntimeError('Sanity check failed: no users in eligible set')
# Get a list of managed users. These are all the people who match the "managed_user_filter"
# in the LDAP settings.
LOG.info('Reading managed user entries from LDAP')
# in the Lookup settings.
LOG.info('Reading managed user entries from Lookup')
managed_user_entries = self.get_managed_user_entries()
# Form a mapping from uid to managed user.
......@@ -63,25 +93,25 @@ class LDAPRetriever(ConfigurationStateConsumer):
# Get a set containing all groupIDs. These are all the groups that are eligible to be in
# our GSuite instance. If a group is in GSuite and is *not* present in this list then it
# is deleted.
LOG.info('Reading eligible group entries from LDAP')
LOG.info('Reading eligible group entries from Lookup')
eligible_groupIDs = self.get_eligible_groupIDs()
LOG.info('Total LDAP group entries: %s', len(eligible_groupIDs))
LOG.info('Total Lookup group entries: %s', len(eligible_groupIDs))
# Get a set containing all instIDs. These are all the institutions that are eligible to be
# in our GSuite instance. If an institution is in GSuite and is *not* present in this list
# then the corresponding group is deleted.
LOG.info('Reading eligible institution entries from LDAP')
LOG.info('Reading eligible institution entries from Lookup')
eligible_instIDs = self.get_eligible_instIDs()
LOG.info('Total LDAP institution entries: %s', len(eligible_instIDs))
LOG.info('Total Lookup institution entries: %s', len(eligible_instIDs))
# Add these sets together to form the set of all gids (the IDs of all eligible groups and
# institutions).
eligible_gids = eligible_groupIDs | eligible_instIDs
LOG.info('Total combined LDAP group and institution entries: %s', len(eligible_gids))
LOG.info('Total combined Lookup group and institution entries: %s', len(eligible_gids))
# Get a list of managed groups. These are all the groups that match the
# "managed_group_filter" in the LDAP settings.
LOG.info('Reading managed group entries from LDAP')
# "managed_group_filter" in the Lookup settings.
LOG.info('Reading managed group entries from Lookup')
managed_group_entries = self.get_managed_group_entries()
# Form a mapping from groupID to managed group.
......@@ -96,8 +126,8 @@ class LDAPRetriever(ConfigurationStateConsumer):
)
# Get a list of managed institutions. These are all the institutions that match the
# "managed_inst_filter" in the LDAP settings.
LOG.info('Reading managed institution entries from LDAP')
# "managed_inst_filter" in the Lookup settings.
LOG.info('Reading managed institution entries from Lookup')
managed_inst_entries = self.get_managed_inst_entries()
# Form a mapping from instID to managed institution.
......@@ -137,20 +167,22 @@ class LDAPRetriever(ConfigurationStateConsumer):
})
###
# Functions to perform LDAP calls
# Functions to perform Lookup API calls
###
def get_eligible_uids(self):
"""
Return a set containing all uids who are eligible to have a Google account.
Return a set containing all CRSids who are eligible to have a Google account.
"""
return {
e['attributes']['uid'][0]
for e in self._search(
search_base=self.ldap_config.user_search_base,
search_filter=self.ldap_config.eligible_user_filter,
attributes=['uid']
)
uid for uid in [
_extract_uid(person)
for person in self._fetch_all_list_results(
self.person_api_client.person_search,
USER_RESULT_PROPERTY,
self.lookup_config.eligible_user_filter
)
] if len(uid) > 0
}
def get_eligible_groupIDs(self):
......@@ -159,12 +191,14 @@ class LDAPRetriever(ConfigurationStateConsumer):
"""
return {
e['attributes']['groupID'][0]
for e in self._search(
search_base=self.ldap_config.group_search_base,
search_filter=self.ldap_config.eligible_group_filter,
attributes=['groupID']
)
group_id for group_id in [
group.get('groupid', '')
for group in self._fetch_all_list_results(
self.group_api_client.group_search,
GROUP_RESULT_PROPERTY,
self.lookup_config.eligible_group_filter
)
] if len(group_id) > 0
}
def get_eligible_instIDs(self):
......@@ -173,12 +207,14 @@ class LDAPRetriever(ConfigurationStateConsumer):
"""
return {
e['attributes']['instID'][0]
for e in self._search(
search_base=self.ldap_config.inst_search_base,
search_filter=self.ldap_config.eligible_inst_filter,
attributes=['instID']
)
inst_id for inst_id in [
inst.get('instid', '')
for inst in self._fetch_all_list_results(
self.inst_api_client.institution_search,
INST_RESULT_PROPERTY,
self.lookup_config.eligible_inst_filter
)
] if len(inst_id) > 0
}
def get_managed_user_entries(self):
......@@ -187,19 +223,24 @@ class LDAPRetriever(ConfigurationStateConsumer):
"""
search_filter = (
self.ldap_config.managed_user_filter
if self.ldap_config.managed_user_filter is not None
else self.ldap_config.eligible_user_filter
self.lookup_config.managed_user_filter
if self.lookup_config.managed_user_filter is not None
else self.lookup_config.eligible_user_filter
)
return [
UserEntry(
uid=_extract(e, 'uid'), cn=_extract(e, 'cn'), sn=_extract(e, 'sn'),
displayName=_extract(e, 'displayName'), givenName=_extract(e, 'givenName'),
licensed=_extract_non_empty(e, 'misAffiliation')
uid=_extract_uid(person),
cn=person.get('registered_name', ''),
sn=person.get('surname', ''),
displayName=person.get('display_name', ''),
givenName=_extract_attribute(person, 'firstName'),
licensed=len(_extract_attribute(person, 'misAffiliation')) > 0,
)
for e in self._search(
search_base=self.ldap_config.user_search_base, search_filter=search_filter,
attributes=['uid', 'cn', 'sn', 'displayName', 'givenName', 'misAffiliation']
for person in self._fetch_all_list_results(
self.person_api_client.person_search,
USER_RESULT_PROPERTY,
search_filter,
extra_props=dict(fetch=MANAGED_USER_FETCH)
)
]
......@@ -209,18 +250,25 @@ class LDAPRetriever(ConfigurationStateConsumer):
"""
search_filter = (
self.ldap_config.managed_group_filter
if self.ldap_config.managed_group_filter is not None
else self.ldap_config.eligible_group_filter
self.lookup_config.managed_group_filter
if self.lookup_config.managed_group_filter is not None
else self.lookup_config.eligible_group_filter
)
return [
GroupEntry(
groupID=_extract(e, 'groupID'), groupName=_extract(e, 'groupName'),
description=_extract(e, 'description'), uids=set(e['attributes'].get('uid', []))
groupID=group.get('groupid', ''),
groupName=group.get('name', ''),
description=group.get('description', ''),
uids=set([
member.identifier.value for member in group.get('members', [])
if member.identifier.scheme == UID_SCHEME
])
)
for e in self._search(
search_base=self.ldap_config.group_search_base, search_filter=search_filter,
attributes=['groupID', 'groupName', 'description', 'uid']
for group in self._fetch_all_list_results(
self.group_api_client.group_search,
GROUP_RESULT_PROPERTY,
search_filter,
extra_props=dict(fetch=MANAGED_GROUP_FETCH)
)
]
......@@ -236,73 +284,90 @@ class LDAPRetriever(ConfigurationStateConsumer):
allows longer strings, and so will not truncate the name).
"""
# This requires 2 LDAP queries. First find the managed institutions.
# This requires 2 Lookup queries. First find the managed institutions.
search_filter = (
self.ldap_config.managed_inst_filter
if self.ldap_config.managed_inst_filter is not None
else self.ldap_config.eligible_inst_filter
self.lookup_config.managed_inst_filter
if self.lookup_config.managed_inst_filter is not None
else self.lookup_config.eligible_inst_filter
)
managed_insts = [
GroupEntry(
groupID=_extract(e, 'instID'), groupName=_extract(e, 'ou'),
description=_extract(e, 'ou'), uids=set(),
groupID=group.get('instid', ''),
groupName=group.get('name', ''),
description=group.get('name', ''),
uids=set()
)
for e in self._search(
search_base=self.ldap_config.inst_search_base, search_filter=search_filter,
attributes=['instID', 'ou']
for group in self._fetch_all_list_results(
self.inst_api_client.institution_search,
INST_RESULT_PROPERTY,
search_filter
)
]
managed_insts_by_instID = {g.groupID: g for g in managed_insts}
# Then get each eligible user's list of institutions and use that data to populate each
# institution's uid list.
eligible_users = self._search(
search_base=self.ldap_config.user_search_base,
search_filter=self.ldap_config.eligible_user_filter,
attributes=['uid', 'instID']
eligible_users = self._fetch_all_list_results(
self.person_api_client.person_search,
USER_RESULT_PROPERTY,
self.lookup_config.eligible_user_filter,
extra_props=dict(fetch=USER_INST_FETCH)
)
for e in eligible_users:
uid = e['attributes']['uid'][0]
for instID in e['attributes']['instID']:
if instID in managed_insts_by_instID:
managed_insts_by_instID[instID].uids.add(uid)
for user in eligible_users:
uid = user.identifier.value if user.identifier.scheme == UID_SCHEME else ''
if len(uid) > 0:
for inst in user.institutions:
if inst.instid in managed_insts_by_instID:
managed_insts_by_instID[inst.instid].uids.add(uid)
return managed_insts
def _search(self, *, search_base, search_filter, attributes):
# Use SSL to access the LDAP server when authentication credentials
# have been configured
use_ssl = bool(self.ldap_config.username and self.ldap_config.password)
ldap_server = ldap3.Server(self.ldap_config.host, use_ssl=use_ssl)
# Add authentication credentials if configured
username = self.ldap_config.username if self.ldap_config.username else None
password = self.ldap_config.password if self.ldap_config.password else None
# Connect to the LDAP server and perform the query
with ldap3.Connection(ldap_server, username, password, auto_bind=True) as conn:
return conn.extend.standard.paged_search(
search_base, search_filter, paged_size=1000, attributes=attributes)
def _extract(entry, attr, *, default=''):
"""
Extract an attribute from an ldap entry, returning the attribute itself if single-valued
otherwise the first value of a multivalued attribute.
If the entry doesn't have the attribute then `default` is returned.
"""
vs = entry['attributes'].get(attr, [])
if len(vs) == 0:
return default
if isinstance(vs, str):
return vs
return vs[0]
def _extract_non_empty(entry, attr):
"""
Extract an attribute from an ldap entry, returning whether the attribute exists and is
not empty.
"""
return len(entry['attributes'].get(attr, [])) > 0
def _fetch_all_list_results(self, api_func, result_prop, query, extra_props={}):
"""
Repeatedly make an API call with incrementing offset until no more results are returned,
at which point return all the retrieved results as a single list.
"""
offset = 0
result = []
all_results = []
while offset <= 0 or len(result) >= LIST_FETCH_LIMIT:
LOG.info(f'Fetching from Lookup API {result_prop}, offset {offset}')
response = api_func(
query=query, **self.default_api_params, **extra_props, limit=LIST_FETCH_LIMIT,
offset=offset
)
result = response.get('result', {}).get(result_prop, [])
all_results.extend(result)
offset += LIST_FETCH_LIMIT
return all_results
def _get_lookup_client(self):
"""
Return a Lookup API client instance that can be used to access the Lookup API.
"""
creds_file = self.api_gateway_auth_config.credentials
LOG.info('Loading API Gateway app credentials from "%s"', creds_file)
with open(creds_file, "r") as stream:
settings = yaml.safe_load(stream)
config = LookupClientConfiguration(
settings['client_id'],
settings['client_secret'],
base_url=settings['base_url']
)
return LookupApiClient(config, pool_threads=10)
def _extract_uid(person):
identifier = person.identifier
return identifier.value if identifier.scheme == UID_SCHEME else ''
def _extract_attribute(entity, attr):
return next(
(x.value for x in entity.attributes if x.scheme == attr), ''
)
"""
Synchronise Google Directory with a local LDAP directory.
Synchronise Google Directory with Lookup using API Gateway.
"""
import logging
from .. import config
from .state import SyncState
from .ldap import LDAPRetriever
from .lookup import LookupRetriever
from .gapi import GAPIRetriever
from .compare import Comparator
from .update import GAPIUpdater
......@@ -30,10 +30,10 @@ def sync(configuration, *, read_only=True, timeout=300, group_settings=False, ju
state = SyncState()
# Get users and optionally groups from Lookup
ldap = LDAPRetriever(configuration, state)
ldap.retrieve_users()
lookup = LookupRetriever(configuration, state)
lookup.retrieve_users()
if not just_users:
ldap.retrieve_groups()
lookup.retrieve_groups()
# Get users and optionally groups from Google
gapi = GAPIRetriever(configuration, state)
......
......@@ -2,4 +2,4 @@ PyYAML
docopt
google-api-python-client
google-auth
ldap3
ucam-identitylib==1.0.10
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment