FAQ | This is a LIVE service | Changelog

Commit 20f94be6 authored by Dr Rich Wareham's avatar Dr Rich Wareham
Browse files

initial implementation

Provide an initial implementation, a stub README and a basic GitLab CI
configuration to run tests.
parent 5d11b63b
Pipeline #32477 passed with stages
in 34 minutes and 20 seconds
[run]
omit =
.tox/*
setup.py
tests/*
.git
venv/
*.pyc
__pycache__/
.pytest_cache/
.coverage
htmlcov/
dist/
build/
*.egg-info/
root=true
[*.py]
max_line_length=99
[flake8]
max-line-length=99
exclude = venv,.tox
venv/
*.pyc
__pycache__/
.pytest_cache/
.coverage
htmlcov/
dist/
build/
*.egg-info/
# This file pulls in the GitLab AutoDevOps configuration via an include
# directive and then overrides bits. The rationale for this is we'd like this
# file to eventually have zero local overrides so that we can use the AutoDevOps
# pipeline as-is.
include:
# Bring in the AutoDevOps template from GitLab.
# It can be viewed at:
# https://gitlab.com/gitlab-org/gitlab-ee/blob/master/lib/gitlab/ci/templates/Auto-DevOps.gitlab-ci.yml
- template: Auto-DevOps.gitlab-ci.yml
# Overrides to AutoDevOps for testing
- project: 'uis/devops/continuous-delivery/ci-templates'
file: '/auto-devops/tox-tests.yml'
variables:
DOCUMENTATION_DISABLED: "1"
DAST_DISABLED: "1"
# This dockerfile is used only to run tests in. It should not be used to package
# the library.
FROM python:3.6-alpine
WORKDIR /usr/src/app
ADD ./ ./
RUN \
apk --no-cache add gcc g++ musl-dev libstdc++ libffi-dev libxml2-dev libxslt-dev && \
pip install --no-cache-dir --upgrade -r ./requirements.txt && \
pip install --no-cache-dir --upgrade tox
MIT License
Copyright (c) 2020 University of Cambridge Information Services
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# Geddit
Simple config-free retrieval of resources by URL.
\ No newline at end of file
Simple zero-configuration retrieval of resources by URL.
This module provides a single `geddit` function which takes a single parameter
specifying the URL to fetch. It will return a `bytes` object with the contents
of the resource at that URL or will raise an exception specific to the URL
scheme.
This library is intended to be used in situations where scheme-specific
configuration can be inferred from the environment. (For example, in Google
Cloud-hosted environments there is usually a default identity which the services
run as. This identity will be used to fetch resources specified via the `gs` or
`sm` schemes.)
This library is *not* intended to replace general use libraries such as
`requests`.
## Examples
```python
from geddit import geddit
# The default scheme is file://
geddit('file:///etc/issue') # == b'Debian GNU/Linux 10 \\n \\l\n\n'
geddit('/etc/issue') # == b'Debian GNU/Linux 10 \\n \\l\n\n'
geddit('./README.md') # Raises: ValueError
# Fetching using HTTP over TLS
geddit('https://www.gov.uk/bank-holidays.json')[:20] # == b'{"england-and-wales"'
# HTTP errors are reported
geddit('https://www.example.com/not-found') # raises requests.exceptions.HTTPError
# Google Storage objects. Uses default application credentials.
geddit('gs://my-bucket/some-object')
# Google Secret Manager secrets. Uses default application credentials.
geddit('sm://my-project/some-secret') # fetches latest version
geddit('sm://my-project/some-secret#3') # fetches version 3
```
## Requirements
* Python >= 3.6.
## Installation
The library can be installed directly from the git repository:
```bash
$ pip3 install git+https://gitlab.developers.cam.ac.uk/uis/devops/lib/geddit.git
```
For developers, the tool can be installed from a cloned repo using pip:
```bash
$ cd /path/to/this/repo
$ pip3 install -e .
```
## License
This software is licensed under an MIT-like software license. See the [LICENSE
file](LICENSE) for the full text of the license.
## Scheme-specific notes
### file
The `file` scheme is the default scheme. Only absolute paths can be specified.
### https
There is no support for HTTP basic authentication as that involves putting the
cleartext password into the URL.
Non-TLS ("http") URLs are *not* supported.
### gs and sm
Only default application credentials are supported. To use specific credentials,
set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the absolute
path to some JSON-formatted credentials.
import logging
import os
import urllib.parse
from google.cloud import secretmanager, storage
import requests
LOG = logging.getLogger(__name__)
def geddit(url):
"""
Fetch the content of a resource at a given URL and return the fetched content as a bytes
object.
The following schemes are supported:
* file: fetch from a file on the local file system. (Default if no scheme is provided.)
* https: fetch using HTTP over TLS. HTTP basic authentication is not supported since it
involves having the cleartext password in the URL.
* gs: fetch from a Google Cloud Storage object. The URL should have the form
"gs://bucket/path/to/object".
* sm: fetch from a Google Secret Manager secret. The URL should have the form
"sm://project/secret[#version]". If no version is provided the "latest" version is used.
For "gs" and "sm" URLs, application default credentials are used.
Raises ValueError if the URL has an unknown scheme. Fetch errors of other kinds are raised
using appropriate exceptions for the backend specific to the scheme.
"""
components = urllib.parse.urlsplit(url, scheme='file')
fetch_cb = _SCHEME_MAP.get(components.scheme)
if fetch_cb is None:
raise ValueError(f'Unknown URL scheme "{components.scheme}" for URL "{url}"')
return fetch_cb(components)
def _fetch_file_url(components):
"""
Fetch the contents of a local file given the split file:// URL components.
"""
if not os.path.isabs(components.path):
raise ValueError('file:// URL path must be absolute')
with open(components.path, 'rb') as fobj:
return fobj.read()
def _fetch_https_url(components):
"""
Fetch a HTTP over TLS URL which has been parsed into components.
"""
response = requests.get(urllib.parse.urlunsplit(components[:5]))
response.raise_for_status()
return response.content
def _fetch_secret_manager_url(components):
"""
Fetch a secret manager URL which has been parsed into components.
"""
project_id = components.netloc
secret_name = components.path.lstrip('/')
version = components.fragment if components.fragment != '' else 'latest'
# Sanity check that there aren't any path components in the secret name or version.
if '/' in secret_name or '/' in version:
raise ValueError('Secret Manager URL must have form sm://PROJECT_ID/SECRET#VERSION')
client = secretmanager.SecretManagerServiceClient()
secret_path = client.secret_version_path(project_id, secret_name, version)
return client.access_secret_version(secret_path).payload.data
def _fetch_storage_url(components):
"""
Fetch a Cloud storage URL which has been parsed into components.
"""
bucket = components.netloc
blob_path = components.path.lstrip('/')
client = storage.Client()
bucket = client.get_bucket(bucket)
blob = bucket.get_blob(blob_path)
# Despite the function name, this actually downloads the blob as a bytes object.
return blob.download_as_string()
# A table mapping URL schemes into the corresponding callable for that scheme.
_SCHEME_MAP = {
'file': _fetch_file_url,
'gs': _fetch_storage_url,
'https': _fetch_https_url,
'sm': _fetch_secret_manager_url,
}
[pytest]
junit_family=xunit2
# Requirements for the library itself.
requests
google-cloud-secret-manager~=0.1
google-cloud-storage~=1.26
import os
from setuptools import setup, find_packages
def load_requirements():
"""
Load requirements file and return non-empty, non-comment lines with leading and trailing
whitespace stripped.
"""
with open(os.path.join(os.path.dirname(__file__), 'requirements.txt')) as f:
return [
line.strip() for line in f
if line.strip() != '' and not line.strip().startswith('#')
]
setup(
name='geddit',
version='0.1.0',
packages=find_packages(),
install_requires=load_requirements(),
)
import contextlib
import os
import tempfile
import unittest
import urllib.parse
from geddit import geddit
class FileTestCase(unittest.TestCase):
def setUp(self):
# Create a temporary directory and a file within it with known content
with contextlib.ExitStack() as stack:
self.temp_dir = stack.enter_context(tempfile.TemporaryDirectory(prefix='testing-'))
self.addCleanup(stack.pop_all().close)
self.file_path = os.path.join(self.temp_dir, 'testing')
self.content = b'hello'
with open(self.file_path, 'wb') as fobj:
fobj.write(self.content)
def test_default_scheme(self):
"""The default scheme is file."""
self.assertEqual(geddit(self.file_path), self.content)
def test_url(self):
"""Using a file:// URL also works."""
url = urllib.parse.urlunsplit(('file', '', self.file_path, '', ''))
self.assertTrue(url.startswith('file://'))
self.assertEqual(geddit(url), self.content)
def test_exception_propagation(self):
"""The underlying IOError is propagated if the file does not exist."""
with self.assertRaises(IOError):
geddit(self.file_path + '-which-does-not-exist')
def test_relative_path(self):
"""Using a relative path raises ValueError."""
with self.assertRaises(ValueError):
geddit('some/relative/path')
import unittest
from geddit import geddit
class GeneralTestCase(unittest.TestCase):
def test_unknown_scheme(self):
"""An unknown scheme raises ValueError."""
with self.assertRaises(ValueError):
geddit('gopher://gopher.example.com/archie-index')
import unittest
import unittest.mock as mock
from geddit import geddit
class GoogleStorageTestCase(unittest.TestCase):
def setUp(self):
client_patcher = mock.patch('google.cloud.storage.Client')
self.mock_client_class = client_patcher.start()
self.addCleanup(client_patcher.stop)
self.mock_client = self.mock_client_class.return_value
self.mock_content = b'some request content'
self.mock_bucket = self.mock_client.get_bucket.return_value
self.mock_blob = self.mock_bucket.get_blob.return_value
self.mock_blob.download_as_string.return_value = self.mock_content
def test_basic_fetch(self):
"""Can fetch a basic gs://... URL."""
content = geddit('gs://my-bucket/path/to/object')
self.mock_client.get_bucket.assert_called_with('my-bucket')
self.mock_bucket.get_blob.assert_called_with('path/to/object')
self.assertEqual(content, self.mock_content)
import unittest
import unittest.mock as mock
from geddit import geddit
class HTTPSTestCase(unittest.TestCase):
def setUp(self):
get_patcher = mock.patch('requests.get')
self.mock_get = get_patcher.start()
self.addCleanup(get_patcher.stop)
self.mock_content = b'some request content'
self.mock_get.return_value.content = self.mock_content
def test_basic_case(self):
"""A HTTPS URL ends up calling requests.get()."""
url = 'https://example.com/foo/bar'
content = geddit(url)
self.mock_get.assert_called_with(url)
self.assertEqual(content, self.mock_content)
def test_http_error(self):
"""raise_for_status() is called on the response"""
url = 'https://example.com/foo/bar'
geddit(url)
self.mock_get.return_value.raise_for_status.assert_called()
def test_port(self):
"""A HTTPS URL with a port is supported."""
url = 'https://example.com:1234/foo/bar'
content = geddit(url)
self.mock_get.assert_called_with(url)
self.assertEqual(content, self.mock_content)
def test_authenticated(self):
"""A HTTPS URL with basic auth does not pass an auth param to requests.get()."""
url = 'https://user:pass@example.com/foo/bar'
content = geddit(url)
self.mock_get.assert_called_with(url)
self.assertEqual(content, self.mock_content)
import unittest
import unittest.mock as mock
from geddit import geddit
class SecretManagerTestCase(unittest.TestCase):
def setUp(self):
client_patcher = mock.patch('google.cloud.secretmanager.SecretManagerServiceClient')
self.mock_client_class = client_patcher.start()
self.addCleanup(client_patcher.stop)
self.mock_client = self.mock_client_class.return_value
self.mock_content = b'some request content'
self.mock_client.access_secret_version.return_value.payload.data = self.mock_content
def test_default_version(self):
"""Can fetch the default version."""
content = geddit('sm://my-project/my-secret')
self.mock_client.secret_version_path.assert_called_with(
'my-project', 'my-secret', 'latest')
self.mock_client.access_secret_version.assert_called_with(
self.mock_client.secret_version_path.return_value)
self.assertEqual(content, self.mock_content)
def test_explicit_version(self):
"""Can fetch an explicit version."""
content = geddit('sm://my-project/my-secret#123')
self.mock_client.secret_version_path.assert_called_with(
'my-project', 'my-secret', '123')
self.mock_client.access_secret_version.assert_called_with(
self.mock_client.secret_version_path.return_value)
self.assertEqual(content, self.mock_content)
def test_bad_name(self):
"""Secret names cannot contain "/"."""
with self.assertRaises(ValueError):
geddit('sm://my-project/secrets/my-secret')
def test_bad_version(self):
"""Secret versions cannot contain "/"."""
with self.assertRaises(ValueError):
geddit('sm://my-project/my-secret#version/1')
# Tox runner configuration
#
# The following optional environment variables can change behaviour. See the
# comments where they are used for more information.
#
# - TOXINI_ARTEFACT_DIR
# - TOXINI_FLAKE8_VERSION
# - TOXINI_WORK_DIR
#
[tox]
# Envs which should be run by default.
envlist=flake8,py3
# Allow overriding toxworkdir via environment variable
toxworkdir={env:TOXINI_WORK_DIR:{toxinidir}/.tox}
# Avoid .egg-info directories
skipsdist=True
# The "_vars" section is ignored by tox but we place some useful shared
# variables in it to avoid needless repetition.
[_vars]
# Where to write build artefacts. We default to the "build" directory in the
# tox.ini file's directory. Override with the TOXINI_ARTEFACT_DIR environment
# variable.
build_root={env:TOXINI_ARTEFACT_DIR:{toxinidir}/build}
[testenv]
# Additional dependencies
deps=
.
coverage
pytest
pytest-cov
# Which environment variables should be passed into the environment.
passenv=
# Allow people to override the coverage report location should they so wish.
COVERAGE_FILE
# Location of the coverage.xml file
COVERAGE_XML_FILE
# How to run the test suite. Note that arguments passed to tox are passed on to
# the test command.
commands=
pytest --doctest-modules --cov={toxinidir} --junitxml={[_vars]build_root}/{envname}/junit.xml
coverage html --directory {[_vars]build_root}/{envname}/htmlcov/
coverage xml -o {env:COVERAGE_XML_FILE:{[_vars]build_root}/{envname}/coverage.xml}
# Allow sitepackages setting to be overridden via TOX_SITEPACKAGES environment
# variable. The tox container uses this to avoid re-installing the same packages
# over and over again.
sitepackages={env:TOXINI_SITEPACKAGES:False}
[testenv:py3]
basepython=python3
# Check for PEP8 violations
[testenv:flake8]
basepython=python3
deps=
# We specify a specific version of flake8 to avoid introducing "false"
# regressions when new checks are introduced. The version of flake8 used may
# be overridden via the TOXINI_FLAKE8_VERSION environment variable.
flake8~={env:TOXINI_FLAKE8_VERSION:3.8.0}
commands=
flake8 --version
flake8 .
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment