FAQ | This is a LIVE service | Changelog

Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
iaas_setup.rst 21.54 KiB

IaaS setup

This isnt included in the main docs, only linkable with a direct URL

create new Ubuntu instance using the IaaS broker service:

https://v-vra01.srv.uis.private.cam.ac.uk/catalog/#

latest instance is:

ssh lb584@10.136.11.3 BOT-cags1

mount storage

mount our RDS and RFS storage

the mounting of RDS and RFS is a workaround until we can purchase block storage on the IaaS. it works, but does not preserve different user permissions and is slow.

sudo apt-get install cifs-utils sudo apt install smbclient

create dirs to mount: sudo mkdir /mnt/RDS sudo mkdir /mnt/RFS

add to /etc/fstab

//hpc-isi-w.hpc.private.cam.ac.uk/rfs-x2jwbL2P9T4 /mnt/RFS cifs credentials=/root/.rfs_credentials,workgroup=BLUE.CAM.AC.UK,file_mode=0775,dir_mode=0775,uid=1006,gid=1008 0 0 lb584@rds.uis.cam.ac.uk:rds-ews_production-rjVtFkyQ1T0/ /mnt/RDS/ fuse.sshfs defaults,noauto,_netdev,allow_other,uid=1006,gid=1008 0 0

notes: uid and gid are for the ewsmanager user, which has not actually been made yet, and will have a different id on a new install

the sshfs mount will not get mounted automatically (has the “noauto” option) and will require this to mount:

sudo mount /mnt/RDS - and will prompt for the password. note that this is being mounted as lb584, a different owner will need to supply different credentials

the credentials stored for the RFS in /root/.rfs_credentials are also going to need to be adjusted to the appropriate user is lb584 is no longer about. It doesn't matter who mounts them (provided they have access to the storage set by the datamanager of the storage) as they both have the uid set to be the shared ewsmanager user.

run sudo dpkg-reconfigure tzdata to make sure it automatically adjusts the clock for BST/GMT Basic user and groups setup

1: install needed packages

initial login must be as root

apt-get update apt-get install postfix apt-get install emacs

open emacs and set the theme: alt-x customize-themes

2: create users

sudo adduser lb584,jws52,tm689,ewsmanager

3: add ews group

sudo groupadd ews

4: add users to the ews group

sudo usermod -aG ews ewsmanager/jwc52/tm689

5: allow members of ews group to su into ewsmanager user:

sudo emacs -nw /etc/pam.d/su

# This allows root to su without passwords (normal operation) auth sufficient pam_rootok.so

auth [success=ignore default=1] pam_succeed_if.so user = ewsmanager auth sufficient pam_succeed_if.so use_uid user ingroup ews

6: grant root users sudo privileges

usermod -aG sudo lb584

7: Install docker

download .deb from https://docs.docker.com/desktop/install/ubuntu/

or

wget -O docker-desktop-4.14.1-amd64.deb2 https://desktop.docker.com/linux/main/amd64/docker-desktop-4.14.1-amd64.deb?utm_source=docker&utm_medium=webreferral&utm_campaign=docs-driven-download-linux-amd64

install docker engine: (instructions at https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository)

sudo apt-get install ca-certificates curl gnupg lsb-release sudo mkdir -p /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

add docker users: sudo groupadd docker sudo usermod -aG docker $USER (reconnect/reopen terminal)

check docker works:

<sudo> docker run hello-world (wont need sudo if the user is added to the docker group)

Add sftp access for metoffice

groupadd sftponly ${EDITOR:-nano} /etc/ssh/sshd_config

Match Group sftponly PermitTunnel no AllowAgentForwarding no AllowTcpForwarding no X11Forwarding no ChrootDirectory /storage/sftp/%u ForceCommand internal-sftp

service ssh start mkdir -p /storage/sftp/ chown -R root:root /storage/sftp/ chmod -R 755 /storage/sftp/

###REPEAT THIS PROCESS FOR THE PLANTVILAGE USER#### adduser --no-create-home --shell /usr/sbin/nologin metofficeupload # create the user and an associated group with the same name adduser metofficeupload sftponly # add to existing sftponly group to allow (only) the sftp subsystem to be used under the chroot mkdir -p /storage/sftp/metofficeupload/upload

chown root:root /storage/sftp/metofficeupload chown metofficeupload:metofficeupload /storage/sftp/metofficeupload/upload chmod -R 755 /storage/sftp/metofficeupload mkdir -p /storage/sftp/metofficeupload/upload/Ethiopia/fromMO/daily_name mkdir -p /storage/sftp/metofficeupload/upload/Ethiopia/toMO/

sftp metofficeupload@10.136.11.3

add the ewsmanager user to metofficeupload group so it can manage files written by this user

sudo usermod -aG metofficeupload ewsmanager ###END REPEAT THIS PROCESS FOR THE PLANTVILAGE USER####

Deploy and test the EWS code

1: add the ssh key of the server to gitlab for access.

generate ssh key (if dont have one already) ssh-keygen -t rsa -b 4096 -C <key_id> https://gitlab.developers.cam.ac.uk/-/profile/keys

2: make dirs for the EWS app

mkdir /storage/app/EWS_prod sudo chown -R ewsmanager:ews EWS_prod/ sudo chmod -R g+s EWS_prod/ sudo chmod -R g+w EWS_prod/

3: follow the deployment instructions.

Install and setup the apache server (file downloads and ews_browser)

setup dirs for apache:

sudo mkdir /storage/webdata/Ethiopia
sudo mkdir /storage/webdata/SouthAsia
sudo chown ewsmanager:ews SouthAsia/ Ethiopia/
sudo chmod g+ws Ethiopia/
sudo chmod g+ws SouthAsia/
sudo mkdir /storage/app/ews_browser
sudo chown ewsmanager:ews ews_browser
sudo chmod g+ws ews_browser
ln -s /storage/webdata/Ethiopia /var/www/html/Ethiopia
ln -s /storage/webdata/SouthAsia /var/www/html/SouthAsia

install apache and other libs:

apt-get -qq install --assume-yes apache2 apache2-dev apache2-utils ssl-cert libapache2-mod-wsgi openssh-server

(if you get prompted about a newer version of the sshd_conf file being available, keep the current one as we modified this already.)

create a venv for the ews_browser, activate it and install the requirements:

mkdir -p /storage/app/ews_browser/env
python3 -m venv /storage/app/ews_browser/env/browser_env
source /storage/app/ews_browser/env/browser_env/bin/activate
pip install flask mod-wsgi

(faff) modify the wsgi.load script in /etc/apache2/mods-available to point to the same python as used by the ews_browser

Enable the conda env for the project:

you are probably still in the active environment from the previous step, you will need to be in the new browser_env with a user that has sudo privileges. If necessary, deactivate the existing env and reactivate with a sudo user:

source /storage/app/ews_browser/env/browser_env/bin/activate
which mod_wsgi-express

gives you:

/storage/app/ews_browser/env/browser_env/bin/mod_wsgi-express

run the install-module command using the exe from the previous step:

sudo /storage/app/ews_browser/env/browser_env/bin/mod_wsgi-express install-module

gives you:

LoadModule wsgi_module "/usr/lib/apache2/modules/mod_wsgi-py38.cpython-38-x86_64-linux-gnu.so"
WSGIPythonHome "/storage/app/ews_browser/env/browser_env"

put the above line as the text in:

/etc/apache2/mods-available/wsgi.load (replacing what is currently there)

This means that apache wsgi will use the version of python that is used by the app to which it is binding.

Note

There are config files in the ews-browser project (in gitlab) that point to this python environment. Make sure ews_browser_client_africa.conf and ews_browser_client_asia.conf have the correct python path

prepare certificates and passwords for apache:

copy the default-ssl.conf file into /etc/apache2/sites_available. See notes on installing genuine certificates below (once the server is up and running).

<IfModule mod_ssl.c>
<VirtualHost _default_:443>

ServerAdmin webmaster@localhost

DocumentRoot /var/www/html

ErrorLog ${APACHE_LOG_DIR}/error.log CustomLog ${APACHE_LOG_DIR}/access.log combined SSLEngine on

SSLCertificateFile /etc/apache2/ssl/server.crt SSLCertificateKeyFile /etc/apache2/ssl/server.key

<FilesMatch ".(cgi|shtml|phtml|php)$">
SSLOptions +StdEnvVars

</FilesMatch> <Directory /usr/lib/cgi-bin>

SSLOptions +StdEnvVars

</Directory>

<Directory "/var/www/html/Ethiopia">
AuthType Basic AuthName "Restricted Content" AuthUserFile /etc/apache2/.htpasswd Require user ethiopia Options +Indexes +FollowSymLinks +Includes +MultiViews

</Directory> <Directory "/var/www/html/SouthAsia">

AuthType Basic AuthName "Restricted Content" AuthUserFile /etc/apache2/.htpasswd Require user southasia Options +Indexes +FollowSymLinks +Includes +MultiViews

</Directory>

TypesConfig /etc/mime.types

</VirtualHost>

</IfModule>

enable url rewriting in apache:

This allows a maintenance page to be displayed if services are down.

a2enmod rewrite;

edit the apache conf at /etc/apache2/apache2.conf

<Directory /var/www/>
Options Indexes FollowSymLinks AllowOverride All Require all granted

</Directory>

add a maintenance.html file to /var/www/html e.g.

<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml">

<head>
<STYLE type="text/css">
body {font-family:sans-serif; color: black; background: white;} div {text-align: center}

</STYLE> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title>Willow server homepage</title>

</head> <body>

<center>
<br><br> <div><i>This service is now running at <a href="https://epi.plantsci.cam.ac.uk/">https://epi.plantsci.cam.ac.uk</a> - please edit your url and bookmarks accordingly.</i></div>

</center>

</body>

</html>

add an .htaccess file in /var/www/html/.htaccess

ErrorDocument 503 /maintenance.html RewriteEngine On RewriteCond %{REMOTE_ADDR} !^000.000.000.000 RewriteCond %{REQUEST_URI} !/maintenance.html$ [NC] RewriteRule .* - [L,R=503]

The RewriteEngine is set to “Off” when not redirecting

modify the ews browser wsgi conf files to point to the html directory, rather than the code:

/etc/apache2/sites-available/ews_browser_client_africa.conf
/etc/apache2/sites-available/ews_browser_client_asia.conf

comment in the directory mapping in place of the WSGI mapping. e.g.

#IAAS

WSGIDaemonProcess ews_browser_africa user=ewsmanager group=ewsmanager threads=5 python-home=/storage/app/EWS_prod/envs/browser_env WSGIScriptAlias /ews_browser_africa /storage/app/ews_browser/code/src/main/python/ews_browser_africa.wsgi

<Location "/ews_browser_africa">
AuthType Basic AuthName "Restricted Content" AuthUserFile /etc/apache2/.htpasswd Require user ethiopia admin

</Location> ErrorLog /storage/app/ews_browser/outputs/ews_browser_client_error.log

# IF THE SITE IS DOWN FOR MAINTENANCE, COMMENT THIS BLOCK IN ANF THE ONE ABOVE OUT - IT WILL USE THE REDIRECT IN .HTACCESS TO DISPLAY THE MAINTENENCE MESSAGE #<Directory /var/www/> # Options Indexes FollowSymLinks # AllowOverride All # Require all granted #</Directory>

restart the server to implement the changes:

sudo service apache2 restart

This will make all traffic to the /var/www/html dir redirect to the maintenance page.

enable ssl in apache:

ln -s /etc/apache2/sites-available/default-ssl.conf /etc/apache2/sites-enabled/000-default-ssl.conf

a2enmod ssl (possible this is already enabled)

add passwords for Ethiopia and SouthAsia users:

touch /etc/apache2/.htpasswd

(get the password from someone or make a new one)

htpasswd /etc/apache2/.htpasswd southasia

htpasswd /etc/apache2/.htpasswd ethiopia

htpasswd /etc/apache2/.htpasswd admin

the admin account can be used by developers, to save them from having to re-enter the domain-specific passwords as they swap browsers

create server certificates - once you have a domain name registered and pointing at this instance:

sudo snap install --classic certbot (one off, to install software)

sudo ln -s /snap/bin/certbot /usr/bin/certbot

then generate the certificates using:

sudo certbot certonly --apache

(enter the domain name you want to register (which needs to be wired through to this instance)

note that this retrieves the certificate only, does not install them

This created cert files in:

/etc/letsencrypt/live/epi.plantsci.cam.ac.uk (where epi.plantsci.cam.ac.uk is the registered domain (in this example))

I then edited the certificate path lines in this file:

/etc/apache2/sites-enabled/ssl.conf

to:

SSLCertificateFile /etc/letsencrypt/live/epi.plantsci.cam.ac.uk/fullchain.pem

SSLCertificateKeyFile /etc/letsencrypt/live/epi.plantsci.cam.ac.uk/privkey.pem

These certificates should be auto-renewed as certbot sets up a timer task to check if renewal is needed. See the timer by running:

systemctl list-timers

make the output dirs for the pipeline:

mkdir -p /storage/app/ews_browser/outputs/temp_unzip_dir chmod -R 775 /storage/app/ews_browser/outputs/ chmod g+sw /storage/app/ews_browser/outputs/

deploy the ews_browser code

copy the ews_browser code into /storage/app/ews_browser

create a symlink to the wsgi conf files

ln -s /storage/app/ews_browser/code/src/main/python/ews_browser_client_asia.conf ews_browser_client_asia.conf
ln -s /storage/app/ews_browser/code/src/main/python/ews_browser_client_africa.conf ews_browser_client_africa.conf

activate the sites:

a2ensite ews_browser_client_africa.conf
a2ensite ews_browser_client_asia.conf

start server:

service apache2 start

setup gitlab runner for CI

curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash

note: on the IaaS I had to “mkdir /var/lib/gitlab-runner” as the logs were complaining and the service wasnt staying up. I'm hoping this was because I upgraded the runner version and something got out of whack.

sudo gitlab-runner register go into /etc/gitlab-runner as root and edit the newly registered runner in config.toml :

concurrent = 1 check_interval = 0

[[runners]]

name = "bot-cags1-production" url = "https://gitlab.developers.cam.ac.uk/" token = "sjgyL15rgdC1veHs6yPT" executor = "docker" [runners.docker]

tls_verify = false image = "python" privileged = false disable_cache = false volumes = ["/storage/webdata:/storage/moved","/storage/sftp:/storage/sftp","/storage/app/EWS_prod/envs/credentials:/storage/app/EWS_prod/envs/credentials","/storage/app/EWS_prod/regions:/storage/app/EWS_prod/regions","/storage/app/EWS_prod/code:/storage/app/EWS_prod/code","/cache"] shm_size = 0

[runners.cache]

Airflow installation

Install mysql:

https://www.digitalocean.com/community/tutorials/how-to-install-mysql-on-ubuntu-20-04

sudo apt install mysql-server

sudo systemctl start mysql.service

hack to allow root access via sudo on ubuntu (in the digital ocean docs)

sudo mysql ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password_here'; exit

sudo mysql_secure_installation (no opinion on what options are selected, except disabling root access from remote host (yes))

finish the hack for ubuntu root access: mysql -u root -p ALTER USER 'root'@'localhost' IDENTIFIED WITH auth_socket; exit </end ubuntu hack>

Install Airflow

https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html

mkdir <path>/airflow cd airflow set the airflow home location export AIRFLOW_HOME=<path>/airflow

python3 -m venv airflow-env; source airflow-env/bin/activate;

pip install wheel; pip install apache-airflow[celery]==2.7.1 --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.7.1/constraints-3.8.txt"

note that there are two dashes before constraint above, not a single long dash

pip check

install the docker operator

pip install apache-airflow-providers-docker

install mysql (need root privileges)

https://www.digitalocean.com/community/tutorials/how-to-install-mysql-on-ubuntu-20-04

sudo apt update; sudo apt install mysql-server; sudo systemctl start mysql.service;

follow the additional instructions to work around the mysql_secure_installation issue

sudo mysql; ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '<put_root_password_here>'; exit;

The password above has been set to the same as the ewsmanager password. You can now run the security manager script:

sudo mysql_secure_installation;

<choose sensible security options at your discretion>

Setup mysql database:

mysql -u root -p

CREATE DATABASE airflow_db CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

CREATE USER 'airflow_user'@'localhost' IDENTIFIED BY 'generate_a_non_secret_password_for_here';

GRANT ALL PRIVILEGES ON airflow_db.* TO 'airflow_user';

exit;

set database defaults (sqllite)

airflow db migrate

edit the resulting ${AIRFLOW_HOME}/airflow.cfg file on line 435 to the mysyl database sql_alchemy_conn = mysql+mysqldb://airflow_user:<password_from_above>@localhost:3306/airflow_db

while you are editing the config file, set the timezone to be your local timezone, rather than UTC. As this system works on a regular, local timezone, daylight saving hours need to be taken into account so it runs at the same local time each day.

default_timezone = Europe/London (note that the UI has a dropdown in the top right to set the displayed timezone, but the scheduler will run on the server timezone)

while you are in the config file, set up the smtp to use our email address:

smtp_host = smtp.gmail.com smtp_starttls = True smtp_ssl = False smtp_user = clusternotifications@gmail.com smtp_password = <password_here> smtp_port = 587 smtp_mail_from = clusternotifications@gmail.com

install the mysql provider

sudo apt install gcc; sudo apt-get install python3-dev; pip install apache-airflow-providers-mysql

the above may fail if you dont specify the mysql headers first as in this post:

https://stackoverflow.com/questions/76875507/can-not-install-apache-airflow-providers-mysql-pkg-config-error

sudo apt install libmysqlclient-dev; export MYSQLCLIENT_CFLAGS="$(mysql_config --cflags)"; export MYSQLCLIENT_LDFLAGS="$(mysql_config --libs)";

migrate the db again to use mysql this time

airflow db migrate

possible sql mode error:

https://stackoverflow.com/questions/36882149/error-1067-42000-invalid-default-value-for-created-at

create an admin user for the webserver

airflow users create --role Admin --username admin --email lb584@cam.ac.uk --firstname admin --lastname admin --password admin

(you can set the password to something more secure when logged in)

run the scheduler and the webserver

airflow scheduler; airflow webserver;

can run as demons with the -D flag

connect to the webserver http://<server_ip>:8090

set up airflow as a service

Once you have got airflow running from the command line, you will need to set it up as a service (which will run at startup and in the background)

1: create a service file for airflow scheduler and webserver:

sudo touch /etc/systemd/system/airflow-scheduler.service

sudo touch /etc/systemd/system/airflow-webserver.service

edit the contents of both files to be:

[Unit]
Description=Airflow scheduler daemon
After=network.target mysql.service
Wants=mysql.service
[Service]
EnvironmentFile=/storage/airflow/airflow.cfg
User=ewsmanager
Group=ewsmanager
Type=simple
ExecStart=/usr/bin/bash -c 'export AIRFLOW_HOME=/storage/airflow ; source  /storage/airflow/airflow-env/bin/activate ; airflow webserver'
Restart=on-failure
RestartSec=5s
StartLimitIntervalSec=10s
StartLimitBurst=3
PrivateTmp=true
[Install]
WantedBy=multi-user.target
[Unit]
Description=Airflow scheduler daemon
After=network.target mysql.service
Wants=mysql.service
[Service]
EnvironmentFile=/storage/airflow/airflow.cfg
User=ewsmanager
Group=ewsmanager
Type=simple
ExecStart=/usr/bin/bash -c 'export AIRFLOW_HOME=/storage/airflow ; source  /storage/airflow/airflow-env/bin/activate ; airflow scheduler'
Restart=on-failure
RestartSec=5s
StartLimitIntervalSec=10s
StartLimitBurst=3
PrivateTmp=true
[Install]
WantedBy=multi-user.target

2: reload the service daemon, enable and start the services

sudo systemctl daemon-reload

sudo systemctl enable airflow-scheduler.service

sudo systemctl enable airflow-webserver.service

sudo systemctl start airflow-scheduler.service

sudo systemctl start airflow-webserver.service

sudo systemctl restart airflow-scheduler.service

sudo systemctl restart airflow-webserver.service

sudo systemctl stop airflow-scheduler.service

sudo systemctl stop airflow-webserver.service

3: check the status of the services

sudo systemctl status airflow-scheduler.service

sudo systemctl status airflow-webserver.service

4: check the logs

sudo journalctl -r -u airflow-scheduler.service

sudo journalctl -r -u airflow-webserver.service

General apache logs:

/var/log/apache2/error.log