FAQ | This is a LIVE service | Changelog

README.md 5.39 KB
Newer Older
Jon Marshall's avatar
Jon Marshall committed
1
2
3
4
5
6
7
# zback

zback is a ZFS backup and replication utility. It draws heavy inspiration from
ZnapZend, zfs-auto-snapshot and zrep.

zback is designed to be run as a cron job. Each run is called with a tag, which
dictates snapshot names and whether sending is attempted, along with how many
Jon Marshall's avatar
Jon Marshall committed
8
9
local snapshots are kept. This is to allow for flexibility in sending schedules.
For example, the defaults assume that hourly, daily, weekly and monthly
Jon Marshall's avatar
Jon Marshall committed
10
11
12
13
14
15
16
17
18
19
20
21
22
snapshots will be being taken, and that some of these will be being sent.
However it is trivial to define your own tags, an example being frequent, run
every 10 minutes - the details are up to you.

An example cron job is included in the repository to give an idea of how to run
the script.

Also included are settings for the Zabbix monitoring platform which I'm not
going to explain how to set up or anything.

This software is in an early testing phase and has been running on CentOS 7
only, I accept literally no responsibility for it right now.

Jon Marshall's avatar
Jon Marshall committed
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
The syntax for commands has been modelled on zfs itself, with each subcommand
having its own help:

> usage: zback.py [-h] [-v] {process,configure,clear,zabbix} ...
> 
> optional arguments:
>   -h, --help            show this help message and exit
>   -v, --verbose         Logs to console and increases logging output
> 
> subcommands:
>   {process,configure,clear,zabbix}
>     process             Performs a run on datasets, spawns a new thread for
>                         each one
>     configure           Adds or edits zback related properties for a specified
>                         dataset
>     clear               Clears zback related settings and snapshots from a
>                         dataset
>     zabbix              Returns JSON formatted string of zback managed
>                         datasets for Zabbix

Jon Marshall's avatar
Jon Marshall committed
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
## Getting started

zback has been designed with minimalism in mind, but still has one requirement,
which is paramiko. To install, make sure pip3 is present:

> pip3 install paramiko

To flag a dataset as being managed by zback:

> zfs set zback:manage=yes $ZPOOL/$DATASET

If no other options are specified, zback will take a snapshot with the tag
supplied, and prune snapshots but not attempt a send.

To send a snapshot to another location, another ZFS option has to be set, but
before that a few things need to be considered. If the destination is a remote
server, then a passwordless SSH key should be installed and the host name you
want to connect to should be resolveable, there is no checking done to ensure
this is in place. Additionally, zback assumes that the user you are running as
has access to the server on the other end and will crap out dramatically if not.

An example send job would be:

66
> zfs set zback:send=hourly:ssh:remote-server1:REMOTEPOOL/REMOTEDATASET
Jon Marshall's avatar
Jon Marshall committed
67
68
69

This will attempt a send whenever the hourly run is called, to the server
remote-server1, sending the snapshot over an SSH channel, to the pool mentioned,
70
in the form `run_tag:protocol:remote-hostname:zpool/dataset`. 
Jon Marshall's avatar
Jon Marshall committed
71
72
73
74
75
76
77
78
79
80
81
82

Multiple send jobs are permitted, so long as they are comma-separated.

Supported remote protocols are tcp and ssh - in the case of ssh, the send is
channeled over a secure SSH connection. For hosts on a private network, I have
included support for a simple TCP connection to the remote server, with all the
security and performance implications that raises.

If the destination is a local dataset, then the format is:

> zfs set zback:send=local:DESTPOOL/DESTSET:hourly

Jon Marshall's avatar
Jon Marshall committed
83
84
85
86
87
88
89
90
91
92
93
94
95
96
These options can be more easily set with:

> zback configure $DATASET

Once you have set the options on a dataset correctly, you can then begin with an
initial seed transfer. To do this, call the `zback process` subcommand:

> zback process --init $DATASET

This will attempt to create the remote dataset if none already exists - if there
is already a dataset but it has no snapshots associated with it then this is
also fine (for example you need to create a remote dataset with specific options
set on it). However if the remote dataset exists and has snapshots associated
with it then zback will refuse to initialise.
Jon Marshall's avatar
Jon Marshall committed
97

Jon Marshall's avatar
Jon Marshall committed
98
To call a normal run on a dataset, simply omit the `--init`:
Jon Marshall's avatar
Jon Marshall committed
99

Jon Marshall's avatar
Jon Marshall committed
100
> zback process $DATASET
Jon Marshall's avatar
Jon Marshall committed
101

Jon Marshall's avatar
Jon Marshall committed
102
103
Called with no datasets, zback will process all managed datasets, each one in a
separate thread.
Jon Marshall's avatar
Jon Marshall committed
104
105
106
107
108
109
110
111
112

By default, if no tags are specified on a dataset then zback will assume the
following tags are going to be used:

> hourly:48
> daily:7
> weekly:6
> monthly:6

Jon Marshall's avatar
Jon Marshall committed
113
114
There are two ways that you can change this behaviour. One is to edit `zback.py`
and change the variable `DEFTAGS`, which will permanently alter this behaviour
Jon Marshall's avatar
Jon Marshall committed
115
116
117
118
119
120
across all datasets.

However, if you use zfs to set a property called tags, for example:

> zfs set zback:tags=frequent:5,hourly:24 $ZPOOL/$DATASET

Jon Marshall's avatar
Jon Marshall committed
121
122
Then this will override the defaults for this dataset only. The same option can
be set more easily by running the `zback configure` subcommand.
Jon Marshall's avatar
Jon Marshall committed
123
124
125
126
127
128
129

One thing to note, if you are running zback on both sides of a sync, that on the
other end you will need to specify a recv flag, so as to prevent zback from
taking unnecessary snapshots. This can be done via:

> zfs set zback:recv=hourly $ZPOOL/$DATASET

Jon Marshall's avatar
Jon Marshall committed
130
131
132
If you have initialised the dataset using zback, then this should not be
necessary as it will have set this flag, as well as the manage flag, on the
remote dataset during the initialisation.
Jon Marshall's avatar
Jon Marshall committed
133

Jon Marshall's avatar
Jon Marshall committed
134
135
If you want to remove zback, as well as its custom options, hold tags, snapshots
or all of the above, then please use the `zback clear` subcommand.