Saturday, September 3, 2016

How to Set Up an Ultimate Backup System, Using Duplicity, Systemd and Thunar on Arch Linux Part Three



How to Setup Automated Duplicity Backups for the User: 

Now that we have set the automated backups for the system, it is time to do the same for the user. The process is similar with some minor modifications. First things first, we need an operation folder where we will place all of the duplicity related files. Create the following folder:
mkdir /home/my_user/.config/duplicity_backup
In order to simplify our life when we are implementing the front end, we will not place all of the duplicity parameters into one file. Create the following file:
vim /home/my_user/.config/duplicity_backup/exclude_list.txt 
This file will contain the list of files and folders that should be excluded from backup. It should be filled in similar to this:

/home/my_user/Music
/home/my_user/.cache
/home/my_user/Downloads
/home/my_user/.gnupg
/home/my_user/.ssh/id_rsa
/home/my_user/.local
/home/my_user/Pictures
/home/my_user/Templates
/home/my_user/tmp
/home/my_user/Videos
/home/my_user/.config/google-chrome/

Note that Music, Videos and Pictures are excluded in this case, this is done to conserve space on the remote server. Fill free to include those folders, by removing them from the list.  Now we need to specify the remote repository where the files will be stored:
cd /home/my_user/.config/duplicity_backup/ 
echo "scp://my_user@server///home/my_user/repos/me" > repo_location
 Finally we need to create the backup scripts:
vim back_home_up.sh
with the following content:

Make this script executable and run it:
chmod +x back_home_up.sh
back_home_up.sh
Now you should have your first full backup of home folder on the remote repository. It is time to set up user units for systemd to automate it. Navigate to the ~/.config/systemd/user folder, if it is not there create it. Than create duplicity-remote-backup.service and duplicity-remote-backup.timer
files with the following content:

[Unit]
Description=Duplicity remote home backup
[Service]
Type=oneshot
ExecStart=/home/my_user/.config/duplicity_backup/back_home_up.sh
[Unit]
Description=Duplicity remote home backup timer
[Timer]
OnBootSec=5m
OnUnitActiveSec=1h
[Install]
WantedBy=timers.target

Now enable the timer by executing:
systemctl --user enable duplicity-remote-backup.timer
This shall conclude the back-end of the duplicity backup system for the user. Next we will discuss the front-end.


How to Implement a Front-end to Duplicity for Thunar, Using thunarx-python:

Thunarx-python provides python bindings for the Thunar Extension Framework. We will be using it to implement a Thunar plugin that provides a submeny with restore options. To install it on Arch Linux you either need to download it from AUR and install it manually or run the following command: 
yaourt thunarx-python
Once the installation is complete, create the following file:
sudo vim /usr/share/thunarx-python/extensions/thunarx-submenu-plugin.py 
with this content: 

import thunarx
import gtk
import sys
import urllib
from os import path
sys.path.append("/usr/local/lib/python2.7")
from dupController import DupController
home_folder = path.expanduser("~") + "/"
dup_folder = home_folder + ".config/duplicity_backup"
__dpc = DupController(dup_folder)
def __extractAddress(obj):
return urllib.unquote(obj.get_uri()[7:])
"""
Thunarx Submenu Plugin
This plugin shows an example of a MenuProvider plugin that implements
sub-menus. The example used here requires the developer to sub-class
gtk.Action and override the create_menu_item virtual method.
"""
class MyAction(gtk.Action):
__gtype_name__ = "MyAction"
def __init__(self, name, label, tooltip=None, stock_id=None, menu_handler=None):
gtk.Action.__init__(self, name, label, tooltip, stock_id)
self.menu_handler = menu_handler
self.obj_type = None
self.obj_addresses = None
def create_menu_item(self):
menuitem = gtk.MenuItem(self.get_label())
if self.menu_handler is not None:
menu = gtk.Menu()
menuitem.set_submenu(menu)
self.menu_handler(menu, self.obj_type, self.obj_addresses)
return menuitem
do_create_menu_item = create_menu_item
# Defined By me!
def ManageDuplicityController(self, objects, option):
for obj in objects:
__dpc.restore_file(obj, option)
def PyFileActionMenu(menu, obj_type, obj_addresses):
create_menu = False
backed_up_objects = []
try:
for obj in obj_addresses:
obj_name = __extractAddress(obj)
if __dpc.is_backed_up(obj_name):
create_menu = True
backed_up_objects.append(obj_name)
except TypeError:
pass
if create_menu:
options = __dpc.get_options()
for option in options:
action = gtk.Action("TMP:"+option[0], option[0], None, None)
subitem = action.create_menu_item()
menu.append(subitem)
subitem.show()
action.connect("activate", ManageDuplicityController, backed_up_objects, option[1])
class ThunarxSubMenuProviderPlugin(thunarx.MenuProvider):
def __init__(self):
pass
def get_file_actions(self, window, files):
res = MyAction("TMP:Restore", "Restore", "Backup Management",
gtk.STOCK_FILE, menu_handler=PyFileActionMenu)
res.obj_type = "File"
res.obj_addresses = files
return [res]
def get_folder_actions(self, window, folder):
res = MyAction("TMP:Restore", "Restore",
"Backup Management", gtk.STOCK_DIRECTORY, menu_handler=PyFileActionMenu)
res.obj_type = "Folder"
res.obj_addresses = folder
return [res]
Don't forget to edit line 12 and set the dup_folder to the correct location. This will add the submenu option "Restore"  to Thunar when you right click on a file. For this to work you need to add one more additional file to   /usr/local/lib/python2.7 . Execute the following commands in command line:
cd /usr/local/lib/python2.7 
sudo vim dupController.py
and fill it in with:

from __future__ import print_function
from os import path
import shlex
from collections import deque
from itertools import islice
from subprocess import Popen, PIPE, STDOUT
from threading import Thread
try:
import Tkinter as tk
except ImportError:
import tkinter as tk # Python 3
try:
from Queue import Queue, Empty
except ImportError:
from queue import Queue, Empty # Python 3
info = print
def iter_except(function, exception):
"""Works like builtin 2-argument `iter()`, but stops on `exception`."""
try:
while True:
yield function()
except exception:
return
class DupControllerGUI:
"""
Disclaimer: The original structure of this class has been borrowed and from:
https://gist.github.com/zed/42324397516310c86288
it has been modified to fit our purpose.
Implements a threaded gui and a background call to duplicity
"""
def __init__(self, root, opt_args):
self.root = root
# start duplicity backup process
self.proc = Popen(opt_args, stdout=PIPE, stderr=STDOUT)
# launch thread to read the subprocess output
# (put the subprocess output into the queue in a background thread,
# get output from the queue in the GUI thread.
# Output chain: proc.readline -> queue -> stringvar -> label)
q = Queue()
t = Thread(target=self.reader_thread, args=[q]).start()
# GUI Elements
self._message_var = tk.StringVar()
self._message_var.set("Restoring: Please restrain from stopping until the process has finished")
tk.Label(root, font=("Times", 13, "bold"), textvariable=self._message_var).pack()
# show subprocess' stdout in GUI
self._stdout_text = tk.Text(root, font=("Mono", 12, "bold"), bg="gray11", fg="antique white")
self._stdout_text.insert(tk.END, "Starting restore process: \n")
self._stdout_text.pack()
self._stdout_text.config(state="disabled")
# stop subprocess using a button
self._button = tk.Button(root, text="Stop", font=("Mono", 11, "bold"), command=self.stop)
self._button.pack()
self.update(q) # start update loop
# Set to Busy
self.set_cursor("watch")
def set_cursor(self, param):
"""Set cursor to param"""
self.root.config(cursor=param)
self._stdout_text.config(cursor=param)
def reader_thread(self, q):
""" Read from stdout in subprocess thread"""
info('Start Duplicity')
is_finished = False
"""Read subprocess output and put it into the queue."""
for line in iter(self.proc.stdout.readline, b''):
q.put([line, is_finished])
line = "Restoring process has successfully finished \n"
is_finished = True
q.put([line, is_finished])
info('Finish Duplicity')
def update(self, q):
"""Update GUI with items from the queue."""
# read no more than 10000 lines, use deque to discard lines except the last one,
for data in deque(islice(iter_except(q.get_nowait, Empty), 10000), maxlen=1):
if data is None:
return # stop updating
else:
line = data[0]
is_finished = data[1]
self._stdout_text.config(state="normal")
self._stdout_text.insert(tk.END, line) # update GUI
self._stdout_text.config(state="disabled")
if is_finished:
self._button.config(text="Finish")
self.set_cursor("")
self.root.after(40, self.update, q) # schedule next update
def stop(self):
"""Stop subprocess and quit GUI."""
info('stoping')
self.proc.terminate() # tell the subprocess to exit
# kill subprocess if it hasn't exited after a countdown
def kill_after(countdown):
if self.proc.poll() is None: # subprocess hasn't exited yet
countdown -= 1
if countdown < 0: # do kill
info('killing')
self.proc.kill() # more likely to kill on *nix
else:
self.root.after(1000, kill_after, countdown)
return # continue countdown in a second
# clean up
self.proc.stdout.close() # close fd
self.proc.wait() # wait for the subprocess' exit
self.root.destroy() # exit GUI
kill_after(countdown=5)
class DupController:
"""
Back-end control of duplicity restore process. Made for integration with thunarx-python
"""
def __init__(self, duplicity_folder):
"""
:param duplicity_folder: Root operation folder of duplicity.
"""
if type(duplicity_folder) is not type(str()):
raise TypeError("Type of the Duplicity operation folder must be str")
if duplicity_folder.endswith("/"):
self.__op_folder = duplicity_folder
else:
self.__op_folder = duplicity_folder + "/"
if not path.exists(self.__op_folder):
raise ValueError("The Duplicity operation folder does not exist")
self.exclude_list = "exclude_list.txt"
self.root_folder = path.expanduser("~") + "/"
with open(self.__op_folder+"repo_location", "r") as f:
self.repo_location = f.readline()
def is_backed_up(self, obj_address):
""" Parse the exclude file list, determine if the file is excluded or not
:param obj_address the address of the object to be restored
"""
is_excluded = False
excl_path = self.__op_folder + self.exclude_list
if path.exists(excl_path):
with open(excl_path, 'r') as f:
for line in f:
if line.strip("\n") in obj_address:
is_excluded = True
return not is_excluded
def __extract_rel_file_path(self, full_path):
"""
:param full_path: Absolute path to the file located in users home directory.
:return: Relative path to the, in relation to home folder.
"""
root_folder_len = len(self.root_folder)
curr_root_folder = full_path[0:root_folder_len]
if curr_root_folder != self.root_folder:
raise ValueError("Invalid root folder \n " +
" is : " + curr_root_folder + "\n" +
"should : " + self.root_folder)
return full_path[root_folder_len:len(full_path)]
@staticmethod
def get_options():
"""
Duplicity restore options. Feel free to modify or append in case a different structure is needed
:return: list of options in [display name, duplicity command] format.
"""
return [["Jump to 1h ago", "-t1h"],
["Jump to 2h ago", "-t2h"],
["Jump to 3h ago", "-t3h"],
["Jump to 1D ago", "-t1D"],
["Jump to 2D ago", "-t2D"],
["Jump to 1W ago", "-t1W"],
["Jump to 2W ago", "-t2W"]]
def restore_file(self, file_path, date_str):
"""
Restore a file to a particular date in time. The restored file will have a prefix _restored
and wont override the current file
:param file_path: Absolute file path
:param date_str: Date string obtained from self.gat_options()
"""
rel_file_str = "--file-to-restore " + self.__extract_rel_file_path(file_path)
opt_string = "duplicity --no-encryption "
opt_string += date_str + " " + rel_file_str + " "
opt_string += self.repo_location + " " + file_path + "_restored"
opt_args = shlex.split(opt_string)
info(opt_args)
root = tk.Tk()
app = DupControllerGUI(root, opt_args)
root.protocol("WM_DELETE_WINDOW", app.stop) # exit subprocess if GUI is closed
root.title("Restoring")
# Set icon, in case the file is missing fill free to set it to whatever icon you like or comment it out
icon_img = tk.Image("photo", file='/usr/share/icons/gnome/16x16/devices/gnome-dev-cdrom-audio.png')
root.tk.call('wm', 'iconphoto', root._w, icon_img)
root.mainloop()
info('exited')
Now you should have it all up and running, test it by restoring and arbitrary file.





Thursday, September 1, 2016

How to Set Up an Ultimate Backup System, Using Duplicity, Systemd and Thunar on Arch Linux Part Two

 

How to Automate Script Calls Using Systemd:

How to Write Systemd Service Units:

Now that we have a working script for performing a system backup, we want to automate it. To get familiarized with systemd I would recommend to read through the wiki pages of arch Linux they are pretty great [1], [2], [3]. The back_sys_up.sh is responsible for backing up the system, independent of any users. That is why we want it to run whenever the machine is powered, and not depend on whether the user is logged in or not. To do so we need to create our service for calling the backup script in  /etc/systemd/system folder:
cd  /etc/systemd/systemsudo 
sudo vim duplicity-system-remote-backup.service
This file must contain the following content:

[Unit]
Description=Incremental remote system backup
[Service]
Type=oneshot
ExecStart=/root/scripts/system_backup/back_sys_up.sh
view raw gistfile1.txt hosted with ❤ by GitHub
It is also recommended to find out which services are responsible for the network connection on your machine and add the requirements in this file at the bottom of the [Unit] section:
Requires=your_network_connection.service 
After=yout_network_connection.service
This will ensure that your machine is connected to the network before running the backup service. The [Unit] section contains all of the parameters responsible for the systemd unit, and it is universal for all the systemd units. In our case we just have a short description of what the unit does and its requirements. The [Service] section is unique for systemd service units. We have only two parameters here. Type specifies the type of the service in our case oneshot, which means that the service will stop upon the completion of the script. Since we are going to be using a timer to activate the script every time this is a desired behavior.  The second parameter is the ExecStart, which specifies the script to be executed.

How to Write Systemd Timer Units:

Now lets configure the timer for our service. To do so execute the following command:
sudo vim duplicity-system-remote-backup.timer
It is important that the timer and the service files have the same name. The content of the timer should be as follows:

[Unit]
Description=Incremental remote system backup timer
[Timer]
OnBootSec=5m
OnUnitActiveSec=1h
[Install]
WantedBy=timers.target
Here there are two important parameters. OnBootSec specifies the time that should pass after boot in order for the service to start, in this case 5 minutes. OnUnitActiveSec specifies the time that should pass from the last time the unit was active in order to start the service, in this case 1 hour. Now it is time to enable the timer and do some testing. Run the following command to enable the timer:
sudo systemctl enable duplicity-system-remote-backup.timer
In order to test you need to restart. After reboot open two terminal windows. In one of them type:
 journalctl -f
Leave this terminal visible, here you can monitor all the systemd activities. In the second terminal you can type:
systemctl list-timers 
 To see which timers are currently present, how much time is left until they execute, when was the list time they have been active, etc.

Housekeeping:

Since full backups can take a long time on slow networks, and since there is no guaranty that your machine will stay up for the duration of this time, duplicity allows the restarting of interrupted backup processes from the point of interruption. To do so duplicity divides the bulk of your files into blocks and uploads them block by block. Every time a block is being uploaded a lock file is being created by duplicity. This can cause issues when duplicity is unexpectedly interrupted, i.e. the machine is turned off while duplicity is performing a backup, or something along this lines. The problem is that after reboot the lock file is still in place, and duplicity will refuse to perform backups if there is a lock file from another instance. To overcome this issue a small script must be written that will be activated every time after boot, and that will look for the lock files and delete them before duplicity is activated. To do so create the following file:
sudo vim /root/scripts/system_backup/remove_locks.sh
and fill it in as follows:

#!/bin/bash
find /root/.cache/duplicity -name "*.lock" | xargs rm
find /home/my_user/.cache/duplicity -name "*.lock" | xargs rm
view raw remove_locks.sh hosted with ❤ by GitHub
In order to save time I took a shortcut and also added the remove functionality for the my_user in to this script as well. By default the lock files are stored in ~/.cache/duplicity folder. If you did not explicitly specify a different folder this script should do the trick. To automate this process we need to create the systemd service and timer units for this script. To do so create duplicity-cleanup-onboot.service  duplicity-cleanup-onboot.timer  files in /etc/systemd/system/ folder, with the following content:

[Unit]
Description= Clean up residual duplicity locks
[Service]
Type=oneshot
ExecStart=/root/scripts/system_backup/remove_locks.sh
[Unit]
Description=Clean up residual duplicity locks timer
[Timer]
OnBootSec=1m
[Install]
WantedBy=timers.target
Now you should have s fully functioning system backup using duplicity. In the next part we will describe the creation of the back-end as well as the front-end for the backup system of the user.

Wednesday, August 31, 2016

How to Set Up an Ultimate Backup System, Using Duplicity, Systemd and Thunar on Arch Linux Part One

 

Introduction:

In this multi-step tutorial we will discuss how to set up a complete backup system on your freshly installed arch linux. We will be using duplicity as a backup program, which will perform automatic backups of your system to a remote server. To automate the backup process, we will use systemd units and timers. Once the back-end is set, we'll implement a front-end GUI that is integrated with Thunar file browser to make the restoring of backed up files more comfortable and user friendly. 

List of tools:

Back-end: 

  • Backup program: Duplicity
  • Backup automation:  Systemd units and timers
  • Scripting: Shell

Front-end:

  • File browser: Thunar
  • Plugin API: Thunarx-python
  • Scripting: Python 2.7

Installing  Duplicity and Configuring ssh on the Client Side:

Installing duplicity on arch linux is pretty straight forward. Execute the following command:
sudo pacman -Syu duplicity
Enter your password, seat back and relax. Once the installation is complete it is time to configure the remote repository. We will be using ssh for transferring the files between the server and the local machine. There is a number of tutorials and how-toes on the internet describing how to properly set up ssh access on a remote server, thus I will omit detailed descriptions and will only concentrate on the basic configurations on the client side. 

 We will split the backup system into two parts. First part will be running on the system level, and be responsible for backing up your system files such as /var, /usr, /boot, etc. The second part will be running on the user level and will be responsible for backing up user files i.e. /home/USER.   

Preparing system level backups:

How to Configure SSH for automated remote login: 

First things first, you need to write/configure ssh config file for the root user. If you do not have the file already you need to create /root/.ssh/config to do so type:
sudo vim /root/.ssh/config
Inside of this file you need to add the following content:
Host server
HostName my_domain.me
Port 22
User my_user
view raw .ssh_config hosted with ❤ by GitHub
Where my_domain.me should be replaced by the domain name or the ip address of your server. Note, that it can also be a server in your local network described by the local ip address. The important thing is that the server should be accessible from your machine.  In this example we are using port 22 since it is the default ssh port. For security reasons it is recommended to set the port to a higher number. This can be done by editing /etc/ssh/sshd_config  on your server and changing the .ssh/config Port parameter correspondingly.

Next you need to generate a pair of public and local keys for secure ssh access to the server. To do so type the following in the command line:
sudo ssh-keygen
Answer all of the question correctly. When promoted to enter a password  leave it blank, otherwise you will not be able to automate the ssh login to the remote server. Next you need to copy your public key to your server, to do so type:
sudo ssh-copy-id server
Once you type in your sudo password followed by your my_user password on the remote server your public key will be copied. To test if your password-less ssh login works type the following into the command line:
sudo ssh server
now you should be logged in your remote server as my_user.  Type exit to return to your local machine.  Now that the automated ssh login is configured for the root user, you need to configure it for the regular user as well. Perform all of the actions in this sub-chapter without the using sudo in front.


How to set up Duplicity:

Now it is time to set up duplicity to perform backups to the remote server. First of all you will need a folder on the remote server where duplicity will store its files. Since we are planning to split the backup process into two subsystems we will create two folders. One for system backups and one for user backups. Type the following in to the command line:
ssh server 

mkdir /home/my_user/repos/system 

mkdir /home/my_user/repos/me 
 lets write the backup scripts for the system backup first. Create a folder in the home directory of your root to store all of the duplicity related stuff.
sudo mkdir /root/scripts/system_backup 
vim /root/scripts/system_backup/back_sys_up.sh 
Now fill in the following content:
#!/bin/bash
test -x $(which duplicity) || exit 0
#Write the start time into the log files
echo "start: " `date +%D-%T` >> /var/log/duplicity_system.log
# Call duplicity to begin the backup process
$(which duplicity) \
--no-encryption \
--full-if-older-than 15D \
--num-retries 3 \
--log-file /var/log/duplicity_system.log \
--progress \
--exclude /proc \
--exclude /home \
--exclude /dev \
--exclude /mnt \
--exclude /lost+found \
--exclude /sys \
--exclude /run \
--exclude /tmp \
/ scp://my_user@server///home/my_user/repos/system
# Remove all of the full backups and their increments, except the last three
$(which duplicity) \
remove-all-but-n-full 3 \
--force \
scp://my_user@server///home/my_user/repos/system
echo "end: " `date +%D-%T` >> /var/log/duplicity_system.log
$(which duplicity) \
collection-status \
scp://my_user@server///home/my_user/repos/system > \
/root/scripts/system_backup/current_status.txt
Duplicity is highly configurable, and there are many options that can be set. We will discuss a few of them here for more information please refer to [1] and [2]. Due to a minor bug in the current version of duplicity (0.7.09) there are some issues with gpg encryption for the root user. When testing I discovered that duplicity fails to fetch the gpg key if the last backup session has been interrupted, thus the usage of --no-encryption option. If you are using a later version of duplicity or want to tackle the problem on your own, I would refer you to [3] for more information on how to set up encryption on the remote server. Duplicity can perform two types of backup full or incremental. Full backups are the full backups of the not excluded folders and files, where as the incremental backups track the changes in the files and folders from last full backup. Incremental backups are small and fast. They put less strain on your network, but in order to restore a file from a backup multiple multiple incremental backups must be downloaded along side with the full backup in order to reconstruct the file. This means that the longer the chain of incremental backups the longer it takes to reconstruct the file. Therefore it makes sense to force a full backup once in a while.  --full-if-older-than 15D forces duplicity to perform a full backup every 15 days.

Run the following command in order to create your first full backup:
sudo chmod +x /root/scripts/system_backup/back_sys_up.sh 
sudo /root/scripts/system_backup/back_sys_up.sh 
Now you are done with the first part of this tutorial. We will be discussing the automation process in the next part.