September 2014 Archives

Thu Sep 11 14:39:47 ICT 2014

Amanda back-up of VMware ESXi host, with ghettoVCB

Back-up a virtual machine

On Amanda server

Amanda server is running FreeBSD, the syntax bellow depends on FreeBSD.

  1. Patch Amanda server, in server-src/chunker.c change CONNECT_TIMEOUT to something sensible; from my experience, 25GB of data on the virtual machine takes 45 minutes to make the snapshot; I choose to use a 3 hours timeout.
  2. Configure NFS server; in /etc/rc.conf:
    # NFS for VMware back-up
    nfs_server_enable="YES"
    nfs_server_flags="-u -t"
    rpcbind_enable="YES"
    rpc_lockd_enable="YES"
    rpc_statd_enable="YES"
    
    in /etc/exports; we will use the directory /virtual for temporary back-up of VMware virtual machines:
    /virtual -maproot=0:0 virtual1000.cs.ait.ac.th virtual2000.cs.ait.ac.th virtual3000.cs.ait.ac.th virtual4000.cs.ait.ac.th virtual5000.cs.ait.ac.th
    
    in /etc/hosts.allow (this may be overkill, IP depends on CSIM network):
    # Rpcbind is used for all RPC services; protect your NFS!
    # (IP addresses rather than hostnames *MUST* be used here)
    rpcbind : 192.41.170.0/255.255.255.0 : allow
    rpcbind : 10.41.170.0/255.255.255.0 : allow
    rpcbind : ALL : deny
    
    # Rquota used by NFS
    rpc.rquotad: 192.41.170.0/255.255.255.0 : allow
    rpc.rquotad: 10.41.170.0/255.255.255.0 : allow
    rpc.rquotad: ALL : deny
    
    # Portmapper is used for all RPC services; protect your NFS!
    # (IP addresses rather than hostnames *MUST* be used here)
    portmap : 192.41.170.0/255.255.255.0 : allow
    portmap : 10.41.170.0/255.255.255.0 : allow
    portmap : ALL : deny
    
  3. Configure sudo(8); in sudoers (the snapshot is created by root on the ESXi server, the user amanda needs to escalade privileges to remove the snapshot once it has been saved):
    Cmnd_Alias      AMANDA = /bin/rm
    amanda  ALL=(root) NOPASSWD: AMANDA
    
  4. Install Perl packages Mail::SendEasy (p5-Mail-SendEasy) and GetOpt::Long (p5-Getopt-Long)
  5. Install the script vmware in /usr/local/libexec/amanda/application, make sure it is mode 755.

    Edit the script to reflect the list of ESXi servers.

  6. Configure Amanda, in amanda.conf (a specific dumptype calls the script vmware):
    define script vmware {
            plugin "vmware"
            execute-where server
            execute-on pre-dle-backup, post-dle-backup
            }
    
    define dumptype vmware {
            comment "Full dump of VMware virtual machine snapshot"
            auth "bsd"
            index yes
            compress server best
            estimate server
            priority high
            program "GNUTAR"
            allow-split true
            skip-incr
            script "vmware"
    }
    
    Obviously, incremental dumps make no sens here.
    andThen i nthe file disklist (note that the DLE name depends on the name of the virtual machine, spaces in the name need to be properly escaped):
    amanda             /virtual/mybackups/Desktop\ Olivier     vmware  1       disk
    
    ghettoVCB adds the subdirectory mybackups in /virtual; that can be changed in the configuration of ghettoVCB.
  7. Generate a SSH key pair, the private key should be saved in .ssh/id_rsa_virtual in the home directory of the user running Amanda.

On ESXi server

  1. Install the SSH public key in /etc/ssh/keys-root/authorized-keys.
  2. Install ghettoVCB into /vmfs/volumes/datastore1/ghettoVCB.
    When a VMWare server reboots, the file systems are re-generated from some back-up/master copies. Anything installed elsewhere than the datastores is lost. For this script to work, you must have a datastore named datastore1.
    The configuration file ghettoVCB.conf must contain:
    VM_BACKUP_VOLUME="/vmfs/volumes/Oak (NFS) private network/mybackups"
    DISK_BACKUP_FORMAT=thin
    VM_BACKUP_ROTATION_COUNT=3
    POWER_VM_DOWN_BEFORE_BACKUP=0
    ENABLE_HARD_POWER_OFF=0
    ITER_TO_WAIT_SHUTDOWN=3
    POWER_DOWN_TIMEOUT=5
    ENABLE_COMPRESSION=0
    VM_SNAPSHOT_MEMORY=0
    VM_SNAPSHOT_QUIESCE=0
    ALLOW_VMS_WITH_SNAPSHOTS_TO_BE_BACKEDUP=0
    ENABLE_NON_PERSISTENT_NFS=1
    UNMOUNT_NFS=1
    NFS_SERVER=amanda1000.cs.ait.ac.th
    NFS_VERSION=nfs
    NFS_MOUNT=/virtual
    NFS_LOCAL_NAME=nfs_temp_amanda
    NFS_VM_BACKUP_DIR=mybackups
    SNAPSHOT_TIMEOUT=15
    WORKDIR_DEBUG=0
    VM_SHUTDOWN_ORDER=
    VM_STARTUP_ORDER=
    
    Note that all the paths reflect the paths coded in the script.
    ghettoVCB will automatically mount amanda1000:/virtual/mybackups and unmount it after the snapshoot was taken; the compression will be done by Amanda.
    Make sure that the NFS_LOCAL_NAME does not conflict with the name of any datastore on the server.
That's all for the ESXi server; Amanda should be ready to take back-ups.

Restore a machine

  1. Extract the back-up from Amanda back-up into the directory used by Amanda NFS server.
  2. Mount the NFS storageof Amanda onto the ESXi server; use vSphere Client or the command:
    vim-cmd hostsvc/datastore/nas_create mybackup 3 <mounted_directory> 0 amanda1000.cs.ait.ac.th
    
  3. Create a configuration file with: the full pathname to the snapshot directory (the one ending with the date); the datastore full path; 1 (for tick provisionning). For example:
    #"<DIRECTORY or .TGZ>;<DATASTORE_TO_RESTORE_TO>;<DISK_FORMAT_TO_RESTORE>"
    # DISK_FORMATS
    # 1 = zeroedthick
    # 2 = 2gbsparse
    # 3 = thin
    # 4 = eagerzeroedthick
    "/vmfs/volumes/oak1000/mybackups/DNS/DNS-2014-09-15_06-34-31;/vmfs/volumes/datas
    
  4. Run the restore:
    /vmfs/volumes/datastore1/ghettoVCB-restore.sh -c <configuration-file> [-d 2]
    
    The option -d 2 is for debug.
  5. Umount NFS; either with vSphere client or the command:
    vim-cmd hostsvc/datastore/destroy mybackup
    
  6. Remove the snapshot of the machine from Amanda NFS server space.
  7. There may be strange questions asked when starting the restored machine, use your gut instincts.

Ultimate back-up

This back-up is done before a user account is deleted. The result is stores in the file VMware-snapshot.<timestamp> in the home directory of the user.

It uses the script /home/java/on/Ldap/back-vmmachine; this script accepts one pool name as argument, the pool name must start with the account name of the user that will be deleted; for example, for the user st101876, all pool with a name staring with st101876 will be deleted, and the virtual machines in these pools will be deleted too.

On ufo server

The script is in ~on/Ldap/backup_vmmachine and is called by the job removing user accounts. For the moment, the script is limited to call virtual3 but can be extended to call the other virtual servers.

On ESXi server

The installation of the public key and ghettoVCB is the same as above; the major difference is in the configuration file for ghettoVCB.

  1. The file name is /vmfs/volumes/datastore1/ghettoVCB/ghettoVCB_delete.conf.
  2. The configuration file uses the shutdown option, so hopefully the disks will be sync before they are snapshoted, in ghettoVCB configuration:
    POWER_VM_DOWN_BEFORE_BACKUP=1
    ENABLE_HARD_POWER_OFF=1
    
  3. If the ESXi server is not yet mounting any share from oak1000, the NFS share should be:
    ENABLE_NON_PERSISTENT_NFS=1
    UNMOUNT_NFS=1
    NFS_SERVER=oak1000.cs.ait.ac.th
    NFS_VERSION=nfs
    NFS_MOUNT=/home/home/virtual
    NFS_LOCAL_NAME=nfs_temp_delete
    NFS_VM_BACKUP_DIR=mydelete
    
    Make sure that the NFS_LOCAL_NAME does not conflict with the name of any datastore on the server.
  4. If the ESXi server is already mounting a share from oak1000, use that exsiting mounted share:
    VM_BACKUP_VOLUME="/vmfs/volumes/<Whatever existing datastore name>"
    ...
    ENABLE_NON_PERSISTENT_NFS=0
    

Posted by Olivier | Permanent link | File under: administration, vmware, backup