Skip to content

Latest commit

 

History

History
333 lines (242 loc) · 9.11 KB

2023-11-28-google-drive-backup-setup.md

File metadata and controls

333 lines (242 loc) · 9.11 KB

2023-11-28 setup a google drive backup

Create a container

On ovh3 I created a container 150. I runned the postinstall scrpit

I put a small size for the main disk (50G) it's only for the system.

But after VM creation I added a disk of 500G through proxmox interface.

I also made a user for me for I will need to connect with ssh (using mkuseralias script).

Techbot account

I created a techbot account on google openfoodfacts.

I gave it the reader right on whole google drive.

Trying grive2

Building grive2

Following instructions on https://yourcmc.ru/wiki/Grive2#Build_debian_package

Inside container, I clone the repository and build the deb package:

cd /opt
git clone https://github.com/vitalif/grive2.git
cd grive2/
git log|head
    Author: Vitaliy Filippov <[email protected]>
    Date:   Sat Dec 10 13:20:39 2022 +0300

        Cache layers during Docker build, take source from the current dir instead of cloning

    commit eb82bfe28b9e796721ffdffc7426684350f71a8a

apt install dpkg-dev
# note I get this list by a first run of dpkg-buildpackage
# There is a '|' in the output to tell you to choose between two libraries for libcurl4
apt install git cmake build-essential  debhelper pkg-config zlib1g-dev libcurl4-openssl-dev libboost-filesystem-dev libboost-program-options-dev libboost-test-dev libboost-regex-dev libexpat1-dev libgcrypt-dev libyajl-dev
# I did not use -j4 because it's harder to read output of parrallel compilations, and also because of limited memory
dpkg-buildpackage
cd ..

and install:

cd /opt
dpkg -i grive_0.5.3_amd64.deb

First sync - authentication

Following https://yourcmc.ru/wiki/Grive2#Usage

I go to /mnt/gdrive-backup/

Start a screen: screen -S grive

cd /mnt/gdrive-backup/
grive -a
-----------------------
Please open this URL in your browser to authenticate Grive2:
...

You get a url that you must open in a browser. But this url have a redirect_uri parameter which redirect to localhost with a specific port. In my case the port is 57125 (but it changes).

To have auth work, I have to redirect this local port to the container local port thanks to ssh.

On my machine:

ssh gdrive-backup -L 57125:1217.0.0.1:57125

I can then copy paste the url in a browser tab where I am authenticated as techbot, after going through the google auth wizard, I get redirected and got the message:

Authenticated successfully. Please close the page

To be able to use google auth, we either need to use w3m or to have localhost:58043 redirecting to the container.

It does not work for us

Finally I get those problems after testing:

  1. grive does not synchronize google documents ! (while that's what we need ! To sync google docs)
  2. It does not seem to support synchronizing shared drive (and here again, that's what we need)

rclone

Install

We need unzip installed.

sudo apt install unzip

Then following https://rclone.org/install/

sudo -v ; curl https://rclone.org/install.sh | sudo bash

Config

Creating an OAuth profile

I prefer to do that as a backup of many file can lead to many requests.

I followed https://rclone.org/drive/#making-your-own-client-idX

  • logged into https://console.developers.google.com/
  • select projects --> create new
    • name: "drive backup"
    • org and zone: openfoodfacts.org
  • "ENABLE APIS AND SERVICES" search for "Drive", and enable the "Google Drive API".
  • Click "Credentials" in the left-side panel (not "Create credentials", which opens the wizard).
  • "CONFIGURE CONSENT SCREEN"
    • first step (creation):
      • user type: external --> no ! use internal (see below)
      • click create
    • second step (app config):
      • app name: rclone
      • assistance email: tech - at - openfoodfacts.org
      • dev email: tech - at - openfoodfacts.org
      • click save
    • third step (scopes) :
      • click on add and remove application scope
      • select: .../auth/docs, .../auth/drive, ../auth/drive.metadata.readonly
      • click update
      • click save and continue
    • 4th step (tests users):

I first use application type external, but finally I decided to go for internal app (this is fine for our use) So in OAuth consent screen, I changed type to "Internal". So above scenario might have some unecessary step.

Add an 0Auth account:

  • click on "credentials" on the right panel, to come back to it
  • click on "+ CREATE CREDENTIALS" button at the top of the screen, then select "OAuth client ID".
    • Choose an application type of "Desktop app", name "rclone backup drive on ovh3" and click "Create"

Adding drive

following https://rclone.org/drive/

I will first create the drive corresponding to Open Food Facts.

I get the drive id in the url of the drive (after folders/)

rclone config
...
No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n

Enter name for new remote.
name> off-gdrive

Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
...
18 / Google Drive
   \ (drive)
...
Storage> drive

Option client_id.
Google Application Client Id
Setting your own is recommended.
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
If you leave this blank, it will use an internal key which is low performance.
Enter a value. Press Enter to leave empty.
client_id> ***********.apps.googleusercontent.com

Option client_secret.
OAuth Client Secret.
Leave blank normally.
Enter a value. Press Enter to leave empty.
client_secret> ********************

Option scope.
Comma separated list of scopes that rclone should use when requesting access from drive.
Choose a number from below, or type in your own value.
Press Enter to leave empty.
...
scope> drive.readonly

Option service_account_file.
Service Account Credentials JSON file path.
...
service_account_file> 

Edit advanced config?
...
y/n> n
... (all the rest is default)

We then come to authorization:

2023/11/28 15:15:21 NOTICE: Make sure your Redirect URL is set to "http://127.0.0.1:53682/" in your custom config.
2023/11/28 15:15:21 NOTICE: If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=******
2023/11/28 15:15:21 NOTICE: Log in and authorize rclone for access

So I first did a port redirection with ssh from my machine:

ssh gdrive-backup -L 53682:127.0.0.1:53682

and opened the url on my machine, but in a tab where I am authenticated as techbot. I followed the auth screen and finally got a:

Success!
All done. Please go back to rclone.

It proceed on the server:

Configure this as a Shared Drive (Team Drive)?

y) Yes
n) No (default)
y/n> y

Option config_team_drive.
Shared Drive
Choose a number from below, or type in your own string value.
Press Enter for the default (0AHYW2qKn7jMpUk9PVA).
 1 / OFF Fellowship drive
   \ (*******)
 2 / Open Food Facts
   \ (*******)
config_team_drive> 2

...

Keep this "off-gdrive" remote?
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y

FIXME add second drive.

First sync

see https://rclone.org/commands/rclone_copy/

As simple as:

rclone sync off-gdrive: /mnt/gdrive-backup/Open-Food-Facts/

We use sync and not copy because sync also remove files deleted on remotes. We will however keep old files through the ZFS snpashots mechanism.

Problem in first think

Looking at the process after a while, it did stop because the disk was full with 500 G of data

I augment the zfs dataset size with pct resize mp0 2T but I was supsicious.

I look at properties of the drive in Google drive, and it says it was 105G of data.

I then issue a du -sh *|sort -h in /mnt/gdrive-backup/Open-Food-Facts and saw Community Management folder taking a lot of space ! After investigation, it was dowloading several time because there is a shortcut of the same folder inside the folder.

I though I configured the drive to not 250000download linked content, but it seems not to be the case !

https://rclone.org/flags/ helps me get the flag I want to edit: --drive-copy-shortcut-content

So I did a rclone config update:

rclone config update off-gdrive copy_shortcut_content=false config_refresh_token=false
rclone config update  off-gdrive copy_shortcut_content=0 config_refresh_token=false

I also changed some other flags for efficiency:

rclone config update off-gdrive  buffer_size=256M fast_list=true config_refresh_token=false

I did a research on the drive and found a lot of shorcuts, so instead of cleaning the mess, I removed all content and did the sync again !

But it was not enough !

So I tried to use skip shortcuts option:

rclone config update off-gdrive   config_refresh_token=false

Systemd service

I created rclone_backup@.{service,timer} and linked it.

I renamed /mnt/gdrive-backup/Open Food Facts to /mnt/gdrive-backup/off-gdrive to be consistent

Then activate:

systemctl daemon-reload
systemctl enable --now rclone_backup@off-gdrive