Firebase Python Backup/Restore Script
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a small Python script which will:

  1. Fetch data from target Firebase/Wilddog root or node path and save data to json files, using the node name as file name
  2. Restore data from saved local json file to target Firebase/Wilddog root or node path

The usual CURL method won’t work because I have nodes that are over the Firebase/Wilddog CURL download size limit.

Below are a few useful links of similar projects:
https://gist.github.com/alexklibisz/3247dcba8c8d7936a0ce

https://github.com/handsome-code/firebase-backup-s3-python

http://www.seanmeadows.com/2014/03/firebase-continuous-backup/

The reason I mention Wilddog is because I’m using this service which is a clone of Firebase but they don’t offer any data backup/restore functions like Firebase, so I need a custom one.

Thanks!

awarded to iurisilvio
Tags
python

Crowdsource coding tasks.

3 Solutions


Try this (use Python 3)
https://gist.github.com/jorjeb/808c900be48e478d645785d523b828f3

Backup example
python3 script.py -u <Firebase URL> -s <Firebase Secret> -e <Firebase Email> -a backup -p /users -o /output/dir

Restore example
python3 script.py -u <Firebase URL> -s <Firebase Secret> -e <Firebase Email> -a restore -p /newusers -j /output/dir/users.json

Help
python3 script.py --help

Will this be able to backup node data that is several gigabytes in size and then restore back as well?
user856 4 months ago
It will, but it will slow down or deny read access to other db requests while the script is running. So backup/restore should be done while no one is accessing the db.
jorjb 4 months ago
In the -a parameter I can insert a specific node path or root path as well right?
user856 4 months ago
No, use -p for the node path. -a is for the action to do {restore, backup}
jorjb 4 months ago
Can you try modifying the script using Wilddog import? I replaced all Firebase mentions in the script with Wilddog but I got this error: 'Traceback (most recent call last): File "testscript.py", line 2, in from wilddog import wilddog, jsonutil File "/usr/local/lib/python3.4/dist-packages/wilddog/init.py", line 3, in from .async import processpool File "/usr/local/lib/python3.4/dist-packages/wilddog/async.py", line 3, in from lazy import LazyLoadProxy ImportError: No module named 'lazy''
user856 4 months ago
Here you go: https://gist.github.com/jorjeb/da8fa1a55a9b7054868eea715b21ed54 You probably need to install lazy as well pip3 install lazy
jorjb 4 months ago
After installing lazy and updating the script code, I got this error: Traceback (most recent call last): File "testscript.py", line 2, in from wilddog import wilddog, jsonutil File "/usr/local/lib/python3.4/dist-packages/wilddog/init.py", line 3, in from .async import processpool File "/usr/local/lib/python3.4/dist-packages/wilddog/async.py", line 3, in from lazy import LazyLoadProxy ImportError: cannot import name 'LazyLoadProxy'
user856 4 months ago
@user856 It's a bug in the Wilddog python client library. Please uninstall wilddog pip3 uninstall wilddog-python and install my fork in the meantime pip3 install git+https://github.com/jorjeb/wilddog-python.git.
jorjb 4 months ago
I have other scripts that use the original Wilddog Python library, will your fork break the other scripts?
user856 4 months ago
It should not since I just made minor changes.
jorjb 4 months ago
I get this error when running the install command: Cloning https://github.com/jorjeb/wilddog-python.git to /tmp/pip-a2ylin4w-build Error [Errno 2] No such file or directory: 'git' while executing command git clone -q https://github.com/jorjeb/wilddog-python.git /tmp/pip-a2ylin4w-build Cannot find command 'git'
user856 4 months ago
You need to download and install git here https://git-scm.com/downloads
jorjb 4 months ago
Okay so everything is installed but I get this error due to not providing a 'secret' because it's currently test data. Is it possible for the script to skip 'secret' parameter if it isn't provided? ValueError: wilddogtokengenerator.create_token: secret must be a string.
user856 4 months ago
I got this error with the updated script: 'requests.exceptions.HTTPError: 400 Client Error: Bad Request'
user856 4 months ago
can you give me a sample url and node path?
jorjb 4 months ago
Maybe it's because the root includes too many nodes with too large file sizes? Is it possible to add an option to skip certain nodes that are too large and have data that aren't that important?
user856 4 months ago
I will check
jorjb 4 months ago
Yes it's possible. For example { "node1": { "child-node1": { "key": "value" }, "child-node2": { "key": "value" } } }. You can backup child-node1 by setting the path parameter to -p /node1/child-node1.
jorjb 4 months ago
But if I want to backup root and but skip several nodes, how can I do that? Also were you able to backup from this node using the script? I wasn't able to: https://site-eumt-exp-test.wilddogio.com/products_id
user856 4 months ago
Were you able to try the link? :)
user856 4 months ago
Yes, but it returns an error saying request is too large.
jorjb 4 months ago
I'm checking their docs if its possible to skip nodes.
jorjb 4 months ago
But the main objective of the script was so that it is able to progressively backup and restore large node data in chunks or via some other method. Did the links in the bounty description help?
user856 4 months ago
I can increase the bounty and extend it for another week but I really need the script to be able to backup large data nodes
user856 4 months ago
Yes, skipping nodes is not possible but we can shard one large requests into multiple smaller requests. https://docs.wilddog.com/api/sync/web/Query.html#limitToFirst I'm updating the script.
jorjb 4 months ago
Okay that's fine for me, I just need to be able to progressively backup all data from root and then progressively restore data back if necessary
user856 4 months ago
Were you able to get the sharding working? :)
user856 4 months ago
Not yet. The api still throwing an error.
jorjb 4 months ago
@user856, please give me an index(.indexOn) that I can use to sort the results.
jorjb 4 months ago
What do you mean by index? I don't have any special settings for data in Wilddog
user856 4 months ago
Hey, I didn't get what you meant by index
user856 4 months ago
The limitToFirst parameter should be used in conjunction with the orderBy parameter. The orderBy parameter is either $value, $key, $priority, or a child node. If child node, the value needs to be an index added using the indexOn setting.
jorjb 4 months ago
Umm but I need the whole node's values backed up, not just the first two
user856 4 months ago
But the data is correct :)
user856 4 months ago
See if this works: https://gist.github.com/jorjeb/64cb3ea70dc8e5c2f446c8c562775fa4 python3 script.py -u https://site-eumt-exp-test.wilddogio.com -p /products_id -a backup -O "\"\$key\"" -S "\"0\"" -l 1000 -o /output/dir
jorjb 4 months ago
What are the -O and -S and -l parameters? Do I need to change them?
user856 4 months ago
No need to change for /products_id. You can increase -l though.
jorjb 4 months ago
Still working on the restore function
jorjb 4 months ago
What's the -l limit for? Is it number or nodes to backup?
user856 4 months ago
I ran the command as you suggested above but I can't see if it's running or not, are you able to add some progression output or something so that I can know the script is running?
user856 4 months ago
Also since there will probably be quite a bit of data, would there be any possibility of concurrent backup/restore to speed up the process?
user856 4 months ago
Also I ran the script on this link: https://site-eumt-exp.wilddogio.com/shipments_individual_products and it seems the downloaded json files aren't correctly formatted, can you try the script with the above link?
user856 4 months ago
It's not possible to do concurrent requests because the StartAt option requires the last key of the previous query to determine where to offset the records. I hope that make sense.
jorjb 4 months ago
Okay the concurrency feature I understand, can you fix the incorrectly formatted json file issue?
user856 4 months ago
Hey any progress so far? :)
user856 4 months ago
Were you able to get the restore part working? :)
user856 4 months ago
Hi, sorry I was travelling. I'm not done yet with the restore function, but I have already updated the script to include progress messages and fixes https://gist.github.com/jorjewiz/d6c12288619e3ade440375634d4aac17
This time, you should use it like this python3 script.py -u https://site-eumt-exp-test.wilddogio.com -p /products_id -a backup -O \$key -S 0 -l 1000 -o . Regarding shipmentsindividualproducts, I tried this command, but seems to be working just fine python3 script.py -u https://site-eumt-exp.wilddogio.com -p /shipments_individual_products -a backup -O \$key -S 0 -l 1 -o .. If you see texts like these \u4f9b\u5a74, those are chinese characters converted to unicode. It's not an error.
jorjb 4 months ago
Winning solution

I created a gist with my solution, based on the wilddog service.

I try to get a node. If it fails, I get all keys with shallow=true and try to get each of these keys. I do until I get a node without shallow.

https://gist.github.com/iurisilvio/277049a6afa5fb918d5e219bfbf4f163

to backup: python backup_restore.py backup -ws https://site-eumt-exp-test.wilddogio.com your_output_file.txt
to restore: python backup_restore.py restore -ws https://site-eumt-exp-test.wilddogio.com your_input_file.txt

I got a small error when I ran the command, I sent you an email with the problem
user856 4 months ago
View Timeline