Firebase Python Backup/Restore Script
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a small Python script which will:

  1. Fetch data from target Firebase/Wilddog root or node path and save data to json files, using the node name as file name
  2. Restore data from saved local json file to target Firebase/Wilddog root or node path

The usual CURL method won’t work because I have nodes that are over the Firebase/Wilddog CURL download size limit.

Below are a few useful links of similar projects:
https://gist.github.com/alexklibisz/3247dcba8c8d7936a0ce

https://github.com/handsome-code/firebase-backup-s3-python

http://www.seanmeadows.com/2014/03/firebase-continuous-backup/

The reason I mention Wilddog is because I’m using this service which is a clone of Firebase but they don’t offer any data backup/restore functions like Firebase, so I need a custom one.

Thanks!

awarded to iurisilvio
Tags
python

Crowdsource coding tasks.

3 Solutions


Try this (use Python 3)
https://gist.github.com/jorjeb/808c900be48e478d645785d523b828f3

Backup example
python3 script.py -u <Firebase URL> -s <Firebase Secret> -e <Firebase Email> -a backup -p /users -o /output/dir

Restore example
python3 script.py -u <Firebase URL> -s <Firebase Secret> -e <Firebase Email> -a restore -p /newusers -j /output/dir/users.json

Help
python3 script.py --help

Will this be able to backup node data that is several gigabytes in size and then restore back as well?
user0809 1 year ago
It will, but it will slow down or deny read access to other db requests while the script is running. So backup/restore should be done while no one is accessing the db.
jorjb 1 year ago
In the -a parameter I can insert a specific node path or root path as well right?
user0809 1 year ago
No, use -p for the node path. -a is for the action to do {restore, backup}
jorjb 1 year ago
Can you try modifying the script using Wilddog import? I replaced all Firebase mentions in the script with Wilddog but I got this error: 'Traceback (most recent call last): File "testscript.py", line 2, in from wilddog import wilddog, jsonutil File "/usr/local/lib/python3.4/dist-packages/wilddog/init.py", line 3, in from .async import processpool File "/usr/local/lib/python3.4/dist-packages/wilddog/async.py", line 3, in from lazy import LazyLoadProxy ImportError: No module named 'lazy''
user0809 1 year ago
Here you go: https://gist.github.com/jorjeb/da8fa1a55a9b7054868eea715b21ed54 You probably need to install lazy as well pip3 install lazy
jorjb 1 year ago
After installing lazy and updating the script code, I got this error: Traceback (most recent call last): File "testscript.py", line 2, in from wilddog import wilddog, jsonutil File "/usr/local/lib/python3.4/dist-packages/wilddog/init.py", line 3, in from .async import processpool File "/usr/local/lib/python3.4/dist-packages/wilddog/async.py", line 3, in from lazy import LazyLoadProxy ImportError: cannot import name 'LazyLoadProxy'
user0809 1 year ago
@user856 It's a bug in the Wilddog python client library. Please uninstall wilddog pip3 uninstall wilddog-python and install my fork in the meantime pip3 install git+https://github.com/jorjeb/wilddog-python.git.
jorjb 1 year ago
I have other scripts that use the original Wilddog Python library, will your fork break the other scripts?
user0809 1 year ago
It should not since I just made minor changes.
jorjb 1 year ago
I get this error when running the install command: Cloning https://github.com/jorjeb/wilddog-python.git to /tmp/pip-a2ylin4w-build Error [Errno 2] No such file or directory: 'git' while executing command git clone -q https://github.com/jorjeb/wilddog-python.git /tmp/pip-a2ylin4w-build Cannot find command 'git'
user0809 1 year ago
You need to download and install git here https://git-scm.com/downloads
jorjb 1 year ago
Okay so everything is installed but I get this error due to not providing a 'secret' because it's currently test data. Is it possible for the script to skip 'secret' parameter if it isn't provided? ValueError: wilddogtokengenerator.create_token: secret must be a string.
user0809 1 year ago
I got this error with the updated script: 'requests.exceptions.HTTPError: 400 Client Error: Bad Request'
user0809 1 year ago
can you give me a sample url and node path?
jorjb 1 year ago
Maybe it's because the root includes too many nodes with too large file sizes? Is it possible to add an option to skip certain nodes that are too large and have data that aren't that important?
user0809 1 year ago
I will check
jorjb 1 year ago
Yes it's possible. For example { "node1": { "child-node1": { "key": "value" }, "child-node2": { "key": "value" } } }. You can backup child-node1 by setting the path parameter to -p /node1/child-node1.
jorjb 1 year ago
But if I want to backup root and but skip several nodes, how can I do that? Also were you able to backup from this node using the script? I wasn't able to: https://site-eumt-exp-test.wilddogio.com/products_id
user0809 1 year ago
Were you able to try the link? :)
user0809 1 year ago
Yes, but it returns an error saying request is too large.
jorjb 1 year ago
I'm checking their docs if its possible to skip nodes.
jorjb 1 year ago
But the main objective of the script was so that it is able to progressively backup and restore large node data in chunks or via some other method. Did the links in the bounty description help?
user0809 1 year ago
I can increase the bounty and extend it for another week but I really need the script to be able to backup large data nodes
user0809 1 year ago
Yes, skipping nodes is not possible but we can shard one large requests into multiple smaller requests. https://docs.wilddog.com/api/sync/web/Query.html#limitToFirst I'm updating the script.
jorjb 1 year ago
Okay that's fine for me, I just need to be able to progressively backup all data from root and then progressively restore data back if necessary
user0809 1 year ago
Were you able to get the sharding working? :)
user0809 1 year ago
Not yet. The api still throwing an error.
jorjb 1 year ago
@user856, please give me an index(.indexOn) that I can use to sort the results.
jorjb 1 year ago
What do you mean by index? I don't have any special settings for data in Wilddog
user0809 1 year ago
Hey, I didn't get what you meant by index
user0809 1 year ago
The limitToFirst parameter should be used in conjunction with the orderBy parameter. The orderBy parameter is either $value, $key, $priority, or a child node. If child node, the value needs to be an index added using the indexOn setting.
jorjb 1 year ago
Umm but I need the whole node's values backed up, not just the first two
user0809 1 year ago
But the data is correct :)
user0809 1 year ago
See if this works: https://gist.github.com/jorjeb/64cb3ea70dc8e5c2f446c8c562775fa4 python3 script.py -u https://site-eumt-exp-test.wilddogio.com -p /products_id -a backup -O "\"\$key\"" -S "\"0\"" -l 1000 -o /output/dir
jorjb 1 year ago
What are the -O and -S and -l parameters? Do I need to change them?
user0809 1 year ago
No need to change for /products_id. You can increase -l though.
jorjb 1 year ago
Still working on the restore function
jorjb 1 year ago
What's the -l limit for? Is it number or nodes to backup?
user0809 1 year ago
I ran the command as you suggested above but I can't see if it's running or not, are you able to add some progression output or something so that I can know the script is running?
user0809 1 year ago
Also since there will probably be quite a bit of data, would there be any possibility of concurrent backup/restore to speed up the process?
user0809 1 year ago
Also I ran the script on this link: https://site-eumt-exp.wilddogio.com/shipments_individual_products and it seems the downloaded json files aren't correctly formatted, can you try the script with the above link?
user0809 1 year ago
It's not possible to do concurrent requests because the StartAt option requires the last key of the previous query to determine where to offset the records. I hope that make sense.
jorjb 1 year ago
Okay the concurrency feature I understand, can you fix the incorrectly formatted json file issue?
user0809 1 year ago
Hey any progress so far? :)
user0809 1 year ago
Were you able to get the restore part working? :)
user0809 1 year ago
Hi, sorry I was travelling. I'm not done yet with the restore function, but I have already updated the script to include progress messages and fixes https://gist.github.com/jorjewiz/d6c12288619e3ade440375634d4aac17
This time, you should use it like this python3 script.py -u https://site-eumt-exp-test.wilddogio.com -p /products_id -a backup -O \$key -S 0 -l 1000 -o . Regarding shipmentsindividualproducts, I tried this command, but seems to be working just fine python3 script.py -u https://site-eumt-exp.wilddogio.com -p /shipments_individual_products -a backup -O \$key -S 0 -l 1 -o .. If you see texts like these \u4f9b\u5a74, those are chinese characters converted to unicode. It's not an error.
jorjb 1 year ago
Winning solution

I created a gist with my solution, based on the wilddog service.

I try to get a node. If it fails, I get all keys with shallow=true and try to get each of these keys. I do until I get a node without shallow.

https://gist.github.com/iurisilvio/277049a6afa5fb918d5e219bfbf4f163

to backup: python backup_restore.py backup -ws https://site-eumt-exp-test.wilddogio.com your_output_file.txt
to restore: python backup_restore.py restore -ws https://site-eumt-exp-test.wilddogio.com your_input_file.txt

I got a small error when I ran the command, I sent you an email with the problem
user0809 1 year ago
View Timeline