Scratchpad:Remote Sync

From OpenLP
Jump to: navigation, search

Remote Sync

It has multiple times been suggested to add some kind of remote synchronization feature to OpenLP in order to be able to keep multiple installations of OpenLP in sync. The suggested solution has so far been to move the OpenLP data folder to a synchronized folder (Dropbox, Google Drive, OneDrive, etc). While this can work fine, there are some problem with this approach:

  • If multiple users uses OpenLP at the same time the behavior is undefined... (expect OpenLP to hang and/or crash)
  • If conflicts arises there is no way to handles this.

On this page one or more solutions will be described, and maybe in time implemented...

Implementation Design

Before any implementation can be considered how the databases is used needs to be agreed. There are different patterns which needs to reviewed and agreed on before any code is written. At present songs can be swapped using either import/export or via services and song matching is done based on text strings. This is not ideal but what is.

  • Do we have a master <> slave implementation which allows for a single master and all updates go via that.
  • Do we have a federated design where updates can go any which way between clients.
  • How do you manage duplicates when it is valid to support duplicate songs with minor variations in text.

Drupal have moved their configurations to a migration approach where each item is tagged with a UUID when it is created. If we did this then the master and all slaves would be consistent and updates easy to spot. You could move then to a federated update model as long as all the clients start as the master. Updates would then be based on the UUID and when there is a match an update performed if not an insert. What happens if to people add the same song? The first one should win but it is unlikely for two people to add the same song at the same time! Organisations need to have processed to push out updates when new songs are added.

To support any migrations, a number of things would be needed:

  • an identifier on the song like UUID
  • when it was last changed
  • a simple way to export changes since the last export
  • a simple way to import and review changes and selectively apply them.

Solutions

So far 2 solutions has been proposed:

  • Synchronization using remote database accessed using REST webservices.
  • Synchronization using a synchronized folder (Dropbox, Google Drive, OneDrive, etc).

Below the 2 solutions is described in more detail. If both (or more) solutions should be implemented it would an advantages to implement a general "Remote Sync" plugin with multiple backends.

General sync features

Some general features are needed regardless of which solution is chosen.

Conflict handling

Conflicts could be handled in multiple ways, and we could let the user decide between multiple ways to resolve them:

  • Latest change is kept. (For webservice: If we know the current time of the server and the time of the latest remote change, and the local time and time of latest local change, then it should be easy to calculate which is newer.)
  • Manually resolve using a window similar to the one used when handling duplicated songs.
  • Local/remote always "wins".

Background thread to handle synchronization

When a change is made to a song/author/etc, the change is pushed to a sync-queue that the background thread look at and then push to the "sync-backend".

The thread also looks for remote changes and merge them into the local OpenLP DB. If conflicts arise the user is asked to take action, or perhaps a pre-selected way to resolve the conflict is used.

When syncing remote entries we need to make sure that ingoing and outgoing syncing handles conflict. An example of the problem could be that we have changes to an entry in the outgoing queue waiting to be synced. We then receive an ingoing sync which overwrite the local entry. The entry in the outgoing queue is now push to the remote, which means that the old version of the entry (which locally has been overwrite) is distributed to remote sync.

Global entry id

Since songs/authors/etc can be added by multiple OpenLP instances a global id for entries are needed to identify entries. Assigning a UUID to each entry which is saved both locally and remote would solve this.

Remote storage format (songs)

When songs are stored remotely we need to chose how to best store the songs, so that it can be easily integrated with OpenLP. Proposals for how to do this will be described below.

Mimic/mirror DB tables

In this approach the remote storage mimics or mirrors the song db structure (either using files or a real DB), making it a more or less complete copy of the most current song database available. This also means that if a change is made to the DB structure, the remote structure must be updated as well, which can be problematic if not all users update to the same OpenLP version at the same time.

The straght-forward way for storing the DB remotely is in a real DB, such as mysql or sqlite. But it could also be possible to use a file structure, perhaps in the form <path>/<table-name>/<entity-uuid>.json.

OpenLyrics files/blobs

In this approach we aim to reuse parts of the existing OpenLyrics export and import functionality to handle some of the sync process. By exporting to OpenLyrics the remote implementation can be kept simpler than when using the "Mimic/mirror DB tables" approach mentioned above. This approach is also more independent of internal song db changes in OpenLP.

Synchronization using REST webservices

The remote (server side) implementation would be accessible through a REST interface which would fairly closely resemble the DB layout of OpenLP, probably using some simple existing framework. The reference implementation of the webserice will most likely be done in Python. There would probably also need to be some kind of graphical user interface to handle users (login/password).

At regular intervals the webservice must be asked if any changes has occurred. Perhaps this could be done using websockets?

We should probably have a look at U1DB, which looks to support what we need, and it is written in Python2. Unfortunately it seems to be a largely abandoned project, but very interesting!

Django application

I wanted to try out django's rest framework and wrote a simple app that has this basic functionality. My app is currently living here. I also have started work on a plugin for openlp here. They are both currently to be considered more as a proof of concept and not real proposals. All objects would be stored in a database mimicking openlp's database. The rest framework has support for authentication based on authentication tokens. That way different machines can be separated using django's built in user management system. Another upside is that the django admin framework already provides a basic management interface.

Synchronization using a synchronized folder (Dropbox, Google Drive, OneDrive, etc)

Synchronization using files

This is not the same approach as placing OpenLPs datafolder in a synchronized folder! Instead files that can be used to synchronized multiple OpenLP databases is placed in a synchronized folder and OpenLP then read these files.

On each change to songs the updated entry is placed in the sync'ed folder. While writing to the file it should locked in some way to avoid conflicts, perhaps using simple "<filename>.lock"-files.

To get notified about file changes a cross-platform python library exists: https://pypi.python.org/pypi/watchdog. It is available on latest ubuntu and fedora releases.

Synchronization using a web API

Dropbox, Google Drive, etc. offers web APIs which would make it possible for us to sync OpenLP songs and other data. The APIs can be used to handle files, and easily find updated data.

DropBoxs Datastore allows for database-like storage which could be used to store songs etc. A problem with Dropbox Datastore is that it cannot be shared among users, it is private to the user. DropBoxs Datastore will be removed by April 2016.

OwnCloud allows for OwnCloud-apps to get and store data from external connections, which could be used as a datastore, but we would then have to develop a OwnCloud app.

DropBox API: https://www.dropbox.com/developers

Google Drive API: https://developers.google.com/drive/v3/web/about-sdk

OneDrive REST API: http://msdn.microsoft.com/en-us/library/dn659752.aspx

OwnCloud App documentation: https://doc.owncloud.org/server/8.2/developer_manual/app/index.html