Non-replicating (local) / replication-only transformation functions #37

stuartpb · 2016-12-01T02:01:53Z

It would make a lot of sense - especially for applications that maintain a special in-DB representation of data that should be replicated without transformation, like, say, crypto-pouch - if there were variants like incomingLocal, outgoingLocal, incomingReplication, and outgoingReplication, where the latter two only apply for replication, and the former only apply for non-replicating actions (get, put, etc).

I guess this can be emulated right now by doing something like this:

const dbOriginal = new PouchDB('mydb');
const dbOnlyForLocalUse = new PouchDB(dbOriginal)
  .transform({incoming: incomingLocal, outgoing: outgoingLocal});
const dbOnlyForReplication = new PouchDB(dbOriginal)
  .transform({incoming: incomingReplication, outgoing: outgoingReplication});

but that would lead to issues, with, say, a general function that takes one database as an argument, on which it performs both replication and non-replication operations.

Also, does new PouchDB(dbTransformHasAlreadyBeenCalledOn) copy the transformations that are on that DB? My gut says it should, but I'm not certain (and I'm not sure if the docs make any statement on the matter one way or another).

The text was updated successfully, but these errors were encountered:

stuartpb · 2016-12-01T03:07:34Z

There's also the matter of how, with these as available functions, they'd interact if specified alongside incoming and outgoing.

One way to do it would be to make them mutually exclusive, with any use of plain incoming alongside incomingLocal and/or incomingReplication an error (and likewise for outgoing; however, I think it'd be more conducive to convenience for models that share functionality in their document transformation to not put the burden of composition on the end user, and instead offer two more transformations, where the complete document transformation function application flow would look like this:

           Documents coming in
via put(),         ||          via
post(), etc.       \/          replicate.from()
              incomingPre()
               /        \
              V          V
   incomingLocal()    incomingReplication()
               \        /
                V      V
               incoming()
                   \/
            Stored in database
                   ||
               outgoing()
                /      \
               V        V
   outgoingLocal()    outgoingReplication()
                \      /
                 V    V
             outgoingFinal()
via get(),         ||          via
allDocs(),         \/          replicate.to()
etc.       Documents going out

In other words, that documents would pass through (at most) three transformations before coming into / going out from the database, depending on the function:

put(), post(), etc: incomingPre, incomingLocal, incoming
replicate.from(): incomingPre, incomingReplication, incoming
get(), allDocs(), etc: outgoing, outgoingLocal, outgoingFinal
replicate.to(): outgoing, outgoingReplication, outgoingFinal

And, of course, these only apply if specified. Applications could need as few as only one transformation function to specify (ie. transforming incoming documents only from replication, for the purposes of migration), or could require as many as all eight (needing to perform a common transformation to the outside incoming document format, transforming that further when inserting and also for replication from outside, finalizing both of those source-specific changes in a common way, performing a common pre-transformation out of the database, altering that further in specific ways, and then putting one last set of common finishing touches on), or any number of other combinations (most would probably use no more than six, but it could realistically be any six).

stuartpb · 2016-12-01T03:13:13Z

Also, something that just occurred to me: it might make sense for transformation functions to have access to data about the transaction or replication as a second parameter. For example, this would allow an outgoing replication function to strip local-application-specific fields when replicating to remote databases.

Indeed, now that I think about it, this might be the more sensible level to factor this at: a second parameter that functions could use to filter and compose their own outgoing or incoming transformations as applicable, whether that be on a local-versus-replication basis, or whatever.

stuartpb · 2016-12-01T04:00:15Z

#38 makes a good point on how these are fundamentally different operations, and should really be handled completely differently: an incoming or outgoing transformation from the out-of-database world may not correspond to a revision to a document, but a transformation in the process of replication absolutely does.

In that light (assuming incoming and outgoing don't do this already, which seems to be the case), it'd probably make more sense to just make incoming and outgoing keep their current behavior, but make them only apply for non-replication functions, and make it so any transformation to be applied in the context of replication would have to be specified as incomingReplication and outgoingReplication, with these each creating a new revision with _rev to represent the transformation they've performed.

stuartpb · 2016-12-01T04:14:49Z

Well, either that, or there could be a separate revise-pouch plugin that applies this kind of revising transformation whenever an object is inserted or updated locally (ie. it commits the given version, then commits the transformed version), and creates ephemeral revisions for transformations when replicating (though I believe PouchDB's non-deterministic random-based revision identifiers introduce the potential for this to introduce meaningless conflicts in the event that the same deterministic transformation is applied two different times). This could also be introduced in the form of a new option to .transform(), like "revise": true" or "revise": "replication".

In any case, I think incoming and outgoing here should really not be applied silently in replication, in any scenario. Doing so breaks one of the fundamental assumptions of the CouchDB consistency model (that all changes to the document coming in will be reflected by a differing revision history).

nolanlawson · 2016-12-02T10:50:30Z

I'm starting to think that, yes, it would make sense to make a separate plugin. This plugin is designed for very simple use cases, e.g. I have a CouchDB full of too-big documents and I merely want them to be smaller when I replicate them locally.

You are fully encouraged to write a revise-pouch plugin. :)

gr2m · 2016-12-03T10:37:09Z

I second Nolan’s comment, but please keep us posted on revise-pouch :)

stuartpb mentioned this issue Dec 1, 2016

Does transformation affect _rev when replicating? #38

Open

stuartpb mentioned this issue Dec 1, 2016

Make this work between two PouchDBs eHealthAfrica/pouchdb-migrate#24

Open

nolanlawson closed this as completed Dec 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-replicating (local) / replication-only transformation functions #37

Non-replicating (local) / replication-only transformation functions #37

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016 •

edited

Loading

nolanlawson commented Dec 2, 2016

gr2m commented Dec 3, 2016

Non-replicating (local) / replication-only transformation functions #37

Non-replicating (local) / replication-only transformation functions #37

Comments

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016

stuartpb commented Dec 1, 2016 • edited Loading

nolanlawson commented Dec 2, 2016

gr2m commented Dec 3, 2016

stuartpb commented Dec 1, 2016 •

edited

Loading