oak-documentMk based discovery.impl / SLING-4603

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

oak-documentMk based discovery.impl / SLING-4603

Stefan Egli-2
Hi all,

It has come up that with discovery.impl based on oak we could make use of
oak's (mongoMk's) lease mechanism instead of sending higher level
heartbeats.

I've created SLING-4603 to track that and would appreciate some opinions
from this list. I'd be looking at providing a first implementation of this
for review.

Cheers,
Stefan
--
https://issues.apache.org/jira/browse/SLING-4603
linked to this one: https://issues.apache.org/jira/browse/SLING-2939


Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Tommaso Teofili-2
well, that's an interesting approach, I'm curious to see how it'll work :)

Tommaso

2015-04-09 17:29 GMT+02:00 Stefan Egli <[hidden email]>:

> Hi all,
>
> It has come up that with discovery.impl based on oak we could make use of
> oak's (mongoMk's) lease mechanism instead of sending higher level
> heartbeats.
>
> I've created SLING-4603 to track that and would appreciate some opinions
> from this list. I'd be looking at providing a first implementation of this
> for review.
>
> Cheers,
> Stefan
> --
> https://issues.apache.org/jira/browse/SLING-4603
> linked to this one: https://issues.apache.org/jira/browse/SLING-2939
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Robert Munteanu-2
In reply to this post by Stefan Egli-2
Hi Stefan,

On Thu, 2015-04-09 at 17:29 +0200, Stefan Egli wrote:
> Hi all,
>
> It has come up that with discovery.impl based on oak we could make use of
> oak's (mongoMk's) lease mechanism instead of sending higher level
> heartbeats.
>
> I've created SLING-4603 to track that and would appreciate some opinions
> from this list. I'd be looking at providing a first implementation of this
> for review.

Sounds interesting, even though it doesn't apply to all supported
configuration.

One question though - you mentioned mongo mk. I assume that this would
work with all DocumentNodeStore-based implementations, e.g. also with
the rdb mk. Is that assumption correct or do you plan a Mongo-only
implementation?

Thanks,

Robert

Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Felix Meschberger-3
In reply to this post by Stefan Egli-2
Hi

While this really sounds interesting at first glance, it *is* problematic.

Sure, it reuses existing functionality which has already been done, implemented, tested. That’s all great.

But then it has several nasty drawbacks:

  * It depends on a specific implementation detail of a specific
    Oak MK/NodeStore implementation. This implementation may
    change at any time
  * This feature seems to be accessed through JMX which exposes
    and admin interface which is not guaranteed to be stable
    for regular programmatic use
  * It limits the topology to the Oak cluster members
  * This looks like hacking around a problem in Oak leveraging
    other parts of Oak which seem to have issues in themselves …

All in all, I doubt whether the energy we put into this really is worth it given there are valid other solutions around which are sound, stable, and proven such as etcd, zookeeper.

Maybe we should stick with the current discovery.impl as being good enough and instead concentrate on building a new discovery implementation based on said proven technology. For demo and ease-of-use purposes the current discovery.impl is probably sufficient. For real world uses a etcd or zookeeper or whathever based solution may be more promising IMHO.

Sorry to sound deceptive, but I am not convinced of the approach.

Just my two cents

Regards
Felix
 

> Am 09.04.2015 um 17:29 schrieb Stefan Egli <[hidden email]>:
>
> Hi all,
>
> It has come up that with discovery.impl based on oak we could make use of
> oak's (mongoMk's) lease mechanism instead of sending higher level
> heartbeats.
>
> I've created SLING-4603 to track that and would appreciate some opinions
> from this list. I'd be looking at providing a first implementation of this
> for review.
>
> Cheers,
> Stefan
> --
> https://issues.apache.org/jira/browse/SLING-4603
> linked to this one: https://issues.apache.org/jira/browse/SLING-2939
>
>

Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Stefan Egli-2
In reply to this post by Robert Munteanu-2
On 4/10/15 9:56 AM, "Robert Munteanu" <[hidden email]> wrote:

>...I assume that this would
>work with all DocumentNodeStore-based implementations, e.g. also with
>the rdb mk. Is that assumption correct or do you plan a Mongo-only
>implementation?

Yes that was the idea, to base it on DocumentNodeStore.

Cheers,
Stefan


Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Stefan Egli-2
In reply to this post by Felix Meschberger-3
Hi Felix,

On 4/10/15 10:53 AM, "Felix Meschberger" <[hidden email]> wrote:

>  * It depends on a specific implementation detail of a specific
>    Oak MK/NodeStore implementation. This implementation may
>    change at any time
>  * This feature seems to be accessed through JMX which exposes
>    and admin interface which is not guaranteed to be stable
>    for regular programmatic use

Agreed. (except it's a trade-off with the advantage gained, as pointed out
below)

>  * It limits the topology to the Oak cluster members

Not exactly. The idea was to use (embed) most of the functionality of
discovery.impl - ie reuse the topology connectors et al. So cross-cluster
would work exactly the same as with discovery.impl.

>  * This looks like hacking around a problem in Oak leveraging
>    other parts of Oak which seem to have issues in themselves Š

Or, stated slightly differently: Oak's document-based clustering comes
with an eventual consistency model. This by design incorporates a certain,
undefined delay between when writes from one node become visible to
others. In such a model it is unclear what, under any circumstances, the
largest delay will be - and thus, what a proper heartbeat timeout should
be configured to. So by making use of these ActiveClusterNodes, this
'eventual consistency' (ie its delay) can be completely avoided and thus
the algorithm becomes much more deterministic.

PS: Discussed this offline today with Carsten/MichaelM: we should in any
case finally implement a fix for long-standing SLING-3432 - this should be
a big improvement to discovery.impl - and it would apply to any
discovery.* implementation. I've added a comment to SLING-3432.

PPS: I'll create another follow-up ticket for discovery which will be
about 'proper synchronizing between sending of topology_changed event and
the fact that the underlying repository is eventual consistent'. This
currently is automatically handled in discovery.impl (as it is based on
the repository and thus incorporates this "eventual-ness") - but any other
discovery implementation (eg etcd/zookeeper/documentnodestore-based) that
circumvents the repository must watch out for this.

Cheers,
Stefan

>All in all, I doubt whether the energy we put into this really is worth
>it given there are valid other solutions around which are sound, stable,
>and proven such as etcd, zookeeper.
>
>Maybe we should stick with the current discovery.impl as being good
>enough and instead concentrate on building a new discovery implementation
>based on said proven technology. For demo and ease-of-use purposes the
>current discovery.impl is probably sufficient. For real world uses a etcd
>or zookeeper or whathever based solution may be more promising IMHO.
>
>Sorry to sound deceptive, but I am not convinced of the approach.
>
>Just my two cents
>
>Regards
>Felix
>  
>
>> Am 09.04.2015 um 17:29 schrieb Stefan Egli <[hidden email]>:
>>
>> Hi all,
>>
>> It has come up that with discovery.impl based on oak we could make use
>>of
>> oak's (mongoMk's) lease mechanism instead of sending higher level
>> heartbeats.
>>
>> I've created SLING-4603 to track that and would appreciate some opinions
>> from this list. I'd be looking at providing a first implementation of
>>this
>> for review.
>>
>> Cheers,
>> Stefan
>> --
>> https://issues.apache.org/jira/browse/SLING-4603
>> linked to this one: https://issues.apache.org/jira/browse/SLING-2939
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Stefan Egli-2
On 4/13/15 5:31 PM, "Stefan Egli" <[hidden email]> wrote:

>PPS: I'll create another follow-up ticket for discovery which will be
>about 'proper synchronizing between sending of topology_changed event and
>the fact that the underlying repository is eventual consistent'. This
>currently is automatically handled in discovery.impl (as it is based on
>the repository and thus incorporates this "eventual-ness") - but any other
>discovery implementation (eg etcd/zookeeper/documentnodestore-based) that
>circumvents the repository must watch out for this.

Created SLING-4627 for this

Cheers,
Stefan


Reply | Threaded
Open this post in threaded view
|

Re: oak-documentMk based discovery.impl / SLING-4603

Stefan Egli-2
In reply to this post by Stefan Egli-2
Hi Felix,

On 4/13/15 5:31 PM, "Stefan Egli" <[hidden email]> wrote:

>Hi Felix,
>
>On 4/10/15 10:53 AM, "Felix Meschberger" <[hidden email]> wrote:
>
>>  * It depends on a specific implementation detail of a specific
>>    Oak MK/NodeStore implementation. This implementation may
>>    change at any time
>>  * This feature seems to be accessed through JMX which exposes
>>    and admin interface which is not guaranteed to be stable
>>    for regular programmatic use
>
>Agreed. (except it's a trade-off with the advantage gained, as pointed out
>below)

After some more brainstorming, I think we should go back to the
suggestions that were floated around originally by MichaelM and again by
Chetan (the original suggestion is indeed brittle):

Let oak-mongoMk expose a mongo-connection (so as to make sure we're
reusing an existing connection and avoid any (authentication)
configuration on the discovery layer) and discovery.mongo would store
heartbeats directly, raw in a separate mongo collection - bypassing and
independent of any oak code (in this collection heartbeats and
establishedViews would be stored. Properties and announcement can remain
in JCR). Thus this would be completely separated - the only remaining link
is the exposed mongo connection.

This in my view would address the first two concerns.

>>  * It limits the topology to the Oak cluster members
>
>Not exactly. The idea was to use (embed) most of the functionality of
>discovery.impl - ie reuse the topology connectors et al. So cross-cluster
>would work exactly the same as with discovery.impl.
>
>>  * This looks like hacking around a problem in Oak leveraging
>>    other parts of Oak which seem to have issues in themselves ?
>
>Or, stated slightly differently: Oak's document-based clustering comes
>with an eventual consistency model. This by design incorporates a certain,
>undefined delay between when writes from one node become visible to
>others. In such a model it is unclear what, under any circumstances, the
>largest delay will be - and thus, what a proper heartbeat timeout should
>be configured to. So by making use of these ActiveClusterNodes, this
>'eventual consistency' (ie its delay) can be completely avoided and thus
>the algorithm becomes much more deterministic.

As stated, I believe mongoMk being eventually consistent is the problem
and a maximum delay cannot be guaranteed. Hence in my view it is more than
a hack. Even if current delays are enlarged due to an oak-bug, the
eventual consistency will always be there. So the risk of configuring a
heartbeat timeout that is too low will also always be there (Or in other
words: due to eventual consistency delays you can hardly detect node
crashes timely)


>PS: Discussed this offline today with Carsten/MichaelM: we should in any
>case finally implement a fix for long-standing SLING-3432 - this should be
>a big improvement to discovery.impl - and it would apply to any
>discovery.* implementation. I've added a comment to SLING-3432.

This one is in discussion, as to what to do with the isolated mode (which
is in my view not part of the api)

>PPS: I'll create another follow-up ticket for discovery which will be
>about 'proper synchronizing between sending of topology_changed event and
>the fact that the underlying repository is eventual consistent'. This
>currently is automatically handled in discovery.impl (as it is based on
>the repository and thus incorporates this "eventual-ness") - but any other
>discovery implementation (eg etcd/zookeeper/documentnodestore-based) that
>circumvents the repository must watch out for this.

This (SLING-4627) is also yet to be discussed if it should be in discovery
or in the actual users's code of discovery.

>>All in all, I doubt whether the energy we put into this really is worth
>>it given there are valid other solutions around which are sound, stable,
>>and proven such as etcd, zookeeper.
>>Maybe we should stick with the current discovery.impl as being good
>>enough and instead concentrate on building a new discovery implementation
>>based on said proven technology. For demo and ease-of-use purposes the
>>current discovery.impl is probably sufficient. For real world uses a etcd
>>or zookeeper or whathever based solution may be more promising IMHO.
>>
>>Sorry to sound deceptive, but I am not convinced of the approach.

What about the new approach?

Chers,
Stefan