Friday, 2 December 2016

Considering removing Mongo from Keycloak

We are considering removing Mongo support from Keycloak in 3.x. The reasons behind it is that there are a fair few issues in the current implementation, especially around consistency due to lack of transaction support in Mongo and often we update multiple documents. In many cases we rely on transactions to rollback to prevent partial updates, but this obviously doesn't work in Mongo.

With the fact that Mongo is already partially broken and the constant maintenance involved we're considering removing it and rather focus purely on the relational database back-end.

Another point to make is that we are not considering supporting Mongo in the supported version of Keycloak (Red Hat Single Sign-On). So we are never able to provide the same level of care and attention to it as we can for relational databases.

If we do decide to remove it we would make sure we provide a seamless and easy option to migrate from Mongo to a relational database!

I would like to gather some feedback from the community before doing anything. So please vote on the following Doodle:

http://doodle.com/poll/nnimebpkx774ppus

Also, comments on this post is more than welcome!

I'll end with a comment - Time spent by core developer on maintaining Mongo could be better spent on awesome new features, testing and bug fixing!

18 comments:

  1. How do you expect a proper HA setup that 'just works' with a RDBMs? That's really a pretty bad move.

    ReplyDelete
    Replies
    1. MySQL, PostgreSQL and Oracle all have full replication support, both active and passive replication. Sure, it may not be quite as simple as Mongo, but the end of the day we have to support relational databases as that is what enterprises use and have experience with. We simply don't have the capacity to maintain both. Especially once you start considering cross datacenter support as well. Problem with Mongo is that there are so many cases where the lack of transaction can cause serious inconsistencies in the data. Sure, you can design around this if you're extremely careful, but that does require a significant amount of effort that we simply don't have.

      Delete
    2. None of these have something like automatic leader election. Therefore of one of our zones goes into maintenance we now have to change our setup by hand. This is pretty inconvenient.

      Even just leaving it in there unsuported would be helpful.

      Oh and I definitely would consider us an enterprise and we do use Mongo (for some things) :)

      Delete
    3. I meant to write "most enterprises" :)

      Leaving it in there unsupported is not an option as it won't take many releases before it's not working as we regularly change and add to the model.

      Is it really that big of a deal to reconfigure if the master node goes away? Surely it can be scripted and automated as well?

      Delete
    4. Different DBs have different features. There is no single ideal DB. Mongo has its own tradeoffs one of which is poor (compared to relational DBs) consistency. And for important configuration data (like your user accounts) consistency is extremely important. Considering the fact that for configuration data reads occur much more often compared to writes it is trivial to have HA setup with any popular relational DB.

      Delete
    5. Situation when AWS availability zone goes does is not the only possible problem. There are many other potential situation when Mongo will go into mode of eventual consistency which by itself may present serious problem. HA for relational databases is a problem already solved many years ago. And it is not hard at all. You can use managed service (managed by cloud provider where you get HA with few mouse cliks), you can use mysqlfailover and you can use other options which are proven and work.

      Delete
    6. In case of an planned maintenance it's not that big of a deal. But in the case of an outage it is.

      Currently we have to do next to nothing when a whole zone goes down - everything recovers on its own. Keycloak would be the exception.

      Delete
  2. Are you aware of the work to bring non-ACID transactional support to MongoDB by the Narayana team? They handle transactional updates to multiple documents within the scope of a (compensating) transaction. Most importantly, the protocol used is recoverable so you get an atomic outcome even in the absence of failure.

    See: http://jbossts.blogspot.co.uk/2014/05/bringing-transactional-guarantees-to.html

    ReplyDelete
  3. I absolutely support the idea to remove Mongo.

    I work as an architect in outsourcing software company with 4000+ employees and I see a lot of different project. And from what I see in most cases where Mongo is used it is used to store and process very small volumes of data (up to few gigs) and in most cases people don't use any HA with it. Some people believe in it's ability to scale and provide HA. However people who really need HA and scalability do not just believe blog pots but perform benchmarking of different DBs including Mongo. And what we most often see that in reality performance is very far from what people imagine by reading blog posts and realiability is very poor. Recently one of our customers found that simply during data load from their existing DB into Mongo they loose records. This was simple data load procedure without any complications of nodes failures or high concurency. I mean that after loading few tens of millions of records they have found that the total number of records loaded is smaller. Number of lost records is quite small number when you see at it as a percentage from total number of loaded records. But this is absolutely not acceptable is these records are your user accounts or any other important data.

    For people who are still attached to Mongo I recommend reading Google Spanner paper. They are industry leaders and they started wave of NoSQL popularity. And looking at their lessons learned is very benefitial to companies who are much less experienced in this area.

    ReplyDelete
  4. Have you considered deprecating the postgres side instead?

    ReplyDelete
    Replies
    1. No, not a chance. The supported version of Keycloak only has relational database support and we can't remove that. Nor would we want to, because that actually works properly with regards to transactions and consistency.

      Delete
  5. So far we have had:
    - infinispan cluster
    - mongo db cluster

    Both app and keycloak use the mongo cluster. After the postgres move:
    - infinispan cluster
    - mongodb cluster
    - postgres cluster

    That alone will cost weeks of work to get going with backup management, monitoring and all the nitty-gritty.

    And here I thought that "the enterprises" were shifting towards NoSQL solutions for their average business level applications.

    But in the end, I do agree that keeping up with both worlds is not feasible for the project. Wish there was a middle ground solution here.

    ReplyDelete
  6. Stefan Schreiber9 December 2016 at 01:46

    We don't appreciate this kind of development direction as well.
    I'm not sticking with Mongo, but our hope was that Keycloak will go the way of NoSQL. Our whole enterprise architecture is based on NoSQL. I won't judge if this is good or not. At least I could argue that we are using Mongo. Switching to a RDBMS only means for me that I've choosen the wrong product.

    ReplyDelete
    Replies
    1. So - you don't use Mongo for other things in your organization, but since it has the label "NoSQL" then you'd rather use that over RDBMS. Seems like very arbitrary reasoning to me.

      Delete
    2. Frankly speaking
      >Switching to a RDBMS only means for me that I've choosen the wrong product.
      is not very serious, thought through, polite and respectful comment.

      Sound like you just think that "NoSQL is cool" and that's it.

      Delete
  7. Stian have you looked into possibly using Hibernate OGM as a way to continue supporting MongoDB? It doesn't solve the transaction issues but it might be a middle ground.

    From our perspective it would be painful if MongoDB was removed. Currently everything we build around Keycloak is using MongoDB. So if it was removed we would either need to maintain (and pay for) a second multi-region HA setup or move everything we built around it over to whatever we use for Keycloak.

    It could be done, but it wouldn't be trivial to do either.

    ReplyDelete
  8. To pile on a bit and add onto what Dane suggested (disclaimer: Dane and I work together :)...

    First of all, thanks for floating this proposal so we have a chance to weigh in, but I too vote against dropping Mongo support.

    I realize it would be a bit of code pain to reorient the entire persistence model to use JPA throughout, but it seems like it might be a good investment and might address everyone's concerns. Keycloak would benefit by needing to support only JPA, and then deployers/integrators of Keycloak like us could choose the appropriate database platform for our use case. From my perspective, moving to a SQL-based system with clustering, HA, and cross-region replication would be a significant amount of pain (operationally and financially).

    If you take the path of using JPA everywhere, it seems that would help keep you out of the business of having to deal with multiple database paradigms, and lets us integrators optimize around what works best for our environment.

    ReplyDelete
    Replies
    1. I appreciate that it is extra overhead to maintain an relational database if you are already maintaining Mongo. I also fully appreciate the cost and time involved in doing the switch. However, please appreciate the fact that Mongo is costing us to maintain and we simply don't have the manpower to do everything. There's two reasons we're considering dropping Mongo. Firstly, the current implementation is not made with love and is broken (especially around potentially bad state if an update goes wrong. Secondly, we don't have the time to improve the current implementation to get it to a proper state, we don't even have the time to maintain what we have at the moment so maintaining Mongo is preventing us to focus on other important features.

      Delete

Please only add comments directly associated with the post. For general questions use the Keycloak user mailing list.