ZooKeeper - protecting content

Please note that the features described here have not yet been committed. Follow progress on SOLR-4580

Before reading further along you ought to read and understand ZooKeeper access control using ACLs in ZooKeeper Programmer's Guide

Motivation

A SolrCloud system uses ZooKeeper for shared information and for coordination.

Changing some of the Solr-related content i ZooKeeper might do some damage to the SolrCloud cluster

  • Changing configuration might make it not work or behave in an unintended way
  • Changing "clusterstate" into something wrong or inconsistent might very well make the SolrCloud cluster behave strange
  • Adding a delete-collection job to be carried out by the Overseer will have data deleted from the cluster
  • etc

If you are paranoid "enough" you will want to prevent those bad things from happening. Especially if you give access to your ZooKeeper ensamble to entities you do not trust, but it might be worth a thought anyway because the bad things might be performed by

  • Malware that found its way into your system
  • Other systems using the same ZooKeeper ensamble ("bad thing" might be done by accident)
  • etc.

You might even want to limit read-access, if you think there is stuff in ZooKeeper that not everyone should know about. Or you might just in general work on need-to-know-basis.

Protecting ZooKeeper itself could be about a lot of things. This page is about protecting Solr-content in ZooKeeper. ZooKeeper content basically lives persisted on disk and (partly) in memory of the ZooKeeper-processes. This page is not about protecting ZooKeeper data at storage or ZooKeeper-process levels - thats for ZooKeeper to deal with - this is a Solr-related page.

But this content is also available to "the outside" via the ZooKeeper API. Outside processes can connect to ZooKeeper and create/update/delete/read content - a Solr-node in a SolrCloud cluster wants to create/update/delete/read, and a SolrJ client to a SolrCloud cluster wants to read. It is up to the outside processes that create/update content, to setup ACLs on the content. ACLs describe who is allowed to read, update, delete, create, etc. Each piece of information (znode/content) in ZooKeeper has its own set of ALCs and inheritance or sharing is not possible. Default in Solr is to add one ACL on all the content it creates - one ACL that gives anyone the permission to do anything (in ZooKeeper terms called "the open-unsafe ACL"). This page is about being able to tell Solr to add more restrictive ACLs to the ZooKeeper content it creates, and being able to tell Solr about credentials it need to use in order to access the content in ZooKeeper. You will have to "activate" it - default Solr behavior is still open-unsafe ACL allover and no credentials used

How it works

So what we want to do is to

  1. Be able to control the credentials Solr uses for its ZooKeeper connections. The credentials are used to get permissions to perform operations in ZooKeeper 2. Be able to control which ACLs Solr will add to znodes (ZooKeeper files/folders) it creates in ZooKeeper 3. Be able to control it "from the outside", so that you do not have to modify and/or recompile Solr code to turn this on

Solr-nodes, Solr-clients and Solr-tools (like e.g. ZkCLI) always uses a java-class called SolrZkClient to deal with its ZooKeeper stuff. The implementation of the solution is all about changing SolrZkClient. If you "stole" and use SolrZkClint in your application, the descriptions below will be true for your application too

Controlling credentials

You control which credentials will be used by making a JVM-parameter named "defaultZkCredentialsProvider" point to a class (in classpath) implementing the following interface

package org.apache.solr.common.cloud;

public interface ZkCredentialsProvider {
  
  public class ZkCredentials {
    String scheme;
    byte[] auth;
    
    public ZkCredentials(String scheme, byte[] auth) {
      super();
      this.scheme = scheme;
      this.auth = auth;
    }
    
    String getScheme() {
      return scheme;
    }
    
    byte[] getAuth() {
      return auth;
    }
  }
  
  Collection<ZkCredentials> getCredentials();

}

When SolrZkClient has to decide on credentials to use it asks the getCredentials method (once per SolrZkClient). It asks the implementation pointed out by JVM-parameter "defaultZkCredentialsProvider". If the JVM-parameter hasnt been set or it does not point to a valid class on classpath implementing the interface, it uses the default implementation.

Out of the box implementations

You can always make you own implementation, but out-of-the-box with Solr comes

org.apache.solr.common.cloud.DefaultZkCredentialsProvider

Its getCredentials returns a list of length zero - no credentials used. This is default and is used if you do not set JVM-parameter "defaultZkACLProvider" properly

org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider

This one lets you define your credentials using JVM-parameters. It supports (up to) one set of credentials

  • The schema is "digest". The username and password are defined by JVM-parameters "zkDigestUsername" and "zkDigestPassword" respectively. If not both username and password is provided this set of credentials will not be added to the list of credentials returned by getCredentials
  • If the one set of credentials above is not added to the list, it will fallback to default behavior and use the credentials-list of DefaultZkCredentialsProvider

Controlling ACLs

You control which ACLs will be added by making a JVM-parameter named "defaultZkACLProvider" point to a class (in classpath) implementing the following interface

package org.apache.solr.common.cloud;

public interface ZkACLProvider {
  
  List<ACL> getACLsToAdd(String zNodePath);

}

When SolrZkClient wants to create a new znode it asks the getACLsToAdd method (with the path to the znode as parameter) about what ACLs to put on it. It asks the implementation pointed out by JVM-parameter "defaultZkACLProvider". If the JVM-parameter hasnt been set or it does not point to a valid class on classpath implementing the interface, it uses the default implementation.

Out of the box implementations

You can always make you own implementation, but out-of-the-box with Solr comes

org.apache.solr.common.cloud.DefaultZkACLProvider

It returns a list of length one for all zNodePaths. The single ACL-entry in the list is "open-unsafe". This is default and is used if you do not set JVM-parameter "defaultZkACLProvider" properly

org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider

This one lets you define your ACLs using JVM-parameter. Its getACLsToAdd-implementation does not use zNodePath for anything, so all znodes will get the same set of ACLs. It supports adding

  • A user that is allowed to do everything. The permission is "ALL" (corresponding to all of CREATE, READ, WRITE, DELETE and ADMIN) and the schema is "digest". The username and password are defined by JVM-parameters "zkDigestUsername" and "zkDigestPassword" respectively. If not both username and password is provided this ACL will not be added to the list of ACLs
  • and/or a user that is allowed to do reads only. The permission is "READ" and the schema is "digest". The username and password are defined by JVM-parameters "zkDigestReadonlyUsername" and "zkDigestReadonlyPassword" respectively. If not both username and password is provided this ACL will not be added to the list of ACLs
  • If not at least one of the above ACLs is added to the list, it will fallback to default behavior and use the ACL-list of DefaultZkACLProvider

Notice the overlap in JVM-parameter names with credentials provider VMParamsSingleSetCredentialsDigestZkCredentialsProvider (described above). This is to let the two providers collaborate in a nice and perhaps common way: We always protect access to content by limiting to two users - an admin-user and a readonly-user. AND we always connect with credentials corresponding to this same admin-user - basically so that we can do anything to the content/znodes we create ourselves.

The readonly-credentials you can give to "clients" of you SolrCloud cluster - e.g. to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any of the ZooKeeper content.

Example

Lets say that you want all Solr-related content in ZooKeeper protected. You want an "admin"-user that is able to do anything to the content in ZooKeeper - this user will be used for initializing Solr-content in ZooKeeper and for server-side Solr-nodes. You also want a "readonly"-user that is only able to do reads in the content in ZooKeeper - this user will be handed over to "clients".

Lets say

  • "admin"-users username/password: admin-user/admin-password
  • "readonly"-users username/password: readonly-user/readonly-password

What to do

If you use ZkCLI

SOLR_ZK_PROVIDERS="-DdefaultZkCredentialsProvider=org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider -DdefaultZkACLProvider=org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider"
SOLR_ZK_CREDS_AND_ACLs="-DzkDigestUsername=admin-user -DzkDigestPassword=admin-password -DzkDigestReadonlyUsername=readonly-user -DzkDigestReadonlyPassword=readonly-password"
java ... $SOLR_ZK_PROVIDERS $SOLR_ZK_CREDS_AND_ACLs ... org.apache.solr.cloud.ZkCLI -cmd ...

When you start you Solr-nodes

SOLR_ZK_PROVIDERS="-DdefaultZkCredentialsProvider=org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider -DdefaultZkACLProvider=org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider"
SOLR_ZK_CREDS_AND_ACLs="-DzkDigestUsername=admin-user -DzkDigestPassword=admin-password -DzkDigestReadonlyUsername=readonly-user -DzkDigestReadonlyPassword=readonly-password"
java ... $SOLR_ZK_PROVIDERS $SOLR_ZK_CREDS_AND_ACLs ... -jar start.jar ...

When starting your own "clients" (using SolrJ)

SOLR_ZK_PROVIDERS="-DdefaultZkCredentialsProvider=org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider"
SOLR_ZK_CREDS_AND_ACLs="-DzkDigestUsername=readonly-user -DzkDigestPassword=readonly-password"
java ... $SOLR_ZK_PROVIDERS $SOLR_ZK_CREDS_AND_ACLs ... 

Or since you yourself are writing the code creating the SolrZkClients, you might want override the provider implementations at code-level instead.

  • No labels