Currently , the community version of Jumbune runs well on non-secured cluster.
The version needs to be enhanced with the security feature by adding Kerberos for service level authentication, Jumbune needs to provide support for Microsoft Active Directory (AD) and MIT kerberos.
The following acronym has been used in high level configuration & development tasks:
AD - Microsoft Active Directory
MIT Kerberos - MIT Kerberos
Secured Cluster Configuration:
1) Generating service principals in Microsoft Active Directory for all superusers for e.g
similarly yarn, mapred, hive, oozie ..
By setting the Administrator privileges (user which has power to create other users in the hierarchy) for all the above and setting there password to never expire.
2) The client machine and the AD machine needs to have the required certificates to hand-shake and share the information over the network.
3) A superuser has to be created <ADMIN_USER> which will run Jumbune processes apart from service super users. The super user has to be created on the AD side too which would be used as a user while impersonating service super user.
4) In AD , KDC sits on AD side which grants ticket to the user on issuing the kinit command. This is one advantage that while asking for ticket from AD we do not require host for e.g.
hdfs/<hostname>@REALM is a MIT kerberos format with AD we can perform the similar task as hdfs@XYZ.COM
5) The superuser created to run jumbune, <ADMIN_USER> must be added to all the service superuser group manually.
6) All the necessary changes regarding the secured kerberos should be made in the following xmls as mentioned in the below reference site:
7) After making the changes in the required xmls and re-starting the hadoop services, any call given to the hadoop services should require a KDC ticket.
8) On HDFS, The /user/ directory is owned by "hdfs" with 755 permissions. Then can create a user directory with name <ADMIN_USER> under the HDFS directory /user/<ADMIN_USER> , HDFS service will write temporary data for that user under this directory owned by HDFS supergroup.
9) On HDFS, The /user/history directory is owned by "mapred" supergroup with 755 permissions. History server uses this "location to move & write files".
10) By setting the proxy user setting in the configuration xmls the service superuser principals can be impersonated to allow access to other users.
Integrating security API tasks:
1) Underlying java code which hits the services need to be written using the secured method as in the block below:
2) JAAS security API can be used in order to get the Kerberos ticket using the principal & password only instead of keytabs, keytabs are not readily available to authenticate the user. The method required to impersonate the user is given in the API in above reference site.
The similar can be done for MIT Kerberos, we need an LDAP server in order to maintain the user database which is inbuilt in AD server.