{"id":749,"date":"2020-02-06T21:45:39","date_gmt":"2020-02-06T19:45:39","guid":{"rendered":"http:\/\/dekarlab.de\/wp\/?p=749"},"modified":"2020-05-23T15:33:03","modified_gmt":"2020-05-23T13:33:03","slug":"authentication-and-authorisation-in-hadoop-cluster","status":"publish","type":"post","link":"https:\/\/dekarlab.de\/wp\/?p=749","title":{"rendered":"Authentication and authorization in Hadoop cluster"},"content":{"rendered":"\n<p>Here we explain concepts behind activation of security in Hadoop cluster.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>As an example setup we have HDFS and Hive. Task is to configure security for these components.<\/p>\n\n\n\n<h2> Authentication<\/h2>\n\n\n\n<p>Everything starts from user. You enter user name and password in login screen and press submit button. System starts to authenticate you. Behind the scene user\/password pair is compared with user\/password stored in external directory. It can be LDAP, database, or simple properties file. <\/p>\n\n\n\n<p>There are predefined authentication providers in Hive. For example, you can use predefined <a href=\"https:\/\/github.com\/apache\/hive\/blob\/master\/service\/src\/java\/org\/apache\/hive\/service\/auth\/LdapAuthenticationProviderImpl.java\">LDAP <\/a>provider. It is needed to fill required properties, like <a href=\"https:\/\/cwiki.apache.org\/confluence\/display\/Hive\/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.ldap.url\">LDAP url<\/a>, with path to your LDAP server ( for example ldap:\/\/yourserver.com:389), <a href=\"https:\/\/cwiki.apache.org\/confluence\/display\/Hive\/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.ldap.baseDN\">base path<\/a> to your user, for example ou=Users,o=Org. Default LDAP provider will use your user name and generates search string: uid=&lt;user name&gt;,ou=User,o=Org. If in your organization you use instead of uid cn=&lt;user name&gt;, then you should define this in <a href=\"https:\/\/cwiki.apache.org\/confluence\/display\/Hive\/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.ldap.guidKey\">guidKey <\/a>property. <\/p>\n\n\n\n<p>Sometimes for technical processes there is no LDAP authentication and user\/password is taken from other sources, in this case you should override and extend default LDAP provider with your custom code.<\/p>\n\n\n\n<h2>Authorization<\/h2>\n\n\n\n<p>Authorization can be achieved by defining user groups. If user belongs to some group, than it can have special permissions. User can be more than in one group. Assignment of user  to group can be also part of external directory. By default groups are imported from underlying Linux system using  <a href=\"https:\/\/hadoop.apache.org\/docs\/current\/hadoop-project-dist\/hadoop-common\/GroupsMapping.html\">ShellBasedUnixGroupsMapping<\/a>. To see available groups in Hadoop is possible with following command: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>hdfs dfs -groups &lt;user name><\/code><\/pre>\n\n\n\n<p>if you use groups from Linux, then output of about command will be equal to:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ngroups &lt;user name&gt;\n<\/pre><\/div>\n\n\n<p>But it is possible to have another group provider or mix of them. For example, if you use <a href=\"https:\/\/hadoop.apache.org\/docs\/current\/hadoop-project-dist\/hadoop-common\/GroupsMapping.html\">LdapGroupsMapping <\/a>then groups will be imported from LDAP, and you will be able to see them in HDFS also with same command: hdfs dfs -groups. But these groups are not available in Linux. Moreover, it is possible to have mixed group provider, to import groups, for example, from LDAP and Linux using <a href=\"https:\/\/hadoop.apache.org\/docs\/current\/hadoop-project-dist\/hadoop-common\/GroupsMapping.html\">CompositeGroupsMapping<\/a>.<\/p>\n\n\n\n<p>As a next step, we should define permissions based on groups, where imported on previous step. In HDFS you can use:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: powershell; title: ; notranslate\" title=\"\">\nhdfs dfs -chown test:GroupFromLdap \/user\/test\nhdfs dfs -chmod 555 \/user\/test \n<\/pre><\/div>\n\n\n<p>In hive you can define permissions as:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\ncreate role new_role;\ngrant role new_role to GroupFromLdap;\ngant all on table new_table to role new role;\n<\/pre><\/div>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here we explain concepts behind activation of security in Hadoop cluster.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0},"categories":[25],"tags":[57,54,37],"_links":{"self":[{"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/posts\/749"}],"collection":[{"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=749"}],"version-history":[{"count":9,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/posts\/749\/revisions"}],"predecessor-version":[{"id":773,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=\/wp\/v2\/posts\/749\/revisions\/773"}],"wp:attachment":[{"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=749"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=749"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dekarlab.de\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}