Authentication in Hadoop cluster: MIT Kerberos and Active Directory
There are different options how to activate kerberos in Hadoop cluster.
Read moreBlog about big data processing and data-driven investments
We will discuss here solutions for data processing and distributed computing, like HDFS, spark, and kubernetes.
There are different options how to activate kerberos in Hadoop cluster.
Read moreKerberos authentication protocol is needed to secure Hadoop cluster. This is the only way to make Hadoop cluster secure.
Read moreDuring implementation of your components as a microservice, you can come to an idea to use Apache Spark for data retrieval. I will describe ideas how to do this in this post.
Read moreHere is an overview, what is hidden behind spark interpreter in Apache Zeppelin.
Read moreThere is a pattern in microservices architecture: Command and Query Responsibility Segregation (CQRS). This pattern helps to design multi-purpose data lake.
Read moreYou can buy this book by amazon.com.
Read moreHere we explain concepts behind activation of security in Hadoop cluster.
Read moreMaintaining data description is useful feature. There are some ideas, how to implement this.
Read moreWe start saving data in HDFS using avro format. In previous post we have discussed about forward and backward compatibility of avro schemas. How to use this concept?
Read moreUseful way for implementing CI/CD pipeline is to pack code as docker and run in K8s cluster. One very practical application for data analytics is notebook based tool Apache Zeppelin. Every business department requires own configuration for zeppelin. Hence, there is an idea to create docker containers for every department and run in k8s cluster.
Read more