You may think, that there is no need to structure data in HDFS. You can systemize it in the future. But I think this is a wrong way. We should always keep in mind: there is no free lunch. Therefore it is better to make desicions at the beginning.
Latest Posts
Thoughts about schema-on-write and schema-on-read
There are two approcahes, which we can select for designing storage of the data. They are schema-on-read and schema-on-write.
How to integrate Apache Kylin OLAP In Excel (pivot) [XMLA Connect and Mondrian]
Apache Kylin is very powerfull OLAP engine. It supports ODBC driver to move the data in excel, however this driver is not user friendly. Users should wright sql queries for this.
Short note about HDFS or why you need distributed file system
Why do you need HDFS (Hadoop Distributed Files System)? If the amount of data is small and place on your computer is enough for this, then you do not need distributed file system. But if you like to process a large amount of data, which is not possible to save on one computer, then you need to think about distributed file system.
Documenting your investment decisions in MoneyBuilder
Before you start the investment you need to document two very important points.
Source code for MoneyBuilder is available
First steps with MoneyBuilder
You have downloaded the application and you would like to start book keeping.
How to start?
Read more
First version of MoneyBuilder is available
The first version of MoneyBuilder is available.
Easy Master Data Management with XMDM
You can download here the new software for master data management.
Informatica Powecenter Fix (Multiple Monitors) without Admin Rights
Informatica Powercenter has a small issue, when working with multiple screens.
The Dialogs for Expressions and SQLs are not displayed correctly: