There are two approcahes, which we can select for designing storage of the data. They are schema-on-read and schema-on-write.
Why do you need HDFS (Hadoop Distributed Files System)? If the amount of data is small and place on your computer is enough for this, then you do not need distributed file system. But if you like to process a large amount of data, which is not possible to save on one computer, then you need to think about distributed file system.