data layout
- C-Store: A Column-oriented DBMS
- Integrating Compression and Execution in Column-Oriented Database Systems
- Dremel: Interactive Analysis of WebScale Datasets
- RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems
- Major Technical Advancements in Apache Hive
- Table Placement Methods
Further readings
[1] Apache Arrow vs. Parquet and ORC: Do we really need a third Apache project for columnar data representation? by Daniel Abadi, 2017