Paper Notes
1.
benchmarks
1.1.
ssb
2.
bigdata
2.1.
mapreduce
2.2.
nephele
2.3.
dataflow model
2.4.
flink
2.5.
flink state management
3.
compiler
4.
databases
4.1.
cloudnative
4.1.1.
aurora
4.1.2.
taurus
4.2.
columnstores vs rowstores
4.3.
kv
4.3.1.
rocksdb cidr17
4.3.2.
wisckey
4.4.
mmdb
4.4.1.
mmdb overview
4.5.
oltp
4.5.1.
through the looking glass
4.5.2.
staring into the abyss
4.6.
olap
4.6.1.
lakehouse
4.6.2.
delta lake
4.6.3.
vertica
4.6.4.
duckdb
4.7.
htap
4.7.1.
greenplum
4.8.
vector db
4.8.1.
hnsw
4.8.2.
ivf-hnsw
4.8.3.
diskann
4.8.4.
product quantization
4.9.
graph db
4.9.1.
kuzu
4.10.
citus
4.11.
optimizer
4.12.
executor
4.12.1.
volcano
4.13.
concurrency control
4.13.1.
evaluation of in-memory mvcc
4.14.
cdc
4.14.1.
dblog
4.15.
rum conjecture
5.
datalayout
5.1.
cstore
5.2.
cstore compression
5.3.
dremel
5.4.
rcfile
5.5.
orc
5.6.
table placement methods
6.
data structures
6.1.
btree family
6.1.1.
bw-tree
6.2.
hash table
6.2.1.
linear hashing
6.3.
trie family
6.3.1.
art
6.3.2.
hot
6.4.
bitmaps
6.4.1.
roaring bitmaps
6.5.
skip list
6.6.
bloom filter
7.
distributed system
7.1.
consensus
7.1.1.
flp
7.1.2.
paxos made simple
7.1.3.
paxos made live
7.1.4.
viewstamped replication
7.1.5.
zab
7.1.6.
paxos vs. vr vs. zab
7.1.7.
raft
7.1.8.
paxos vs raft
7.2.
scheduler
7.2.1.
borg
7.3.
primary backup
7.4.
chain replication
7.5.
bolosky
7.6.
holy grail
7.7.
chandy lamport
7.8.
asynchronous barrier snapshotting
7.9.
zookeeper
8.
filesystem
8.1.
gfs
8.2.
polarfs
9.
llm
10.
storage
10.1.
kv store
10.1.1.
dynamo
10.2.
kudu
10.3.
bluestore
Light (default)
Rust
Coal
Navy
Ayu
论文阅读笔记
bigdata
MapReduce: Simplified Data Processing on Large Clusters
Nephele: Efficient Parallel Data Processing in the Cloud
The Dataflow Model
Apache Flink: Stream and Batch Processing in a Single Engine
State Management in Apache Flink