Skip to content

Commit acc99b5

Browse files
committed
Added section of Snapshot Isolation
1 parent 4934dcd commit acc99b5

File tree

1 file changed

+19
-5
lines changed

1 file changed

+19
-5
lines changed

README.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,11 @@ If you are reading this and take effort to understand these papers, we would lov
1111
3. [Classic System Design](#system-design)
1212
4. [Columnar Databases](#column)
1313
5. [Data-Parallel Computation](#data-parallel)
14-
6. [Consensus and Consistency](#consensus)
15-
7. [Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)](#trends)
16-
8. [Miscellaneous](#misc)
17-
9. [External Reading Lists](#external)
14+
6. [Snapshot Isolation](#si)
15+
7. [Consensus and Consistency](#consensus)
16+
8. [Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)](#trends)
17+
9. [Miscellaneous](#misc)
18+
10. [External Reading Lists](#external)
1819

1920

2021
## <a name='basic-and-algo'> Basics and Algorithms
@@ -53,7 +54,6 @@ If you are reading this and take effort to understand these papers, we would lov
5354
* [Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications](http://www.cs.berkeley.edu/~rxin/db-papers/Chord-DHT.pdf) (2001) and [Dynamo: Amazon’s Highly Available Key-value Store](http://www.cs.berkeley.edu/~rxin/db-papers/Dynamo.pdf) (2007): Chord was born in the days when distributed hash tables was a hot research. It does one thing, and does it really well: how to look up the location of a key in a completely distributed setting (peer-to-peer) using consistent hashing. The Dynamo paper explains how to build a distributed key-value store using Chord. Note some design decisions change from Chord to Dynamo, e.g. finger table O(logN) vs O(N), because in Dynamo's case, Amazon has more control over nodes in a data center, while Chord assumes peer-to-peer nodes in wide area networks.
5455

5556

56-
5757
## <a name='column'> Columnar Databases
5858

5959
Columnar storage and column-oriented query engine are critical to analytical workloads, e.g. OLAP. It's been almost 15 years since it first came out (the MonetDB paper in 1999), and almost every commercial warehouse database has a columnar engine by now.
@@ -78,6 +78,20 @@ Columnar storage and column-oriented query engine are critical to analytical wor
7878
* [Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks](http://cs.brown.edu/~debrabant/cis570-website/papers/dryad.pdf) (2007): Dryad is a programming model developed at Microsoft that enables large scale dataflow programming. "The fundamental difference between the \[MapReduce and Dryad\] is that a Dryad application may specify an arbitrary communication DAG rather than requiring a sequence of map/distribute/sort/reduce operations".
7979

8080

81+
## <a name='si'> Snapshot Isolation
82+
83+
* [A Critique of ANSI SQL Isolation Levels](http://research.microsoft.com/pubs/69541/tr-95-51.pdf) (1995): Defines isolation levels in terms of phenomena, and shows that these and the ANSI SQL definitions fail to characterize several popular isolation levels. It also defines an important multiversion isolation type: *Snapshot Isolation (SI)*.
84+
85+
* [A Read-Only Transaction Anomaly Under Snapshot Isolation](http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf) (2004): Disproves the assumption that under Snapshot Isolation, read-only transactions always execute serializably provided the concurrent update transactions are serializable.
86+
87+
* [Serializable Isolation for Snapshot Databases (SSI)](https://courses.cs.washington.edu/courses/cse444/08au/544M/READING-LIST/fekete-sigmod2008.pdf) (2008) and ([revised 2009 (ESSI)](http://dl.acm.org/citation.cfm?doid=1620585.1620587)): Describes a concurrency control algorithm that detects and prevents Snapshot Isolation anomalies at run-time, thus providing serializable isolation. Both papers are included for comparison, yet the second paper is more comprehensive and includes protection against additional phenomena and could be regarded as *Enhanced Serializable Snapshot Isolation (ESSI)*.
88+
89+
* [Precisely Serializable Snapshot Isolation (PSSI)](http://www.cs.umb.edu/~eoneil/PSSI_ICDE11_Numbered.pdf) (2011): Defines an algorithm for precisely detecting Snapshot Isolation anomalies, resulting in less false-positive aborts than ESSI. Discuesses implementation of the algorithm in MySQL's InnoDB.
90+
91+
* [Serializable Isolation in PostgreSQL](http://drkp.net/papers/ssi-vldb12.pdf) (2012):
92+
Discusses the trade-offs between SSI, ESSI and PSSI and the approach to implementation of SSI in PostgresSQL.
93+
94+
8195
## <a name='consensus'> Consensus and Consistency
8296

8397
* [Paxos Made Simple](http://www.cs.berkeley.edu/~rxin/db-papers/Paxos.pdf) (2001): Paxos is a fault-tolerant distributed consensus protocol. It forms the basis of a wide variety of distributed systems. The idea is simple, but notoriously difficult to understand (perhaps due to the way the original Paxos paper was written).

0 commit comments

Comments
 (0)