Database System Implementation
CS346
at 200-305
A major database system implementation project realizes the principles and techniques covered in earlier courses. Students independently build a complete database management system, from file structures through query processing, with a personally designed feature or extension. Lectures on project details and advanced techniques in database system implementation, focusing on query processing and optimization. Guest speakers from industry on commercial DBMS implementation techniques. Prerequisites: CS145, CS245, programming experience in C++. (more info)
Schedule & Handouts
Links to more handouts may be added over time.
Lectures may change, but the project due dates are set.
I use the slides provided to prepare my blackboard lectures.
There is no 1-1 mapping from slides to the crazy things I say in class.
| Week | Date | Event | Handouts |
|---|---|---|---|
| 1 |
Class Introduction to course, DBMS review, RedBase overview |
RedBase Part 0: PF RedBase Part 1: RM | |
| Class File & buffer review, RedBase PF and RM components |
Slides: Buffer Management (pdf) | ||
| 2 |
Class Buffer Management |
Slides: Buffer Manager Extra (pdf) | |
|
Class Page Layout and File of Records | |||
|
Project RedBase Part 1: RM Due | |||
| 3 |
Class RedBase IX component, Indexing and B+ tree review (by TA) |
Slides: B+/B-Link Trees (pdf and paper) RedBase Part 2: IX | |
|
Class Concurrency in Indexing, B-Link tree | |||
| 4 |
Class RedBase SM and QL components, Metadata and Query Processing review (by TA) |
RedBase Part 3: SM RedBase Part 4: QL | |
|
Class Query Processing lecture |
Slides: Cost Models | ||
|
Project RedBase Part 2: IX Due | |||
| 5 |
Class Recovery (ARIES) |
Slides: ARIES (pdf) ARIES paper ARIES examples | |
|
Class Guest lecture: Eric Sedlar, Oracle | |||
| 6 |
Class Guest lecture: Michalis Petropouls, Pivotal | ||
|
Class Database Analytics (DeepDive, Hogwild!), RedBase EX component |
RedBase Part 5: EX | ||
|
Project RedBase Part 3: SM Due | |||
| 7 |
Class Guest lecture: Michael Armburst, Databricks | ||
|
Class Guest lecture: Mike Cafarella, U of Michigan, Co-founder of Hadoop | |||
|
Project RedBase Part 5: EX Proposal Due | |||
| 8 |
Class Guest lecture: Christian Tinnefeld, SAP | ||
|
Class Guest lecture: Karthik Ramasamy, Twitter | |||
|
Project RedBase Part 4: QL Due | |||
| 9 |
Class No Class, Memorial Day | ||
|
Class Guest lecture: TJ Green, Logicblox | |||
| 10 |
Class No class, work on your projects! | ||
|
Class No class, work on your projects! | |||
| Project RedBase Part 5: EX Final Demo |
Course Info
Course Staff
Chris Re Instructor
- Office Hours: Before Class () and by Appointment
- Office: Gates 433
- Email: chrismre at cs
Jaeho Shin Teaching Assistant
- Office Hours: (every week), (before dues)
- Office: Gates 427
- Email: jaeho.shin at stanford
Marianne Siroker Administrator
- Office: Gates 435
- Phone: 723-0872
- Email: siroker at cs
Communication
Piazza is the main place where all announcements, questions, and discussions are posted.
Important announcements are also sent out to the mailing list cs346-spr1415-all which includes all enrolled students.
Emails concerning individual issues should always be sent to cs346-spr1415-staff@lists.stanford.edu. Also, RedBase Part 5: EX proposals should be submitted to that address.
Course Contents
There will be five aspects to the course:
The basic RedBase project, implemented by each student individually.
An extension to RedBase, individually conceived, designed, and implemented by each student.
Lectures on aspects of the RedBase project.
Lectures on advanced database system implementation techniques, with an emphasis on query processing and optimization.
Guest lecturers from industry describing commercial database system implementation techniques, with an emphasis on query processing and optimization.
An overview and details of the project can be found on the RedBase Project page.
Prerequisites
CS145 (Introduction to Databases) and CS245 (Database System Principles) or equivalent knowledge is essential. We will assume that all students already understand basic database system implementation techniques. In this course you will put your basic knowledge into practice while learning about more advanced implementation techniques including those used in commercial products.
We recommend that all students have prior experience with Unix, and at least with the C programming language. It is preferred that students have C++ experience as well, although it is not essential. Students with no C++ experience will need to learn quickly; students with no C/Unix experience probably should not take this course.
Units
Students may enroll in CS346 for 3, 4, or 5 units. All students are expected to do the same amount of work regardless of their number of units. CS346 is a 5-unit course in terms of work; it is offered for fewer units as a courtesy to students who have a limit.
Readings and Textbook
A few research papers will be made available on the web as suggested reading. There is no required textbook for the course, but students may wish to own a comprehensive database textbook for reference, for example:
- Database Systems: The Complete Book
H. Garcia-Molina, J.D. Ullman, and J. Widom; Prentice Hall
Other textbooks such as those by Silberschatz, Korth, & Sudarshan; Ramakrishnan & Gehrke; Elmasri & Navathe; O'Neil; or Date also are sufficient.
Grading
90% of your final grade will be based on the project and 10% on class participation. The complete breakdown is:
| Project Part 1 | 15% |
| Project Part 2 | 15% |
| Project Part 3 | 15% |
| Project Part 4 | 20% |
| Project Part 5 proposal | 5% |
| Project Part 5 demo | 20% |
| Class participation | 10% |
Your programs will be graded on correctness and efficiency, as well as on descriptions of key design decisions. Details on program grading criteria and mechanisms are provided in the RedBase Logistics document.
Please note that attendance to all guest lectures are required.
CS346 is not graded on a curve. It's a difficult class, and everyone who performs well (defined very roughly as ~90% of project points, good class participation, and a solid RedBase extension) will get an A.
Past Offerings
Here are websites of some of the past offerings of the course.
Students with Documented Disabilities
Students who may need an academic accommodation based on the impact of a disability must initiate the request through the Office of Accessible Education (OAE). OAE staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact OAE as soon as possible since timely notice is needed to coordinate accommodations:
563 Salvatierra Walk
TTY: (650) 723-1067
Voice: (650) 723-1066
Honor Code
Under the Honor Code at Stanford, each of you is expected to submit your own work in this course: all code submitted must have been written by you. However, on many occasions when working on programs it is useful to talk with others (the instructor, the TA, or other students) about design decisions and programming strategies. Such activity is both acceptable and encouraged, but when you turn in your programs you must indicate any assistance you received. Any assistance received that is not given proper citation will be considered a violation of the Honor Code.
The project extension proposal must represent individual ideas and writing, and we discourage excessive collaboration in developing proposals. Quiz answers must be original.
The course staff will pursue aggressively all suspected cases of Honor Code violations, and they will be handled through official University channels. The course staff may employ plagiarism-detection software to ensure that programs turned in are the original work of each student.