Franco Fernando on Substack Every Backend Engineer Should Be Able to Answer This Question What Happens if a Database Crashes in the Middle of a Transaction Many Bad Things Can Happen, Such as Power Outages,
Franco Fernando on Substack: “Every backend engineer should be able to answer this question: What happens if a database crashes in the middle of a transaction? Many bad things can happen, such as power outages, hardware failures, etc. The main idea for not losing data is to store it in a non-volatile s…” #
Excerpt #
Every backend engineer should be able to answer this question:
What happens if a database crashes in the middle of a transaction?
Many bad things can happen, such as power outages, hardware failures, etc.
The main idea for not losing data is to store it in a non-volatile storage like a disk.
Whenever the user performs a transaction, the database does 2 things:
it writes the data on a separate log
it makes the update
The log allows transaction reprocessing during reboot to restore a consistent state after failure.
Writing to the log is fast because this is an append-only binary file.
Data is only ever added to the end of the file, avoiding time-consuming seek operations.
What if a database is distributed?
This case is trickier since the database servers must coordinate using a Two-Phase Commit Protocol.
There is a process where a server acts as a coordinator:
it communicates the commit to all the participants
it waits for all acknowledgments
it communicates the commit or rollback
Every backend engineer should be able to answer this question:
What happens if a database crashes in the middle of a transaction?
Many bad things can happen, such as power outages, hardware failures, etc.
The main idea for not losing data is to store it in a non-volatile storage like a disk.
Whenever the user performs a transaction, the database does 2 things:
- it writes the data on a separate log
- it makes the update
The log allows transaction reprocessing during reboot to restore a consistent state after failure.
Writing to the log is fast because this is an append-only binary file.
Data is only ever added to the end of the file, avoiding time-consuming seek operations.
What if a database is distributed?
This case is trickier since the database servers must coordinate using a Two-Phase Commit Protocol.
There is a process where a server acts as a coordinator:
- it communicates the commit to all the participants
- it waits for all acknowledgments
- it communicates the commit or rollback