Franco Fernando on Substack Every Backend Engineer Should Be Able to Answer This Question What Happens if a Database Crashes in the Middle of a Transaction Many Bad Things Can Happen, Such as Power Outages,

Franco Fernando on Substack: “Every backend engineer should be able to answer this question: What happens if a database crashes in the middle of a transaction? Many bad things can happen, such as power outages, hardware failures, etc. The main idea for not losing data is to store it in a non-volatile s…” #

Excerpt #
Every backend engineer should be able to answer this question:

What happens if a database crashes in the middle of a transaction?

Many bad things can happen, such as power outages, hardware failures, etc.

The main idea for not losing data is to store it in a non-volatile storage like a disk.

Whenever the user performs a transaction, the database does 2 things:

The log allows transaction reprocessing during reboot to restore a consistent state after failure.

Writing to the log is fast because this is an append-only binary file.

Data is only ever added to the end of the file, avoiding time-consuming seek operations.

What if a database is distributed?

This case is trickier since the database servers must coordinate using a Two-Phase Commit Protocol.

There is a process where a server acts as a coordinator:

Every backend engineer should be able to answer this question:

What happens if a database crashes in the middle of a transaction?

Many bad things can happen, such as power outages, hardware failures, etc.

The main idea for not losing data is to store it in a non-volatile storage like a disk.

Whenever the user performs a transaction, the database does 2 things:

- it writes the data on a separate log

- it makes the update

The log allows transaction reprocessing during reboot to restore a consistent state after failure.

Writing to the log is fast because this is an append-only binary file.

Data is only ever added to the end of the file, avoiding time-consuming seek operations.

What if a database is distributed?

This case is trickier since the database servers must coordinate using a Two-Phase Commit Protocol.

There is a process where a server acts as a coordinator:

- it communicates the commit to all the participants

- it waits for all acknowledgments

- it communicates the commit or rollback