PostgreSQL Replication, High Availability HA and Scalability
PostgreSQL is one of the most powerful and user-friendly database management systems. It is actively supported by the community and receives new releases annually. PostgreSQL supports the most advanced features of the SQL standard. It also offers NoSQL capabilities, a rich set of data types and extensions, making PostgreSQL an attractive solution for software systems.
Read more about the course
In this course, we explore the task of creating scalable solutions based on PostgreSQL, using the resources of multiple servers. There is a natural limitation for such systems – you always have to make a compromise between performance, reliability, and consistency. You can improve one aspect, but others may suffer as a result. In this course, we will see how to find the best balance for our tasks, to precisely understand which aspects require scaling and how to avoid the typical trade-offs in distributed systems.
Scaling PostgreSQL is a journey. After this course, you will be better able to evaluate scaling needs, understand how to scale reading and how to scale writing.
Each solution presented in this course will improve a particular aspect of scalability, but each one will also add complexity and possibly some limitations.
To begin with, it's important to ask the right questions to understand the system requirements, which is why we've dedicated an entire session to examining what questions we should ask ourselves before embarking on the path to scaling.
After this course, you will be better prepared and understand how to scale reading.
We have several options for replication, depending on whether performance or flexibility is more important to us.
Replication can be used as a backup or as a failover solution that activates in case of a primary server failure.
Additionally, replication can improve system performance as it allows distributing the load across multiple database servers.
Next, if one form of replication is set up, you might consider having several computers serve the same data.
For this, a query distribution mechanism is needed. We will explore two of the most popular available options here.
Furthermore, if the number of database connections is high, you will likely need a connection pool. We will also consider two options here.
We will also study how to scale writes and make traffic growth more predictable by adding queues to the architecture.
After this, we will look into partitioning for cases where it is necessary to handle large tables.
We will also study sharding for scaling writes and all the complex decisions associated with it.
Finally, we will briefly look into the multi-master solution, which is a relatively new concept but looks promising.
If our goal is simply high availability (HA) or the ability to continue operation even in the event of a failure of one of the servers in the cluster, we can consider only those solutions that are suitable for these tasks.
To ensure high availability, it's necessary to set up a replication strategy.
Then, tools that allow the standby server to quickly take over the load if the primary server fails can be used.
Watch Online PostgreSQL Replication, High Availability HA and Scalability
# | Title | Duration |
---|---|---|
1 | Why Scale PostgreSQL? | 01:27 |
2 | Vertical Scaling | 01:14 |
3 | Horizontal Scaling | 02:42 |
4 | CAP Theorem Explained | 03:49 |
5 | PostgreSQL vs. NoSQL | 02:00 |
6 | Use case: Consistent and Available System | 01:12 |
7 | Use case: Available and Partition-tolerant System | 01:02 |
8 | Read Versus Write Bound Workload | 02:46 |
9 | How statistics will answer to all questions? | 01:16 |
10 | Enable Statistics | 04:41 |
11 | Replication | 01:28 |
12 | Load Balancing | 01:30 |
13 | Connection Pooling | 01:21 |
14 | Queuing | 00:54 |
15 | Partitioning | 00:38 |
16 | Sharding | 00:49 |
17 | Multi-master | 01:46 |
18 | What is Streaming Replication? | 02:13 |
19 | Asynchronous vs. Synchronous Replication | 02:33 |
20 | Hands-on - Initialise Primary Database | 01:29 |
21 | Configuring the Primary for Replication | 03:47 |
22 | Configuring the Replica Instance | 03:01 |
23 | Testing Replication Setup | 02:53 |
24 | What is Logical Replication in Postgres? | 05:54 |
25 | Setting-up Postgres Servers for Logical Replication | 04:45 |
26 | Selective Copy of the Data | 03:16 |
27 | Create the Publication | 00:47 |
28 | Create the Subscription | 03:02 |
29 | Limitations of Logical Replication | 02:23 |
30 | Monitoring Logical Replication | 03:11 |
31 | Best use-cases for Logical Replication | 01:49 |
32 | Introduction | 00:53 |
33 | Fundamental concepts of connection pooling | 02:24 |
34 | Building a PgBouncer Setup | 02:00 |
35 | Installing and Configure PgBouncer | 00:38 |
36 | Creating a basic configuration file for PgBouncer | 03:16 |
37 | Connecting to PgBouncer | 01:14 |
38 | Advanced Settings for Performance | 02:56 |
39 | Pool Modes | 01:52 |
40 | A simple benchmark | 03:58 |
41 | Introduction | 00:27 |
42 | Key Components | 00:38 |
43 | Key Characteristics of the Architecture | 00:58 |
44 | Creating PostgreSQL Instances on Google Cloud | 04:37 |
45 | Creating a GCE for HAProxy | 04:00 |
46 | Configure HAProxy for Load-Balancing | 06:16 |
47 | Testing Load-Balancing | 03:50 |
48 | Introduction | 00:58 |
49 | Which Tables Need Partitioning? | 02:07 |
50 | How should the Tables be Partitioned? | 01:48 |
51 | Declarative vs. Inheritance Partitioning | 00:45 |
52 | Creating a Partitioned Table | 04:23 |
53 | Partitioning Methods | 01:46 |
54 | Introduction | 03:04 |
55 | Pain Points of Sharding | 04:16 |
56 | How to Partition Data in PostgreSQL | 03:24 |
57 | Second Level Sharding | 02:06 |
58 | Querying Across Shards | 01:39 |
59 | Why High Availability? | 00:39 |
60 | Steps to achieve High Availability | 01:57 |
61 | Essential Questions to set-up High Availability | 05:21 |
62 | Log-Shipping Replication | 01:24 |
63 | Streaming Replication and Logical Replication | 02:31 |
64 | Cascading Replication | 01:17 |
65 | Synchronous vs. Asynchronous Replication | 02:05 |
66 | Automatic Failover and Always-on Strategy | 01:21 |
67 | Simple HA Solution Example | 00:52 |
68 | Better HA Solution Example | 01:42 |
69 | Introduction | 00:38 |
70 | Pgpool-II Features | 03:19 |
71 | Configure Pgpool-II with Streaming Replication | 01:08 |
72 | Setting up Streaming Replication | 05:53 |
73 | Configuring Pgpool-II for Load Balancing | 06:42 |
74 | Testing load-balancing & read/write separation | 03:25 |
75 | Configure Pgpool for PostgreSQL High-Availability | 00:58 |
76 | Configuring PostgreSQL Primary Server | 02:34 |
77 | Configuring Pgpool-II Server | 02:25 |
78 | Configuring PostgreSQL Replica Server | 01:24 |
79 | Testing The Failover | 02:12 |
80 | Restoring failed nodes | 01:57 |