PostgreSQL Replication, High Availability HA and Scalability

3h 9m 35s
English
Paid

PostgreSQL is one of the most powerful and user-friendly database management systems. It is actively supported by the community and receives new releases annually. PostgreSQL supports the most advanced features of the SQL standard. It also offers NoSQL capabilities, a rich set of data types and extensions, making PostgreSQL an attractive solution for software systems.

Read more about the course

In this course, we explore the task of creating scalable solutions based on PostgreSQL, using the resources of multiple servers. There is a natural limitation for such systems – you always have to make a compromise between performance, reliability, and consistency. You can improve one aspect, but others may suffer as a result. In this course, we will see how to find the best balance for our tasks, to precisely understand which aspects require scaling and how to avoid the typical trade-offs in distributed systems.

Scaling PostgreSQL is a journey. After this course, you will be better able to evaluate scaling needs, understand how to scale reading and how to scale writing.

Each solution presented in this course will improve a particular aspect of scalability, but each one will also add complexity and possibly some limitations.

To begin with, it's important to ask the right questions to understand the system requirements, which is why we've dedicated an entire session to examining what questions we should ask ourselves before embarking on the path to scaling.

After this course, you will be better prepared and understand how to scale reading.

We have several options for replication, depending on whether performance or flexibility is more important to us.

Replication can be used as a backup or as a failover solution that activates in case of a primary server failure.

Additionally, replication can improve system performance as it allows distributing the load across multiple database servers.

Next, if one form of replication is set up, you might consider having several computers serve the same data.

For this, a query distribution mechanism is needed. We will explore two of the most popular available options here.

Furthermore, if the number of database connections is high, you will likely need a connection pool. We will also consider two options here.

We will also study how to scale writes and make traffic growth more predictable by adding queues to the architecture.

After this, we will look into partitioning for cases where it is necessary to handle large tables.

We will also study sharding for scaling writes and all the complex decisions associated with it.

Finally, we will briefly look into the multi-master solution, which is a relatively new concept but looks promising.

If our goal is simply high availability (HA) or the ability to continue operation even in the event of a failure of one of the servers in the cluster, we can consider only those solutions that are suitable for these tasks.

To ensure high availability, it's necessary to set up a replication strategy.

Then, tools that allow the standby server to quickly take over the load if the primary server fails can be used.

Watch Online PostgreSQL Replication, High Availability HA and Scalability

Join premium to watch
Go to premium
# Title Duration
1 Why Scale PostgreSQL? 01:27
2 Vertical Scaling 01:14
3 Horizontal Scaling 02:42
4 CAP Theorem Explained 03:49
5 PostgreSQL vs. NoSQL 02:00
6 Use case: Consistent and Available System 01:12
7 Use case: Available and Partition-tolerant System 01:02
8 Read Versus Write Bound Workload 02:46
9 How statistics will answer to all questions? 01:16
10 Enable Statistics 04:41
11 Replication 01:28
12 Load Balancing 01:30
13 Connection Pooling 01:21
14 Queuing 00:54
15 Partitioning 00:38
16 Sharding 00:49
17 Multi-master 01:46
18 What is Streaming Replication? 02:13
19 Asynchronous vs. Synchronous Replication 02:33
20 Hands-on - Initialise Primary Database 01:29
21 Configuring the Primary for Replication 03:47
22 Configuring the Replica Instance 03:01
23 Testing Replication Setup 02:53
24 What is Logical Replication in Postgres? 05:54
25 Setting-up Postgres Servers for Logical Replication 04:45
26 Selective Copy of the Data 03:16
27 Create the Publication 00:47
28 Create the Subscription 03:02
29 Limitations of Logical Replication 02:23
30 Monitoring Logical Replication 03:11
31 Best use-cases for Logical Replication 01:49
32 Introduction 00:53
33 Fundamental concepts of connection pooling 02:24
34 Building a PgBouncer Setup 02:00
35 Installing and Configure PgBouncer 00:38
36 Creating a basic configuration file for PgBouncer 03:16
37 Connecting to PgBouncer 01:14
38 Advanced Settings for Performance 02:56
39 Pool Modes 01:52
40 A simple benchmark 03:58
41 Introduction 00:27
42 Key Components 00:38
43 Key Characteristics of the Architecture 00:58
44 Creating PostgreSQL Instances on Google Cloud 04:37
45 Creating a GCE for HAProxy 04:00
46 Configure HAProxy for Load-Balancing 06:16
47 Testing Load-Balancing 03:50
48 Introduction 00:58
49 Which Tables Need Partitioning? 02:07
50 How should the Tables be Partitioned? 01:48
51 Declarative vs. Inheritance Partitioning 00:45
52 Creating a Partitioned Table 04:23
53 Partitioning Methods 01:46
54 Introduction 03:04
55 Pain Points of Sharding 04:16
56 How to Partition Data in PostgreSQL 03:24
57 Second Level Sharding 02:06
58 Querying Across Shards 01:39
59 Why High Availability? 00:39
60 Steps to achieve High Availability 01:57
61 Essential Questions to set-up High Availability 05:21
62 Log-Shipping Replication 01:24
63 Streaming Replication and Logical Replication 02:31
64 Cascading Replication 01:17
65 Synchronous vs. Asynchronous Replication 02:05
66 Automatic Failover and Always-on Strategy 01:21
67 Simple HA Solution Example 00:52
68 Better HA Solution Example 01:42
69 Introduction 00:38
70 Pgpool-II Features 03:19
71 Configure Pgpool-II with Streaming Replication 01:08
72 Setting up Streaming Replication 05:53
73 Configuring Pgpool-II for Load Balancing 06:42
74 Testing load-balancing & read/write separation 03:25
75 Configure Pgpool for PostgreSQL High-Availability 00:58
76 Configuring PostgreSQL Primary Server 02:34
77 Configuring Pgpool-II Server 02:25
78 Configuring PostgreSQL Replica Server 01:24
79 Testing The Failover 02:12
80 Restoring failed nodes 01:57

Similar courses to PostgreSQL Replication, High Availability HA and Scalability

Effective Database Design

Effective Database DesignBen Brumm

Category: Sql, Other (Backend)
Duration 1 hour 3 minutes 56 seconds
The Complete 2023 Web Development Bootcamp

The Complete 2023 Web Development Bootcampudemy

Category: MongoDB, Other (Backend), Other (Frontend)
Duration 62 hours 32 minutes 23 seconds
Web API From Start to Finish

Web API From Start to Finishiamtimcorey.com (Tim Corey)

Category: Other (Backend)
Duration 17 hours 59 minutes 54 seconds
Streaming Databases

Streaming Databasesoreillymedia

Category: Other (Backend)
Duration
Database Design & Implementation

Database Design & Implementationamigoscode (Nelson Djalo)

Category: Sql, Other (Backend)
Duration 2 hours 7 minutes 49 seconds
Fundamentals of Database Engineering

Fundamentals of Database Engineeringudemy

Category: Other (Backend)
Duration 26 hours 2 minutes 59 seconds
Getting Started With ClickHouse (2023 Edition)

Getting Started With ClickHouse (2023 Edition)udemy

Category: Other (Backend)
Duration 4 hours 38 minutes 56 seconds
Fundamentals of Backend Communications and Protocols

Fundamentals of Backend Communications and Protocolsudemy

Category: Other (Backend)
Duration 15 hours 35 minutes 37 seconds
PostgreSQL High Performance Tuning Guide

PostgreSQL High Performance Tuning Guideudemy

Category: Other (Backend)
Duration 2 hours 41 minutes 14 seconds
Fundamentals of Networking Engineering

Fundamentals of Networking Engineeringudemy

Category: Other (Backend)
Duration 13 hours 40 minutes 46 seconds