PostgreSQL tutorial: Get started with PostgreSQL 10

Get up and running—and avoid the setup pitfalls—with this enterprise-class, open-source database available for all major platforms

PostgreSQL tutorial: Get started with PostgreSQL 10
Thinkstock

For years now, PostgreSQL has been one of the best kept secrets of enterprise software and open source alike—a powerful, robust, expertly maintained relational database. MySQL may be better known, but PostgreSQL offers many features MySQL has only started to provide.

In this article I’ll walk through the steps to set up and get running a basic PostgreSQL installation. Most of this discussion will be aimed at folks who are starting entirely anew, but I’ll cover details involving migrating data from a previous PostgreSQL installation or another database solution. Finally, this article focuses on the PostgreSQL 10 family of releases, the most recent and current revision of PostgreSQL as of this writing.

Pick your PostgreSQL platform

PostgreSQL is available in binary form for all of the major operating systems: desktop and server editions of Windows, all the popular flavors of Linux, MacOS, and a few other OSes used in enterprises such as Solaris.

By and large, PostgreSQL’s capabilities are the same across the various platforms it supports. We’ll go into some of the exceptions to that rule in detail below, so bear in mind how those might affect any deployment or application development plans.

Install PostgreSQL on Linux

Linux distributions vary greatly in terms of how software is installed and maintained: different package managers, different package formats, different software repository philosophies. PostgreSQL isn’t exempt from these differences, which can be a barrier to obtaining the latest version of the software on your platform.

Ubuntu, for instance, maintains a version of PostgreSQL in its APT repositories. In the event it’s not the most recent version, you’ll need to add the apt.postgresql.org repository to your repo list and fetch PostgreSQL from there.

With Red Hat and Fedora Linux, an edition of PostgreSQL is included in the repo for each edition. Fedora is updated more frequently than Red Hat Enterprise Linux (about twice a year), and much quicker to embrace new versions of applications, but in either case the version of PostgreSQL made available through the repo may not be the one you want. For instance, in the default repos of Red Hat Enterprise Linux, you won’t find a version of PostgreSQL newer than 9.2.

Thus the most surefire way to install the most recent version of PostgreSQL for Red Hat or Fedora is to download the installation artifact directly from PostgreSQL via an interactive download page or from PostgreSQL’s yum repository.

Docker container images for every major edition of PostgreSQL on Linux are available from Docker Hub. Note that you will need to provide the container with a location for persistent data storage; the default location is /var/lib/postgresql/data. You can point to a different location by editing the PGDATA environment variable.

An additional consideration on Linux is which file system to use to store PostgreSQL data. Past benchmarks have generally favored XFS or Ext4, although as per the linked article they must be configured with specific options (e.g., nobarrier) for best results. Although the PostgreSQL FAQ cites ZFS for its delegated administration and snapshotting functionality, Ubuntu is the only major Linux distribution that bundles ZFS today; in other Linux distros, you will have to set up ZFS on your own.

Install PostgreSQL on MacOS

Mac users often rely on Homebrew, an open-source package manager for MacOS that has become the de facto way to install open source projects on the Mac from the command line. Homebrew allows you to install PostgreSQL and all its dependencies with one command (brew install postgresql).

Another way to install PostgreSQL on MacOS is by way of Postgres.app, a third-party repackaging of PostgreSQL as a conventional Mac app. The website provides installers for the latest versions of PostgreSQL and all the major PostgreSQL GUI clients, and even self-updates if needed. Postgres.app also performs some automatic preconfiguration for you, such as initializing the data directory and user database, and bundles many common tools like pg_upgrade.

Install PostgreSQL on Windows

Those who want to set up PostgreSQL on Windows have a choice of installers packaged by third parties—everything from simple .ZIP archives to interactive GUI-driven installers.

  • BigSQL’s graphical installer is basic and straightforward. It installs only PostgreSQL and pgAdmin, prompting the user during the setup process for basic install information—data directory, master password, etc.
  • The EnterpriseDB installer provides a far greater wealth of install-time options. Among them is the “Stack Builder,” a tool for downloading and installing a plethora of additional tools: database drivers, other editions of PostgreSQL (in the event you need to run them side by side), EnterpriseDB’s own PostgreSQL add-ons, and the Slony-I replication system for PostgreSQL. For experts, EnterpriseDB also offers a simple .ZIP archive of nothing more than PostgreSQL’s binaries for Windows.
  • If you want to run PostgreSQL on Windows without installing it or leaving any data behind, there’s PostgreSQL Portable. Packaged for the PortableApps application launcher, it provides a full PostgreSQL server instance in a single directory with no external dependencies. This is highly useful for testing on a desktop where software installations are restricted, or where you want to confine your PostgreSQL work to a non-system or removable drive.

One big caveat to keep in mind when running PostgreSQL on Windows is a documented limit—around 125—on the number of connections PostgreSQL can hold open at a given time. This limitation applies only when PostgreSQL is run as a service, as opposed to being invoked from the command line.

According to the PostgreSQL wiki, this is because “some of the libraries that PostgreSQL is dependent on are dependent on user32.dll which allocates memory from an area known as the Desktop Heap.” Because the heap has a fixed size, and because each connection depletes a certain amount of memory, failures may occur with more than approximately 125 connections.

That said, this isn’t likely to become an issue in most usage scenarios, since one rarely needs to make that many simultaneous direct connections to PostgreSQL. A well-designed application can use connection pooling to avoid running into a connection limit. And connection pooling is something you should be doing anyway, to keep the number of connections to the database commensurate with the number of hardware threads available on your server.

Configure PostgreSQL on Linux

If you use one of the graphical installers for PostgreSQL described above, particularly for Windows, you’ll be walked through the steps needed to have a running PostgreSQL installation. With Linux, however, in most cases you’ll need to take those steps yourself.

Configure PostgreSQL on Fedora Linux

PostgreSQL 10 is included in Fedora 28’s default repositories, so running sudo dnf install postgresql-server postgresql-contrib will install those bits directly. After that, you will likely want to enable PostgreSQL to start at boot time:

sudo systemctl enable postgresql

You will also need to manually run PostgreSQL’s setup script to create the initial databases:

sudo postgresql-setup --initdb --unit postgresql

Set the postgres system password, i.e. the password for the account the PostgreSQL service runs under:

sudo passwd postgres

Start the PostgreSQL server:

sudo systemctl start postgresql

Then set the password for the postgres internal user account, i.e. the account used to connect directly to PostgreSQL. Run the psql command-line app to connect to PostgreSQL, and issue the command \password postgres (note the leading backslash).

Many additional notes about installing PostgreSQL on Fedora Linux are available in the Fedora wiki, but these steps should suffice for a basic setup.

Configure PostgreSQL on Ubuntu Linux

Ubuntu 18.04 includes PostgreSQL 10 in its default repositories, so the setup process is similar to Fedora’s but more streamlined:

  • Install the base packages:
    sudo apt update
    sudo apt install postgresql postgresql-contrib
  • Set the user account password:
    sudo passwd postgres
  • Set the PostgreSQL service to auto-start:
    sudo systemctl enable postgresql
     
    sudo systemctl start postgresql
  • Change users and connect to the server to test:
    sudo -i -u postgres
    psql

Note that there is no postgresql-setup step required here. The installation process contains the triggers needed to perform that step.

Migrate data between PostgreSQL versions

One of the last steps you’ll take is to move your data from any old instances of PostgreSQL to your new instance. PostgreSQL, as of version 10, has three main ways to migrate data during an upgrade of the database.

  • The pg_upgrade utility is the preferred way to migrate data between versions of PostrgreSQL. It is typically used when you are installing a new version of PostgreSQL side-by-side with an old one; you run pg_upgrade from the new installation, point it at the old one, and let it move the data over. pg_upgrade takes advantage of consistencies between point revisions of the PostgreSQL data format to speed up the migration process, so it is not only the most convenient way to migrate but typically the fastest.
  • Another way to perform an upgrade/migration is to use PostgreSQL replication (e.g., Slony). Here you use the new version of PostgreSQL to create a standby server for a server running the older version. This method results in the least amount of downtime, because the replication happens in the background. Once the replication is finished, all you need to do is redirect all connections from the old server to the new one. However, if you’re not using replication in the first place, this might involve jumping through a few too many hoops.
  • The pg_dumpall utility exports the entire contents of a PostgreSQL server to one or more SQL dump files. The resulting script or archive file can then be executed on or imported to the new server. It’s also possible to set up the old and new servers side-by-side, and use pg_dumpall to “pipe” the dump operation between them. The biggest downside of using pg_dumpall is that it’s slow, but it can be used as an absolute fallback method.

Install pgAdmin 4 on Windows, MacOS, or Linux

A third-party administration tool for PostgreSQL, pgAdmin 4, provides a useful web GUI for managing a PostgreSQL installation. Unfortunately, the tool is somewhat ornery to get up and running, as it’s written in Python and has a great many dependencies from that language.

MacOS and Windows users can download a binary installer for pgAdmin 4 from the pgAdmin website. 

For Ubuntu, there’s an APT package. There is also a post in the Ask Ubuntu forum that runs down the steps needed to install pgAdmin 4, get it running, and create a convenient shortcut to a startup script.

For Fedora and Red Hat, there’s an RPM package. However, as of this writing the RPM package has a minor bug, where a key Python package is omitted. The recipe below, which reproduces many of the steps from the previous link, includes an additional step (line 2) that works around this:

$ sudo dnf install pgadmin4
$ sudo dnf install python3-flask-babelex
$ sudo /usr/pgadmin4/bin/pgadmin4-web-setup.sh
$ pgadmin4

The last command launches the pgAdmin 4 back end and, eventually, a web browser to view it.

If you’re a Docker user, you can avoid most of the headache by pulling and using a Docker container with pgAdmin 4 and all of its dependencies.

Copyright © 2018 IDG Communications, Inc.