This is likely to be a controversial change so I wanted to put some
explanation of our reasoning into the commit message. This gets
kind of complex so I'll start with the problem and then the reasoning.
Problem:
We rely heavily on the ability to uninstall and reinstall postgres
throughout our testing code, testing features like "can I move from the
distribution packages to the upstream packages through the module" and
over time we've learnt that the uninstall code simply doesn't work a lot
of the time. It leaves traces of postgres behind or fails to remove
certain packages on Ubuntu, and generally causes bits to be left on your
system that you didn't expect.
When we then reinstall things fail because it's not a true clean slate,
and this causes us enormous problems during test. We've spent weeks and
months working on these tests and they simply don't hold up well across
the full range of PE platforms.
Reasoning:
Due to all these problems we've decided to take a stance on uninstalling
in general. We feel that in 2014 it's completely reasonable and normal
to have a good provisioning pipeline combined with your configuration
management and the "correct" way to uninstall a fully installed service
like postgresql is to simply reprovision the server without it in the
first place. As a general rule this is how I personally like to work
and I think is a good practice.
WAIT A MINUTE:
We understand that there are environments and situations in which it's
not easy to do that. What if you accidently deployed Postgres on
100,000 nodes? When this work is finished I'm going to take a look at
building some example 'profiles' to be found under examples/ within this
module that can uninstall postgres on popular platforms. These can be
modified and used in your specific case to uninstall postgresql. They
will be much more brute force and reliant on deleting entire directories
and require you to do more work up front in specifying where things are
installed but we think it'll prove to be a much cleaner mechanism for
this kind of thing rather than trying to weave it into the main module
logic itself.
This doesn't fix the root cause of the issue, such as the fact that
dpkg can't do wildcard removals, and the uninstall fails when you're
passing in a version number like this, but THIS test doesn't care, it
just wants to make sure the deprecation warning appears in the first
place.
This does however make the tests pass on 14.04.
The validate_db_connection class takes a user to connect as, but if we're
using the progresql::server::db defined type then the user might not be
created yet, and might not have any permissions granted yet. This patch
users a collector to ensure that the that the user and grants are active
before validating.
On FreeBSD systems the $user variable is not 'postgres' so does not
match the default database correctly. These changes use the existing
default_database parameter to replace instances where $user is passed as
the database to be connected to.
These changes are in server::database, server::role and
server::grant.
At present, the ownership of pg_hba.conf is hardwired to be uid 0. It should have the same ownership as all of the other postgressql configuration files in the same cluster so that they can be managed/edited by the postgres role user (system) account.
The warnings are as follows:
Warning: Scope(Concat::Fragment[pg_hba_rule_deny access to postgresql user]): The $mode parameter to concat::fragment is deprecated and has no effect
Warning: Scope(Concat::Fragment[pg_hba_rule_deny access to postgresql user]): The $owner parameter to concat::fragment is deprecated and has no effect
E.g. pe-postgresql does NOT use postgres as the default database name.
It uses pe-postgres. So if there is no way to specify a default database
name, the postgesql::validate_db_connection resource in
postgresql::server::service will ALWAYS fail. This commit exposes the
parameter in order to avoid that situation.
Since the class is now throwing an error when you use the class directly,
I'm just removing it.
We left this in from the last rewrite as someone reported an issue a long
time ago, but alas we have been unable to prove its a problem.
Signed-off-by: Ken Barber <ken@bob.sh>
This patch is a fix for the race condition that keeps occuring during
postgresql setup. Its very rare on its own, but when you are using this
module in a CI environment it happens quite frequently.
Basically what happens is that sometimes the service will announce the
database has started, but really it is still working in the background.
Sometimes the unix socket may not be listening, and sometimes the
system is still loading and you get a weird client error.
The fix itself is a modification to postgresql::validate_db_connection
so that it is able to connect on the local unix socket, plus retry
until the database is available.
This new and improved validate_db_connection can then be put into the
build pipeline (in the service class in particular) to ensure the
database is started before continuing on with the remaining steps.
This in effect blocks the puppet module from continuing until the
postgresql database is fully started and able to receive connections
which is perfect.
Tests and documentation provided.
Signed-off-by: Ken Barber <ken@bob.sh>