module-puppetdb/README_GETTING_STARTED.md
2012-09-18 17:23:55 -07:00

279 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

puppetlabs/puppetdb - PuppetDB Management
-----------------------------------------
Purpose: Install and manage the PuppetDB server and database, and
configure the Puppet master to use PuppetDB
Module: puppetlabs/puppetdb (http://forge.puppetlabs.com/cprice404/puppetdb)
Puppet Version: 2.7+
Platforms: RHEL6, Debian6, Ubuntu 10.04
One of the new projects that we at Puppet Labs are excited about right now is
PuppetDB, our new “data warehouse” for managing storage and retrieval of all
platform-generated data. (If you havent checked it out yet, have a look at
[Nick Lewis blog
post](http://puppetlabs.com/blog/introducing-puppetdb-put-your-data-to-work/) or
the [PuppetDB documentation](http://docs.puppetlabs.com/puppetdb/).) Currently,
it offers a huge performance improvement for exported and collected resources,
as well as several other great features. Were even more excited about some of
the not-quite-released functionality that is in the pipeline, so stay tuned for
more information!
Installing and configuring PuppetDB isnt *too* difficult, but we knew that it
could and should be even easier than it was. Thats where the new
`puppetlabs/puppetdb` module comes in. Whether you just want to throw PuppetDB
onto a test system as quickly as possible so that you can check it out, or you
want finer-grained access to managing the individual settings and configuration,
this module aims to let you dive in at exactly the level of involvement that you
desire.
Here are some of the capabilities of the new 1.0 release of the `puppetdb`
module; almost all of these are optional, so you are free to pick and choose
which ones suit your needs:
* Installs and manages the core PuppetDB server
* Installs and manages the underlying database server (PostgreSQL or a simple
embedded database)
* Configures your Puppet master to use PuppetDB
* Optional support for opening the PuppetDB port in your firewall on
RedHat-based distros
* Validates your database connection before applying PuppetDB configuration
changes, to help make sure that PuppetDB doesnt end up in a broken state
* Validates your PuppetDB connection before applying configuration changes to
the Puppet master, to help make sure that your master doesnt end up in a broken
state
Installing the module
---------------------
Installing the PuppetDB module is a breeze using the Puppet module tool
(available in Puppet 2.7.14+ and Puppet Enterprise 2.5+):
$ puppet module install puppetlabs/puppetdb
puppet module install puppetlabs/puppetdb
Preparing to install into /etc/puppet/modules ...
Downloading from http://forge.puppetlabs.com ...
Installing -- do not interrupt ...
/etc/puppet/modules
└─┬ puppetlabs-puppetdb (v0.1.1)
├── cprice404-inifile (v0.0.2)
├─┬ inkling-postgresql (v0.3.0)
│ └── puppetlabs-stdlib (v3.0.1)
└── puppetlabs-firewall (v0.0.4)
$
Resource Overview
-----------------
Lets take a quick peek at the main classes and types defined by the module.
(Well take a more in-depth look, with examples, in the following section.)
##### `puppetdb` class
This is a sort of all-in-one class for the PuppetDB server. Itll get you up
and running with everything you need (including database setup and management)
on the server side. The only other thing youll need to do is to configure your
Puppet master to use PuppetDB... which leads us to:
##### `puppetdb::master::config` class
This class should be used on your Puppet master node. Itll verify that it can
successfully communicate with your PuppetDB server, and then configure your
master to use PuppetDB.
***NOTE***: Using this class involves allowing the module to manipulate your
puppet configuration files; in particular: `puppet.conf` and `routes.yaml`. The
`puppet.conf` changes are supplemental and should not affect any of your existing
settings, but the `routes.yaml` file will be overwritten entirely. If you have an
existing `routes.yaml` file, you will want to take care to use the `manage_routes`
parameter of this class to prevent the module from managing that file, and
youll need to manage it yourself.
##### `puppetdb::server` class
This is for managing the PuppetDB server independently of the underlying
database that it depends on; so itll manage the PuppetDB package, service,
config files, etc., but will allow you to manage the database (e.g. postgresql)
however you see fit.
###### `puppetdb::database::postgresql` class
This is a class for managing a postgresql server for use by PuppetDB. It can
manage the postgresql packages and service, as well as creating and managing the
puppetdb database and database user accounts.
##### Low-level classes
There are several lower-level classes in the module (e.g., `puppetdb::master::*`
and `puppetdb::server::*` which you can use to manage individual configuration
files or other parts of the system. In the interest of brevity, well skip over
those for now... but if you need more fine-grained control over your setup, feel
free to dive into the module and have a look!)
Example Usage
-------------
Enough with the gory details, lets talk about how to actually use the thing!
When you are first getting started with PuppetDB, there are a few decision
youll have to make:
* Which database back-end should I use? (The current choices are PostgreSQL or
our embedded database; well discuss this more a bit later on.)
* Should I run the database on the same node that I run PuppetDB on?
* Should I run PuppetDB on the same node that I run my master on?
The answers to those questions will be largely dependent on your Puppet
environment. How many nodes are you managing? What kind of hardware are you
running on? Is your current load approaching the limits of your hardware?
### The Simple Case
Since I wont be able to answer all of those questions for you, well start off
with the absolute simplest case: using our default database (PostgreSQL), and
running everything (PostgreSQL, PuppetDB, Puppet master) all on the same node.
This setup will be great for testing / experimental environment, and may be
sufficient for many real-world deployments depending on the number of nodes
youre managing. So, what would our manifest look like in this case?
node puppetmaster {
# Configure puppetdb and its underlying database
include puppetdb
# Configure the puppet master to use puppetdb
include puppetdb::master::config
}
Thats it! Obviously, you can provide some parameters for these classes if
youd like more control, but that is literally all that it will take to get you
up and running with the default configuration. Here are the steps that this
manifest will trigger:
* Install PostgreSQL on the node if its not already there
* Create the PuppetDB postgres database instance and user account
* Validate the postgres connection and, if successful, install and configure
PuppetDB
* Validate the PuppetDB connection and, if successful, modify the Puppet master
config files to use PuppetDB
* Restart the Puppet master so that it will pick up the config changes
If your logging level is set to INFO or finer, you should start seeing
PuppetDB-related log messages appear in both your Puppet master log and your
PuppetDB log as subsequent agent runs occur.
Note: If youd prefer to use PuppetDBs embedded database rather than
PostgreSQL, have a look at the database parameter on the puppetdb class. The
embedded db can be useful for testing and very small production environments,
but is not recommended for production environments as it consumes a great deal
of memory as your number of nodes increases.
### A Distributed Setup
In many cases, youll prefer not to install PuppetDB on the same node as the
Puppet master. Your environment will be easier to scale if you are able to
dedicate hardware to the individual system components. You may even choose to
run the PuppetDB server on a different node from the PostgreSQL database that it
uses to store its data. So lets have a look at what a manifest for that
scenario might look like:
# This is an example of a very basic 3-node setup for PuppetDB.
# This node is our Puppet master.
node puppet {
# Here we configure the puppet master to use PuppetDB,
# and tell it that the hostname is puppetdb
class { 'puppetdb::master::config':
puppetdb_server => 'puppetdb',
}
}
# This node is our postgres server
node puppetdb-postgres {
# Here we install and configure postgres and the puppetdb
# database instance, and tell postgres that it should
# listen for connections to the hostname puppetdb-postgres
class { 'puppetdb::database::postgresql':
listen_addresses => 'puppetdb-postgres',
}
}
# This node is our main puppetdb server
node puppetdb {
# Here we install and configure PuppetDB, and tell it where to
# find the postgres database.
class { 'puppetdb::server':
database_host => 'puppetdb-postgres',
}
}
Thats it! This should be all it takes to get a 3-node, distributed
installation of PuppetDB up and running. Note that if you prefer, you could
easily move two of these classes to a single node and end up with a 2-node setup
instead.
### Cross-node Dependencies
If youre playing along at home, you may have spotted some cross-node
dependencies here and youve probably recognized that the order that these nodes
check in with the puppet master will have serious implications for getting
everything up and running. It would be very bad to configure the master to use
the PuppetDB server before that server was up and running. Likewise, it
wouldnt be great to try to start up the PuppetDB server pointing to a Postgres
server that isnt actually running Postgres yet.
The module handles this problem for you by taking a sort of “eventual
consistency” approach. Theres nothing that the module can do to control the
order in which your nodes check in, but the module *can* check to verify that
the services it depends on are up and running before it makes configuration
changes--so thats what it does.
When your Puppet master node checks in, it will validate the connectivity to the
PuppetDB server before it applies its changes to the Puppet master config files.
If it cant connect to PuppetDB, then the puppet run will fail and the previous
config files will be left intact. This prevents your master from getting into a
broken state where all incoming Puppet runs fail because the master is
configured to use a PuppetDB server that doesnt exist yet. The same strategy
is used to handle the dependency between the PuppetDB server and the postgres
server.
What does this all mean to you, as a user? Well, it basically means that the
first time you add this stuff to your manifests, you may see a few failed Puppet
runs on the affected nodes. This should be limited to 1 failed run on the
PuppetDB node, and up to 2 failed runs on the Puppet master node. After that,
all of the dependencies should be satisfied and your puppet runs should start to
succeed again.
If you prefer, you can manually trigger puppet runs on the nodes in the correct
order (Postgres, PuppetDB, Puppet master) and you should avoid any failed runs.
Configuring the module
----------------------
The module supports a large number of configuration options. If youd like more
control over things like:
* whether or not to open the PuppetDB port on the firewall
* what address the PuppetDB server should listen on
* what version of PuppetDB to use
* what address the PostgreSQL server should listen on
* PostgreSQL database name, username, password, etc.
* custom paths to various configuration files
and more, please take a peek at the individual classes. They expose a large
number of parameters and should hopefully be documented fairly well. (We wont
cover them here since this post has already gotten a bit long-winded, if I do
say so myself, but perhaps well do a follow-up blog post in the future that
goes into greater detail.)
Conclusion
----------
Thats about it for now. We hope that this module makes it So Darn Easy to get
up and running with PuppetDB that you simply cant come up with any more excuses
not to go ahead and do it right now! We think youll be happy you did--not only
because of its current power and features, but also because of all of the great
things we have in store for it in the near future.
If you have any questions, suggestions, or feedback, please send them to Ryan
or Chris! If theres a setting that youd like to be able to manage that we
havent exposed yet, let us know, or better yet, file a pull request to the
module project: https://github.com/puppetlabs/puppetlabs-puppetdb