How to interact with Pebble

As mentioned in the introduction, the recommended way to create charms for Kubernetes is using the sidecar pattern with the workload container running Pebble.

Pebble is a lightweight, API-driven process supervisor designed for use with charms. If you specify the containers field in a charm’s metadata.yaml, Juju will deploy the charm code in a sidecar container, with Pebble running as the workload container’s ENTRYPOINT.

When the workload container starts up, Juju fires a PebbleReadyEvent, which can be handled using Framework.observe as shown in Framework Constructs under “Containers”. This gives the charm author access to event.workload, a Container instance.

The Container class has methods to modify the Pebble configuration “plan”, start and stop services, read and write files, and run commands. These methods use the Pebble API, which communicates from the charm container to the workload container using HTTP over a Unix domain socket.

The rest of this document provides details of how a charm interacts with the workload container via Pebble, using the Python Operator Framework Container methods.

Contents:

Configure the Pebble layer

Pebble services are configured by means of layers, with higher layers adding to or overriding lower layers, forming the effective Pebble configuration, or “plan”.

When a workload container is created and Pebble starts up, it looks in /var/lib/pebble/default/layers (if that exists) for configuration layers already present in the container image, such as 001-layer.yaml. If there are existing layers there, that becomes the starting configuration, otherwise Pebble is happy to start with an empty configuration, meaning no services.

In the latter case, Pebble is configured dynamically via the API by adding layers at runtime.

See the layer specification for more details.

Add a configuration layer

To add a configuration layer, call Container.add_layer with a label for the layer, and the layer’s contents as a YAML string, Python dict, or pebble.Layer object.

You can see an example of add_layer under the “Replan” heading. The combine=True argument tells Pebble to combine the named layer into an existing layer of that name (or add a layer if none by that name exists). Using combine=True is common when dynamically adding layers.

Because combine=True combines the layer with an existing layer of the same name, it’s normally used with override: replace in the YAML service configuration. This means replacing the entire service configuration with the fields in the new layer.

If you’re adding a single layer without combine=True on top of an existing base layer, you may want to use override: merge in the service configuration. This will merge the fields specified with the service by that name in the base layer. See an example of overriding a layer.

Fetch the effective plan

Charm authors can also introspect the current plan using Container.get_plan. It returns a pebble.Plan object whose services attribute maps service names to pebble.Service instances.

Below is an example of how you might use get_plan to introspect the current configuration, and only add the layer with its services if they haven’t been added already:

class MyCharm(CharmBase):
    ...

    def _on_config_changed(self, event):
        container = self.unit.get_container("main")
        plan = container.get_plan()
        if not plan.services:
            layer = {"services": ...}
            container.add_layer("layer", layer)
            container.start("svc")
        ...

Control and monitor services

The main purpose of Pebble is to control and monitor services, which are usually long-running processes like web servers and databases.

In the context of Juju sidecar charms, Pebble is run with the --hold argument, which prevents it from automatically starting the services marked with startup: enabled. This is to give the charm full control over when the services in Pebble’s configuration are actually started.

Replan

After adding a configuration layer to the plan (details below), you need to call replan to make any changes to services take effect. When you execute replan, Pebble will automatically restart any services that have changed, respecting dependency order. If the services are already running, it will stop them first using the normal stop sequence.

The reason for replan is so that you as a user have control over when the (potentially high-impact) action of stopping and restarting your services takes place.

Replan also starts the services that are marked as startup: enabled in the configuration plan, if they’re not running already.

Call Container.replan to execute the replan procedure. For example:

class SnappassTestCharm(CharmBase):
    ...

    def _start_snappass(self):
        container = self.unit.containers["snappass"]
        snappass_layer = {
            "services": {
                "snappass": {
                    "override": "replace",
                    "summary": "snappass service",
                    "command": "snappass",
                    "startup": "enabled",
                }
            },
        }
        container.add_layer("snappass", snappass_layer, combine=True)
        container.replan()
        self.unit.status = ActiveStatus()

Check container health

The Charmed Operator Framework provides a way to ensure that your container is healthy. In the Container class, Container.can_connect() can be used in a conditional statement as a guard around your code to ensure that Pebble is operational.

This provides a convenient pattern for ensuring that Pebble is ready, and obviates the need to include try/except statements around Pebble operations in every hook to account for a hook being called when your Juju unit is being started, stopped or removed: cases where the Pebble API is more likely to be still coming up, or being shutdown. It can also be used on startup to check whether Pebble has started or not outside of the pebble_ready hook.

Container.can_connect() will catch and log pebble.ConnectionError, pebble.APIError, and FileNotFoundError (in case the Pebble socket has disappeared as part of Charm removal). Other Pebble errors or exceptions should be handled as normal.

Start and stop

To start (or stop) one or more services by name, use the start and stop methods. Here’s an example of how you might stop and start a database service during a backup action:

class MyCharm(CharmBase):
    ...

    def _on_pebble_ready(self, event):
        container = event.workload
        container.start('mysql')

    def _on_backup_action(self, event):
        container = self.unit.get_container('main')
        if container.can_connect():
            try:
                container.stop('mysql')
                do_mysql_backup()
                container.start('mysql')
            except pebble.ProtocolError, pebble.PathError:
                # handle Pebble errors

It’s not an error to start a service that’s already started, or stop one that’s already stopped. These actions are idempotent, meaning they can safely be performed more than once, and the service will remain in the same state.

When Pebble starts a service, Pebble waits one second to ensure the process doesn’t exit too quickly – if the process exits within one second, the start operation raises an error and the service remains stopped.

To stop a service, Pebble first sends SIGTERM to the service’s process group to try to stop the service gracefully. If the process has not exited after 5 seconds, Pebble sends SIGKILL to the process group. If the process still doesn’t exit after another 5 seconds, the stop operation raises an error. If the process exits any time before the 10 seconds have elapsed, the stop operation succeeds.

Fetch service status

You can use the get_service and get_services methods to fetch the current status of one service or multiple services, respectively. The returned ServiceInfo objects provide a status attribute with various states, or you can use the ServiceInfo.is_running method.

Here is a modification to the start/stop example that checks whether the service is running before stopping it:

class MyCharm(CharmBase):
    ...

    def _on_backup_action(self, event):
        container = self.unit.get_container('main')
        if container.can_connect():
            is_running = container.get_service('mysql').is_running()
            if is_running:
                container.stop('mysql')
            do_mysql_backup()
            if is_running:
                container.start('mysql')

Send signals to services

From Juju version 2.9.22, you can use the Container.send_signal method to send a signal to one or more services. For example, to send SIGHUP to the hypothetical “nginx” and “redis” services:

container.send_signal('SIGHUP', 'nginx', 'redis')

This will raise an APIError if any of the services are not in the plan or are not currently running.

View service logs

Pebble stores service logs (stdout and stderr from services) in a ring buffer accessible via the pebble logs command. Each log line is prefixed with the timestamp and service name, using the format 2021-05-03T03:55:49.654Z [snappass] .... Pebble allocates a ring buffer of 100KB per service (not one ring to rule them all), and overwrites the oldest logs in the buffer when it fills up.

When running under Juju, the Pebble server is started with the --verbose flag, which means it also writes these logs to Pebble’s own stdout. That in turn is accessible via Kubernetes using the kubectl logs command. For example, to view the logs for the “redis” container, you could run:

microk8s kubectl logs -n snappass snappass-test-0 -c redis

In the command line above, “snappass” is the namespace (Juju model name), “snappass-test-0” is the pod, and “redis” the specific container defined by the charm configuration.

The files API

Pebble’s files API allows charm authors to read and write files on the workload container. You can write files (“push”), read files (“pull”), list files in a directory, make directories, and delete files or directories.

Push

Probably the most useful operation is Container.push, which allows you to write a file to the workload, for example, a PostgreSQL configuration file. You can use push as follows (note that this code would be inside a charm event handler):

config = """
port = 7777
max_connections = 1000
"""
container.push('/etc/pg/postgresql.conf', config, make_dirs=True)

The make_dirs=True flag tells push to create the intermediate directories if they don’t already exist (/etc/pg in this case).

There are many additional features, including the ability to send raw bytes (by providing a Python bytes object as the second argument) and write data from a file-like object. You can also specify permissions and the user and group for the file. See the API documentation for details.

Pull

To read a file from the workload, use Container.pull, which returns a file-like object that you can read().

The files API doesn’t currently support update, so to update a file you can use pull to perform a read-modify-write operation, for example:

# Update port to 8888 and restart service
config = container.pull('/etc/pg/postgresql.conf').read()
if 'port =' not in config:
    config += '\nport = 8888\n'
container.push('/etc/pg/postgresql.conf', config)
container.stop('postgresql')
container.start('postgresql')

If you specify the keyword argument encoding=None on the pull() call, reads from the returned file-like object will return bytes. The default is encoding='utf-8', which will decode the file’s bytes from UTF-8 so that reads return a Python str.

List files

To list the contents of a directory or return stat-like information about one or more files, use Container.list_files. It returns a list of pebble.FileInfo objects for each entry (file or directory) in the given path, optionally filtered by a glob pattern. For example:

infos = container.list_files('/etc', pattern='*.conf')
total_size = sum(f.size for f in infos)
logger.info('total size of config files: %d', total_size)
names = set(f.name for f in infos)
if 'host.conf' not in names:
    raise Exception('This charm requires /etc/host.conf!')

If you want information about the directory itself (instead of its contents), call list_files(path, itself=True).

Create directory

To create a directory, use Container.make_dir. It takes an optional make_parents=True argument (like mkdir -p), as well as optional permissions and user/group arguments. Some examples:

container.make_dir('/etc/pg', user='postgres', group='postgres')
container.make_dir('/some/other/nested/dir', make_parents=True)

Remove path

To delete a file or directory, use Container.remove_path. If a directory is specified, it must be empty unless recursive=True is specified, in which case the entire directory tree is deleted, recursively (like rm -r). For example:

# Delete Apache access log
container.remove_path('/var/log/apache/access.log')
# Blow away /tmp/mysubdir and all files under it
container.remove_path('/tmp/mysubdir', recursive=True)

Run commands

From Juju 2.9.17, Pebble includes an API for executing arbitrary commands on the workload container: the Container.exec method. It supports sending stdin to the process and receiving stdout and stderr, as well as more advanced options.

To run simple commands and receive their output, call Container.exec to start the command, and then use the returned Process object’s wait_output method to wait for it to finish and collect its output.

For example, to back up a PostgreSQL database, you might use pg_dump:

process = container.exec(['pg_dump', 'mydb'], timeout=5*60)
sql, warnings = process.wait_output()
if warnings:
    for line in warnings.splitlines():
        logger.warning('pg_dump: %s', line.strip())
# do something with "sql"

Handle errors

The exec method raises a pebble.APIError if basic checks fail and the command can’t be executed at all, for example, if the executable is not found.

The ExecProcess.wait and ExecProcess.wait_output methods raise pebble.ChangeError if there was an error starting or running the process, and pebble.ExecError if the process exits with a non-zero exit code.

In the case where the process exits via a signal (such as SIGTERM or SIGKILL), the exit code will be 128 plus the signal number. SIGTERM’s signal number is 15, so a process terminated via SIGTERM would give exit code 143 (128+15).

It’s okay to let these exceptions bubble up: Juju will mark the hook as failed and re-run it automatically. However, if you want fine-grained control over error handling, you can catch the ExecError and inspect its attributes. For example:

process = container.exec(['cat', '--bad-arg'])
try:
    stdout, _ = process.wait_output()
    print(stdout)
except pebble.ExecError as e:
    logger.error('Exited with code %d. Stderr:', e.exit_code)
    for line in e.stderr.splitlines():
        logger.error('    %s', line)

That will log something like this:

Exited with code 1. Stderr:
    cat: unrecognized option '--bad-arg'
    Try 'cat --help' for more information.

Use command options

The Container.exec method has various options (see full API documentation), including:

  • environment: a dict of environment variables to pass to the process
  • working_dir: working directory to run the command in
  • timeout: command timeout in seconds
  • user_id, user, group_id, group: UID/username and GID/group name to run command as

Here is a (contrived) example showing the use of most of these parameters:

process = container.exec(
    ['/bin/sh', '-c', 'echo HOME=$HOME, PWD=$PWD, FOO=$FOO'],
    environment={'FOO': 'bar'},
    working_dir='/tmp',
    timeout=5.0,
    user='bob',
    group='staff',
)
stdout, _ = process.wait_output()
logger.info('Output: %r', stdout)

This will execute the echo command in a shell and log something like Output: 'HOME=/home/bob, PWD=/tmp, FOO=bar\n'.

Use input/output options

The simplest way of receiving standard output and standard error is by using the ExecProcess.wait_output method as shown above. The simplest way of sending standard input to the program is as a string, using the stdin parameter to exec. For example:

process = container.exec(['tr', 'a-z', 'A-Z'],
                         stdin='This is\na test\n')
stdout, _ = process.wait_output()
logger.info('Output: %r', stdout)

By default, input is sent and output is received as Unicode using the UTF-8 encoding. You can change this with the encoding parameter (which defaults to utf-8). The most common case is to set encoding=None, which means “use raw bytes”, in which case stdin must be a bytes object and wait_output() returns bytes objects.

For example, the following will log Output: b'\x01\x02':

process = container.exec(['cat'], stdin=b'\x01\x02',
                         encoding=None)
stdout, _ = process.wait_output()
logger.info('Output: %r', stdout)

You can also pass file-like objects using the stdin, stdout, and stderr parameters. These can be real files, streams, io.StringIO instances, and so on. When the stdout and stderr parameters are specified, call the ExecProcess.wait method instead of wait_output, as output is being written, not returned.

For example, to pipe standard input from a file to the command, and write the result to a file, you could use the following:

with open('LICENSE.txt') as stdin:
    with open('output.txt', 'w') as stdout:
        process = container.exec(
            ['tr', 'a-z', 'A-Z'],
            stdin=stdin,
            stdout=stdout,
            stderr=sys.stderr,
        )
        process.wait()
# use result in "output.txt"

For advanced uses, you can also perform streaming I/O by reading from and writing to the stdin and stdout attributes of the ExecProcess instance. For example, to stream lines to a process and log the results as they come back, use something like the following:

process = container.exec(['cat'])

# Thread that sends data to process's stdin
def stdin_thread():
    try:
        for line in ['one\n', '2\n', 'THREE\n']:
            process.stdin.write(line)
            process.stdin.flush()
            time.sleep(1)
    finally:
        process.stdin.close()
threading.Thread(target=stdin_thread).start()

# Log from stdout stream as output is received
for line in process.stdout:
    logging.info('Output: %s', line.strip())

# Will return immediately as stdin was closed above
process.wait()

That will produce the following logs:

Output: 'one\n'
Output: '2\n'
Output: 'THREE\n'

Caution: it’s easy to get threading wrong and cause deadlocks, so it’s best to use wait_output or pass file-like objects to exec instead if possible.

Send signals to a running command

To send a signal to the running process, use ExecProcess.send_signal with a signal number or name. For example, the following will terminate the “sleep 10” process after one second:

process = container.exec(['sleep', '10'])
time.sleep(1)
process.send_signal(signal.SIGTERM)
process.wait()

Note that because sleep will exit via a signal, wait() will raise an ExecError with an exit code of 143 (128+SIGTERM):

Traceback (most recent call last):
  ..
ops.pebble.ExecError: non-zero exit code 143 executing ['sleep', '10']

Access the Pebble client directly

Occasionally charm code may want to access the lower-level Pebble API directly: the Container.pebble property returns the pebble.Client instance for the given container.

Below is a (contrived) example of an action that uses the Pebble client directly to call pebble.Client.get_changes:

from ops.pebble import ChangeState

class MyCharm(CharmBase):
    ...

    def show_pebble_changes(self):
        container = self.unit.get_container('main')
        client = container.pebble
        changes = client.get_changes(select=ChangeState.ALL)
        for change in changes:
            logger.info('Pebble change %d: %s', change.id, change.summary)

Last updated 9 days ago.