How to run workloads with your charm

There are several ways your charm might start a workload, depending on the type of charm you’re authoring. In the case of a Kubernetes charm, your workload is likely a container, but that may not be the case for a machine charm. Before writing the code to start your workload, recall the Lifecycle events section, and note that when the start event is emitted, charm authors should ensure their workloads are configured to “persist in a started state without further intervention from Juju or an administrator”.

Contents:

Machine charms

For a machine charm, it is likely that packages will need to be fetched, installed and started to provide the desired charm functionality. This can be achieved by interacting with the system’s package manager, ensuring that package and service status is maintained by reacting to events accordingly.

Example

It is important to consider which events to respond to in the context of your charm. A simple example might be:

# ...
from subprocess import check_call, CalledProcessError
# ...
class MachineCharm(CharmBase):
    #...

    def __init__(self, *args):
        super().__init__(*args)
        self.framework.observe(self.on.install, self._on_install)
        self.framework.observe(self.on.start, self._on_start)
        # ...

    def _on_install(self, event: InstallEvent) -> None:
      """Handle the install event"""
      try:
        # Install the openssh-server package using apt-get
        check_call(["apt-get", "install", "-y", "openssh-server"])
      except CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Package install failed with return code %d", e.returncode)
        self.unit.status = BlockedStatus("Failed to install packages")

    def _on_start(self, event: StartEvent) -> None:
      """Handle the start event"""
      try:
        # Enable the ssh systemd unit, and start it
        check_call(["systemctl", "enable", "--now", "openssh-server"])
      except CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Starting systemd unit failed with return code %d", e.returncode)
        self.unit.status = BlockedStatus("Failed to start/enable ssh service")
        return

      # Everything is awesome
      self.unit.status = ActiveStatus()

If the machine is likely to be long-running and endure multiple upgrades throughout its life, it may be prudent to ensure the package is installed more regularly, and handle the case where it needs upgrading or reinstalling. Consider this excerpt from the ubuntu-advantage charm code (with some additional comments):

class UbuntuAdvantageCharm(CharmBase):
    """Charm to handle ubuntu-advantage installation and configuration"""
    _state = StoredState()

    def __init__(self, *args):
        super().__init__(*args)
        self._state.set_default(hashed_token=None, package_needs_installing=True, ppa=None)
        self.framework.observe(self.on.config_changed, self.config_changed)

    def config_changed(self, event):
        """Install and configure ubuntu-advantage tools and attachment"""
        logger.info("Beginning config_changed")
        self.unit.status = MaintenanceStatus("Configuring")
        # Helper method to ensure a custom PPA from charm config is present on the system
        self._handle_ppa_state()
        # Helper method to ensure latest package is installed
        self._handle_package_state()
        # Handle some ubuntu-advantage specific configuration
        self._handle_token_state()
        # Set the unit status using a helper _handle_status_state
        if isinstance(self.unit.status, BlockedStatus):
            return
        self._handle_status_state()
        logger.info("Finished config_changed")

In the example above, the package install status is ensured each time the charm’s config-changed event fires, which should ensure correct state throughout the charm’s deployed lifecycle.

Kubernetes charms

As described in the introduction, the preferred way to run workloads on Kubernetes with charms is to start your workload with Pebble. You do not need to modify upstream container images to make use of Pebble for managing your workload. The Juju controller automatically injects Pebble into workload containers using an Init Container and Volume Mount. The entrypoint of the container is overridden so that Pebble starts first and is able to manage running services. Charms communicate with the Pebble API using a UNIX socket, which is mounted into both the charm and workload containers.

By default, you’ll find the Pebble socket at /var/lib/pebble/default/pebble.sock in the workload container, and /charm/<container>/pebble.sock in the charm container.

Most Kubernetes charms will need to define a containers map in their metadata.yaml in order to start a workload with a known OCI image:

# ...
containers:
  myapp:
    resource: myapp-image
  redis:
    resource: redis-image

resources:
  myapp-image:
    type: oci-image
    description: OCI image for my application
  redis-image:
    type: oci-image
    description: OCI image for Redis
# ...

In some cases, you may wish not to specify a containers map, which will result in an “operator-only” charm. These can be useful when writing “integrator charms” (sometimes known as “proxy charms”), which are used to represent some external service in the Juju model.

For each container, a resource of type oci-image must also be specified. The resource is used to inform the Juju controller how to find the correct OCI-compliant container image for your workload on Charmhub.

If multiple containers are specified in metadata.yaml (as above), each Pod will contain an instance of every specified container. Using the example above, each Pod would be created with a total of 3 running containers:

  • a container running the myapp-image
  • a container running the redis-image
  • a container running the charm code

The Juju controller emits PebbleReadyEvents to charms when Pebble has initialised its API in a container. These events are named <container_name>_pebble_ready. Using the example above, the charm would receive two Pebble related events (assuming the Pebble API starts correctly in each workload):

  • myapp_pebble_ready
  • redis_pebble_ready.

Example

Consider the following example snippet from a metadata.yaml:

# ...
containers:
  pause:
    resource: pause-image

resources:
  pause-image:
    type: oci-image
    description: Docker image for google/pause
# ...

Once the containers are initialised, the charm needs to tell Pebble how to start the workload. Pebble uses a series of “layers” for its configuration. Layers contain a description of the processes to run, along with the path and arguments to the executable, any environment variables to be specified for the running process and any relevant process ordering (more information available in the Pebble README).

In many cases, using the container’s specified entrypoint may be desired. You can find the original entrypoint of an image locally like so:

$ docker pull <image> $ docker inspect <image>

When using an OCI-image that is not built specifically for use with Pebble, layers are defined at runtime using Pebble’s API. Recall that when Pebble has initialised in a container (and the API is ready), the Juju controller emits a PebbleReadyEvent event to the charm. Often it is in the callback bound to this event that layers are defined, and services started:

# ...
from ops.pebble import Layer
# ...

class PauseCharm(CharmBase):
    # ...
    def __init__(self, *args):
        super().__init__(*args)
        # Set a friendly name for your charm. This can be used with the Operator
        # framework to reference the container, add layers, or interact with
        # providers/consumers easily.
        self.name = "pause"
        # This event is dynamically determined from the service name
        # in ops.pebble.Layer
        # 
        # If you set self.name as above and use it in the layer definition following this
        # example, the event will be <self.name>_pebble_ready
        self.framework.observe(self.on.pause_pebble_ready, self._on_pause_pebble_ready)
        # ...

    def _on_pause_pebble_ready(self, event: PebbleReadyEvent) -> None:
        """ Handle the pebble_ready event"""
        # You can get a reference to the container from the PebbleReadyEvent
        # directly with:
        # container = event.workload
        #
        # The preferred method is through get_container()
        container = self.unit.get_container(self.name)
        # Add our initial config layer, combining with any existing layer
        container.add_layer(self.name, self._pause_layer(), combine=True)
        # Start the services that specify 'startup: enabled'
        container.autostart()
        self.unit.status = ActiveStatus()

    def _pause_layer(self) -> Layer:
        """Returns Pebble configuration layer for google/pause"""
        return Layer(
            {
                "summary": "pause layer",
                "description": "pebble config layer for google/pause",
                "services": {
                    self.name: {
                        "override": "replace",
                        "summary": "pause service",
                        "command": "/pause",
                        "startup": "enabled",
                    }
                },
            }
        )
# ...

A common method for configuring container workloads is by manipulating environment variables. The layering in Pebble makes this easy. Consider the following extract from a config-changed callback which combines a new overlay layer (containing some environment configuration) with the current Pebble layer and restarts the workload:

# ...
from ops.pebble import ServiceStatus
# ...
def _on_config_changed(self, event: ConfigChangedEvent) -> None:
    """Handle the config changed event."""
    # Get a reference to the container so we can manipulate it
    container = self.unit.get_container(self.name)

    # container.can_connect() provides a mechanism to ensure that
    # no errors  were raised by when trying to connect to pebble
    if container.can_connect():
        try:
            # Get the 'pause' service from within the container
            service = container.get_service(self.name)

            # Create a new config layer - specify 'override: merge' in 
            # the 'pause' service definition to overlay with existing layer
            layer = Layer(
                {
                    "services": {
                        "pause": {
                            "override": "merge",
                            "environment": {
                                "TIMEOUT": self.model.config["timeout"],
                            },
                        }
                    },
                }
            )

            # Get the current services from the plan in the container
            services = container.get_plan().services
            # Check if there are any changes to the config
            # So we can avoid unnecessarily restarting the service
            if services != layer.services:
                # Add the layer to Pebble
                container.add_layer(self.name, layer, combine=True)
                logging.debug("Added config layer to Pebble plan")

                # Start/restart the 'pause' service in the container
                container.restart("pause")
                logging.info("Restarted pause service")
                # All is well, set an ActiveStatus
                self.unit.status = ActiveStatus()

        except pebble.PathError, pebble.ProtocolError:
            # handle errors
          .....
    # ...

In this example, each time a config-changed event is fired, a new overlay layer is created that only includes the environment config, populated using the charm’s config. The application is only restarted if the configuration has changed.


Last updated 3 months ago.