Charm development best practices
This document describes the best practices for charm development.
This is likely to change as the ecosystem and practice matures, so make sure to come back from time to time to stay up to date.
Contents:
- Conventions
- Code style
- Patterns
- Testing
- Recommended tooling
- Common integrations
- Example repositories
Conventions
Naming
As it is common to develop multiple charms for the same underlying technology or software, using clear and consequent naming is a vital part of creating a new charm. If we take prometheus as an example, multiple charms exist that cover different scenarios:
- Running on bare metal as a machine charm
- Running in kubernetes as a k8s charm
To make it obvious to the user which one to use, we’ve put together a separate document detailing the do’s and don’ts of charm naming, which is available here.
When naming configuration items or actions, the convention is to use lowercase alphanumeric names, separated with underscores if required. For example timeout
or enable_feature
.
Generality
When building your charm or library, pay attention to wider usability; try not to encapsulate logic that is highly specific to your use-case. There is always a balance between configurability and simplicity, but consider how your implementation would meet the use-cases of others trying to consume your charm or library.
State
Charms should generally aim to be stateless. In the event that a charm needs to track state between invocations, it should create an instance of StoredState in a class attribute called _stored. It is worth noting that any StoredState is specific to a single unit rather than shared across an application, and will be lost if the unit is restarted (for example, if a Kubernetes pod is rescheduled).
For sharing state between units of the same application, we instead recommend using peer relation data bags.
Do not track the emission of events, or elements relating to the charm’s lifecycle, in a state. Where possible, construct this information by accessing the model, i.e. self.model
, and the charm’s relations; peer relations or otherwise.
Resources
Resources can either be of the type oci-image or file. When providing binary files as resources, you need to provide binaries for all CPU architectures your binary might end up being run on. An example of this can be found here.
At the time of writing, we try to include amd64 and arm64 versions of such resources, but you should also make sure that you implement the usage of these resources in such a way that the user may build a binary for their architecture of choice and supply it themselves. An example of this can be found here.
Relations
A relation is formed between a “providing” and a “requiring” charm sharing the same interface. We recommend using Charm Libraries to distribute code that simplifies implementing any relation for people who wish to integrate with your application. Generally, we recommend naming a charm library after the relation interface it manages (example).
To keep this neat and easy to reason about, we implement a separate class for each side of the relation in the same library, for instance:
class MetricsEndpointProvider(Object):
# …
class MetricsEndpointRequirer(Object):
# …
These classes should do whatever is necessary to handle any relation events specific to the relation interface you are implementing, throughout the lifecycle of the application. By passing the charm object into the constructor to either the Provider or Requirer, you can gain access to the on
attribute of the charm and register event handlers on behalf of the charm, as required.
Application and unit statuses
In general, you should aim to only make changes to the charm’s application or unit status directly within an event handler. This makes it easier for other developers to understand which states you intend the charm to be in, and when. If status is set across multiple event handlers, helper methods and other functions, this can be more difficult.
An example:
class MyCharm:
# This is an event handler, and can therefore set status
def _on_config_changed(self, event):
if self._some_helper():
self.unit.status = ActiveStatus()
# This is a helper method, not an event handler, so don't set status here
def _some_helper(self):
# do stuff
return True
When authoring libraries, consider that charm authors should always be given the option to decide whether something going wrong in a library function should be considered status-altering for their charm. To facilitate this, libraries should never mutate the status of a unit or application. Instead, raise exceptions and let them bubble back up to the charm for the charm author to handle as they see fit.
In cases where the library has a suggested default status to be raised, use a custom exception with a .status property containing the suggested charm status as shown here or here. The calling charm can then choose to accept the default by setting self.unit.status
to raised_exception.status
or do something else.
The only special case to this guidance is when authoring a lifecycle event handler in a library , where the library function subscribing to the hook is the entrypoint. In this case, it is considered valid to change the state, as long as this behavior is documented.
Logging
Templating
Use the default Python logging
module. The default charmcraft init
template will set this up for you. Do not build strings for the logger.
Prefer
logger.info("something %s", var)`
over
logger.info("something {}".format(var))`
Frequency
Try to avoid spurious logging, ensure that log messages are clear and meaningful and provide the information a user would require to rectify any issues.
Avoid excess punctuation or capital letters.
logger.error("SOMETHING WRONG!!!")
is significantly less useful than
logger.error("configuration failed: '8' is not valid for field 'enable_debug'.")
Sensitive information
Never log credentials or other sensitive information. If you really have to log something that could be considered sensitive, use the trace error level.
Documentation
Documentation should be end-user oriented. Document what your charm does and how it’s used. A copy-paste from the application documentation is not helpful, instead: include a link to it.
A good rule of thumb when testing your documentation is to ask yourself whether it provides a means for “guaranteed getting started”. You only get one chance at a first impression, so your quick start should be rock solid.
The front page of your documentation should not carry information about how to build, test or deploy the charm from the local filesystem: put this information in separate docs specific to the development of and contribution to your charm. This information can live as part of your Charm Documentation, or in the version control repository for your charm (example).
Custom events
Charms should never define custom events themselves. They have no need for emitting events (custom or otherwise) for their own consumption, and as they lack consumers, they don’t need to emit any for others to consume either. Instead, this should be done in a library.
Backward compatibility
When authoring your charm, consider the target Python runtime. Kubernetes charms will have access to Python 3.8 by default (the charm container is based on Ubuntu 20.04), as will charms that require Juju 3.0+ (which supports Ubuntu 20.04 or later). So it’s reasonable to use features from Python 3.8 in your charms.
If your charm or charm library must be compatible with older systems or Juju 2.9, you’ll need to make your code compatible with Python 3.5. However, consider upgrading and targeting Python 3.8 and Ubuntu 20.04 for new charms.
Compatibility checks for Python 3.8 can be automated in your CI or using mypy.
Dependency management
External dependencies must be specified in a requirements.txt file. If your charm depends on other libraries, you should vendor and version the library you depend on (see the prometheus-k8s-operator). This is the default behaviour when using charmcraft fetch-lib
. For more information see the docs on Charm Libraries.
Code style
Clarity
Charm authors should choose clarity over cleverness when writing code. A lot more time is spent reading code than writing it, so when possible, we opt for clear code that is easily maintained by anyone.
User experience / UX
Charms should aim to keep the user experience of the operator as simple and obvious as possible. If it is harder to use your charm than to set up the application from scratch, why should the user even bother with your charm?
Where possible, try to ensure that the application can be deployed without providing any further configuration options, e.g.
juju deploy foo
is preferable over
juju deploy foo --config something=value
This will not always be possible, but will provide a nicer user experience where applicable. Also consider if any of your configuration items could instead be automatically derived from a relation.
A key consideration here is which of your application’s configuration options you should initially expose. If your chosen application has many config options, it may be prudent to provide access to a select few, and add support for more as the need arises.
For very complex applications, consider providing “configuration profiles” which can group values for large configs together.
Event handler visibility
Charms should make event handlers private: _on_install
, not on_install
. There is no need for any other code to directly access the event handlers of a charm.
Linting
To keep down the cognitive overhead and maintenance effort needed, we use linters to make sure the code we write has a consistent style regardless of the author. An example configuration can be found in the pyproject.toml the charmcraft init
template.
This config makes some decisions about code style on your behalf. At the time of writing, it configures code formatting using black
, with a line length of 99 characters. It also configures isort
to keep imports tidy, and flake8
to watch for common coding errors.
In general, we run these tools inside a tox
environment. We generally configure an environment named lint
, and one called fmt
alongside any testing environments required. See the Recommended Tooling section for more details.
Docstrings
Charms should have docstrings. At Canonical, we use the Google docstring format when writing docstrings for charms. To enforce this, we then use flake8-docstrings as part of our linter suite.
Class layout
The class layout of a charm should be organized in the following order:
- Constructor (inside which events are subscribed to, roughly in the order they would be activated)
- Factory methods (classmethods), if any
- Event handlers, placed in order that they’re subscribed to
- Public methods
- Private methods
Further, we discourage the use of nested functions (i.e scope internal functions), and instead suggest using either private methods or free/unscoped functions. Likewise we discourage the use of static methods.
Patterns
Fetching Network Information
In some charms we need to get unit’s address to share over relation data with other charms.
Getting unit IPs is error prone, as the bind_address
depending on the state of the charm may return None
, for instance:
@property
def _address(self) -> Optional[str]:
"""Get the unit's ip address.
Technically, receiving a "joined" event guarantees an IP address is available. If this is
called beforehand, a None would be returned.
When operating a single unit, no "joined" events are visible so obtaining an address is a
matter of timing in that case.
This function is still needed in Juju 2.9.5 because the "private-address" field in the
data bag is being populated by the app IP instead of the unit IP.
Also in Juju 2.9.5, ip address may be None even after RelationJoinedEvent, for which
"ops.model.RelationDataError: relation data values must be strings" would be emitted.
Returns:
None if no IP is available (called before unit "joined"); unit's ip address otherwise
"""
# if bind_address := check_output(["unit-get", "private-address"]).decode().strip()
if bind_address := self.model.get_binding(self._peer_relation_name).network.bind_address:
bind_address = str(bind_address)
return bind_address
A better approach is to instead rely on the FQDN returned by the socket
module:
import socket
...
@property
def address(self) -> str:
"""Unit's hostname."""
return socket.getfqdn()
External Accessibility
Keep in mind that the FQDN won’t be resolvable outside of the model (or rather, outside of the Kubernetes cluster). To make our charm resolvable by external workloads, we also need to first check whether we have an ingress address and if that is the case, return that address instead.
Testing
Charms should have tests to verify that they are functioning correctly. These tests should cover the behavior of the charm both in isolation (unit tests) and when used with other charms (integration tests). Charm authors should use tox to run these automated tests.
Unit tests
Unit tests are written using the unittest library shipped with Python. To facilitate unit testing of charms, we use a testing harness specifically designed for charmed operators available in the Charmed Operator SDK. An example of charm unit tests can be found here.
Functional tests
Functional tests in charms often take the form of integration-, performance- and/or end-to-end tests.
For integration and end-to-end tests, we use the pytest library. We also provide a testing library for interacting with Juju and your charm called pytest-operator. Examples of integration tests for a charm can be found in the prometheus-k8-operator repo
Recommended tooling
Continuous integration
The quality assurance pipeline of a charm should be automated using a continuous integration (CI) system.
For repositories on GitHub, the easiest way to go about this is to use the actions-operator, which will take care of setting up all dependencies needed to be able to run charms in a CI workflow. You can see an example configuration for linting and testing a charm using Github Actions here.
The automation should also allow the maintainers to easily see whether the tests failed or passed for any available commit.
Additionally, it is important to provide enough data for the reader to be able to take action, i.e. dumps from juju status, juju debug-log, kubectl describe and similar. To have this done for you, you may integrate charm-logdump-action into your CI workflow.
Linters
At the time of writing, linting modules commonly used by charm authors include black, flake8, flake8-docstrings, flake8-copyright, flake8-builtins, pyproject-flake8, pep8-naming, isort, and codespell.
Common integrations
Observability
Charms should provide a metrics endpoint for Prometheus, as well as reasonable (read: conservative but useful) alert rules and default dashboards.
Example repositories
There are a number of sample repositories you could use for inspiration and a demonstration of good practice.
Kubernetes charms:
Machine charms:
Last updated a month ago.