Persistence

Persistence is a somewhat relevant topic for Node.^1{.footnote-ref}^

When I say persistence I’m talking about keeping the state of files between images. And the two aspects I want to speak about is how files are persisted and what files are persisted.

Node is delivered as an image (now and in future). An update (a new rootfs) is also delivered as an image. Migrating changes between the original image, and the updated image is what I call persistence.

Now

In the current Node persistence is implemented by keeping changed files on a writable partition, and bind mounting them at boot time into the target. The persistence works with a white-list, so a user (or package) needs to name all files which need to be persisted. A drawback of this approach is, that the bind-mounts are not available in the early boto stages. The problem can be illustrated if you think about persisting a modified /etc/fstab or new (3rd party) kernel modules. Additionally someone needs to determine the files which define the state of a component.

Then

For the future we are thinking about an approach where the whole root filesystem is writable^2{.footnote-ref}^.

In addition to it beeing writable, we will be also booting into that filesystem. This makes it possible to

Modify any file
(which implies) Modify files which are relevant for the early boot process

The key difference to the old approach is, that we can now modify the filesystem we use to boot.

How

Persistence then works by migrating a file from the previous to the next image. The migration can be a simple copy or similar to a 3-way merge. Maybe even both. Or more? Independent of what strategy is choosen, he key fact remains: Persisting works by migrating the old state, to the new image.

Now we know how we persist files, but it’s still the question what files we need to be persist.

What

As said above, currently this works by whitelisting files. This has the drawback that the user (or package/consumer) needs to be aware that it#s running on Node, and that is cumbersome to maintain, and not obvious to a user, why files need to be persisted.

I anticipate something like “transparent” persistence for the next-gen Node. It should be transparent to the user/consumer that persistence is necessary, so Node should decide what files need to be persisted which not.

Because Node will keep the original image, and a layer with the changed filesystem tree around, we can easily determine the changed files.

Greedy

A clear option is to migrate all of the changed files to a new (updated) image. This approach is greedy: Each file that has been touched is copied from one image to the next. This will result in many changes which are moved from one image to another. We need something smarter. And there is more than one option.

File

We could use file-level white-listing like we do today.

$ perists /etc/passwd

It’s very selective, but needs a lot of maintenance.

Package

Besides doing it on a file level, we can also do it on the package level. Two approaches: We can either name the package (and all of it’s files) to be persisted, or we can determine the package of a file, when the user tries to persist a file.

$ persist pkg vdsm

$ persist file /etc/passwd
Do you want to perists all changes to files of the package 'core-utils' instead of the single file '/etc/passwd'?
Yes/No/Maybe [Maybe]?

Blacklist

Or we could just come up with a blacklist of paths which should not be persisted. Except when explicitly requested.

These are a couple of options that come to my mind, I’m not yet sure how the final solution will look like. But I see use cases for all this ways. And the good is, the current infrastructure we plan, will support all of these approaches.

::: {.footnotes role=”doc-endnotes”}

::: {#fn:imagepersistence} it’s not only Node that needs persistence. It affects all projects which need to take care to keep the customized state of a system. If the state is not kept, then we are talking about stateless. ↩︎{.footnote-backref} :::
::: {#fn:readonlyroot} Currently the root filesystem is read-only, and only parts are made writable using tmpfs and bind-mounts to writable partitions ↩︎{.footnote-backref} ::: :::

::: {#footer} [ September 11th, 2014 11:33am ]{#timestamp} [node]{.tag} [persistence]{.tag} [fedora]{.tag} [atomic]{.tag} :::