Control by design

Msrc offers just enough ways to control each update to get what you want. There are m4 macros, diversions, and include files, as well as shell and rdist bindings. This document describes them all.

To understand this document

This document explains and indexes some of the subtle points of leverage available in the master source tool-chain. It would help a lot if you had already read the main page. It would help a lot more if you had built and installed the primary package.

Points of leverage in msrc

The master source system is based on many levers. Each lever allows small changes that cascade through the structure to create required output in the features beyond, which in turn, mayhap, will create more ripples. Since automating the correct levers solves configuration management issue much more quickly that picking the wrong levers, we present an overview of the key levers in this document.

Msrc's program name and $MSRC_LIB

The first leverage point for any of my tools is the name the program is passed in argv[0], and msrc is no exception. Msrc looks for a file based on the name it was called under $MSRC_LIB named basename.hxmd. This is could be used to set global options for msrc to pass to hxmd. But it almost never is actually instanced, so the default installation doesn't even build the empty support directory. It is almost always used by shell wrappers that have their own library directory (which they specify as below).

For example, if we set $MSRC_LIB to a directory we can write in (it defaults to /usr/local/lib/msrc), then create a file msrc.hxmd in that directory, which adds options to all the enclosed hxmd instances. In this example I'll force an uptime command after the last host is updated:

$ mkdir /tmp/ksb
$ export MSRC_LIB=/tmp/ksb
$ echo "-K 'uptime'" >$MSRC_LIB/msrc.hxmd
$ cd /usr/msrc/local/bin/oue
$ msrc -Csite.cf -G `hostname` :
 ...
sulaco.example.com: updating of sulaco.example.com finished
10:41AM  up 144 days, 17:50, 55 users, load averages: 0.08, 0.02, 0.01
$ unset MSRC_LIB

Note that the existence of any HXINCLUDE file from the controlling make recipe will suppress the path search for a default file. (Thus this trick doesn't work in the source to msrc itself.)

So that gives you 2 intercept points: the value of $MSRC_LIB and the name you provide to msrc. The contents of the file provides many additional hooks, as any option to hxmd could be included. You may change the program name via a symbolic link, or in some shells you can explicitly set it (viz. zsh).

Environment variables in the argv[0].hxmd file

The code that installs those hxmd options may read run-time values from the environment. This is sometimes used to pass a command or configuration file down to cache processes.
# Install the Fuze hook
-D
$FuzeMe
Then pass the define you want set via the variable hook you installed:
$ FuzeMe="INIT_CMD=uptime" msrc ...
This tactic allows comments in the file, use them to describe what you are actually intending. And where the value is set in the environment. Also note that other hxmd options, like -j and -C allow other features to be remote controlled.

That lever allows the injection of m4 defines at the hxmd level, additional options and configuration files, as well as xapply and xclate option injection.

On the other-hand, this fails when the environment is not set. Therefore, it is best to present these variables from a structure that sets a default value for each one. Also you should include the current (or a default) value of MSRC_LIB on the end of your list to allow nested application of the hook.

Tokens under -t

This not-often-used feature allows the selection of a local resource which is bound to each update. For example, each update may need a local service bound to a TCP port, or a unix domain socket. By starting enough instances for the parallel factor we assure that one is available for each running update. Then we just need to pick an available one for each update process. That's ptbw's job.

As an example, we'll just use a token list of 4 common zero argument shell functions in /tmp/ksb/tokens: uptime, date, pwd, and uname. Given there are three names for the local host in auto.cf, we can run those commands with:

$ msrc -DINIT_CMD="\${6}" -l -t /tmp/ksb/tokens -AR1 -Cauto.cf :
 3:15PM  up 458 days,  1:33, 63 users, load averages: 0.19, 0.11, 0.09
Mon Apr 22 15:15:15 GMT 2013
/usr/msrc/usr/local/sbin/msrc
Note the uname didn't get picked because we only had 3 target hosts. Also note the markup for $6, we need to keep m4 from expanding it to the empty string, so we put curly braces around the 6.

In a real example, a process above this would bind services to a list of ports (or local domain sockets) then provide the list to us for our updates. In the script we'd forward an ssh request back (under -R or gtfw) to access the services. I've used this a few times, but not often. In the real cases I always provide -P, which permutes the list as processes finish.

The main sticky point here is the quoting to get the token string from the script arguments, see the description of the positional parameters in the update script. The other quoting that might work is "\`\$'6", but I would not depend on that one, due to possible m4 double-evals.

M4 Hooks

These are covered in detail in the msrc manual page. But we should explain here why one might change some of them, and how.

The hooks generally come from attribute macros defined in any of the configuration files specified under -C, -X, or -Z or by an explicit definition under -D.

HXMD_PHASE and -j
Use -j to include a file of m4 markup, then use the fixed values of HXMD_PHASE to unpack only the macros you need for the context you are in. This is best done by include'ing a file detailing what you need, rather than by nested ifelse logic. See the description in the hxmd HTML page.
By using -dX switches to msrc we can see that the order the files are provisioned is
hostcmd mapped-files... provision
Since most of the action happens in the host update command script, you should key on HXMD_PHASE equal to 1. In version 1 of msrc there are no m4 tunable hooks in the provision file.
Never depend on the order of the MAP files, unless you force it in the control recipe. Msrc does not sort them, it accepts them in directory order, which is not stable.
SSH
This is almost always set to the path to a script that runs ssh with options to suppress X11 forwarding. Sometimes other performance options are enabled (e.g -c might specify arcfour). But it could also be used to log out-bound requests, for example.
ENTRY_DEFS
This is a great hook. If you want to keep the authoritative host from pushing (updating) a host, you can either change the path to a file that just contains an exit command, or update the default file to contain a (conditional) exit. Since the file is sourced into the remote shell, any command the terminates the shell (or renders it lame, like setting -n) renders the update powerless. This is often used to "freeze" a machine for testing or critical production.
Putting a set -x in the file traces the update actions at the shell level. See also the snoopy library, which lets the trace descend across a fork via setting LD_PRELOAD.
INCLUDE_CMD(mode)
This often overlooked hook allows a complete rewrite of the remote update script. If you don't like my version, just insert your own, then exit before mine starts (or divert(-1)).
divert
Reorder the code from an include file. A file included under -j may define macros to be expanded later in the file, or may divert code. See the odd numbered diversions in the output of the control script, under -d S, viz.:
$ msrc -d S -DSSH=`which false` -Cauto.cf : 2>&1 | ${PAGER:-less}
The even numbered diversions are left for any internal or debug actions local site policy dictates.
m4wrap
Catch the end of the update script with this m4 built-in. This allows some clever hooks, for example, to remove remote files with an ssh back the target host.
m4wrap(ifelse(MSRC_MODE,`local',`cd ${1} &&',`SSH ${2} -n cd ${5} \&\&')` make -s clean
')dnl

M4 Includes

Coding good m4 components is a skill one can master. Here are some pointers I teach my staff:
Reserve a prefix for local variables in macro functions
I use U_ in most of my functions. I used to use L__, but the double underscore made it hard to read and type. When you need a local pushdef the marco, when you are done with it popdef it back to the value it had before. No macro should pass a local name down unevaluated. Feel free to use my name-space.
Each include should have a module_pop marco
This macro is used to undefine all the macros the module defined. If you need a set of macros for a limited scope, include the module, use the macros needed, then call the pop macro to cleanup the m4 namespace.

Make macros

The best hook for any of these is to set the macro to a command prefixed with a semicolon (;). That forces the output of the command to be the value msrc uses a prefix of "echo", followed by the macro it wants, and a shell markup to redirect the output to an already open file descriptor (like 1>&7):
echo ${MAP} 1>&7
So if we set MAP like this:
# pick a file at random to send today
MAP= ; random-file *.host
Then make runs the shell command:
echo ; random-file *.host 1>&7
Since the stdout of the make process is /dev/null the blank line is ignored, and the output from the random-file command is read as the value from MAP.

A few macros make recipe macro hooks.

INTO=_multi-word message
This is clearly documented as a way to prevent a package build recipe from being mistaken for an msrc control recipe. But it might also output other hints about how to use the recipe.
INTO=.
Setting this to dot (.) forces the remote target directory to the home directory of the remote login.
SEND
Adding files that end in .host to this forces them to be sent without m4 processing, which means they might be processed (by msrc) on the remote host.
MAP
Force a directory into the list to use it as a Cache directory. Cache directories are explained well in several places.

Clever use of options to hxmd

Some options are best used as a lever for other options. For example the include path option (-I) might change the path to a file specified under -j or one included by a MAP'd file's m4 include markup. Changing from a "test" include directory to a "production" one might change many included files by cascaded inclusion. This limits mismatched part-test/part-production files from mistakenly being built (or deployed). A special recipe is intentionally created to make any chimeras.

It is possible to use -j to provide a FIFO to m4, which will be opened once for each host processed. This allows a few hooks (for example rate-limits, unique content per-host), but doesn't really tell you which host is getting which content. Also note that the -j files are used for guard processes and cleanup work, so you'd have to handle those as well (one might use HXMD_PHASE). Never be tempted to reverse-engineer the target host via lsof, use a cache directory for such things. The cache recipe knows a lot more context, so it is clearly safer.

It is also possible to select files to send by parameters passed via an m4 file. For example, we want to select a specific set of package release to push to a set of hosts. We'll create a file (versions.m4) to specify the patch number (from some parent script or process):

dnl machine generated by ...
define(`APACHE_VERSION',`2.0.63')dnl
define(`BUILD',`4')dnl
define(`RADIUS_VERSION',`1.5.7')dnl
define(`OPENLDAP_VERSION',`2.4.7')dnl
In the Makefile we should ignore the parameter files and use the fact that msrc calls the shell to output macro values:
IGNORE=versions.m4
SEND="/var/ftp/pub/apache/source/./httpd-APACHE_VERSION.tar.gz \
	/var/ftp/pub/apache/source/./openldap-OPENLDAP_VERSION.tar.gz \
	/var/ftp/pub/apache/source/./mod_auth_radius-RADIUS_VERSION.tar" |\
	m4 versions.m4 -

At this pont running a debug "mmsrc -d P :" shows that the SEND macro does get bound to the expansion of the macros (viz. "/var/ftp/pub/apache/source/./httpd-2.0.63.tar.gz" is part of the list. This is because msrc's plundering of the make recipe code a call to echo before the value of the SEND (or any other) macro. So the command executed becomes a pipeline that includes the m4 filter on the end.

To allow other files to see the same mapping we include this line in Msrc.hxmd (or any other HXINCLUDE file):

# $Id: Msrc.hxmd,v ...
-j versions.m4

Prerequisite or MAP'd file

The home directory pull spell needs a perl script the deletes the first line from a non-text file. This script is generated by a very short command: i.e. s2p -e 1d.

There are 2 ways we could construct that script:

with a recipe __mmsrc prerequisite
Me might code a recipe like:
GEN= 1d

...
1d:
	s2p -e 1d >$@
...
source: ${SOURCE} ${GEN}

# msrc hooks below
__msrc: source
with a MAP file 1d.host
Which adds 1 line (at most) to the recipe file:
MAP=1d.host
to which we must add a file 1d.host that looks like:
dnl $Id: ...
syscmd(`s2p -e 1d')dnl
`'dnl

The difference between these two is subtle, and may lead to unintended service failures.

In the first case the recipe file needs to update a file in the master directory before it provides a push or pull service. That might be impossible for the effective uid/gid of the update process. For example the msrcmux service may run as an unprivileged user.

With the MAP the s2p is run once for each target host, but the output is sent to a temporary file owned by the invoker, which always works. The only likely failures would be permissions on $TMPDIR or a full filesystem.

Configuration file leverage

There is nothing in the configuration file that limits the number of times a host is defined. Human authors tend to build configuration files which define a host only once, but automation maybe less limited.

This helps you in a few ways.

Column headers of dot (.) discard values
If you need to remove a macro binding from a column it is usually easier to edit the percent-header than to parse the host binding lines. Just replace the (whole-word) of the macro you need to ignore with a dot (.), which effectively removes the column until the next header line.
Adding definitions to an existing file doesn't change HXMD_B
This allows you to add a table at the end of an existing stream to add your own bindings. I would usually use -X to avoid defining new hosts, but that could be what you want to do. Since stdin is a possible element of configs (as -), just produce a modified configuration file on that channel. If you need stdin for something else, make it a temporary file (via mktemp).
The first macro binding sticks (-Z, -C, then -X)
When you bind an attribute macro to a host in a zero configuration file (under -Z) that binding cannot be rebound by any subsequent file. Since all -C files are read next, they bind before any -X files, which come last.
Use -dD to debug bindings
When you confuse yourself with bindings it is easier to trace them in the raw m4 stream with hxmd's debug flag D.

Compilation of code

One of the best uses for msrc is the compilation and deployment of compiled code (C, C++, or the like) and configuration files. Many of the levers and hooks you need to make compile-time tunes are easy with this layering.

Layer 2 product builds

The layer 2 package specification file (ITO.spec) comments (via mk)
These comments allow you to set several options for level2s. See manpage for level2s(8l) for details.
The layer 2 package control recipe (Msrc.mk, or Makefile) comments (via mk)
The comments of a control recipe may contain mk markup to automate common usage of the spells contained within. This is largely an issue of local site policy: I can't tell you what operations you'll want to automate -- but I can tell how to do it. See mk(1l), and the mk HTML document.
The layer 2 package control recipe's macros and targets
This might set an HXINCLUDE to change its behavior. It used the __msrc hook to force prerequisites into the master-side build process. It can use a __cleanup target to cleanup any local cached data as well.
Map filename with m4 or efmd
See above.
Any macros msrc fills in for you
If you add files to a directory msrc picks which macro gets the new file. The . and +++ markup allows you to pick which macro gets unclaimed resources. See the HTML explanation.
Any cache directories built per-host
These could have anything you can build with make presented to any following step. There is nearly unlimited power in this step. For example reaching out to each target host to recover some current-state to mix that into the new state.
For example we could base the new resolv.conf on the current one. Or automation might check ntpdc's sysinfo for a target host before we update ntpd.conf.
The platform recipe (Makefile.host) while being processed by m4
The Makefile.host file might force some -DHAVE_feature or -DNEED_resource macros based on HOSTTYPE or other values in site.cf. for each host.
The Makefile.host file might include library options, or add/replace source files using m4 logic.
Another common meme is to add -D'HOSTTYPE` to the C compiler flags. This allows cpp markup to leverage the definition of the host's type. Nearly all ksb's tools use this rather than auto-configuration, for good or bad.
The platform recipe (having become Makefile) on the remote host
The recipe passes down all compiler options produced by the m4 markup above (viz. -D'HOSTTYPE` becomes -DDARWIN). This clues machine.h to set the apropos default values any HAVE_feature macros (or the like).
Dependencies included by m4 markup that force a command to build emulation code on the target host (see below).
Any autoconf spell to make config.h
This and machine.h are the only places where the host type or operating system version is directly examined, on the target host. By converting the host type into a set of property macros we may consolidate to more common code. Few of my tools use autoconfig. (Presently only mmsrc, to get us started and scdpn just to show I know how.)
Any machine.h or config.h file sets default property macros.
A machine.h file looks for any already defined HAVE_feature macros and any clues set in common include files to refine the picture of the target host. This usually only sets more macros. Sometimes it forced the inclusion of specific #include files.
Below #include "machine.h" in a C source file
We may conditionally include a header file (<strings.h> vs <string.h>). These are often conditional on a property macro set by machine.h or autoconf's config.h.
Any external declarations for any emulation code might be #if'd in based on any property macros set above.
Next a section of optional data declarations (e.g. struct fsent FSNew;) might be #if'd into the file. These might be declared as a macro type to unify the code even more.
In the C source files (e.g. main.c)
At the actual usage of the facility we call through a macro or through some emulation code to unify the alternatives as much as possible.
In another file (machine.c) any platform specific emulation code is instanced.
This allows the consolidation of the emulation code to a single file, or a few files. Some of the emulation code is pulled from the explode repository, so we can use it over and over again (for example the emulation for strcasecmp or strlcpy).
This also is a good place to use the C preprocessor to record any of the macros or options that may change the behavior of your application. Here is some clever code like the code I used in op to record the HOSTOS as a C string:
static const char acHostOs[] =
#if defined(HOSTOS)
#define op_sTrInG(Mx) #Mx /*space*/
#define op_InDirEct(My) op_sTrInG(My)
		"(" op_InDirEct(HOSTOS) ")"
#undef op_sTrInG
#undef op_InDirEct
#else
		""
#endif
			;
To understand this you may need to read the documentation on the Gnu website for cpp. Once you convert the macros to strings, the output becomes trivial. In op the options are listed under -V. This is a good way to display them.
How we get the explode spell
We might pull generic emulation code from explode's repository with a trigger from the Makefile.host m4 logic, or by creating an explicit dependency on the emulation file in the recipe via m4 markup. (When building the platform recipe.)
Using mkcmd to reuse code and options
At this point mkcmd does path searches for executables we need to get us back to the shell level at run-time.
We also use mkcmd to incorporate options from other programs. For example mmsrc imports very large portions of hxmd and msrc into itself.
Load against local libraries
When emulation code is so common that you need it for many applications just build a shared library and load against it.
Load against system libraries
Some systems provide emulation for other's interfaces. You may not need your own emulation code.
Like every other configuration management task this is a loop. Always refine your local base of libraries, options, explode and mkcmd modules, machine.h logic, autoconf parts, recipe files, cache directories, and rpm spec files until you have the correct balance between efficiency and robustness/fragility.

The best lever you have is the shell

Since msrc and all the related tools are largely configured with shell code and hxmd format files (which are usually generated with shell, m4, or perl code quite easily), the best friend you have is your shell skills. You don't have to learn another specially-built language to use this structure.

Learning to use m4 to markup files is required to customize sendmail, autoconfig, syslog and many facilities.

The make recipe driver is used by countless packages.

The columnar configuration file format required by hxmd (and therefore msrc and mmsrc), efmd, and distrib is not hard to use. It expresses detailed information about a large number of instances in a format that is both human and machine friendly. I find it much easier to read and write than some crazy XML scheme, or even YAML.

Return to the main page.



$Id: levers.html,v 1.19 2013/10/24 21:48:10 ksb Exp $ by .