BackupScheduler CEDAR 10.1 % BackupScheduler Carl Hauser Copyright 1992 Xerox Corporation. All rights reserved. Abstract: BackupScheduler helps unix system administrators assign file system dumps to tapes. With a large number of filesystems of widely variant sizes this can be a formidable task, especially when dealing with tape drives of differing capacities and when considering the differing rates at which file systems' contents change. BackupScheduler provides two commands to assist in the assignment task. The first, ScheduleDumps, reads a dump_config file (the same file used by the parc_dump system) to produce weekly schedules for dumps. A weekly schedule assigns each file system to a particular tape drive, specifying which days will have full dumps and which days incremental dumps. ScheduleDumps attempts to fit all of the file systems' full and incremental dumps on the available tapes. Incremental dump sizes are estimated from the volatility estimate for each file system found in the dump_config file. AnalyzeDumpLog examines the dump logs from preceding weeks to help administrators choose reasonable volatility estimates. It produces a simple linear model of the volatility: one day following a full dump b% of the file system will have changed, and thereafter an additional a% changes every day. I greatly doubt the validity of this model for any one file system, though it might be reasonable when averaged over classes of file systems. For example, workstation filesystems might be considered one class, file server system volumes (/, /usr, /var, etc) might be another, and file server volumes available to clients might be a third. If you're serious about using AnalyzeDumpLog please see me so we can get some good data on its use. Created by: Carl Hauser Maintained by: Carl Hauser:PARC:Xerox Keywords: Backups, Dumps, Tapes, File systems, 8mm, parc_dump XEROX Xerox Corporation Palo Alto Research Center 3333 Coyote Hill Road Palo Alto, California 94304 1. ScheduleDumps Cedar Command % ScheduleDumps {MM/DD/YY | default}* The default form of configFileName is .dump_config in which case alone is acceptable. The program writes .weekly_schedule (for the default tape change schedule -- see below for dump_config format), and .week_of.YY.MM.DD for each week with a different tape change schedule indicated in the dump_config. If any dates (or "default") are listed on the command line, then only those schedules are produced. Notice that you can't force it to produce a schedule for a week not listed in the config file this way. Unix Command % enable backup/BackupScheduler % schedDumps {MM/DD/YY | default}* Output files are written in the current unix working directory. Limitations Since dump_config file can indicate "default" as the file system name on a given server, and even for named file systems does not contain their sizes, ScheduleDumps contacts each file server to enumerate its file systems and determine their sizes. If for some reason this doesn't work ScheduleDumps displays a message and continues. Schedules produced in this situation are suspect because file systems may be omitted and other file systems' sizes may be incorrect. ScheduleDumps will also display messages if it has calculated that a tape is likely to be overfilled. Whether or not the tape is actually overfilled depends on how full and how volatile the filesystems that are dumped on it are. Such schedules should be examined carefully for reasonableness. In operational use, I think it is important to introduce new .weekly_schedule files only on a particular day of the week. If they are introduced frequently, at arbitrary times, it is possible that some file systems will never have full dumps done! I therefore suggest including MM/DD/YY specs on the command line to avoid producing gratuitous new .weekly_schedule files when all you need is a schedule for a particular week. Dependencies ScheduleDumps relies on /project/dumps/findRawDevices, an awk script, to put the output of the unix "df" comand in a standard form. The problem is that df, used to enumerate file systems on each file server, has various output formats depending on the server's OS and manufacturer. The knowledge of these formats is captured in /project/dumps/findRawDevices. dump_config Format This file (e.g. SSL.dump_config) lists the host name, tape name, log directory, and notification email addresses needed by the parc_run_dump script for each dump server. It contains (at least) the following sections, each delimited by a section header standing alone on a line, e.g. [DUMP SERVERS] below. Comments are indicated by beginning a line with #. [DUMP SERVERS] # name size machine device log dir mail # ---- ---- ------- ------ ------- ---- SSL-A 2.0 sapphire /dev/nrst0 /project/dump/SSL-A root SSL-B 5.0 solo /dev/nrst8 /project/dump/SSL-B root This section names the available dump servers. A dump server can be thought of as a named tape drive residing at a particular device address on a particular host. ScheduleDumps is concerned only with the name, size (in gigabytes) and machine fields of the dump server records. [DUMP CLIENTS] # List of file systems, dump levels to use, scheduling priority (1 = highest), # and percentage of filesystem that changes in a day. # norm = schedule both full and incremental dumps # inc = schedule only incremental dumps # full = schedule only full dumps # - = schedule no dumps # # machine filesystem levels pri change % comments # ------- ---------- ------ --- -------- -------- default / full 3 0.1 omnibus /dev/id000g norm 2 0.1 # /usr omnibus /dev/id000h norm 1 1.0 # /omnibus omnibus /dev/id001a norm 1 1.0 # /omnibus1 nebula default norm 2 0.1 # any on nebula except / These describe the filesystems to be dumped. ScheduleDumps uses all of this information. "default" as the machine name should be followed by a mountpoint name, such as / or /usr. Such entries supply dump parameters for file systems found at those mount points. As a special case "/machine" as a mountpoint name means a mount point that matches on the host name on which it is found. For example on host nebula a file system at /nebula. "default" as the file system name on a named machine means that it applies to any file systems on the named machine that are not explicitly mentioned, either for that machine or the "default" machine. In the example above, therefore, nebula's root file system will have a full dump every day (as permitted by the tape change schedule), because of the "default /" entry, and the rest of its file systems will get full dumps once a week and incrementals on the other days, because of the "nebula default". The scheduler has a notion of "affinity" for dump clients and dump servers that reside on the same machine: it tries to schedule those client dumps on the server(s) on the same machine. It gracefully moves dumps that don't fit to other servers. For example, dumps for a 10GB Bullwinkle-class server won't fit on a 2.3GB tape drive that only gets fed 5 tapes a week. Most of the dumps will be done locally, but some will be done on another server. The scheduling algorithm uses priority in two ways: most importantly, on any given day, the dumps to a given tape are done in order from highest priority to lowest. Thus a failure of a low priority dump will be less likely to eliminate a high priority one. Second, the actual assignment of tape space to file systems is attempted in priority order. This tends to spread high priority dumps over the available tapes. Sometimes though, this will result in unnecessarily overcommitted tapes, in which case a second attempt is made using a "greedy" assignment algorithm that assigns tape space to file systems in order from largest to smallest. The change percentage is the administrators estimate of how much of the file system will have to be dumped on the first day following a full dump. ScheduleDumps inflates this with the passage of time. [TAPE CHANGE SCHEDULE] # Indicates on which days of the week to change tapes and schedule full dumps. # norm = schedule both full and incremental dumps (change tape) # inc = schedule only incremental dumps (change tape) # +norm = schedule both full and incremental dumps (append to tape) # +inc = schedule only incremental dumps (append to tape) # - = schedule no dumps # (day of week on which tape is LOADED, even if dump is run after midnight) # Week Mon Tue Wed Thu Fri Sat Sun # ---- --- --- --- --- --- --- --- default F F F F F +I +I # 11/18/91 F F F +I F +I +I 12/23/91 F +I - +I +I +I +I 12/30/91 F F +I F F +I +I Tape change schedules for exceptional weeks allow backups to occur even when there's no one around to change tapes. 2. AnalyzeDumpLog The dump analysis feature has not been ported to Cedar10. Command % AnalyzeDumpLog For each file system mentioned in the filtered dump log (see below for format and how to create it), solves a linear program to find the optimal choice of c0 and c1 such that (c0+c1*(days since full dump))*fulldumpsize/100 is bigger than the incremental dump size on every day. Prior to experimenting with this program, I expected that c0 and c1 could be given to ScheduleDumps to guide it in picking tapes. Now I'm not so sure. Maybe c0 is a usable as the estimate of volatility in the dump_config, but historically, c0 varies quite a bit from one dump cycle to the next. It also varies a lot from file system to file system on a given server, complicating the use of default file system parameters in the dump_config file. I would like to try using symbolic designators for file system volatility: workstation, server system disk, file server disk and calculating reasonable estimates over the whole classes. Default file system entries in the dump_config file would direct ScheduleDumps to determine the kind of each file system based on whatever it knew about it: size, name, position in df output, etc. Please see me if you think AnalyzeDumpLog would help your life. I think we can improve it. Filtered dump logs Format The filtered dump log contains any number of entries like drive gar:/dev/rsd10c level5 Jan 24 1992 01:00:19 level0 Jan 22 1992 01:00:42 size 161866 The "drive" line indicates the server and raw file system to which the entry applies. The level5 line gives the date of the incremental dump, the level0 line the date of the corresponding full dump, and the size line the number of tape blocks used by the incremental dump. If the level5 line is missing, it is an entry for a full dump and the size is the size of the full dump. The program is fairly forgiving about missing lines (entries with missing lines are ignored), but may complain vehemently about badly formatted lines. Producing one There are two awk scripts in this package for processing dump logs into filtered dump logs. filterlog processes .log files produced by the new parc_dump package and filteroldlog processes .log files from 8mm_dump. I use filteroldlog like this % filteroldlog /project/dumps/purple*of6.log >purplelogs.filtered purplelogs.filtered can then be used as input to AnalyzeDumpLog 3. Producing and using the packaged world To help make these commands more stable for the non-Cedar PARC community they have been put into a packaged world and scripts created for using them from a unix shell. Here's how the packaged world is made: to a unix shell type % CedarCommander Commander % preregister ScheduleDumps; packageit BackupSchedulerWorld Commander % sh1 BackupSchedulerWorld.ld; exitworld The C shell command files schedDumps and analyzeDumps invoke the packaged world to do their computation. schedDumps takes the same argument that ScheduleDumps does and analyzeDumps takes the same argument as AnalyzeDumpLog. After SModelling BackupScheduler.df, cd to /project/backup/BackupScheduler and Commander % qbo -p /r/BackupScheduler.df BackupSchedulerDoc.tioga Copyright 1992 by Xerox Corporation. All rights reserved. Chauser, June 12, 1992 11:18 am PDT  "cedar" styleNewlineDelimiter Mark LastEditedJ LastEdited>'9=>KR. RRR (RNEEHHK;;::::RuR R9 R_ButtonLSFileNameG/net/palain/palain3/chauser/work/BackupScheduler/AnalyzeDumpLog.command&&RRR_ButtonLSFileNameG/net/palain/palain3/chauser/work/BackupScheduler/AnalyzeDumpLog.command2[[   9QbQ  QAAQ?? ) RRFFR22RR) ButtonLSFileName=/net/palain/palain3/chauser/work/BackupScheduler/analyzeDumpst ButtonLSFileName=/net/palain/palain3/chauser/work/BackupScheduler/analyzeDumpsButtonLSFileNameG/net/palain/palain3/chauser/work/BackupScheduler/AnalyzeDumpLog.command   ] + OR((0d7