Using DOS Batch Files to Run Experiments on Windows
When developing research code, one often wants to run test their experimental code
on a series of problems, without necessarily building a full-fledged interactive interface.
Especially for those familiar with Linux or Unix, script files seem like a natural solution
for automating test runs, as performance of the test script is generally far less important
than the test themselves. As I'm doing my thesis work in Windows, I delved into the world
of .bat
files to perform this scripting. I discovered that batch files are both
more and less powerful than I expected, and have some weird quirks. This is a hopefully useful
catalog of my findings. And before you ask, yes, I am aware of Cygwin,
and use it for some tasks as well, but the native scripting interface is easier to work with
in many cases.
The Basics
First off, you'll need to create a batch file with the extension .bat
.
I gave mine the rather uninspired namescript.bat
. You can run this file
either by double-clicking it in Windows
Explorer, or by typing script
at the command line in the directory in which
it is stored. The second is often desirable when you're debugging, as the command window
will automatically disappear when the batch file exits if you run it from Explorer, taking
any ouptut with it. Within
the file you can place any commands that you would normally execute from the command line,
essentially as you would normally use them. Read on for the exceptions.
Running a Series of Tests
If you had to type out every test individually, you probably wouldn't gain much from using
a script. What you likely want to do is pass data into your code one at a time.
For example, I work on problems in the
AMPL modelling language,
which stores individual problems in
.mod
files. To run my program once on every
problem in a directory, I use a FOR statement like this:
FOR %%s IN (*.mod) DO thesis.exe %%s
As you might guess from the statement syntax, the
%%s
gets expanded into
the name of each
.mod
file in the directory. When you run the batch file it
will execute, for instance:
thesis.exe problem1.mod
thesis.exe problem2.mod
thesis.exe problem3.mod
And so on. Parsing the filename passed to your program into something useful is highly
dependent on the language you use, and you'll need to work that one out yourself.
Saving the Output of Your Program to a File
If you're running a lot of tests, any output from your program will likely scroll
off the top of your screen while you're off having a coffee or sleeping or any other
number of useful things. If you ran the script from Windows Explorer this is even worse,
as the fatal error message your program helpfully printed at 3:23 AM disappeared when the script
aborted. So it's a good idea to redirect the output to a file.
The simplest way to direct output to a file is with the > filename
and
>> filename
operators. The major difference between the two is that a single
angled bracket will overwrite filename
if it exists, whereas the double bracket
version will append to the end of the file. If you're running a batch of tests you probably
want to append to a log file, but if you reuse the same script remember to change the filename
or move the original log out of the way if you want to separate your results.
Extending the example from the previous section, you can use a statement like this
to capture all the output from every run of your program into one file:
FOR %%s IN (*.mod) DO thesis.exe %%s >> scriptout.txt
.txt
is the standard extension for text files recognized by Windows Notepad and
the like, so it's easiest to use that, although any extension will do. If you want to add
some other information to the file that your program doesn't output, you can use the
ECHO
statement. Also useful are the date
and time
commands. To write the date and time at the top of your output file, you can use the following:
echo Experiment Time: >> testout.txt
date /t >> scriptout.txt
time /t >> scriptout.txt
Which will produce something like the following in scriptout.txt
:
Experiment Time:
30/01/2006
05:21 PM
Compiling Your Output Data
Probably the easiest spreadsheet readable format your program can output data to is
Comma Separated Value
format. Each line of the file is a row of the spreadsheet, with columns on each row
separated by commas (other separators are also possible). Originally I had my program
append directly to a shared results file, with each run adding a row. This was problematic
for several reasons:
- One bad run could corrupt the rest of the file
- It made it hard to restart half finished runs
- Examining results partway through a run was perilous, as my program may be trying
to access the one shared file
Instead, I opted to have each run output to its own result file, and then concatenate the
individual files together using the script. This only really addresses the third point
and part of the second one, but it's a start. For ease of scripting, the results for
each problem1.mod
file are output to a corresponding problem1.mod.csv
file. Yes, Windows has sort of caught on to the multiple file extensions thing. They'll
open (basically) fine in Excel or OpenOffice. To accomplish this, I use something like
the following:
copy ..\results.csv .\results.csv
FOR %%s IN (*.mod) DO thesis.exe %%s
FOR %%s IN (*.mod.csv) DO copy results.csv+%%s results.csv
The first line copies a file with the headers for each column from the directory
above it (..\
is the representation for going up a directory level). I
keep a fresh copy there so it doesn't get clobbered. The second line you'll recognize
from above, it invokes my code on each problem. Somewhere in there the results of the test
get output to a .mod.csv
file, which again you'll have to figure out for yourself.
The last line goes through all the output result files and appends them to results.csv
.
Note that if you have any stray result files from previous runs in the directory they'll get appended
as well, so make sure to clear those out. There may be a cleaner way to do this inside a single
FOR loop, but I haven't put the effort in to figure that syntax out yet. Note that appending them
results in a "unknown character" square showing up at the start of each line in Excel and Notepad,
although this does not show up in OpenOffice. Haven't tracked down the culprit, but as my first
column is just the model name it's easy to ignore.
Sending Yourself the Output
As alluded to above, while running these large batches of tests you probably have other things
to do. Like sleeping. Especially sleeping. Which you generally do at your apartment, not your lab,
so a little "ding" when your program finishes may not help you much. If you're like me what you'd
really like is to get your data without having to go out in the blizzard or blazing heat or whatever
Ottawa has decided to hit you with this month. The IT departments at most labs are clever enough to
not let you indiscrimantely remotely access your University workstation, so other solutions are in
order. One of the simplest is to just e-mail your results out as an attachment. To that end I've discovered
this handy tool called
Blat. The setup is not too complicated,
although you may have to poke at the registry a bit to get your default server set up properly. Once that's done,
you can finish off your batch file with something as simple as:
"C:\Program Files\Blat250\full\blat.exe" - -body results -subject results -to your.name@dept.university.ca -attach results.csv
And you're done. Nothing like waking up to a steaming hot pot of results in the morning. Well, maybe.
Command Line Options
Hard coding everything into your script isn't that flexible, as you'll probably want to
try your program with a series of different options. Any options you pass to the batch file
are accessed using the %
operator. The first option is %1
, the second
%2
, etc. My particular program has 7 options, so my final script looks like this:
copy ..\results.csv .\results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
FOR %%s IN (*.mod) DO thesis.exe %%s %1 %2 %3 %4 %5 %6 %7 >> scriptout.txt
FOR %%s IN (*.mod.csv) DO copy results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv+%%s results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
"C:\Program Files\Blat250\full\blat.exe" - -body results -subject results -to your.name@dept.university.ca -attach results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
As you can see all 7 options are passed to the program, as well as being incorporated into
the name of the result file for easier sorting later.
Scripting Your Script
Now that we've got all these options set up, it would be nice to run the script repeatedly
with different options without having to come in and restart it manually. To do this,
we'll want to write a script that calls our script. This was the major "gotcha" I encountered
with batch files - if you are invoking another batch file from within a batch file, you
should use the CALL
statement. Otherwise the calling script will exit when
the called script exits. This is obviously not what you want if you're planning on calling
it several times. One of my "meta-scripts" (or script of scripts) looks like this:
call script.bat 1 0.001 0 0.75 20 50 1
call script.bat 1 0.001 0 0.75 20 50 2
call script.bat 1 0.001 0 0.75 20 50 3