
DaRT (Data [or Document] Retrieval Tool) is a Java application used for simultaneous retrieval of documents and archived scientific instrument data that can be accessed via the World Wide Web. It is easy for administrators to plug support for their servers into DaRT. Known archives which support DaRT are the BIMA Data Archive and the Astronomy Digital Image Library.
Although DaRT is a Java application, you do not need to install the Java Development Kit (JDK) on your system to run it. Several distributions of DaRT come bundled with the Java Runtime Environment (JRE) which interprets the byte-code written by the Java compiler. On the other hand, if the JDK or JRE is already installed on your system, you will probably want to download the DaRT distribution which does not come bundled with the JRE. In this case, getting DaRT to run on your system is nearly as easy as if you downloaded the JRE-bundled version.
This manual is separated into two parts. The User Manual explains how to use DaRT. The Server Administrator Manual outlines the few easy steps which an administrator must take to get DaRT to work with her server.
Version 3.0 of DaRT has the following new features:
Whether you start DaRT from a Web browser or
from the command line, the DaRT GUI will look
similar to this:
By default, the destination directory (where the data files will
be written) is the directory in which you started DaRT. You can set a new
directory for a given run by typing the full path to the new directory
in the Destination Directory text field (be sure to hit ENTER>
after typing this directory name) or by clicking on the Destination...
item in the File menu. If you plan on using this directory often,
consider saving it to your .DaRTrc file using the Preferences
GUI. For more information on these topics, see Setting
the destination directory and Setting preferences
below. Once you have set the destination directory, click the Download
button. When data are being downloaded, the GUI will look like:
That is, DaRT displays the status of the download for each data set (e.g.,
percentage of file that has been downloaded, if the file is being untarred,
if the download and unpack have completed, etc.).
Although most of DaRT's functionality should work on all platforms, DaRT
has been written with Unix systems in mind. The start-up scripts for DaRT
are Unix shell scripts; there are currently no start-up scripts for other
platforms (such as MS-WindowsXX). DaRT uses Unix-like system calls for
untarring files and mailing messages, so these features probably won't
work on non-Unix-like systems. Several DaRT distributions come bundled
with the Java Runtime Environment for popular platforms, and there are
also two distributions (compiled using versions 1.1.6 and 1.1.7 of the
JDK) which can be used on any Unix-like platform which already has a similar
version of the JDK or JRE installed. The platforms for which the bundled-JRE
versions are available are:
You can retrieve a tar file containing the most recent version for your
OS and architecture from the DaRT Download
Page at http://monet.ncsa.uiuc.edu/MDC/DaRT/DaRTDownload.html.
See the section entitled Supported Platforms
for information on platforms on which DaRT can run.
To install DaRT, move the gzipped tar file to the directory
in which you want it to be installed. Uncompress and unpack the file
by typing
gunzip -c DaRT_tarfile.gz> | tar xvf -
or, if you have GNU tar
tar zxvf DaRT_tarfile.gz>
The directory under which DaRT is installed will
be named DaRT-v_version> (e.g., DaRT-v_3.0.0). The DaRT
executable is DaRT-v_version>/bin/dart. At this point, it
is a good idea to create a symbolic link to the DaRT-v_version>
directory. This will make it easier for users to configure their browsers
to use DaRT. To create this symlink, type
If you are installing DaRT for many users and would like the executable
to be in a standard location (e.g. /usr/local/bin), copy the file
DaRT-v_version>/aux/dart
to this directory and edit its one non-comment line to reflect the full
path to your actual DaRT startup script. (In case you are curious, simply
copying the DaRT-v_version>/bin/dart script won't work. This
script looks for files using relative paths, and so it relies on being
in the DaRT-v_version>/bin directory).
If you have downloaded a distribution which includes the JRE, you are
done with the installation. You may proceed to the section entitled Starting
DaRT.
If you have downloaded a distribution that does not include the JRE
(currently the DaRT_v_3.0.0_VANILLA_UNIX_1.1.x distributions), you will need
to do only a little more work to complete the installation. You will need to
edit the top of the file named DaRT-v_version>/bin/dart (e.g.,
DaRT-v_3.0.0/bin/dart).
This should be painless; follow the instructions in this file. You should
only need to set the JAVA_HOME and THREADS variables.
The following method is now preferred over editing the
.mailcap file (a description of which appeared in previous
version README files).
To configure Netscape to use DaRT,
click on the Preferences... item in the Netscape Edit menu.
A Preferences GUI appears. In this GUI, click on the right arrow by the
Navigator
option on the menu on the left side. A sub-menu appears. In this sub-menu,
click on Applications. In the area on the right, click the New...
button. An Application GUI appears. In the MIME Type: text field,
type application/x-multiget. Click on the
Application:
radio button, and in the text field next to it type
path to dart> %s
where path to dart> is the path to the
dart executable. The %s is important! Without it, DaRT will not
function as expected.
Everything else in the Application GUI can be ignored.
Finally, click the OK button in the Application GUI and then click
the OK button in the Preferences GUI. Netscape should now be
configured to run DaRT.
To download BIMA data sets once you have configured your browser to
run DaRT, start by going to the BIMA DATABASE QUERY PAGE at http://bima-server.ncsa.uiuc.edu/secure/bimaquery.html,
enter your search parameters, and click the QUERY DATABASE button.
On the Query Results page which is returned, check the items which
you wish to download and click the Request Selected Items button.
This will start DaRT. Usually it takes several (10 or so) seconds for DaRT
to start after clicking the Request Selected Items button. DaRT can also be run from the command line. The following command line
syntax is supported:
dart [-h] [-v] [-r resource file] [ [-d || -n]
multiget file]
The switches are:
DaRT can be used to download any document that is available on the
web. In order to do this, you need to create a file which contains the
URLs (one per line) of the documents you wish to download. For example,
consider the file named foo which looks like:
The cells in the Local File column are editable; in this way
you may specify the local file name to which the downloaded documents will
be written, e.g.,
If no file name is specified, the last field of the URL will be used
as the local file name. In the example above, http://www.csclub.uwaterloo.ca/u/relipper/tolkien/rootpage.html
will be written to a file named rootpage.html and http://www.cheaptickets.com/
will be written to a file named www.cheaptickets.com. DaRT menu items support keyboard accelerators.
Keyboard accelerators can be used to activate a menu item instead of clicking
on it with the mouse. To use a keyboard accelerator, press the ALT
and accelerator keys simultaneously. The accelerator key is the key that
contains the letter which is underlined on the menu item. For example,
pressing the ALT and E keys simultaneously after the
File
menu has been chosen will cause the Exit command to be
performed.
This GUI contains a number of tabbed panes in which you
specify various preferences.
By clicking the Save Only button, the new preferences are saved
to a disk file, but not applied to the current session. Clicking the
Apply
Only button applies the preferences to the current session, but they
are not saved for future use. Clicking the Save & Apply does
both. Saved preferences are written to a file named .DaRTrc which
is located in your home directory.
The Preferences GUI includes
help documentation; just click on the Help menu and choose the
tabbed pane on which you would like information.
Most preferences you enter
apply only to the archive that is currently loaded. This means that you
can have one set of preferences for the "default" archive (which is really
not an archive at all but the object that lets you download general web
documents), for the BIMA Data archive, and for any other archives for which
you "plug in" DaRT support. The preferences which are archive
independent are Directory containing multiget files and
the Fonts. Various preferences are described in detail below.
DaRT uses two fonts for labeling components. The medium
font is used for components such as text field labels, buttons, table
column headers, menus, and menu items. The small font is used
for text fields, table cells, tabbed panes, lists, and combo boxes. A
given font may look dramatically different on different CPU/display
combinations. A font that is easily legible on one pair of these
devices may be too small to read on another. The only global solution
to the "font" problem is to let the user set the fonts that work best
on her system. The preferences GUI contains a Font panel. This panel looks like:
The top half of this panel is used to set the medium font while the
bottom half is used to set the small font. A font is set by choosing
various combinations of values in the Name, Style, and Size
combo boxes. Whenever a new choice is made in one of these boxes, the
corresponding font is displayed on the label below the boxes.
When you apply the new preferences, the new fonts may not show up.
You will have to save your preferences, exit DaRT, and restart it in
order to see your font changes.
As of version 3.0.0, the user can specify the background color of
the GUIs which DaRT uses. To configure your preferred color, choose
the Preferences... item in the Options menu. In the
Preferences GUI which appears, click on the Colors tabbed
pane. A color chooser tabbed pane appears which looks like
You can choose a color from the Swatches panel, or (for more
choices) by using the HSB or RGB panels. The currently
chosen color appears in the preview panel below the main color
chooser. By clicking the Reset button, the chosen color is
reset to the current GUI background color.
Table columns preferences are specified on the Table Columns
tabbed pane of the Preferences
GUI which looks like:
This interface allows the user to set the column names, column widths,
and order in which the columns appear in the DaRT GUI. This panel consists
of a table on the left where the user sets her preferences, a (non-editable)
list on the right which consists of acceptable column names for the chosen
archive, and Add and Remove buttons near the bottom which
allow the user to add or delete a column from his preferences. Note that
the column named "" is the untitled check box column.
To change the location of a column, select the row which describes that
column in the Actual Column Names table by clicking on that row
with the left mouse button. Then, while keeping that button pressed, drag
the row to its new position and release the mouse button. The row containing
the column information will move to its new location in the
Actual Column
Names table. While you are dragging the row, you will notice that a
rectangle representing the row moves with the pointer.
To have a column removed from the DaRT GUI table, select it in the
Actual
Column Names table by clicking on it with the left mouse button. Then
click the Remove button. Poof! The information on the column is
gone!
To have a column added to the DaRT GUI table, select its name in the
Acceptable Column Names list by clicking on it with the left mouse
button. Then, in the Actual Column Names table, click on the row
of column information before which you want the new column to appear. Then
click the Add button. The information on the new column is entered
into the Actual Column Names table. The default width of a newly
added column is 75 pixels.
To set the width of a column in pixels, double click on the value in
the cell in Column Width column of the Actual Column Names
table. The font will change and a cursor will appear in this cell which
means you can edit it. After you have completed your edit, hit the ENTER>
key. The cell's font will return to what it originally was. if you enter
something other than a positive integer, the old value for the width will
be restored. After starting DaRT, you may want to set
the directory to which data sets should be written. You can do this in one
of two ways. If you know the name of the directory, you can simply type
it in the Destination text field near the bottom of the DaRT GUI.
Be sure to hit the ENTER> key when you are done typing the
directory name. If you have typed the name of a directory which does not
exist or is not writable, a dialog box will appear informing you of this
and the directory name will be reset to its previous value.
Alternatively, you can search for a directory if you do not have
one in mind. To do this, click on the File menu and then click on
the Destination... item. A "file chooser" which displays only directories
will appear which looks like:
To select a directory, click only once on its name in the file
chooser. The directory name will appear in the text field near the bottom
of the 'chooser. To use this directory as the destination, click the OK
button in the 'chooser. The 'chooser will disappear and the new directory
will appear in the Destination text field in the DaRT GUI. To traverse
a directory using the file chooser, double click on the folder icon to
the left of its name. The file chooser will then display the subdirectories
of the directory that was just clicked.
If the directory you choose is the directory you will want to use in
the future as well, you can set this preference. Click on the Options
menu in the DaRT GUI, and then click on the Preferences... item.
The preferences GUI appears which looks like:
In this new GUI, type the name of the destination directory you want
to use in the Directory in which to download data sets: text field.
To save this preference (and others that you enter using this GUI) and
apply it to this run, click the Save & Apply button. Now when
you run DaRT in the future, this directory will be the default area to
which data sets are written. Your DaRT preferences are stored in an ASCII
file named .DaRTrc which is located in your home directory.
This string can be composed of ordinary characters as well as macros.
A macro is a string surrounded by curly brackets ({}) which indicates that
a meta-datum associated with the data set should be used in creating the
directory structure (see the examples below). Each archive has its own
set of valid macros, and these macros are listed in the (uneditable) box
below the text field mentioned above. Note that the macro names are case
sensitive.
These examples illustrate how to use the directory structure property
in conjunction with the BIMA Data Archive. If you are using DaRT to retrieve
files from another archive, these examples will be still be useful to you.
There are five valid macros which can be used for BIMA data. They
are
{year}
Consider the BIMA data set 1733-130 which was observed as part
of project t240c115.L43 on 99may30.
{year}/{month}/{day}/{Project}.
So, by default, this data set will be written to the directory
99/may/30/t240c115.L43
under the destination directory.
{Project}/{day}{month}{year},
the data set will be written to the directory
t240c115.L43/30may99. foo{year}/bar/{Project}_{day}/stuff
were specified, the data set would be written to the directory
foo99/bar/t240c115.L43_30/stuff. You can de-select data sets (or documents)
to be downloaded by clicking the check boxes in the DaRT table (you may
have to click twice to de-select an item and twice again to re-select it).
Only items which have checks by them will be downloaded. Of course, this
assumes that the check box column is displayed in the table. If not, all
displayed data sets will be retrieved.
As of version 3.0, DaRT now supports download retries. This
feature is useful when not all datasets have been successfully
retrieved on the first attempt. For example, if you are trying to
download BIMA data sets and there is a problem with mass-store, you
may not be able to get all your data sets in one go. In this case,
DaRT will automatically try to download these data sets again if you
have set these preferences. You set the maximum number of re-tries
DaRT should make and the time interval between subsequent tries using
the Preferences GUI. In the DaRT GUI, click on the Options menu
and then click on the Preferences... item. The Preferences GUI
will appear. In this GUI, click on the Re-Tries tab. The
panel which is displayed looks like
Set the maximum number of download re-tries DaRT should attempt by
typing this non-negative integer in the text field entitled Maximum
number of download re-tries. Set the time interval between
attempts (the end of one download cycle to the beginning of the next)
by filling in the time in the Hours: and Minutes: text
fields. If a download fails and the number in the Maximum
number of download re-tries text field is positive, DaRT will
automatically retry the download in the specified time. During the
waiting period between downloads, the Status text field of the
DaRT GUI is periodically updated with the time remaining until the
next download re-try begins.
Once you have chosen the destination directory and
the data sets to be retrieved, you are ready to download. Click the Download
button to do this. If the Status column is displayed, you will see
messages in its cells on the status of data sets which are currently being
retrieved. When the status says "Staging...", the URL connection is being
made (and, for BIMA data, the data set is being transferred from mass-store
to the server if necessary). "Downloading..." indicates that the data set
is being transferred from the server to your machine. During this phase,
if the Progress column is displayed, the corresponding progress
bar is updated to indicate how much of the data set has been written to
your disk (optionally, if the Size column is displayed, the estimated
time of completion can also be displayed on the progress bar by setting
the appropriate item in the preferences GUI). After a data set has been
downloaded, it can be unpacked (untarred) if the archive which you are
accessing delivers tar files (the BIMA Data Archive does). This message
appears in the status field. After the data set has been downloaded and
optionally unpacked, a "Done" message appears in the status area and the
corresponding check box is unchecked.
You can abort downloads in progress. Just click the Stop
button, and an abort will occur for all data which are currently being
staged or downloaded or for any pending ("Idle") downloads. Any partially
retrieved data sets will be removed from local disk. Data sets which have
been completely downloaded will remain on disk. Aborts will not occur on
data sets which have been downloaded and are being unpacked (untarred).
The untar will complete normally, and the unpacked data set will remain
on disk.
As of version 3.0, DaRT supports download scheduling.
In order to schedule a download, click on the File
menu and then click on the Schedule download... item. A
download scheduler
GUI appears which looks like
In the text fields, fill in the time from when you press the Submit that you want the download to begin.
After the Submit button is pressed, the
batch GUI disappears. The Status text field of the DaRT GUI is
periodically updated to show how much time remains before the download
will begin.
To quit DaRT, click on the File menu and then click
the Exit item.
To get help, click on the Help menu and then click on
the DaRT GUI Help... item. A GUI is created which looks like:
It contains a list of items for which help is available on the left
and the help description in the HTML pane on the right. To get help on
a specific item, click on it in the list.
DaRT alerts you that something bad (like an error)
occurs by changing the label to the right of the Stop button which
normally says "No new errors" to "New error message". During each run,
DaRT maintains an internal error log which can be viewed by clicking on
the View menu and then clicking on the Error Log... item.
An error log GUI is created which looks like:
If there is an error that you can't decipher, your best bet is to mail
the error log to the server administrators and the developers. This is
easily achieved by clicking the Mail... button in the Error Log
GUI. A Mailer GUI is created which contains the text of the error log.
It looks like:
Click the Send button to send the log.
dart Path to previously saved file>
where Path to previously saved file>
is the full path to the file you previously saved. Alternatively, you can
load this file after DaRT has been started. In this case, start DaRT from
the command line without any arguments:
dart
Once DaRT is running, click on the File menu and then click on
the Open... item. A file chooser is created which you can use to
find the file you had previously saved. Once you have found this file,
click the Open button in the file chooser.
If you plan on saving multiget files frequently, you should put
these files in their own directory. Furthermore, you can let DaRT know
the name of this directory by setting it in the Preferences GUI. This item
is the Directory containing URLs Files text field in the Preferences
GUI. If you use this feature, the Open... and Save... file
choosers will be created with a view of this directory, which will make
it unnecessary for you to use them to navigate the file system. In
addition, if you start DaRT from the command line and specify the file
name, you need not give DaRT the full path to the file; it will
automatically look for this file first in the current directory and
then in the directory which you have specified in your preferences.
This preference is archive independent.
This means that you may choose only one directory in which to put all
of your multiget files, regardless for which archive they represent.
Type your message in the text area and click the Send button.
What's up with that?
A. The GUI is too small to accommodate all of the components within
it. Try making it bigger by resizing it with the mouse.
For a permanent fix, set the height and width preferences and save them.
Q. I'm running DaRT from Netscape. However, when DaRT starts,
no table appears and the status text field says " No
valid file selected". What's going on?
A. You probably forgot to put in the %s when you configured Netscape
to run DaRT. Read the section entitled
Running DaRT from Netscape and pay close attention to the bits
about the %s.


ln -s DaRT-v_version> DaRT
For example, to create a symlink which points to the version 3.0.0 release,
type
ln -s DaRT-v_3.0.0 DaRT
This is useful when installing new versions; just delete and remake
the symlink to point to the new version. This way your users won't
need to reconfigure their browsers when a new version of DaRT is installed.
This is also useful if you want to have more than one version
of DaRT installed. If there is a problem with the new version (which, of
course, should never happen :) ), all you need to do is to remove the symlink
and re-create it so that it points to the old version.
-h
Prints a usage message and exits.
-v
Verbose output for debugging.
-r
User preferences are loaded from resource file. If not specified,
preferences will be loaded from .DaRTrc in the user's home
directory.
-d
Attempt to download all the data sets specified in URLsFileName
automatically upon startup.
-n
Same as -d, only do not start a GUI. Program exits after files
have been downloaded.
multiget file
The file which lists the URLs and optional other information of the requested data sets. DaRT first
looks in the directory specified by the URLFilesDir property in
the .DaRTrc file (see Saving multiget files for later use
for more information).
http://monet.ncsa.uiuc.edu/ADC/DaRT/index.html
http://www.yahoo.com/
http://www.csclub.uwaterloo.ca/u/relipper/tolkien/rootpage.html
http://www.cheaptickets.com/
If you start DaRT by typing
dart foo &
the DaRT GUI will look similar to this:








{month}
{day}
{Project}
{Dataset}







Getting DaRT to work with your archive requires a small amount of work on both the server and client side. I'll describe the client side development first.
The minimum you will need to do is to create a simple text
(properties) file called Archive_your archive
name>.properties, where your archive name> is a
short (six or so character) name for your archive (e.g., for the BIMA
and ADIL archives, these files are named
Archive_BIMA.properties and
Archive_ADIL.properties.
Furthermore, this file should be
located under the directory tree ncsa/sciarch/archive, e.g.,
the BIMA properties file is
someTopDirectory>/ncsa/sciarch/archive/Archive_BIMA.properties. This directory
structure is what DaRT looks for, so this is important.
In this file are the properties (actually key-value pairs) which
describe your archive. This is just a Java resources file, so if you
know Java, enough said. If you don't, the way one of these is set up
is that you either have a single key-value pair or a comment on each
line (blank lines are also permitted). Comment lines begin with a "#". Key-value pairs take
the form of
key: value
or
key=value.
For example, the line
# This is the DaRT properties file for the BIMA data
archive
is an example of a comment, and
contactEmail: bimadata@bima-server.ncsa.uiuc.edu
and
contactEmail=bimadata@bima-server.ncsa.uiuc.edu
are examples of how to set the contactEmail property.
Below is a list of currently supported properties. You do not need
to set all of them; the ones which you must set are marked with a
*. The properties which the user can override are marked
with a !.
| Property Name | Description |
|---|---|
| contactEmail* | Email address to which users can send questions, comments, etc. |
| columnNames*! | A list of the types of data that can be displayed |
| URLformat* | The format of the URLs to your archive's files. |
| additionalDescriptors | Describes how entries listed in the columnNames property which are not simple macros should be constructed. |
| fileType | The type of file your archive delivers, used for post-download processing. Currently only fileType=tar is supported. |
| columnWidths! | A list of positive integers which specify the widths in pixels of the columns listed in the columnNames property (defaults to 75) |
| directoryStructure! | Instructs DaRT to put downloaded files under the named directory structure. |
Since this section details the minimum effort that you need to expend, I will describe only the mandatory properties that you must set. Before I do this, however, it will be useful for you see what a working resources file looks like. Below is the resource file for the BIMA Data Archive.
# DaRT resource file for the BIMA data archive by Dave Mehringer
#contact email address
contactEmail: bimadata@bima-server.ncsa.uiuc.edu
#format of the URL to data sets
URLformat: http://bima-server.ncsa.uiuc.edu/bima/data/archive/{year}/{month}/{day}/{Project}/{Dataset}.t
#permitted column names
columnNames: "",Date,Project,Dataset,Size,Location,Progress,Status
#column widths in pixels expressed as a list of comma delimited positive
#integers
columnWidths: 15,65,100,70,70,70,70,70
#additional column descriptions
#describe how to create the contents of columns whose names are not
#specified in the URLformat property. A "@" at the beginning of a
#definition means that all the values for the cells in this column will be
#obtained simultaneously by executing the method getColumnData() in the
#archive administrator defined Archive_ class.
#A "$" at the beginning of a definition means that the values for the
#cells in this column will be obtained individually by executing the method
#getCellValue() in the archive administrator defined
#Archive_ class.
additionalDescriptors: Date = {year}{month}{day}, Size = @sizes, Location = @locations
#file type
fileType: tar
#format of output directory structure (this will be the default, the
#user can define it herself)
directoryStructure: {year}/{month}/{day}/{Project}
OK, so now that you've seen it, let me explain exactly what it means.
The contactEmail property is obvious; DaRT uses this in its mail GUI on the To: line. This expedites the mailing of comments and error logs to you.
The URLFormat property describes the general format of the URLs used to access your data. The bits in curly braces are called macros and represent metadata which describe the data set. As can be seen in the file above, the BIMA archive uses
URLformat: http://bima-server.ncsa.uiuc.edu/bima/data/archive/{year}/{month}/{day}/{Project}/{Dataset}.t
for its URL format. Thus, a valid URL could be http://bima-server.ncsa.uiuc.edu/bima/data/archive/87/jan/31/project141/omc1.t. DaRT uses this information to determine which archive the URLs represent. The macros defined in the URLformat property can be used as column names in the DaRT table. So, in our example above for the BIMA archive, the DaRT GUI would look something like:

if the user overrode the default columnNames property in the file above.
Because of this, it is useful for the user if you give your macros descriptive names, although, in principle, you can call them anything you like. Here's what the URLformat property for the Astronomy Digital Image Library looks like:
URLformat: http://imagelib.ncsa.uiuc.edu/project/download/{Year}.{Initials}.{Project Number}.{Image ID}
Thus, a valid URL to a data set in this archive could be http://imagelib.ncsa.uiuc.edu/project/download/45.DM.22.01 and the GUI in this case might look like

The final property that you are required to supply is columnNames. The truth is, you don't have to supply this, but I highly recommended that you do; your users will appreciate it. The value taken by columnNames is a comma-delimited list of strings. It is important to understand that, to DaRT, a column name is more than just a label. It tells DaRT what type of data to display in the column. You are limited to the choices you can use for column names. You can use any of the macros that you defined in the URLformat property. You can use any of the keys which are specified in the additionalDescriptors property (see the section entitled Beyond the minimum for more information on this property). Finally, you can use any of the pre-defined column names which are described in the next paragraph.
Pre-defined column names: Currently, there are six pre-defined column names. These are listed in the table below. The first five are pretty easy to understand. In order to be able to use the Size column name, you must specify it in the list of values associated with the additionalDescriptors property. To use the Size column, you must write a Java class. For these reasons, the Size column will be described in the section entitled Writing an archive-specific Java class. The pre-defined column names are
| Pre-defined Column Name | Description |
|---|---|
| "" | Check boxes which the user can click to specify if the data set should be retrieved. |
| Progress | Progress bars are displayed to give information on how much of a data set has been downloaded. |
| Status | A string (actually a JLabel) is displayed which provides information on the status of the download. |
| URL | The URL to the data set is displayed. |
| Output File | The cells in this column are editable so the user can specify the name of the local output file. |
| Size | The sizes of the files are displayed. The sizes are usually obtained from a CGI script, so this is beyond "the minimum". |
When you have finished creating your Archive_archive name>.properties file, you will next need to put it in a jar (Java Archive) file. The jar utility is distributed with the Java Development Kit which is free, so if you don't already have the JDK you will have to get it. You should get version 1.1 of the JDK, which, for Solaris, can be found at http://java.sun.com/products/jdk/1.1/ . JDK distributions for other platforms can be found via http://java.sun.com/cgi-bin/java-ports.cgi.
This is where the directory structure for the properties file described above comes into play. Using the notation from above, change directory to someTopDirectory> and then execute the command
jar cvf your archive name>.jar ncsa/sciarch/archive/Archive_your archive name>.properties
It is imperative that your archive name> be the same thing in the two places where it occurs in the above command. So, for example, to create the required jar file for the FOO archive, you would execute
jar cvf FOO.jar ncsa/sciarch/archive/Archive_FOO.properties
The jar file containing the properties file is all DaRT needs to recognize your archive. You must make this jar file available to your users (e.g., via ftp, http, etc.). After a user downloads it, she must put it in the lib directory under the top directory of her DaRT distribution. The BIMA.jar and ADIL.jar files which are included in the standard DaRT distribution as well as files named classes.jar and swingall.jar will be in this directory already.
PLEASE TEST DaRT WITH YOUR ARCHIVE PLUG-IN THOROUGHLY BEFORE YOU RELEASE IT TO YOUR USERS. This will save you much time sending email to your users when they complain that things aren't working. If you need help getting your plug-in to work, let us know and we will help as we can.
If you want DaRT to support additional features for your archive, this section is for you. I'll start by describing the optional properties which can be set in the Archive_your archive name>.properties file.
The fileType property specifies the type of file your archive delivers and allows the user to process this file after downloading it. Currently, only fileType=tar for tar files is supported. In this case, the user has the option of unpacking the file after it has been downloaded. If your archive delivers anything other than tar files, don't bother setting this property. If you'd like your users to be able to do post-download processing on your data sets, let us know and maybe we can modify DaRT to support your file type.
The columnWidths property specifies the width of the columns specified in the columnNames property in pixels. The value is a comma-delimited list of positive integers. For example, if you set
columnNames="",Status,Progress
you could set
columnWidths=20,80,75
The directoryStructure property allows you to define the default directory structure under which your archive files are written. For details of how to specify this, see the section entitled Setting the directory tree structure to which data sets are written in the USER MANUAL. The macros which you can use to specify this property are any subset of the strings listed in the columnNames property.
The last property you need to know about is the additionalDescriptors property. This property allows you to specify additional column names not already specified by the macros in the URLformat property. The value taken by this property is a list of comma-delimited key=value pairs. For example, the BIMA archive defines this property as
additionalDescriptors: Date = {year}{month}{day}, Size = @sizes, Location = @locations
The keys in the key-value pairs in this property come in two flavors. The first is called a compound macro. A compound macro is just a bunch of simple macros which are defined in the URLformat property strung together. For example, consider the BIMA archive's Date compound macro. This macro is defined to be Date = {year}{month}{day} (recall from the Archive_BIMA.properties file listed in the section entitled The minimum that the macros {year}, {month}, and {day} are all defined in the URLformat property). So, for a URL that has {year}=98, {month}=jan, and {day}=13, the value that would appear in the Date column of the DaRT GUI would be 98jan13. In a sense, the directoryStructure property is like a compound macro.
The second flavor of keys that can be specified in the key-value pairs in the additionalDescriptor property is a method parameter. DaRT uses the all but the first character of the value of this key-value pair as a String object and passes it to a method in a Java class that you must define. The method to which this String is passed depends on what the first character of the String is. If the first character is a $, DaRT will call the method getCellValue in the class you define. If the first character is a @, DaRT will call the method getColumnData() to get the data for this column. Thoroughly confused? Let's back up a bit.
In this section, I assume you know how to write Java code. If you don't, then only use compound macros in your additionalDescriptors property (assuming you choose to define it at all) and skip to the next section. You can use DaRT to retrieve more information on data sets in your archive than just the metadata contained in the URL. For example, DaRT can obtain the sizes of BIMA data sets and the locations of these data sets. In order to do this, however, you must create an archive-specific Java class.
Like the archive-specific resource file described above, your Java class must have a specific name. This class must be called Archive_your archive name>.class (the source file obviously should be named Archive_your archive name>.java), where your archive name> is identical to what you used in the simple text resource file name. Like the resource file, it should reside under the directory ncsa/sciarch/archive. This class must follow the following guidelines:
package ncsa.sciarch.archive;
public class Archive_BIMA extends ncsa.sciarch.util.ArchiveBundle {
You will obviously need the ncsa.sciarch.util.ArchiveBundle.class file in order to compile your class; it is part of lib/classes.jar which comes with the DaRT distribution. If you add this jar file to your CLASSPATH, the javac compiler should be able to find this class.
public Archive_BIMA() {}
public String getCellValue(int row, String identifier, String[] macros, String[] values)
and
public String[] getColumnData(int rowCount, String identifier, String[] macros, String[][] values)
The two methods defined above tell DaRT how to get additional information about your archive files. The getCellValue() method should return the value that DaRT will place in its table located at the row number specified by the row parameter and in the column specified by the identifier. In addition, this method is passed the macros (both simple and compound) as a String array (the macros parameter) that are defined for this archive as well as the substitutions for these macros for the current data set (the values parameter). As mentioned above, DaRT calls this method if a $ is specified as the first character of a value in a key-value pair in the additionalDescriptors property.
Let's look at a simple example. Consider the FOO archive which sets, among others, the following properties in Archive_FOO.properties:
URLformat: http://www.foo.com/{day}/{month}/{year}/{dataset}
additionalDescriptors: Date = {year}/{month}/{day}, Julian Day Number = $julian
The key Julian Day Number in the additionalDescriptors
property is a method parameter, since the first character of its value
$julian is a $. Because of this, DaRT will call
the getCellValue() method in Archive_FOO.class seperately
on each data set the user requests to figure out what the Julian day number
for this data set is. This method, which is defined in Archive_FOO.class,
might look something like
public String getCellValue(int rowNum, String identifier,
String[] macros, String values[]) {
// for the FOO archive, macros = {year,month,day,Date}
// put the values of the macros in variables with useful names
String year = "";
String month = "";
String day = "";
String date = "";
for (int i = 0; i < macros.length; i++) {
if(macros[i].equals("year"))
year = values[i];
else if(macros[i].equals("month"))
month = values[i];
else if(macros[i].equals("day"))
day = values[i];
else if(macros[i].equals("Date"))
date = values[i];
}
// identifier can be any of the values in the additionalDescriptors
// property which start with a "$", so use an if block to do
// processing for each
// calculate the Julian Day Number
if(identifier.equals("julian")) {
int y = 0;
int m = 0;
int d = 0;
try {
y = Integer.parseInt(year);
m = Integer.parseInt(month);
d = Integer.parseInt(day);
}
catch(NumberFormatException exc) {
DaRTManager.writeErrorMessage(date + " is not a valid date!"
);
return "";
}
Calendar gc = GregorianCalendar.getInstance();
long gcMSecs = gc.getTime().UTC(y,m,d,0,0,0);
Calendar fid = GregorianCalendar.getInstance();
long fidMSecs = fid.getTime().UTC(1970,1,1,0,0,0);
long julian = (long)2440588 + (gcMSecs-fidMSecs)/(long)8.64e7;
return Long.toString(julian);
}
// should never get here, but Java requires a return out of
// conditional blocks
return "";
}
If the user requests the following data sets
http://www.foo.com/20/9/1984/bio http://www.foo.com/11/7/1994/omc1.gz
from the FOO archive, the resulting DaRT GUI would look something like:

If your archive will supply extra information using a CGI script which runs on your server, using getCellValue() is very inefficient because the CGI script will be run separately for each requested data set. A better solution is to use the getColumnData() method to send information on all the data sets to the CGI script simultaneously. In this case the CGI script only needs to be run once. This is the method the BIMA archive uses to retrieve information on the sizes and locations of its data sets. The parameters that getColumnData() takes are similar to those that getCellValue() uses. The differences are that the rowCount parameter holds the number of rows in the DaRT GUI table (which is the same as the number of data sets the user is requesting) and the values parameter is a two dimensional String array which holds the substitutions for the macros from all the URLs requested by the user. The dimensions for this array are values[rowCount][macros.length]. This method returns a one-dimensional String array containing the entries for all the cell values in the relevant column. This array should obviously contain rowCount members.
For example, consider the BAR archive which defines (among others) the following properties in Archive_BAR.properties:
columnNames: "",Project ID,Dataset,Location,Status,Progress
URLformat: http://www.bar.com/{Project ID}/{Dataset}
additionalDescriptors: Location = @location
In order to determine the entries for the Location column, DaRT will call getColumnData(), passing the String "location" as the identifier. This method, defined by the BAR server administrator in the Archive_BAR.java source file, might look something like
public String[] getColumnData(int rowCount, String identifier,
String[] macros, String[][] values) {
// put the values of the macros in variables with useful names
String[] pid = "";
String[] dataset = "";
if(identifier.equals("location") {
for (int i = 0; i < rowCount; i++) {
for (int j = 0; j < macros.length; j++) {
if(macros[j].equals("Project ID"))
pid[i] = values[i][j];
else if(macros[j].equals("Dataset"))
dataset[i] = values[i][j];
}
}
String cgiURL = "http://www.bar.com/cgi-bin/getLocations.pl?";
for(int i = 0; i < rowCount; i++ ) {
cgiURL = cgiURL + "file" + i + "=" + pid[i] + "/"
+ dataset[i];
// append an ampersand to all but the final key-value pair
if(i < (rowCount - 1)) {
cgiURL = cgiURL + "&";
}
}
String cgiOutput = "";
try {
URL url = new URL(cgiURL);
// run the CGI script using the retrieve method
// in ncsa.sciarch.util.GetURL. The output from
// this script is put in cgiOutput
String cgiOutput = GetURL.retrieve(url);
catch(MalformedURLException exc) {
DaRTManager.writeErrorMessage("Malformed URL " + cgiURL
+ "!");
return null;
}
// parse the cgiOutput string and put the values into a
// String array so they can be returned to the caller
// I do not define parse in this example
String[] columnData = parse(cgiOutput);
if(columnData.length != rowCount) {
DaRTManager.writeErrorMessage(
"Archive_BAR: Something freaky happened");
return null;
}
return columnData;
}
// should never get here, but Java requires a return out
// of conditional blocks
return "";
}
In the above method examples, I've used some classes from the ncsa.sciarch package to make life a little easier. If you'd like to use classes and methods from this API in your class, the documentation can be found at http://monet.ncsa.uiuc.edu/ADC/DaRT/doc/packages.html. Some classes are documented better than others; if you would like more information on a class or method than what is contained in the documentation, or if you would like to know if you can do something specific with this package, send me mail with your query.
If you plan on using the Size pre-defined column name as a key in your list of additonalDescriptors, there are a few things of which you need to be aware. You can use this as a method parameter, as described above. The values that are returned should still be a String (if you use getCellValue()) or a String array (if you use getColumnData()) as you would expect. DaRT will then take these values and try to shove them into long primitives using java.lang.Long.parseLong(). If a NumberFormatException is thrown when DaRT tries to do this, it will set the value of the size to -1. DaRT treats the Size column specially because it uses sizes when calculating the percentage of a file that has been downloaded when it updates the Progress Bars.
If you have opted to create a Java class, it should be put in the same jar file as your properties file. Doing
jar cvf your archive name>.jar ncsa/sciarch/archive/Archive_your archive name>.*
is an easy way to do this.
In order to have DaRT run from a user's web browser, you must have a CGI script that runs on your archive server which returns a list of URLs (one per line) to the data sets the user has requested. Furthermore, this list must be preceded with a MIME type of application/x-multiget. The user configures her browser to run DaRT if this MIME type is returned by a CGI application. Below is a Perl code snippet which shows how to do this.
#!/usr/local/bin/perl
# The data sets that the user requests will be supplied as key-value
# pairs to this script. This script should take key-value arguments
# which reference the data sets the user is requesting. The URLs to
# the data sets can be determined by processing these arguments
# in the subroutine getURLs()
use CGI;
# create a CGI object
$query = new CGI;
# print the Content type
# if you can't or don't want to use the CGI package, use
# print "Content-type: application/x-multiget\n\n";
# instead
print $query->header('application/x-multiget');
# @urls is an array of urls (one url per element)
# getURLs does the work of getting the URLs for the data sets which
# the user has requested
@urls = &getURLs();
# print the URLs, one per line
foreach $url (@urls) {
print "$url\n";
}
That's all the server-side work that you are required to do. If you use method parameters in your additionalDescriptors property which require CGI scripts to get additional information on data sets, you will obviously need to write these scripts as well. See the sections entitled Beyond the minimum and Writing an archive-specific Java class if you are interested in taking advantage of this functionality.
If you need assistance configuring your server and/or creating your client-side plug in, contact us and we will try to help.
Copyright © 1999, 2000 David Mehringer