There are three command line tools:
The data hub can be loaded from the command line using the MessageLoader component.
This can be run directly from the datahub jar file, using the following syntax.
java -jar datahub-dist.jar [options] filename
The message is read from the file.
Options are:
| --properties propertyFile |
Specify the locaion of the properties file.
Defaults to datahub.properties in the current directory. In
a standard-configured Linux system this should be.--properties /etc/datahub/datahub.properties |
| --system system |
This specifies a pipe-delimited list of valid values for system. If a single value for system is given, it is used as a default. Use * to indicate the default system mapping (where incoming messages conform to the target entities). |
| --entity entity |
Similar to --system, a pipe-delimited list of valid values for entity If a single value for system is given, it is used as a default. This option is required with the --allData option. |
| --timestamp timestamp |
If the message does not have a timestamp, or
the data or file --format option is used, effective
timestamp to use in the message. Defaults to the current
timestamp. |
| --user user |
Similar to --system, a pipe-delimited list of valid values for user If a single value for system is given, it is used as a default. |
| --options options |
Set message options. Options should be a
JSON-formatted string. Their use is
implementation-dependent. |
| --refresh true |
If set to true, indicates that the message contains all the data for the target entity, and that existing target system records should be deleted. If multiple entities are targeted by the source entity,
the refresh option only applies to the first. |
| --format format |
Indicates how the file should be treated. Set to one of:
|
| --process true |
If set to true, indicates that the message
should be processed. If set to false, the default, the
message is loaded but not processed. |
| --reprocess true |
If set to true, reprocess the message the guid of which
is used in place of the file name. Options other than
--properties are ignored. For example --reprocess true 74cd9a05-e9c3-4a12-b1bf-7ab1e53ba9ee |
| --help |
Print help text. |
The data hub monitor can be run from the command line. There are
two options: you can start the monitor and let it run
indefinitely, or you can perform a single run of the monitor which
will look for unprocessed messages.
To run the monitor and leave it running, use
java -cp datahub-dist.jar com.metrici.datahub.DataHubMonitor --properties datahub.properties
To perform a single run of the monitor, use
java -cp datahub-dist.jar com.metrici.datahub.DataHubMonitorRun --properties instance/datahub.properties
The monitor reads properties using the --properties parameter, and also has a --help option, like the loader.
Properties for the monitor are described in the web server section. If the multi=true
property is set, the property file should be the base property
file and the individual instances in separate directories, as
described in the web server topic. The multi=true only applies
when running the whole monitor. If submitting a single run, use
the property file for the individual instance (as in the example
above).
If you leave the monitor running, it will manage work identically to the monitor run from the web server, but has no controlled shutdown method. You will need to provide some method of controlling the running monitor. It may be more convenient to use something like supervisorctl to schedule single runs periodically than to run the monitor continuously.
The single run will complete all work it identifies (it ignores the monitor.shutdownTimeout). If you want a single run to perform all outstanding work, set monitor.queue to 0.
The query component can be used to extract data from the data
store. See Data hub query for details.
Query can be run from the command line, using the following syntax.
java -cp datahub-dist.jar com.metrici.datahub.Query [options]
Options are:
| --properties propertyFile |
Specify the location of the properties file. Defaults to datahub.properties in the current directory. |
| --user user |
The user under whose authority the data it to
be retrieved. No authentication check is made on the user
(command line access is assumed to be authenticated), but a
user may be required to navigate through the entity
authorization. |
| --query query.json |
Identifies the file that contains the query
JSON. See Data hub query for
syntax. |
| --entity entity |
Entity for the query. Will overwrite any
entity in the query JSON. |
| --timestamp timestamp |
Timestamp for the query. Will overwrite
timestamp in the query JSON. |
| --where.field value |
Provide a where object value for field.
This will overwrite any where clause for the field in the
query JSON. |
| --out file |
Write the output to the file. If not provided, the output
is written to sysout. |
At a minimum, you must pass either an entity or a query file with an entity property.
Example 1
This shows how to list all data from the product entity.
java -cp /var/datahub/lib/datahub-dist.jar com.metrici.datahub.Query \
--properties /var/datahub/config/acme/datahub.properties \
--entity product
Example 2
This more complex example shows what might be required to
retrieve product data for a particular range, from a configuration
where only the admin user has authority to read the entity.
java -cp /var/datahub/lib/datahub-dist.jar com.metrici.datahub.Query \
--properties /var/datahub/config/acme/datahub.properties \
--user admin \
--query product_query.json \
--where.range_reference ELEC1 \
--out range_ELEC1.json
Message housekeeping scans the message store and file store and deletes messages and files that have passed their retention period.
Although files are deleted out of the file store by the
housekeeping, a copy of them can be kept in a purge area. This can
then be cleared out by external routines.
| messageRetentionPeriod |
For how long messages for this system/entity
should be retained, in days. The default value of 0 means
"indefinitely". May be fractional. Use a negative value to
mean that messages can be discarded as soon as they have
been processed. |
| retainFiles |
If set to true, files associated with this
system/entity should be retained even if the associated
messages are deleted. Default is false. |
To run the housekeeping, use:
java -cp datahub-dist.jar com.metrici.datahub.MessageHousekeeping [options]
Options are:
| --properties propertyFile | Specify the location of the properties file. Defaults to datahub.properties in the current directory. |
| --check true |
Show what would be deleted, but do not
perform the delete. |
Housekeeping reads the message store, the message process table and the schema definitions. A message and associated process and history records are deleted when:
Messages without an associated message process record are deleted
when the message timestamp is older than the number of days
identified by the invalid message retention period set in
datahub.properties. These orphaned messages indicate errors, but
are retained for a short period for diagnostic purposes.
The file store is also scanned. A file is deleted when:
The schema properties for the file's system/entity do not indicate that files should be retained.
Messages and files may have invalid systems or invalid entities. These are processed using an invalid message retention period, and with retain files set to false.
In addition to the standard messageStore, messageControl and fileStore properties, housekeeping uses the following properties, which can be preceded by "housekeeping.".
| multi |
Specifies that this is a multi-instance
configuration. This runs housekeeping for each of the
instances. |
| invalidMessageRetentionPeriod |
For how many days messages without process entries or with unrecognised systems or entities should be retained. May be fractional. A value of 0 indicates that invalid messages should be retained indefinitely. For semantic consistency, the default is 0. Production systems should set this to a value that reflects how long the organisation would take to detect and diagnose errors, for example to 10. Setting this to below 0 or close to 0 is permitted.
However, this could potentially impact messages as they
are written and would not allow for diagnosis after
errors. A value of at least 1 is recommended. |
| fileStorePurge |
A directory into which files will be moved,
rather than deleted. If not set, files are deleted and not
moved to a purge area. |
Set the verbosity property to 1 or more to list messages and
files that are deleted, or 2 or more for more diagnostics.
When running in multi mode, the invalidMessageRetentionPeriod on the top-level properties is used as a default for the value on instance properties.
When running in multi mode, the fileStorePurge area on the
top-level properties is used to create a default for the value on
instance properties, by adding the instance reference as a suffix.