ICAT Java Client - ICAT Java Client User Manual

Introduction

The ICAT4 API is a layer on top of a relational DBMS. The database is wrapped as a web service so that the tables are not exposed directly. Each table in the database is mapped onto a data structure exposed by the web service. When the web service interface definition (WSDL) is processed for Java then each data structure results in a class definition.

Please also consult the javadoc.

Installation and accessing from maven is explained in Java Client Installation.

Setting Up

The web service is accessed via a proxy (conventionally known as a port). The proxy (here given a variable name of icat) may be obtained by the following:

URL hostUrl = new URL("https://<hostname>:8181") URL icatUrl = new URL(hostUrl, "ICATService/ICAT?wsdl"); QName qName = new QName("http://icatproject.org", "ICATService"); ICATService service = new ICATService(icatUrl, qName); ICAT icat = service.getICATPort();

where <hostname> should be the full name of the ICAT server. For a secure installation, just specifying localhost will not work, the name must match what is on the host certificate.

Session management

When you login to ICAT you will be given back a string, the sessionId, which must be used as the first argument of almost all ICAT calls. The only exceptions being the login call itself, getEntityInfo and getApiVersion - none of which require authentication.

String login(String plugin, Credentials credentials)

where the plugin is the mnemonic defined in the ICAT installation for the authentication plugin you wish to use and credentials is essentially a map. The names of the keys and their meaning is defined by the plugin.

This sessionId returned will be valid for a period determined by the ICAT server.

The example below shows how it works for the authn_db plugin at the time of writing, where the plugin has been given the mnemonic "db".

Credentials credentials = new Credentials(); List<Entry> entries = credentials.getEntry(); Entry e; e = new Entry(); e.setKey("username"); e.setValue("root"); entries.add(e); e = new Entry(); e.setKey("password"); e.setValue("secret"); entries.add(e); String sessionId = icat.login("db", credentials);

double getRemainingMinutes(String sessionId)

This returns the number of minutes left in the session. A user may have more than one session at once.

String getUserName(String sessionId)

This returns the string identifying the user of the session as provided by the authentication plugin.

void refresh(String sessionId)

This resets the time-to-live of the session as it was when the session was first obtained.

void logout(String sessionId)

This invalidates the sessionId.

Exceptions

There is only one exception thrown by ICAT. This is the IcatException_Exception which is a wrapper around the real exception which in turn includes an enumerated code to identify the kind of exception and the usual message. The codes and their meanings are:

BAD_PARAMETER: generally indicates a problem with the arguments made to a call.
INTERNAL: may be caused by network problems, database problems, glassfish problems or bugs in ICAT.
INSUFFICIENT_PRIVILEGES: indicates that the authorization rules have not matched your request.
NO_SUCH_OBJECT_FOUND: is thrown when something is not found.
OBJECT_ALREADY_EXISTS: is thrown when type to create something but there is already one with the same values of the constraint fields.
SESSION: is used when the sessionId you have passed into a call is not valid or if you are unable to authenticate.
VALIDATION: marks an exception which was thrown instead of placing the database in an invalid state.

For example to print what has happened you might use the following:

String sessionId; try { sessionId = icat.login("db", credentials); } catch (IcatException_Exception e) { IcatException ue = e.getFaultInfo(); System.out.println("IcatException " + ue.getType() + " " + ue.getMessage() + (ue.getOffset() >= 0 ? " at offset " + ue.getOffset() : "")); }

Operations which work on a list of objects, such as createMany, may fail because of failure to process one of the objects. In this case the state of the database will be rolled back and the offset within the list of the entry causing the error will be stored in the IcatException. For other calls the offset will be negative, as it is with certain internal exceptions which are not associated with any specific object in a list.

Data Manipulation

The schema

To understand exactly how the data manipulation calls work requires an understanding of the schema. Please take a look now to make sense of the following explanation.

Each table in the database, representing a set of entities, is mapped onto a class in the API so terminology mixes OO and database concepts. Each class has uniqueness constraints, relationships and other fields. Each object is identified by a field "id" which is managed by ICAT and is returned when you create an object. This is common to all objects and is not described in the schema. The "id" field is used as the primary key in the database. There will normally be some combinations of fields, some of which may be relationships, which must be unique across all entries in the table. This is marked as "Uniqueness constraint". For Dataset this is investigation, name where name represents a relationship. No more than one one Dataset may exist with those two fields having the same value. These constraints are enforced by ICAT.

The relationship table is shown next. The first column shows the minimum and maximum cardinality of the relationships. A Dataset may be related to any number of OutputDatasets, to at most one investigation and to exactly one DatasetType. The next column shows the name of the related class and this is followed by the name of the field which is used to represent the relationship. The basic field name is normally the name of the related class unless it is ambiguous or unnecessarily long. The field name is in the plural for "to many" relationships. The next column, "cascaded", is marked yes to show that create and delete operations are cascaded. If a Dataset is deleted then all its DataCollectionDatasets, DatasetParameters and Datafiles are deleted at the same time by one call to ICAT. In a similar manner a tree, created in memory with a Dataset having a a set of Datafiles and Datasetparameters, can be persisted to ICAT in a single call. This will be explained more later.

Note that all "one to many" relationships are cascaded but no "many to one" relationships.

Note also that all relationships are navigable in both directions.

Creating an Object

long create(String sessionId, EntityBaseBean bean)

To create an object in ICAT, first instantiate the object of interest, for example a Dataset, and then call the setters to set its attributes and finally make a call to create the object in ICAT.

So typical code in Java might look like:

Dataset ds = new Dataset(); ds.setName("Name of dataset"); ds.set ... Long dsid = icat.create(sessionId, ds);

You will see that no convenient constructors are generated, rather each field of the object must be set individually. Most fields are optional and may be left with null values, however some are compulsory and the call to create will fail if they are not set. Each object has a primary key that identifies it in the database - this is a value of type "long" that is generated by ICAT and is used to represent relationships in a regular manner.

Some fields represent attributes of the object but others are used to represent relationships. The relationships are represented in the class definitions by a variable which either holds a reference to a single object or a list of objects. In the case of a list it may be "cascaded". Consider creating an Dataset with a set of Datafiles. Because the relationship from Dataset to Datafile is cascaded they may be created in one call as outlined below:

Dataset ds = new Dataset(); ds.setName(dsName); ds.setType(type); Datafile datafile = new Datafile(); datafile.setDatafileFormat(format); datafile.setName(dfName); ds.getDatafiles().add(datafile); // Add the datafile to the dataset icat.create(sessionId, ds);

The call to create returns the key of the created object. If you choose to write:

ds.setId(icat.create(sessionId, ds));

then the client copy of the Dataset will be updated to have the correct key value - however the keys in any other objects "within" the Dataset will still be null on the client side. In this case datafile.getId() will remain null.

When creating multiple objects in one call, the value of the cascaded flag must be noted. The line ds.getDatafiles().add(datafile) requires that the datafile is not already known to ICAT because the cascade flag is set. If the cascaded flag is set then objects to be included in the "create" operation must not exist. However if the cascaded flag is not set then objects which are being referenced must already exist in ICAT.

We now have an example of adding a datafile to an existing dataset, ds

Datafile datafile = new Datafile(); datafile.setDatafileFormat(format); datafile.setName(name); datafile.setDataset(ds); // Relate the datafile to an existing dataset datafile.setId(icat.create(sessionId, datafile)); // Create datafile and store id on client side

List <Long> createMany(String sessionId, List <EntityBaseBean> beans)

This call, as its name suggests, creates many objects. It takes the list of objects to create and returns a list of ids. If any of the individual operations fail the whole call fails and the database will be unchanged. The objects to be created need not be of the same type. For an example (where they are of the same type) consider adding many Datafiles to a existing Dataset, ds:

List <Datafile> dfs = new ArrayList<Datafile>(); for (int i = 0; i < n; i++) { final Datafile datafile = new Datafile(); datafile.setDatafileFormat(dfmt); datafile.setName("bill" + i); datafile.setDataset(ds); dfs.add(datafile); } icat.createMany(sesionId, dfs); // many datafiles are stored in one call

Retrieving an object when you know its id

There are alternative syntaxes for the query, the concise syntax and the JPQL inspired syntax

Concise syntax

EntityBaseBean get(String sessionId, String query, long id)

If dsid is the id of a Dataset then it may be retrieved by the call:

Dataset ds = (Dataset) icat.get(sessionId, "Dataset", dsid);

The second parameter is a string holding the name of the type of object to retrieve and some other optional information. By default only the requested object is returned and no related objects. If you want the Dataset along with its related Datafiles, DatasetParameters and DatafileParameters then replace: "Dataset" with "Dataset INCLUDE Datafile,DatasetParameter,DatafileParameter"

The related types must be all be related to the original type or to some other type in the list. This means that you could not have "Dataset INCLUDE DatafileParameter" . There must be only one route from the original type to each of the included types.

JPQL inspired syntax

This syntax is only relevant if you have an INCLUDE clause as it allows more flexibility in specifying exactly what you want to get back. The previous query to get a Dataset with all its Datafiles, DatasetParameters and DatafileParameters becomes:

Dataset ds INCLUDE ds.datafiles.parameters, ds.parameters

Note that the variable "ds" has been introduced and is then used in the INCLUDE clause. This form of the INCLUDE clause is explained in more detail later

Updating an Object

void update(String sessionId, EntityBaseBean bean)

To update an object simply update the fields you want to change and call update. For example:

Dataset ds = (Dataset) icat.get(sessionId, "Dataset INCLUDE 1", dsid); ds.setInvestigation(anotherInvestigation); icat.update(sessionId, ds);

As suggested by the example above "many to one" relationships, such as the investigation relationship to the dataset, will be updated as will any simple field values. Consequently it is essential to get the existing values for any "many to one" relationships. This is most reliably achieved by the notation INCLUDE 1 as shown here. The effect of the "1" is to include all "many to one" related types. "One to many" relationships are ignored by the update mechanism so you need to start at the correct end of the relationship to have the desired effect.

Deleting an Object

void delete(String sessionId, EntityBaseBean bean)

The following code will get a dataset and delete it.

Dataset ds = (Dataset) icat.get(sessionId, "Dataset", dsid); icat.delete(sessionId, ds);

All cascaded "one to many" related objects will also be deleted. In the extreme case, if you delete a facility, you lose everything associated with that facility. This privilege should not be given to many - see the authorization section later.

Searching for an Object

One call supports two alternative query syntaxes that can be used - concise or JPQL. The concise syntax is convenient if it does what you want, otherwise you have the full power of JPQL queries to use. In addition there is a free text search mechanism.

Concise syntax

List<Object> search(String sessionId, String query)

The concise syntax will be introduced by means of examples:

List<Object> results = icat.search(sessionId, "Dataset");

will return all Datasets. If the query is:

"Dataset.name"

this will return all Dataset names. Multiple datasets with the same name are permitted and this call will include duplicates. Instead

"DISTINCT Dataset.name"

will avoid duplicates. To get related objects returned, then the same INCLUDE syntax that was described for the get call may be used with exactly the same restrictions and semantics:

"Dataset INCLUDE Datafile,DatasetParameter,DatafileParameter"

You can specify an order (which may precede or follow an INCLUDE clause):

"Dataset.id ORDER BY id"

Restrictions can be placed on the data returned. For example:

"Dataset.id [type.name IN ('GS', 'GQ')]"

which could also be written:

"Dataset.id [type.name = 'GS' OR type.name = 'GQ']"

The restriction in the square brackets can be as complex as required - but must only refer to attributes of the object being restricted - in this case the Dataset. Expressions may use parentheses, AND, OR, <, <=, >, >=, =, <>, !=, NOT, IN, LIKE and BETWEEN. Currently the BETWEEN operator does not work on strings. This appears to be a JPA bug.

Functions: MAX, MIN, COUNT, AVG and SUM may also be used such as:

"MAX (Dataset.id)"

Selection may involve more than one related object. To show the relationship a "<->" token is used. For example:

"Dataset.id <-> DatasetParameter[type.name = 'TIMESTAMP']"

Note also here the use of the JPQL style path: type.name . This expressions means ids of Datasets which have a DatasetParameter which has a type with a name of TIMESTAMP. Multiple " <->" may appear but all the objects involved, including the first one, must be connectable in only one way.

It is also possible to restrict the number of results returned by specifying a pair of numbers at the beginning of the query string. This construct would normally be used with an ORDER BY clause. The first number is the offset from within the full list of available results from which to start returning data and the second is the maximum number of results to return. These numbers if specified must be positive. If the offset is greater than or equal to the number of internal results then no data will be returned. The default values are 0 and "infinity". The numbers must be separated by a comma though either may be omitted. The following are all valid. The last example is rather pointless and does the same as the first. A number without a comma is illegal.

" Dataset.id ORDER BY id" "3,5 Dataset.id ORDER BY id" "3, Dataset.id ORDER BY id" " ,5 Dataset.id ORDER BY id" " , Dataset.id ORDER BY id"

JPQL syntax

This is simply JPQL including the SELECT keyword. For example the concise query "Dataset" in the call List<Object> results = icat.search(sessionId, "Dataset") can be replaced by:

SELECT ds.id FROM Dataset ds

and the concise query Dataset.id <-> DatasetParameter[type.name = 'TIMESTAMP'] becomes:

SELECT ds.id FROM Dataset ds JOIN ds.parameters p WHERE p.type.name = 'TIMESTAMP'

The only restriction is that the returned item must be a set of entities, the result of an aggregate function (such as COUNT or MAX) or a set of values of one field of an entity type. Currently nested selects are not supported - but when they are supported please define new variables for use within such a construct.

There are also two extensions to JPQL a LIMIT clause and an INCLUDE clause which may come in either order after the standard JPQL. The LIMIT clase follows MySQL syntax and takes the form: LIMIT 10, 100 which will skip 10 results and return the next 100. A LIMIT clause will normally be used with an ORDER BY clause.

JQPL Inspired INCLUDE syntax

An example of an INCLUDE clause is:

SELECT ds FROM Dataset ds INCLUDE ds.datafiles.parameters, ds.parameters

This uses the variable "ds" defined in the FROM clause. It means that the "Dataset" field "datafiles" will be followed to include all those "Datafiles" and that for each "Datafile" the "parameters" field will be followed to get the "DatafileParameters". In addition the "DatasetParameters" will be included. Those entities which the user is not allowed to read are silently ignored. The above INCLUDE clause could also be written:

INCLUDE ds.datafiles AS dsp, dsp.parameters, ds.parameters

This introduces a new variable dsp which is used later in the clause. The keyword "AS" is optional.

The variables defined outside the INCLUDE clause are not available inside the INCLUDE clause except for the variable identifying the main object being returned. In this case the pre-defined variable is "ds".

It is permissible to visit an entity type more than once in an INCLUDE - for example following a provenance chain.

Free text syntax

List<Object> searchText (String sessionId, String query, int maxCount, String entityName)

This treats each ICAT entry as a document. The contents of that document is formed by concatenating all the non-blank text fields (with a space bewteen them). These documents are then indexed by Lucene. Each create, update or delete call updates the set of available "documents". The indices are updated periodically - so new entries will not be immediately visible. The freshness of the data is determined by the ICAT configuration and may be adjusted. As with the other search call you will only see the data you are allowed to see by the authorization rules. Please see the following examples.

List<Object> results = icat.searchText(sessionId, "king", 50, null);

This obtains the 50 "best" documents with the work "king" in them.

List<Object> results = icat.searchText(sessionId, "king queen", 50, null);

returns documents with "king" or "queen" in them.

List<Object> results = icat.searchText(sessionId, "king AND (queen OR harp", 50, null);

returns documents with "king" and either "queen" or "harp".

Case is ignored and there is no need to put in words with different endings but the same stem as common suffices are removed both when storing the document indices and when looking them up. Wild cards may be used - the ? for a single character and * for zero or more. The last argument my be the simple name of an entity as shown below.

List<Object> results = icat.searchText(sessionId, "king", 50, "Investigation");

This restricts the search to the "Investigation" set of entities.

Authorization

The mechanism is rule based. Rules allow groupings of users to do things. There are four things that can be done: Create, Read, Update and Delete. It makes use of five tables: Rule, User, Grouping, UserGroup and PublicStep. The name "Grouping" has been introduced as "Group" is a reserved word in JPQL. The authentication mechanism authenticates a person with a certain name and this name identifies the User in the ICAT User table. Groupings have names and the UserGroup performs the function of a "many to many" relationship between Users and Groupings. Rules are applied to Groupings. There are special "root users" able to manipulate these four tables, but only these five unless a "root user" creates rules to give himself further powers. Apart from the special role of "root users" these tables behave as other ICAT tables do. The set of "root users" is a configuration parameter of the ICAT installation.

Rules

By default access is denied to all objects, rules allow access. It is only necessary to be permitted by one rule where that rule is only applied to the object referenced directly in the API call. The Rule table has two exposed fields: crudFlags and what . The field crudFlags contains letters from the set "CRUD" to indicate which types of operation are being allowed (Create, Read, Update and/or Delete). The other field, what , is the rule itself. There is also a "many to one" relationship to Group which may be absent.

Consider:

Rule rule = new Rule(); rule.setGroup(userOffice); rule.setCrudFlags("CRUD"); rule.setWhat("Investigation"); icat.create(sessionId, rule);

allows members of the userOffice group full access to all Investigations. Note that the id field of the rule on the client side is not set on the assumption that the client side copy of the rule will not be needed further.

Rule rule = new Rule(); rule.setGroup(null); // Not necessary as it will be null on a newly created rule rule.setCrudFlags("R"); rule.setWhat("ParameterType"); icat.create(sessionId, rule);

allows any authenticated user (with a sessionId) to read Parameters. Consider a group of users: fredReaders. To allow fredReaders to read a datafile with a name of "fred" we could have:

Rule rule = new Rule(); rule.setGroup(fredReaders); rule.setCrudFlags("R"); rule.setWhat("Datafile [name='fred']"); icat.create(sessionId, rule);

More complex restrictions can be added using other related objects. For example to allow read access to Datasets belonging to an Investigation which includes an InvestigationUser which has a user with a name matching the currently authenticated user (from the sessionId) we can have:

Rule rule = new Rule(); rule.setGroup(null); rule.setCrudFlags("R"); rule.setWhat("Dataset <-> Investigation <-> InvestigationUser <-> User[name = :user]"); icat.create(sessionId, rule);

where the :user denotes the currently authenticated user (derived from the sessionId). You will note that the syntax is very similar to that used by the search except that INCLUDE, LIMIT and ORDER BY clauses may not be used. The syntax of the "What" may be either the old concise syntax or the new JPQL syntax.

There is currently an important restriction to avoid a problem which has occured in testing: with the JPQL syntax only one dot may appear for terms in the WHERE clause and for the old syntax no dots are allowed in the condition in square brackets. You will get an error message if you forget.

Rules which allow every one to read a table are good for performance. For example:

Rule rule = new Rule(); rule.setGroup(null); rule.setCrudFlags("R"); rule.setWhat("DatasetType"); icat.create(sessionId, rule);

Such rules are also cached in memory.

PublicStep

This table has two columns (origin and field). An entry in this table affects the way in which INCLUDE authorizationis carried out. Each entry permits all users to make a step from the origin entity by the specifed relationship field without any further checking. This information is held in memory for speed.

Checking accessibility

boolean isAccessAllowed(String sessionId, EntityBaseBean bean, AccessType accessType)

This call returns true if the access to the bean specified by the accessType is permitted. For example:

Dataset ds = new Dataset(); ds.setName("Name of dataset"); ds.set ... System.out.println(isAccessAllowed(sessionId, ds, AccessType.CREATE))

This code sets up a Dataset and then prints whether or not it would be allowed to create it.

This call is expected to be made from GUIs so that they can avoid offering operations that will fail. As such, though READ acess may be queried it is unlikely to be useful as the GUI user will not have found out about the object to be checked. If READ, DELETE or UPDATE access is queried for an object that does not exist it will return false.

In the case of CREATE, the entity is created within a database transaction, the check is made and the transaction is rolled back. Do not populate any one-to-many collections for the entity being tested - this will cause an exception to be raised. Also note that if a create operation would result in a duplicate this may cause an exception to be thrown but this cannot be relied upon.

Logging

Logging to a table with the entity name Log (and/or a file in the logs directory) may be enabled in the icat.properties file of the ICAT server. Records of the type requested in the icat.properites file are added to this table for each eligible call. The information in the table may be regarded as sensitive so appropriate authorization rules should be created.

Notifications

ICAT is able to send JMS messages for create, update and delete of sslected entities. This is controlled by the icat.properties file of the ICAT server which can specify a list of the entity types to consider, and for each type which action to generate a message for. The JMS message is always PubSub rather than point to point. This means that there can be multiple listeners for a message. Any receiver must be set up to receive messages with a topic of "jms/ICAT/Topic". The messages all have properties of "entity" which is the type of the entity such as Dataset and "operation" which is one of the letters: C, U and D (for create, update and delete respectively). The body of the message is an object which holds the entity id as a Long. For the "xxxMany" calls multiple notifications will be generated. A receiver typically implemented as an MDB (Message Driven Bean) should filter the messages it processes by using the properties of the message.

It should be noted that by the time you can react to a deletion notification the entity id you are sent will refer to an object which no longer exists.

This mechanism does not leak information becuase all the user receives is an entity id. To read the entity with that id the user must have read access to that entity instance.

There is an example MDB available which should make this easier to understand.

Information

String getApiVersion()

returns the version of the API - this should match the version of the client as it is held in Maven for a released component. In the case of a release candidate such as 4.2.0-rc03 the version returned will still be 4.2.0 .

List<String> getEntityNames()

Returns an alphabetic list of all the entity names known to ICAT. This is of most value for tools.

EntityInfo getEntityInfo(String beanName)

returns full information about a table given its name. For example:

EntityInfo ei = icat.getEntityInfo("Investigation"); System.out.println(ei.getClassComment()); for (Constraint c : ei.getConstraints()) { System.out.println("Constraint columns: " + c.getFieldNames()); } for (EntityField f : ei.getFields()) { System.out.println("Field names: " + f.getName()); }

Prints out some information about the Investigation table. For a list of all available fields in EntityInfo and the objects it references please consult the javadoc for EntityInfo .

Administration Calls

To be authorized to use these administration calls you must be authenticated with a name listed in the rootUserNames in the icat.properties file.

List<String> getProperties(String sessionId)

lists the active contents of the icat.properties file. It does this by examining the properties after they have been read in so any superfluous definitions in the original properties file will not be seen. The current physical file is not re-examined

void lucenePopulate(String sessionId, String entityName)

instructs lucene to populate indices for the specified entityName. This is useful if the database has been modified directly rather than by using the ICAT API. This call is asynchronous and simply places the request in a set of entity types to be populated. When the request is processed all lucene entries of the specified entity type are first cleared then the corresponding icat entries are scanned to re-populate lucene. To find the prcoessing state use the luceneGetPopulating() call desribed below. Note that because of caching, ICAT should ideally be reloaded after any direct database modifications are made.

List<String> luceneGetPopulating(String sessionId)

returns a list of entity types to be processed for populating lucene following calls to lucenePopulate(). Normally the first item returned will be being processed currently. If nothing is returned then processing has completed.

void luceneCommit(String sessionId)

instructs lucene to update indices. Normally this is not needed as it is will be done periodically according to the value of lucene.commitSeconds in the icat.properties file.

void luceneClear(String sessionId)

clears all the lucene indices. It does not commit itself; you may simply wait for the periodic commit depending upon the value of lucene.commitSeconds in the icat.properties file.

List<String> luceneSearch(String sessionId, String query, int maxCount, String entityName)

searches lucene indices and returns a list of entity_name:entity_id values. "query" is a lucene query. Queries can contain AND and OR in upper case as well as parentheses. The default operator is OR. Wildcards of * and ? are also supported. Other features are described for the searchText call. The maxCount argument specifies the maximum number of values to return and the entityName, if not null, restricts results to entities with that name.

Contents

Project Documentation

Parent Project

Introduction

Setting Up

Session management

Exceptions

Data Manipulation

The schema

Creating an Object

Retrieving an object when you know its id

Concise syntax

JPQL inspired syntax

Updating an Object

Deleting an Object

Searching for an Object

Concise syntax

JPQL syntax

JQPL Inspired INCLUDE syntax

Free text syntax

Authorization

Rules

PublicStep

Checking accessibility

Logging

Notifications

Information

Administration Calls