train_crnn

This document has been machine translated.

Runs CRNN model training.

The execution method is JSON-RPC v2.0.

Example request

To execute CRNN model training, specify train_crnn as method in the parameter of prov.process of the Provenance API.

An example request for JSON-RPC is as follows.

{
  "jsonrpc": "2.0",
  "method": "prov.process",
  "params": {
    "method": "train_crnn",
    "params": {
      "output_ddc": "ddc:model",
      "input_ddc": "ddc:train_data",
      "target_column_name": "ox",
      "prediction_time": 3,
      "crnn_param_json": "{ (see parameter entry) }",
      "no_exec": true
    },
  },
  "id": "occurrence_jsonrpc_id"
}

Parameters

The following parameters can be specified for train_crnn.

Parameter name	Data type	Contents	Default value
`output_ddc`	string	output ddc name	mandatory
`output_mode`	string	output mode (`overwrite` or `error`)	`error`
`input_ddc`	string	input ddc name	mandatory
`target_column_name`	string	target column name for prediction	mandatory
`prediction_time`	number (integer)	time to predict	mandatory
`crnn_param_json`	string	CRNN parameters	mandatory
`no_exec`	boolean	execute asynchronously	`false`

In crnn_param_json you can specify detailed parameters that control how the model is trained. The items that can be specified are as follows.

Attreibute (key)	Data type	Default value
`source.date_parts`	array of string	mandatory
`source.id_column`	string	mandatory
`source.columns`	array of string	mandatory
`models`	array of string	mandatory
`data.scales`	array of object	mandatory
`data.air_rank_boundaries`	array of number	mandatory
`data.per`	number	mandatory
`data.use_idw`	boolean	`false`
`data.spatial_data_block_size`	number (integer)	`16`
`data.spatial_data_size`	number (integer)	`32`
`training.max_cnn_epoch`	number (integer)	`2000`
`training.max_lstm_epoch`	number (integer)	`51`
`training.need_continue_training`	boolean	`true`
`training.is_softmax`	boolean	`false`
`training.imbalance`	boolean	`true`
`training.cnn_batch_size`	number (integer)	`512`
`training.rnn_batch_size`	number (integer)	`32`
`training.time_step`	number (integer)	`24`
`training.k_fold`	number (integer)	`6`

`source.date_parts`.

List column names to be used as time information in the table specified by input_ddc.

`source.id_column`.

List column names to be used as time information in the table specified by input_ddc.

`source.columns`.

List column names in the table specified by input_ddc that are used as features for model learning.

`models`.

Specifies the training target. You can specify one or more of the following values.

value	meaning
`pointwise`	Train for each measurement point distinguished by `source.id_column`.
`average`	learn the average value of all measurement points
`maximum`	learn the maximum value of all measurement points

`data.scales`.

Specifies normalization coefficients for each column to be used as features.

`data.air_rank_boundaries`.

Specify the bounds for each class when training the classification model.

`data.per`.

Specifies the normalization factor of the prediction target.

`training.max_cnn_epoch`.

Specifies the maximum number of epochs for training of CNN among CRNN model training.

`training.max_lstm_epoch`.

The maximum number of epochs for LSTM training among CRNN model training.

`training.need_continue_training`.

If true is specified, two-step training is performed.

`training.is_softmax`.

If true, train a classification model. false to train regression model.

`training.imbalance`.

If true is specified in the classification model, it assumes that the class distribution is biased.

`training.cnn_batch_size`.

Specifies the batch size for CNN training.

`training.rnn_batch_size`.

Batch size for RNN training.

`training.time_step`.

Specify how many steps in the past to use for prediction.

`training.k_fold`.

Specify the number of divisions for cross-validation during training.

Input data.

The ddc specified for input_ddc should have the following schema.

column name	contents	remarks
`start_datetime`	start time	required column for event table
`end_datetime`	end time	required column in event table
(measurement point id)	id representing the measurement point
(attribute 1)	any attribute value (numeric type)
(attribute 2)	any attribute value (numeric type)
(...)	any attribute value (numeric type)

Output data

The schema of ddc output to output_ddc is as follows. This ddc is required when performing predictions with CRNN.

column name	contents	remarks
`data_table`	name of training data table	real table name will be recorded instead of ddc
`meshcode`	measurement point id
`target_column`	name of column for prediction
`prediction_time`	time to predict
`crnn_param_json`	CRNN parameters
`cnn_model_path`	CNN model file name
`lstm_model_path`	LSTM model file name

Return values

Output ddc information

train_crnn

Example request

Parameters

source.date_parts.

source.id_column.

source.columns.

models.

data.scales.

data.air_rank_boundaries.

data.per.

training.max_cnn_epoch.

training.max_lstm_epoch.

training.need_continue_training.

training.is_softmax.

training.imbalance.

training.cnn_batch_size.

training.rnn_batch_size.

training.time_step.

training.k_fold.