Skip to content

train_crnn

This document has been machine translated.

Runs CRNN model training.

The execution method is JSON-RPC v2.0.

Example request

To execute CRNN model training, specify train_crnn as method in the parameter of prov.process of the Provenance API.

An example request for JSON-RPC is as follows.

{
  "jsonrpc": "2.0",
  "method": "prov.process",
  "params": {
    "method": "train_crnn",
    "params": {
      "output_ddc": "ddc:model",
      "input_ddc": "ddc:train_data",
      "target_column_name": "ox",
      "prediction_time": 3,
      "crnn_param_json": "{ (see parameter entry) }",
      "no_exec": true
    },
  },
  "id": "occurrence_jsonrpc_id"
}

Parameters

The following parameters can be specified for train_crnn.

Parameter name Data type Contents Default value
output_ddc string output ddc name mandatory
output_mode string output mode (overwrite or error) error
input_ddc string input ddc name mandatory
target_column_name string target column name for prediction mandatory
prediction_time number (integer) time to predict mandatory
crnn_param_json string CRNN parameters mandatory
no_exec boolean execute asynchronously false

In crnn_param_json you can specify detailed parameters that control how the model is trained. The items that can be specified are as follows.

Attreibute (key) Data type Default value
source.date_parts array of string mandatory
source.id_column string mandatory
source.columns array of string mandatory
models array of string mandatory
data.scales array of object mandatory
data.air_rank_boundaries array of number mandatory
data.per number mandatory
data.use_idw boolean false
data.spatial_data_block_size number (integer) 16
data.spatial_data_size number (integer) 32
training.max_cnn_epoch number (integer) 2000
training.max_lstm_epoch number (integer) 51
training.need_continue_training boolean true
training.is_softmax boolean false
training.imbalance boolean true
training.cnn_batch_size number (integer) 512
training.rnn_batch_size number (integer) 32
training.time_step number (integer) 24
training.k_fold number (integer) 6

source.date_parts.

List column names to be used as time information in the table specified by input_ddc.

source.id_column.

List column names to be used as time information in the table specified by input_ddc.

source.columns.

List column names in the table specified by input_ddc that are used as features for model learning.

models.

Specifies the training target. You can specify one or more of the following values.

value meaning
pointwise Train for each measurement point distinguished by source.id_column.
average learn the average value of all measurement points
maximum learn the maximum value of all measurement points

data.scales.

Specifies normalization coefficients for each column to be used as features.

data.air_rank_boundaries.

Specify the bounds for each class when training the classification model.

data.per.

Specifies the normalization factor of the prediction target.

training.max_cnn_epoch.

Specifies the maximum number of epochs for training of CNN among CRNN model training.

training.max_lstm_epoch.

The maximum number of epochs for LSTM training among CRNN model training.

training.need_continue_training.

If true is specified, two-step training is performed.

training.is_softmax.

If true, train a classification model. false to train regression model.

training.imbalance.

If true is specified in the classification model, it assumes that the class distribution is biased.

training.cnn_batch_size.

Specifies the batch size for CNN training.

training.rnn_batch_size.

Batch size for RNN training.

training.time_step.

Specify how many steps in the past to use for prediction.

training.k_fold.

Specify the number of divisions for cross-validation during training.

Input data.

The ddc specified for input_ddc should have the following schema.

column name contents remarks
start_datetime start time required column for event table
end_datetime end time required column in event table
(measurement point id) id representing the measurement point
(attribute 1) any attribute value (numeric type)
(attribute 2) any attribute value (numeric type)
(...) any attribute value (numeric type)

Output data

The schema of ddc output to output_ddc is as follows. This ddc is required when performing predictions with CRNN.

column name contents remarks
data_table name of training data table real table name will be recorded instead of ddc
meshcode measurement point id
target_column name of column for prediction
prediction_time time to predict
crnn_param_json CRNN parameters
cnn_model_path CNN model file name
lstm_model_path LSTM model file name

Return values

Output ddc information