train_crnn
Note
This document has been machine translated.
Runs CRNN model training.
The execution method is JSON-RPC v2.0.
Example request
To execute CRNN model training, specify train_crnn as method in the parameter of prov.process of the Provenance API.
An example request for JSON-RPC is as follows.
{
"jsonrpc": "2.0",
"method": "prov.process",
"params": {
"method": "train_crnn",
"params": {
"output_ddc": "ddc:model",
"input_ddc": "ddc:train_data",
"target_column_name": "ox",
"prediction_time": 3,
"crnn_param_json": "{ (see parameter entry) }",
"no_exec": true
},
},
"id": "occurrence_jsonrpc_id"
}
Parameters
The following parameters can be specified for train_crnn.
| Parameter name | Data type | Contents | Default value |
|---|---|---|---|
output_ddc |
string | output ddc name | mandatory |
output_mode |
string | output mode (overwrite or error) |
error |
input_ddc |
string | input ddc name | mandatory |
target_column_name |
string | target column name for prediction | mandatory |
prediction_time |
number (integer) | time to predict | mandatory |
crnn_param_json |
string | CRNN parameters | mandatory |
no_exec |
boolean | execute asynchronously | false |
In crnn_param_json you can specify detailed parameters that control how the model is trained.
The items that can be specified are as follows.
| Attreibute (key) | Data type | Default value |
|---|---|---|
source.date_parts |
array of string | mandatory |
source.id_column |
string | mandatory |
source.columns |
array of string | mandatory |
models |
array of string | mandatory |
data.scales |
array of object | mandatory |
data.air_rank_boundaries |
array of number | mandatory |
data.per |
number | mandatory |
data.use_idw |
boolean | false |
data.spatial_data_block_size |
number (integer) | 16 |
data.spatial_data_size |
number (integer) | 32 |
training.max_cnn_epoch |
number (integer) | 2000 |
training.max_lstm_epoch |
number (integer) | 51 |
training.need_continue_training |
boolean | true |
training.is_softmax |
boolean | false |
training.imbalance |
boolean | true |
training.cnn_batch_size |
number (integer) | 512 |
training.rnn_batch_size |
number (integer) | 32 |
training.time_step |
number (integer) | 24 |
training.k_fold |
number (integer) | 6 |
source.date_parts.
List column names to be used as time information in the table specified by input_ddc.
source.id_column.
List column names to be used as time information in the table specified by input_ddc.
source.columns.
List column names in the table specified by input_ddc that are used as features for model learning.
models.
Specifies the training target. You can specify one or more of the following values.
| value | meaning |
|---|---|
pointwise |
Train for each measurement point distinguished by source.id_column. |
average |
learn the average value of all measurement points |
maximum |
learn the maximum value of all measurement points |
data.scales.
Specifies normalization coefficients for each column to be used as features.
data.air_rank_boundaries.
Specify the bounds for each class when training the classification model.
data.per.
Specifies the normalization factor of the prediction target.
training.max_cnn_epoch.
Specifies the maximum number of epochs for training of CNN among CRNN model training.
training.max_lstm_epoch.
The maximum number of epochs for LSTM training among CRNN model training.
training.need_continue_training.
If true is specified, two-step training is performed.
training.is_softmax.
If true, train a classification model. false to train regression model.
training.imbalance.
If true is specified in the classification model, it assumes that the class distribution is biased.
training.cnn_batch_size.
Specifies the batch size for CNN training.
training.rnn_batch_size.
Batch size for RNN training.
training.time_step.
Specify how many steps in the past to use for prediction.
training.k_fold.
Specify the number of divisions for cross-validation during training.
Input data.
The ddc specified for input_ddc should have the following schema.
| column name | contents | remarks |
|---|---|---|
start_datetime |
start time | required column for event table |
end_datetime |
end time | required column in event table |
| (measurement point id) | id representing the measurement point | |
| (attribute 1) | any attribute value (numeric type) | |
| (attribute 2) | any attribute value (numeric type) | |
| (...) | any attribute value (numeric type) |
Output data
The schema of ddc output to output_ddc is as follows.
This ddc is required when performing predictions with CRNN.
| column name | contents | remarks |
|---|---|---|
data_table |
name of training data table | real table name will be recorded instead of ddc |
meshcode |
measurement point id | |
target_column |
name of column for prediction | |
prediction_time |
time to predict | |
crnn_param_json |
CRNN parameters | |
cnn_model_path |
CNN model file name | |
lstm_model_path |
LSTM model file name |
Return values
Output ddc information