train_lgbm

Note

This document has been machine translated.

Execute model training for LGBM.

Use JSON-RPC v2.0 as the execution method.

Example request

To perform LGBM model training, specify train_lgbm as the method in the parameter prov.process of the Provenance API.

An example JSON-RPC request is as follows

{
  "jsonrpc": "2.0",
  "method": "prov.process",
  "params": {
    "method": "train_lgbm",
    "params": {
      "output_ddc": "ddc:model",
      "input_ddc": "ddc:train_data",
      "param_json": "{ (see parameter entry) }",
      "no_exec": true
    },
  },
  "id": "provenance_jsonrpc_id"
}

Parameters

The following parameters can be specified for train_lgbm.

パラメータ名	内容
`output_ddc`	output ddc name
`output_mode`	output mode (`overwrite` or `error`)
`input_ddc`	input ddc name
`param_json`	Parameters for model training

In param_json you can specify detailed parameters that control the model learning method.

`source.date_parts`.

Enumerates the column names to be used as time information in the table specified by input_ddc.

`source.id_column`.

Specify a column name that uniquely identifies a measurement point in the table specified by input_ddc.

`target.column`.

Specifies the name of a column in the table specified by input_ddc to be used for forecasting.

`target.prediction_time`.

Specifies how many steps ahead the value should be predicted.

`lgbm.features`.

Enumerate the column names and time ranges to be used for the features in the table specified by input_ddc in the following format

"features": [
  {"column": "nox", "range": 24},
  {"column": "temp", "range": [24, -3]},
  ...
]

Specify column names in columns and time ranges in range. The time range can be specified in the following format.

Specified format	Example	Meaning
positive integer	`24`	use base time and values from last 24 hours
`0`	use only base time values (past values are not used for prediction)
negative integer	`-3`	use base time and 3 hours in the future
2-element array	`[24, -3]`	use values from 24 hours before to 3 hours after (including both ends)

`lgbm.regressor`.

You can specify parameters to be passed to LightGBM's LGBMRegressor.

Input data

The ddc specified for input_ddc must have the following schema

column name	contents	notes
`start_datetime`	start time	required column in event table
`end_datetime`	end time	required column of event table
(measurement point id)	id representing the measurement point
(Attribute 1)	any attribute value (numeric type)
(Attribute 2)	any attribute value (numeric type)
(...)	Any attribute value (numeric type)

Output data

The schema of the ddc output to output_ddc is as follows This ddc is needed when performing LGBM forecasting.

column name	contents	notes
`method`	learning method	fixed to `lgbm` string
`train_data`	input data name	the real table name will be listed instead of ddc
`param_json`	parameter	parameter string when learning
location_code`	id representing the measurement location
`target`	prediction target	value in `target.column
`prediction_time`	estimated time	value in `target.prediction_time`
model_path`	model file path	the model file entity is stored on the compute node

Return value

Output ddc information

train_lgbm

Example request

Parameters

source.date_parts.

source.id_column.

target.column.

target.prediction_time.

lgbm.features.

lgbm.regressor.