import_category
This document has been machine translated. Registers category definitions to be used for extracting correlation patterns.
Use JSON-RPC v2.0 as the execution method.
Example request
import_category
is a kind of analysis API, and can be executed by specifying api_method="import_category"
in the process
method of the Provenance API.
The following is an example of starting a Provenance session, executing import_category
, and then closing the session.
import json
import xdata_prov.client import Api
api = Api()
api.begin_session()
api.process(api_method="import_category", api_params={
"output_ddc": "ddc:jartic_xrain_category",
"category_json": json.dumps([.
{ "item": "agg_jartic_max_length", "min": 10, "max": 100, "category": "cl10" },
{ "item": "agg_jartic_max_length", "min": 100, "max": 300, "category": "cl100" },
{ "item": "agg_jartic_max_length", "min": 300, "max": 600, "category": "cl300" },
{ "item": "agg_xrain_max_value", "min": 0.1, "max": 1, "category": "rf01" }
{ "item": "agg_xrain_max_value", "min": 1 , "max": 5, "category": "rf1" },
{ "item": "agg_xrain_max_value", "min": 5 , "max": 10, "category": "rf5" }
])
})
api.commit()
api.end_session()
Parameters
When calling the process
method with api_method="import_category"
, api_params
will take a dict containing the following keys.
Parameters with a blank default value are required.
key | description | default value |
---|---|---|
output_ddc | registered to ddc | |
output_mode | output mode (overwrite or error ) |
error |
category_json | category definition (in JSON format) |
input data
category definition
The category definition specified by category_json
is a JSON array whose elements are JSON objects.
Each element of the array must have the following keys.
key | type | description |
---|---|---|
item | string | column name |
min | number | minimum value (including the specified value) |
max | number | maximum value (does not include the specified value) |
category | string | category string |
Interpretation of category definition.
Category definitions are used in extract_items
.
extract_items
converts the category definition into symbolic information according to the category definition for each record of the input transaction.
The conversion rules are as follows.
- If the value of the column specified by `item' is greater than or equal to min and less than max, output the symbol specified by category.
- If min is not specified, the minimum value will be unrestricted.
- If max is not specified, the maximum value will be unrestricted.
In the item column, you can extract specific fields from a timestamp type column in the form of %field
.
For example, start_datetime%hour
will apply the conversion to the time (0 to 23) in the start_datetime
column.
Specifically, this is interpreted as the SQL extract(hour from start_datetime)
.
Other fields than hour
can be specified in the SQL extract.
If a category column contains the string %value
, that part is replaced by the column value.
This notation allows a column specified by item
to be a category if it contains discrete values.
Here is an example of defining a category using the %
notation
{ "item": "start_datetime%hour", "max": 12, "category": "am" }
- Give
am
category ifstart_datetime
time field is less than 12
- Give
{ "item": "start_datetime%dow", "category": "dow%value" }
- assign the day of the week of the
start_datetime
as a category (Sunday:dow0
to Saturday:dow6
)
- assign the day of the week of the
Output data
category definition table.
The "category definition table" will be output to the destination ddc specified by output_ddc
.
This table will have the following schema.
column name | data type | description |
---|---|---|
item | text | column name |
min | double precision | minimum value (including the specified value) |
max | double precision | maximum value (does not contain the specified value) |
category | text | category string |
return value
import_category
returns the ddc information of the output destination ddc.
This is the behavior defined in the specification of the process
method of the Provenance API.