import_category
Note
This document has been machine translated.
Registers category definitions to be used for extracting correlation patterns.
Use JSON-RPC v2.0 as the execution method.
Example request
import_category is a kind of analysis API, and can be executed by specifying api_method="import_category" in the process method of the Provenance API.
The following is an example of starting a Provenance session, executing import_category, and then closing the session.
import json
import xdata_prov.client import Api
api = Api()
api.begin_session()
api.process(api_method="import_category", api_params={
"output_ddc": "ddc:jartic_xrain_category",
"category_json": json.dumps([.
{ "item": "agg_jartic_max_length", "min": 10, "max": 100, "category": "cl10" },
{ "item": "agg_jartic_max_length", "min": 100, "max": 300, "category": "cl100" },
{ "item": "agg_jartic_max_length", "min": 300, "max": 600, "category": "cl300" },
{ "item": "agg_xrain_max_value", "min": 0.1, "max": 1, "category": "rf01" }
{ "item": "agg_xrain_max_value", "min": 1 , "max": 5, "category": "rf1" },
{ "item": "agg_xrain_max_value", "min": 5 , "max": 10, "category": "rf5" }
])
})
api.commit()
api.end_session()
Parameters
When calling the process method with api_method="import_category", api_params will take a dict containing the following keys.
Parameters with a blank default value are required.
| key | description | default value |
|---|---|---|
| output_ddc | registered to ddc | |
| output_mode | output mode (overwrite or error) |
error |
| category_json | category definition (in JSON format) |
input data
category definition
The category definition specified by category_json is a JSON array whose elements are JSON objects.
Each element of the array must have the following keys.
| key | type | description |
|---|---|---|
| item | string | column name |
| min | number | minimum value (including the specified value) |
| max | number | maximum value (does not include the specified value) |
| category | string | category string |
Interpretation of category definition.
Category definitions are used in extract_items.
extract_items converts the category definition into symbolic information according to the category definition for each record of the input transaction.
The conversion rules are as follows.
- If the value of the column specified by `item' is greater than or equal to min and less than max, output the symbol specified by category.
- If min is not specified, the minimum value will be unrestricted.
- If max is not specified, the maximum value will be unrestricted.
In the item column, you can extract specific fields from a timestamp type column in the form of %field.
For example, start_datetime%hour will apply the conversion to the time (0 to 23) in the start_datetime column.
Specifically, this is interpreted as the SQL extract(hour from start_datetime).
Other fields than hour can be specified in the SQL extract.
If a category column contains the string %value, that part is replaced by the column value.
This notation allows a column specified by item to be a category if it contains discrete values.
Here is an example of defining a category using the % notation
{ "item": "start_datetime%hour", "max": 12, "category": "am" }- Give
amcategory ifstart_datetimetime field is less than 12
- Give
{ "item": "start_datetime%dow", "category": "dow%value" }- assign the day of the week of the
start_datetimeas a category (Sunday:dow0to Saturday:dow6)
- assign the day of the week of the
Output data
category definition table.
The "category definition table" will be output to the destination ddc specified by output_ddc.
This table will have the following schema.
| column name | data type | description |
|---|---|---|
| item | text | column name |
| min | double precision | minimum value (including the specified value) |
| max | double precision | maximum value (does not contain the specified value) |
| category | text | category string |
return value
import_category returns the ddc information of the output destination ddc.
This is the behavior defined in the specification of the process method of the Provenance API.