fpgrowth
This document has been machine translated.
Execute correlation pattern extraction by FP-Growth.
The execution method uses JSON-RPC v2.0.
Example request
fpgrowth
is a kind of analysis API, and can be executed by specifying api_method="fpgrowth"
in the process
method of the Provenance API.
The following is an example of starting a Provenance session, executing fpgrowth
, and then exiting the session.
import xdata_prov.client import Api
api = Api()
api.begin_session()
api.process(api_method="fpgrowth", api_params={
"output_ddc": "ddc:jartic_xrain_rules",
"input_ddc": "ddc:jartic_xrain_items",
"min_support": 0.01,
"min_confidence": 0.75,
"param_json": "{ (see parameter entry) }"
})
api.commit()
api.end_session()
Parameters.
When calling the process
method with api_method="fpgrowth"
, api_params
will take a dict containing the following keys.
Parameters with a blank default value are required.
key | description | default value |
---|---|---|
output_ddc | output destination for processing results ddc | |
output_mode | output mode (overwrite or error ) |
error |
input_ddc | input data ddc | |
min_support | minimum value of support | |
min_confidence | minimum value of confidence | |
param_json | Parameters passed to FP-Growth |
filters
.
You can specify the criteria for narrowing down the correlation patterns to be output, with a JSON array as the value. An example is shown below.
"filters": [
{
"pre": {"must": ["mesh", "rf"]},
"post": {"must": ["cl"], "must_not": ["mesh", "dow", "peak"] }
}
]
Each element of the array describes a precondition (pre
) and a filtering condition for the consequent (post
).
key | description |
---|---|
pre.must | list of categories that must be present in the presupposition (forward matching, must include all) |
pre.must_not | list of categories that should not be in the premise clause (forward match, none of them) |
post.must | list of categories that must be present in the consequent clause (forward matching, must include all) |
post.must_not | list of categories that should not be present in the consequent clause (forward match, must not include any of them) |
If any of the conditions listed as the value of filters
are met, the output is targeted.
Input data
symbol-translated transaction table
The input ddc specified by input_ddc
is a "symbol-transformed transaction table".
This table must have the following schema.
column name | data type | description |
---|---|---|
id | integer | ID that uniquely identifies the transaction |
start_datetime | timestamp with time zone | start date and time |
end_datetime | timestamp with time zone | end_date |
location | geometry | spatial range |
meshcode | character varying | meshcode |
items | text[] | a set of events that occurred in this space-time range |
Output data
Correlation rule table
The "correlation rule table" will be output to the destination ddc specified by output_ddc
.
This table will have the following schema.
column name | data type | description |
---|---|---|
id | integer | ID that uniquely identifies the rule |
pre | text[] | antecedent |
post | text[] | consequent (conclusion) |
support | integer | absolute support (the number of records satisfying the rule) |
confidence | double precision | confidence (degree of confidence) |
support
column in the rule table stores the number of records, rather than the degree of support
return value
fpgrowth
returns the ddc information of the output destination ddc.
This is the behavior defined by the specification of the process
method of the Provenance API.