Skip to content

fpgrowth

This document has been machine translated.

Execute correlation pattern extraction by FP-Growth.

The execution method uses JSON-RPC v2.0.

Example request

fpgrowth is a kind of analysis API, and can be executed by specifying api_method="fpgrowth" in the process method of the Provenance API. The following is an example of starting a Provenance session, executing fpgrowth, and then exiting the session.

import xdata_prov.client import Api
api = Api()
api.begin_session()

api.process(api_method="fpgrowth", api_params={
    "output_ddc": "ddc:jartic_xrain_rules",
    "input_ddc": "ddc:jartic_xrain_items",
    "min_support": 0.01,
    "min_confidence": 0.75,
    "param_json": "{ (see parameter entry) }"
})

api.commit()
api.end_session()

Parameters.

When calling the process method with api_method="fpgrowth", api_params will take a dict containing the following keys. Parameters with a blank default value are required.

key description default value
output_ddc output destination for processing results ddc
output_mode output mode (overwrite or error) error
input_ddc input data ddc
min_support minimum value of support
min_confidence minimum value of confidence
param_json Parameters passed to FP-Growth

filters.

You can specify the criteria for narrowing down the correlation patterns to be output, with a JSON array as the value. An example is shown below.

"filters": [
  {
    "pre": {"must": ["mesh", "rf"]},
    "post": {"must": ["cl"], "must_not": ["mesh", "dow", "peak"] }
  }
]

Each element of the array describes a precondition (pre) and a filtering condition for the consequent (post).

key description
pre.must list of categories that must be present in the presupposition (forward matching, must include all)
pre.must_not list of categories that should not be in the premise clause (forward match, none of them)
post.must list of categories that must be present in the consequent clause (forward matching, must include all)
post.must_not list of categories that should not be present in the consequent clause (forward match, must not include any of them)

If any of the conditions listed as the value of filters are met, the output is targeted.

Input data

symbol-translated transaction table

The input ddc specified by input_ddc is a "symbol-transformed transaction table". This table must have the following schema.

column name data type description
id integer ID that uniquely identifies the transaction
start_datetime timestamp with time zone start date and time
end_datetime timestamp with time zone end_date
location geometry spatial range
meshcode character varying meshcode
items text[] a set of events that occurred in this space-time range

Output data

Correlation rule table

The "correlation rule table" will be output to the destination ddc specified by output_ddc. This table will have the following schema.

column name data type description
id integer ID that uniquely identifies the rule
pre text[] antecedent
post text[] consequent (conclusion)
support integer absolute support (the number of records satisfying the rule)
confidence double precision confidence (degree of confidence)
  • support column in the rule table stores the number of records, rather than the degree of support

return value

fpgrowth returns the ddc information of the output destination ddc. This is the behavior defined by the specification of the process method of the Provenance API.