# User Guide This section provides detailed examples of using Aerial with various configurations and use cases. > **📝 Note on Parameter Names:** The parameters `min_rule_frequency` and `min_rule_strength` correspond to `ant_similarity` and `cons_similarity` in the original [Aerial](https://proceedings.mlr.press/v284/karabulut25a.html) > and [PyAerial](https://doi.org/10.1016/j.softx.2025.102341) papers. > **🖥️ CPU Performance:** PyAerial runs fast on CPU. GPU acceleration is optional and only beneficial for very large > datasets. **Looking for something specific?** - 🎯 **Parameter tuning** - See the [Parameter Tuning Guide](parameter_guide.md) for how to get high/low support, confidence, etc. - ⚙️ **Configuration** - See [Configuration](configuration.md) for GPU usage, logging, and training parameters - 🔧 **Troubleshooting** - See [Debugging](configuration.md#debugging) if Aerial can't learn rules or takes too long ## 1. Association Rule Mining from Categorical Tabular Data ```python from aerial import model, rule_extraction from ucimlrepo import fetch_ucirepo # load a categorical tabular dataset from the UCI ML repository breast_cancer = fetch_ucirepo(id=14).data.features # train an autoencoder on the loaded table trained_autoencoder = model.train(breast_cancer) # extract association rules with quality metrics calculated automatically result = rule_extraction.generate_rules(trained_autoencoder, min_rule_frequency=0.1, min_rule_strength=0.8) # access rules and statistics if len(result['rules']) > 0: print(result['statistics']) print(result['rules'][0]) ``` Following is the partial output of above code: ``` >>> Output: breast_cancer dataset: age menopause tumor-size inv-nodes ... deg-malig breast breast-quad irradiat 0 30-39 premeno 30-34 0-2 ... 3 left left_low no 1 40-49 premeno 20-24 0-2 ... 2 right right_up no 2 40-49 premeno 20-24 0-2 ... 2 left left_low no ... Overall statistics: { "rule_count": 15, "average_support": 0.448, "average_confidence": 0.881, "average_coverage": 0.860, "data_coverage": 0.923, "average_zhangs_metric": 0.318 } Sample rule: { "antecedents": [ {"feature": "inv-nodes", "value": "0-2"} ], "consequent": {"feature": "node-caps", "value": "no"}, "support": 0.702, "confidence": 0.943, "zhangs_metric": 0.69, "rule_coverage": 0.744 } ``` **Working with rules:** Rules are returned in a structured dictionary format with quality metrics included: ```python # Accessing rule components and quality metrics for rule in result['rules']: # Access antecedent features for ant in rule['antecedents']: feature_name = ant['feature'] # e.g., "inv-nodes" feature_value = ant['value'] # e.g., "0-2" # Access consequent cons_feature = rule['consequent']['feature'] # e.g., "node-caps" cons_value = rule['consequent']['value'] # e.g., "no" # Access quality metrics (automatically calculated) support = rule['support'] confidence = rule['confidence'] zhangs_metric = rule['zhangs_metric'] rule_coverage = rule['rule_coverage'] # antecedent support ``` > **🔁 Reproducibility:** PyAerial may produce slightly different rules across runs due to the neural network training > process. To get the same results every time, set a fixed seed before training: > ```python > import torch > torch.manual_seed(42) > ... > trained_autoencoder = model.train(breast_cancer) > ``` > If using GPU, also add: > ```python > torch.cuda.manual_seed(42) > torch.backends.cudnn.deterministic = True > torch.backends.cudnn.benchmark = False > ``` ## 2. Specifying Item Constraints Instead of performing rule extraction on all features, Aerial allows you to extract rules only for features of interest. This is called ARM with item constraints. In ARM with item constraints, the antecedent side of the rules will contain the items of interest. However, the consequent side of the rules may still contain other feature values (to restrict the consequent side as well, see [Using Aerial for Rule-Based Classification](#7-using-aerial-for-rule-based-classification-for-interpretable-inference)). `features_of_interest` parameter of `generate_rules()` can be used to do that (also valid for `generate_frequent_itemsets()`, see below). ```python from aerial import model, rule_extraction from ucimlrepo import fetch_ucirepo # categorical tabular dataset breast_cancer = fetch_ucirepo(id=14).data.features trained_autoencoder = model.train(breast_cancer) # features of interest, either a feature with its all values (e.g., "age") or with its certain values (e.g., premeno value of menopause feature is the only feature value of interest) features_of_interest = ["age", "tumor-size", "inv-nodes", {"menopause": 'premeno'}, "node-caps"] result = rule_extraction.generate_rules(trained_autoencoder, features_of_interest, min_rule_frequency=0.1) ``` The output rules will only contain features of interest on the antecedent side: ``` >>> Output: result['rules']: [ { "antecedents": [ {"feature": "menopause", "value": "premeno"} ], "consequent": {"feature": "node-caps", "value": "no"}, "support": 0.357, "confidence": 0.68, "zhangs_metric": -0.066, "rule_coverage": 0.525 }, { "antecedents": [ {"feature": "menopause", "value": "premeno"} ], "consequent": {"feature": "breast", "value": "right"}, "support": 0.245, "confidence": 0.72, "zhangs_metric": 0.124, "rule_coverage": 0.525 }, ... ] ``` ## 3. Setting Aerial Parameters Aerial has 3 key parameters that control rule extraction: - **`min_rule_frequency`**: Controls support (how frequent patterns are) - analogous to minimum support in traditional ARM - **`min_rule_strength`**: Controls confidence and association strength - analogous to minimum confidence - **`max_antecedents`**: Maximum number of conditions in rule antecedents (complexity) **Quick example:** ```python from aerial import model, rule_extraction from ucimlrepo import fetch_ucirepo breast_cancer = fetch_ucirepo(id=14).data.features trained_autoencoder = model.train(breast_cancer) # Adjust parameters to control rule characteristics result = rule_extraction.generate_rules( trained_autoencoder, min_rule_frequency=0.5, # Synonymous to minimum antecedent support threshold min_rule_strength=0.8, # Synonymous to minimum confidence/association threshold max_antecedents=2 # Max rule length ) ``` **Want to know which parameters to set for your specific needs?** See the **[Parameter Tuning Guide](parameter_guide.md)** for detailed guidance on: - Getting high/low support rules - Getting high/low confidence rules - Controlling the number of rules - Common scenarios with examples ## 4. Fine-tuning Autoencoder Architecture and Dimensions Aerial uses an under-complete Autoencoder and in default, it decides automatically how many layers to use and the dimensions of each layer (see [API Reference](api_reference.md)). Alternatively, you can specify the number of layers and dimensions in the `train` method to improve performance. ```python from aerial import model, rule_extraction, rule_quality ... # layer_dims=[4, 2] specifies that there are gonna be 2 hidden layers with the dimensions 4 and 2, for encoder and decoder trained_autoencoder = model.train(breast_cancer, layer_dims=[4, 2]) ... ``` In general, lower number of parameters (layers and dimensions) leads to fewer but higher-quality rules. Introduce more parameters when not getting enough or no rules, or reduce parameters if the rules are too many and have very low support. Training longer with the `epochs` parameter of the `train()` function also has similar impact on the final rule set as increasing the number of parameters. ## 5. Running Aerial for Numerical Values Discretizing numerical values is required before running Aerial. PyAerial provides several discretization methods as part of the `discretization.py` module. These methods can be categorized into **unsupervised** (no target variable needed) and **supervised** (require target variable for classification tasks). **Automatic Column Filtering**: All discretization methods automatically skip columns that are already discrete or categorical-like. This includes: - Binary columns (e.g., 0/1 for class labels) - Low-cardinality columns (< 5% unique values relative to total rows) - Columns with fewer unique values than the requested number of bins When columns are skipped, an INFO-level log message will indicate which columns were filtered and why. ### 5.1. Unsupervised Discretization Methods These methods work without requiring a target variable: #### Equal-Frequency Discretization (Quantile-Based) Divides data into bins with approximately equal number of samples per bin. Useful for skewed distributions. ```python from aerial import model, rule_extraction, discretization from ucimlrepo import fetch_ucirepo iris = fetch_ucirepo(id=53).data.features iris_discretized = discretization.equal_frequency_discretization(iris, n_bins=3) trained_autoencoder = model.train(iris_discretized, epochs=10) result = rule_extraction.generate_rules(trained_autoencoder, min_rule_frequency=0.1, min_rule_strength=0.8) print(f"Found {result['statistics']['rule_count']} rules") ``` #### Equal-Width Discretization Divides the range of values into equal-width intervals. Simple and intuitive. ```python iris_discretized = discretization.equal_width_discretization(iris, n_bins=5) ``` #### K-Means Discretization Uses k-means clustering to create bins based on natural clusters in the data. Interval boundaries are created at the midpoints between consecutive cluster centers. ```python iris_discretized = discretization.kmeans_discretization(iris, n_bins=4, random_state=42) ``` #### Quantile Discretization Similar to equal-frequency but allows custom percentile specification. ```python # Using custom percentiles (quartiles) iris_discretized = discretization.quantile_discretization( iris, percentiles=[0, 25, 50, 75, 100] ) ``` #### Custom Bins Discretization Allows full control with user-specified bin edges for each feature. ```python bins_dict = { 'sepal length (cm)': [4.0, 5.0, 6.0, 7.0, 8.0], 'sepal width (cm)': [2.0, 2.5, 3.0, 3.5, 5.0], 'petal length (cm)': [1.0, 2.0, 4.0, 5.5, 7.0], 'petal width (cm)': [0.0, 0.5, 1.5, 2.0, 3.0] } iris_discretized = discretization.custom_bins_discretization(iris, bins_dict) ``` ### 5.2. Supervised Discretization Methods These methods use target variable information to create more informative bins for classification: #### Entropy-Based Discretization (MDLP) Uses decision tree splits to minimize entropy with respect to the target variable. ```python from aerial import discretization from ucimlrepo import fetch_ucirepo # Load dataset with target labels iris_data = fetch_ucirepo(id=53) features = iris_data.data.features targets = iris_data.data.targets import pandas as pd df = pd.concat([features, targets], axis=1) # Discretize using target information df_discretized = discretization.entropy_based_discretization(df, target_col='class', n_bins=4) ``` #### ChiMerge Discretization Merges adjacent intervals based on chi-square statistics to find optimal discretization. ```python df_discretized = discretization.chimerge_discretization( df, target_col='class', max_bins=5, significance_level=0.05 ) ``` #### Decision Tree Discretization Uses decision tree regression to find optimal split points based on the target variable. Works with both categorical and numerical targets. ```python df_discretized = discretization.decision_tree_discretization( df, target_col='class', max_depth=3, min_samples_leaf=5 ) ``` Following is the partial iris dataset content before and after the discretization: ``` >>> Output: # before discretization sepal length sepal width petal length petal width 0 5.1 3.5 1.4 0.2 1 4.9 3.0 1.4 0.2 ... # after discretization sepal length sepal width petal length petal width 0 (5.0, 5.27] (3.4, 3.61] (0.999, 1.4] (0.099, 0.2] 1 (4.8, 5.0] (2.8, 3.0] (0.999, 1.4] (0.099, 0.2] ... ``` ## 6. Frequent Itemset Mining with Aerial Aerial can also be used for frequent itemset mining besides association rules. ```python from aerial import model, rule_extraction, rule_quality from ucimlrepo import fetch_ucirepo # categorical tabular dataset breast_cancer = fetch_ucirepo(id=14).data.features trained_autoencoder = model.train(breast_cancer, epochs=5, lr=1e-3) # extract frequent itemsets with support values calculated automatically result = rule_extraction.generate_frequent_itemsets(trained_autoencoder) # access itemsets and statistics print(f"Found {result['statistics']['itemset_count']} itemsets") print(f"Average support: {result['statistics']['average_support']}") ``` The following is a sample output: ``` >>> Output: Found 15 itemsets Average support: 0.295 Itemsets with support values: [ { 'itemset': [{'feature': 'menopause', 'value': 'premeno'}], 'support': 0.524 }, { 'itemset': [{'feature': 'menopause', 'value': 'ge40'}], 'support': 0.451 }, { 'itemset': [{'feature': 'menopause', 'value': 'premeno'}, {'feature': 'age', 'value': '30-39'}], 'support': 0.312 }, ... ] ``` ## 7. Using Aerial for Rule-Based Classification for Interpretable Inference Aerial can be used to learn rules with a class label on the consequent side, which can later be used for inference either by themselves or as part of rule list or rule set classifiers (e.g., from [imodels](https://github.com/csinva/imodels) repository). This is done by setting `target_classes` parameter of the `generate_rules` function. This parameter refers to the class label(s) column of the tabular data. As shown in [Specifying Item Constraints](#2-specifying-item-constraints), we can also specify multiple target classes and/or their specific values. `["Class1", {"Class2": "value2"}]` array specifies that we are interested in all values of `Class1` and specifically `value2` of `Class2` in the consequent side of the rules. ```python import pandas as pd from aerial import model, rule_extraction, rule_quality from ucimlrepo import fetch_ucirepo # categorical tabular dataset breast_cancer = fetch_ucirepo(id=14) labels = breast_cancer.data.targets breast_cancer = breast_cancer.data.features # merge labels column with the actual table table_with_labels = pd.concat([breast_cancer, labels], axis=1) trained_autoencoder = model.train(table_with_labels) # generate rules with a target class(es), this learns rules that has the "target_classes" column (in this case this column is called "Class") on the consequent side result = rule_extraction.generate_rules(trained_autoencoder, target_classes=["Class"], min_rule_strength=0.5) if len(result['rules']) > 0: print(f"Generated {result['statistics']['rule_count']} classification rules") print(f"Average confidence: {result['statistics']['average_confidence']}") ``` Sample output showing rules with class labels on the right hand side: ``` >>> Output: Generated 12 classification rules Average confidence: 0.742 Sample rule: { "antecedents": [ {"feature": "menopause", "value": "premeno"} ], "consequent": {"feature": "Class", "value": "no-recurrence-events"}, "support": 0.357, "confidence": 0.68, "zhangs_metric": -0.066, "rule_coverage": 0.525 } ``` ## 8. Smart Defaults and Filtering Aerial uses smart defaults when you don't know what values to use. ### 8.1. Training Duration By default, Aerial uses `epochs=2` which produces fewer, higher-quality rules: ```python trained_autoencoder = model.train(breast_cancer) # epochs=2 by default ``` **Why shorter training is better:** - Captures only the strongest associations - Produces fewer, higher-quality rules - Avoids overfitting to noise **When to increase epochs:** - Only if you're getting no rules and suspect underfitting - Start with `epochs=3` or `epochs=5` and observe results ### 8.2. Automatic Batch Size Batch size is now automatically determined based on dataset size when not specified: ```python # Batch size is auto-selected based on number of rows trained_autoencoder = model.train(breast_cancer) # batch_size chosen automatically ``` **Auto-selection logic:** - <200 rows: batch_size=2 - <500 rows: batch_size=4 - <1000 rows: batch_size=8 - <5000 rows: batch_size=32 - ≥5000 rows: batch_size=64 ### 8.3. Filtering Rules by Quality Post-filter rules to keep only those meeting quality thresholds: ```python result = rule_extraction.generate_rules( trained_autoencoder, min_confidence=0.7, min_support=0.1 ) ``` **Parameters:** - `min_confidence`: Keep only rules with confidence ≥ this value - `min_support`: Keep only rules with support ≥ this value **Use when:** - You want to filter results without changing the extraction thresholds, and to avoid false positives - You want only high-confidence or high-support rules ### 8.4. Combining Parameters Filtering works with all other parameters: ```python result = rule_extraction.generate_rules( trained_autoencoder, target_classes=["Class"], features_of_interest=["age"], min_confidence=0.6, max_antecedents=2 ) ``` ## 9. Visualizing Association Rules Rules learned by PyAerial can be visualized using [NiaARM](https://github.com/firefly-cpp/NiaARM) library. In the following, `visualizable_rule_list()` function converts PyAerial's rule format to NiaARM `RuleList()` format. And then visualizes the rules on a scatter plot using the visualization module of NiaARM ```python ... from niaarm.visualize import scatter_plot from niaarm import RuleList, Feature, Rule def visualizable_rule_list(aerial_result: dict, dataset: pd.DataFrame): rule_list = RuleList() for rule in aerial_result['rules']: # Convert dictionary format to NiaARM Feature format antecedents = [Feature(ant['feature'], "cat", categories=[ant['value']]) for ant in rule["antecedents"]] consequent = Feature(rule["consequent"]['feature'], "cat", categories=[rule["consequent"]['value']]) rule_list.append(Rule(antecedents, [consequent], transactions=dataset)) return rule_list # learn rules with PyAerial as before breast_cancer = fetch_ucirepo(id=14).data.features trained_autoencoder = model.train(breast_cancer) result = rule_extraction.generate_rules(trained_autoencoder, min_rule_frequency=0.1) # get rules in NiaARM RuleList format visualizable_rules = visualizable_rule_list(result, breast_cancer) figure = scatter_plot(rules=visualizable_rules, metrics=('support', 'confidence', 'lift'), interactive=False) figure.show() ``` Visualization of the PyAerial rules as a scatter plot showing their quality metrics: ![visualization.png](_static/assets/visualization.png) Please see NiaARM for more visualization options: https://github.com/firefly-cpp/NiaARM?tab=readme-ov-file#visualization