# How Aerial Works ## Introduction Aerial is a **scalable neurosymbolic association rule mining (ARM) method** for tabular data. It addresses the **rule explosion** and **execution time** issues in classical ARM by combining: - **Autoencoder-based neural representation** of tabular data - **Rule extraction** from learned neural embeddings Learn more about the architecture, training, and rule extraction in our paper: [Neurosymbolic Association Rule Mining from Tabular Data](https://proceedings.mlr.press/v284/karabulut25a.html) ## Pipeline Overview The figure below shows the pipeline of operations for Aerial in 3 main stages. ![Aerial neurosymbolic association rule mining pipeline](_static/assets/pipeline.png) ### 1. Data Preparation 1. Tabular data is first one-hot encoded. This is done using `data_preparation.py:_one_hot_encoding_with_feature_tracking()`. 2. One-hot encoded values are then converted to vector format in the `model.py:train()`. 3. If the tabular data contains numerical columns, they are pre-discretized as exemplified in [Running Aerial for numerical values](user_guide.md#5-running-aerial-for-numerical-values). ### 2. Training Stage 1. An under-complete Autoencoder with either default automatically-picked number of layers and dimension (based on the dataset size and dimension) is constructed, or user-specified layers and dimension. (see [AutoEncoder](api_reference.md#autoencoder)) 2. All the training parameters can be customized including number of epochs, batch size, learning rate etc. (see [train() function](api_reference.md#train)) 3. An Autoencoder is then trained with a denoising mechanism to learn associations between input features. The full Autoencoder architecture is given in our [paper](https://proceedings.mlr.press/v284/karabulut25a.html). ### 3. Rule Extraction Stage 1. Association rules are then extracted from the trained Autoencoder using Aerial's rule extraction algorithm (see [rule_extraction.py:generate_rules()](api_reference.md#generate_rules)). Below figure shows an example rule extraction process. 2. **Example**. Assume `weather` and `beverage` are features with categories `{cold, warm}` and `{tea, coffee, soda}` respectively. The first step is to initialize a test vector of size 5 corresponding to 5 possible categories with equal probabilities per feature, `[0.5, 0.5, 0.33, 0.33, 0.33]`. Then we mark `weather(warm)` by assigning 1 to `warm` and 0 to `cold`, `[1, 0, 0.33, 0.33, 0.33]`, and call the resulting vector a *test vector*. Assume that after a forward run, `[0.7, 0.3, 0.04, 0.1, 0.86]` is received as the output probabilities. Since the probability of `p_weather(warm) = 0.7` is bigger than the given antecedent similarity threshold (`τ_a = 0.5`), and `p_beverage(soda) = 0.86` probability is higher than the consequent similarity threshold (`τ_c = 0.8`), we conclude with `weather(warm) → beverage(soda)`. ![Aerial rule extraction example](_static/assets/example.png) 3. Frequent itemsets (instead of rules) can also be extracted ([rule_extraction.py:generate_frequent_itemsets()](api_reference.md#generate_frequent_itemsets)). 4. Quality metrics (support, confidence, coverage, Zhang's metric, lift, conviction, Yule's Q, interestingness, leverage) are calculated automatically during rule extraction using optimized batch processing with optional parallelization support. ## Key Features PyAerial provides a comprehensive toolkit for association rule mining with advanced capabilities: - **Scalable Rule Mining** - Efficiently mine association rules from large tabular datasets without rule explosion - **Frequent Itemset Mining** - Generate frequent itemsets using the same neural approach - **ARM with Item Constraints** - Focus rule mining on specific features of interest - **Classification Rules** - Extract rules with target class labels for interpretable inference - **Numerical Data Support** - Built-in discretization methods (equal-frequency, equal-width) - **Customizable Architectures** - Fine-tune autoencoder layers and dimensions for optimal performance - **GPU Acceleration** - Leverage CUDA for faster training on large datasets - **Quality Metrics** - Comprehensive rule evaluation (support, confidence, coverage, Zhang's metric) - **Rule Visualization** - Integrate with NiaARM for scatter plots and visual analysis - **Flexible Training** - Adjust epochs, learning rate, batch size, and noise factors