How Aerial Works
Introduction
Aerial is a scalable neurosymbolic association rule mining (ARM) method for tabular data.
It addresses the rule explosion and execution time issues in classical ARM by combining:
Autoencoder-based neural representation of tabular data
Rule extraction from learned neural embeddings
Learn more about the architecture, training, and rule extraction in our paper: Neurosymbolic Association Rule Mining from Tabular Data
Pipeline Overview
The figure below shows the pipeline of operations for Aerial in 3 main stages.

1. Data Preparation
Tabular data is first one-hot encoded. This is done using
data_preparation.py:_one_hot_encoding_with_feature_tracking().One-hot encoded values are then converted to vector format in the
model.py:train().If the tabular data contains numerical columns, they are pre-discretized as exemplified in Running Aerial for numerical values.
2. Training Stage
An under-complete Autoencoder with either default automatically-picked number of layers and dimension (based on the dataset size and dimension) is constructed, or user-specified layers and dimension. (see AutoEncoder)
All the training parameters can be customized including number of epochs, batch size, learning rate etc. (see train() function)
An Autoencoder is then trained with a denoising mechanism to learn associations between input features. The full Autoencoder architecture is given in our paper.
3. Rule Extraction Stage
Association rules are then extracted from the trained Autoencoder using Aerial’s rule extraction algorithm (see rule_extraction.py:generate_rules()). Below figure shows an example rule extraction process.
Example. Assume
weatherandbeverageare features with categories{cold, warm}and{tea, coffee, soda}respectively.The first step is to initialize a test vector of size 5 corresponding to 5 possible categories with equal probabilities per feature,
[0.5, 0.5, 0.33, 0.33, 0.33]. Then we markweather(warm)by assigning 1 towarmand 0 tocold,[1, 0, 0.33, 0.33, 0.33], and call the resulting vector a test vector.Assume that after a forward run,
[0.7, 0.3, 0.04, 0.1, 0.86]is received as the output probabilities. Since the probability ofp_weather(warm) = 0.7is bigger than the given antecedent similarity threshold (τ_a = 0.5), andp_beverage(soda) = 0.86probability is higher than the consequent similarity threshold (τ_c = 0.8), we conclude withweather(warm) → beverage(soda).
Frequent itemsets (instead of rules) can also be extracted (rule_extraction.py:generate_frequent_itemsets()).
Quality metrics (support, confidence, coverage, Zhang’s metric, lift, conviction, Yule’s Q, interestingness, leverage) are calculated automatically during rule extraction using optimized batch processing with optional parallelization support.
Key Features
PyAerial provides a comprehensive toolkit for association rule mining with advanced capabilities:
Scalable Rule Mining - Efficiently mine association rules from large tabular datasets without rule explosion
Frequent Itemset Mining - Generate frequent itemsets using the same neural approach
ARM with Item Constraints - Focus rule mining on specific features of interest
Classification Rules - Extract rules with target class labels for interpretable inference
Numerical Data Support - Built-in discretization methods (equal-frequency, equal-width)
Customizable Architectures - Fine-tune autoencoder layers and dimensions for optimal performance
GPU Acceleration - Leverage CUDA for faster training on large datasets
Quality Metrics - Comprehensive rule evaluation (support, confidence, coverage, Zhang’s metric)
Rule Visualization - Integrate with NiaARM for scatter plots and visual analysis
Flexible Training - Adjust epochs, learning rate, batch size, and noise factors