fiscal patriots Back to Home

Case Study

Predicting Next-Year Audit Findings Using FAC Data

We built a predictive model that functions as an early warning signal for which organizations are more likely to have audit findings next year (t+1) based on what we already know about them this year (t). The output notifies potential future findings for an entity that helps oversight teams prioritize attention and support -- especially when resources are limited.

Predictive Modeling Python / Scikit-Learn FAC Data Risk Assessment

Analytical Tools & Software Used

Programming environment
Python (Jupyter Notebook)
Core libraries
pandas / NumPy for data preparation and feature engineering; scikit-learn for modeling, evaluation, and permutation feature importance; SHAP for model interpretability and stakeholder-friendly explanations
Data sources
Federal Audit Clearinghouse (FAC): general-ay, findings-ay, federal_awards-ay (Audit Year files, 2019-2022)

Methodology: Data Preparation & Preprocessing

1) Dataset grain: entity-year (t)

FAC files are naturally organized around submission/report records, so the first step was building a clean panel at the right grain: one row per entity per audit year.

2) Entity identifier strategy (EIN vs UEI)

Older years (2019-2021) contain many placeholder UEIs (e.g., GSA_MIGRATION), making UEI unreliable for identifying unique entities in those years. To keep continuity across 2019-2022:

3) Outcome creation: findings at t+1

We created:

Only years with observable next-year outcomes were kept for supervised training:

4) Feature engineering (award-based predictors)

From federal_awards-ay, we aggregated program activity into entity-year features such as:

Model Choice and Performance

Chosen model: HistGradientBoosting (HGB)

We selected a HistGradientBoostingClassifier, a highly effective supervised learning model available in the scikit-learn Python library, because it's strong at detecting complex patterns in large datasets (for example, how award complexity and prior findings combine to elevate risk).

Performance metrics

We evaluated using two standard classification ranking metrics:

0.7656
ROC-AUC
Interpretation: If you randomly choose one entity-year that will have findings next year and one that won't, the model ranks the "will have findings" case higher about 77% of the time.
0.5439
PR-AUC
PR-AUC is especially useful when the outcome is relatively uncommon (audit findings are not present for everyone). A PR-AUC of 0.5439 indicates the model does a solid job concentrating true positives toward the top of the ranking, which is useful for prioritization.

For comparison, here's a logistic regression model run on the same data as a baseline:

The HGB model provided a meaningful lift, especially on PR-AUC.

Interpreting What the Model Learned

We also want the model to be explainable, not a black box. We use Permutation feature importance to accomplish just that.

Permutation importance measures how much performance drops when a feature is shuffled. The top signals included:

How This Output Can Be Used

This model supports more targeted oversight, while keeping final decisions with program staff, auditors, and policy.

Example applications:

Recommendations for Improvement

This short case study is just a taste of how powerful analytical tools can transform data into valuable insights. Some ways this model could be made even more effective include: