OSW

SIGNATURE WORK
CONFERENCE & EXHIBITION 2023

Environmental Impacts and Unaccounted Greenhouse Gas Emissions from Imported Wood Pellet Use

Name

Wynona Eurj Curaming

Major

Environmental Science / Public Policy

Class

2023

About

Wynona studies environmental science (public policy) and plans to pursue climate and sustainability consulting as a career, using tools like LCA.

Signature Work Project Overview

Active learning has become a popular approach for handling the challenge of large unlabeled datasets and limited annotation resources. While commonly utilized in image classification, active learning plays a critical role in the finance domain due to the scarcity of available financial data, particularly transaction data, which is inherently sensitive and private.
This study assesses the efficacy of two prevalent active learning query strategies, namely, identifying the most informative data points (e.g., least confidence) and the most representative data points (e.g., K-Means Cluster). Our findings indicate that uncertain sampling query strategies outperform representative query frameworks when applied to financial datasets. Specifically, a mere 2.5% of the data can achieve the baseline model performance using the uncertain sampling strategy, while representative query frameworks yield subpar results. Moreover, combining uncertain sampling with representative frameworks results in a deterioration of the superior performance achieved by solely employing uncertain sampling.
The observed phenomena highlight the presence of noise in financial data. Additionally, we propose a novel query method for ensemble machine learning, whereby we prioritize querying the data points exhibiting the highest variance in model prediction. Our analysis reveals that this method can attain performance on par with the least confidence query strategies while displaying greater robustness.

Signature Work Presentation Video