Beauty Beyond Words II
Project Overview
This research addresses the beauty industry's need for transparent and explainable methods to map product ingredients to skin-related attributes. We developed a BERT-based machine learning model that achieved a balanced F1 score of 0.61 and precision of 0.75 for predicting product attributes from ingredients and descriptions.
Our approach combined explicit model architectures with advanced explainability techniques, including Integrated Gradients, SHAP, and LIME, to uncover the relationships between ingredients and product attributes. By utilizing publicly available Amazon product metadata and reviews alongside a curated skincare glossary, we developed a scalable pipeline to extract and analyze ingredient-attribute relationships in beauty products.
The model successfully identified key attributes such as acne, hydration, and sensitive skin with high precision, while explainability methods revealed significant ingredients like salicylic acid for acne treatment and petroleum jelly for hydration. This work contributes to developing more transparent and effective tools for beauty product analysis and recommendation systems.
Model Performance: Confusion Matrices
Classification performance for key skin attributes:
Acne
Oily Skin
Performance Metrics
Explainability Analysis
Compare ingredient importance across different explainability methods:
LIME: Top Ingredients for Acne
LIME: Top Ingredients for Hydration
Pipeline Architecture
Flowchart representing the multi-stage pipeline for extracting ingredients and attributes:

Key Takeaways
- Developed a scalable, interpretable machine learning pipeline for analyzing beauty product ingredients and their attributes
- Demonstrated that transformer-based models can effectively predict product attributes with high precision (0.75) and reasonable F1 scores (0.61)
- Applied multiple explainability techniques to uncover meaningful connections between ingredients and skin benefits
- Identified opportunities for improving recommendation systems through transparent ingredient-attribute mapping