Houseplant Health Classification

Overview

An end-to-end computer vision pipeline that classifies houseplant health conditions (healthy vs. unhealthy) using Amazon Rekognition Custom Labels. Built with a production-style architecture including automated serverless inference via AWS Lambda. Every AWS resource was provisioned and managed entirely through Python, making the pipeline fully reproducible from a single notebook.

Storefront

Architecture

Local Images → S3 Upload → Rekognition Custom Labels Training
                                        ↓
New Image → S3 (uploads/) → Lambda → Rekognition Inference → S3 (results/)

Results

Performance evaluated on 19 holdout test images:

Metric	Healthy	Unhealthy
Precision	0.93	1.00
Recall	1.00	0.80
F1 Score	0.97	0.89
Overall Accuracy	0.95

Confusion Matrix

Dataset

91 original images (67 healthy, 24 unhealthy) collected from personal houseplants
Augmented to 201 training images to address class imbalance
80/20 train/test split with stratification
Available on Kaggle: Houseplant Health Classification Dataset

Pipeline

Images organized into healthy/unhealthy folders
Train/test split (80/20) with stratification
Augmentation applied to training set only via Albumentations
Images uploaded to S3 with manifest files for Rekognition
Rekognition Custom Labels model trained on 201 images
Model evaluated against 19 holdout test images
Lambda function deployed for automated inference on new S3 uploads

Tech Stack

Python Amazon Rekognition Custom Labels Amazon S3 AWS Lambda IAM Boto3 Albumentations Scikit-learn

Acknowledgements

Training images were collected in person at two plant nurseries whose staff were kind enough to allow photography:

Holiday Foliage Orchids and Plants — 146 West 28th Street, New York, NY
Redwood Flower Shop — New Brunswick, NJ

View Full Repo on GitHub · Back to Portfolio