9. Kaggle H&M Personalized Fashion 추천 시스템 솔루션 분석
User Meta data
ID, category 1, category 2, club member(binary), fashion news frequency, age, postal code
Product Meata data
ID, product_code, product_name, product_group_name, graph_code, graph_name, color_group_code, color_name
Bseline model
1. Generate Candidates for each users
- item-to-item similarites between customer previous baskets
- user based collaborative filtering
- model based predictions with customers without the transaction history
- only last baskets.
2. Create a Huge Table
- user, item, rating, features ...
3. Build item rank model for each users
- XGB, LightGBM, Catboost, DNN
Solution
1. Generate Candidates using previous purchased items
- 재구매 top N
- 인기도 top N
- itemCF top N
- Graph embedding top N
- 아이템과 같은 product_code top N
- logistic regression with categorical information top N
2. Create a Huge Table : 이렇게 총 top N개 * 6개의 후보군을 구성.
3. Build item rank model for each users using 5 LigthGBM classifier and 7 Catboost Classifier
**최근 인기 아이템들을 candidate로 선정.(패션 아이템 특성 상 계절성이 있기 때문에)
feature engineering
- user-tiem interaction for repurchase
- ItemCF collaborative filtering score
- similarity for emebedding retrieval
- item count for popularity
Negative Down Sampling