9. Kaggle H&M Personalized Fashion 추천 시스템 솔루션 분석

2023. 12. 22. 14:22NAVER AI Tech/Project

User Meta data

ID, category 1, category 2, club member(binary), fashion news frequency, age, postal code

 

Product Meata data

ID, product_code, product_name, product_group_name, graph_code, graph_name, color_group_code, color_name

 

 

 

Bseline model

1. Generate Candidates for each users

    - item-to-item similarites between customer previous baskets

    - user based collaborative filtering

    - model based predictions with customers without the transaction history

    - only last baskets.

2. Create a Huge Table

    - user, item, rating, features ...

3. Build item rank model for each users

    - XGB, LightGBM, Catboost, DNN

 

Solution

1. Generate Candidates using previous purchased items

    - 재구매 top N

    - 인기도 top N

    - itemCF top N

    - Graph embedding top N

    - 아이템과 같은 product_code top N

    - logistic regression with categorical information top N

2. Create a Huge Table : 이렇게 총 top N개 * 6개의 후보군을 구성.

3. Build item rank model for each users using 5 LigthGBM classifier and 7 Catboost Classifier

 

**최근 인기 아이템들을 candidate로 선정.(패션 아이템 특성 상 계절성이 있기 때문에)

feature engineering

    - user-tiem interaction for repurchase

    - ItemCF collaborative filtering score

    - similarity for emebedding retrieval

    - item count for popularity

Negative Down Sampling