SCRUF-D Integration
Similarities
- Both reference logged data: Both use feedback logs to improve predictions
- Both use similar data structures: UserID, Lists
Differences
- OBP uses reward system
- OBP is based on Machine Learning driven by data and SCRUFF-D is more model based
- Metrics (OPB is rewards based)
Possible Implementation
All the following are various ways of integrating OBP into the SCRUF-D architecture. (not currently implemented, but are possible solutions)
Partial
- OBP as recommender system
- Build Algorithm and Choice Mechanism, feedback separately
Full
- Modifications (Similar to Slate)
- Evaluators/ Metrics
- Bandit Feedback
- Determine data sources/storage
- Ways to manage the Rewards Problem
- Set all rewards equal to 1
- Repurpose the rewards label: Could signal protected items/lists
- Ways to manage the Fairness Agents
- Each Fariness Agent serves as an OBP policy
- Allocation Mechanism manages policy usage based on BanditFeedback: Nested actions
- Ways to manage the List Structure
- OBP uses three dimension (context,actions,posion): SCRUFF-D uses (userID, ItemID, pposition)
- Context,ListID could be used in OBP, but we would need to pull list from data source