top of page

LATEST PROJECTS

Project |01
Project |01 Time-series Model for Crowd Work Quality Prediction 

While temporal behavioral patterns can be discerned to underlie real crowd work, prior studies have typically modeled worker performance under a simplified i.i.d. assumption. To better model such temporal worker behavior, we propose a time-series label prediction model for crowd work. This latent variable model captures and summarizes past worker behavior, enabling us to better predict the quality of each worker's next label. Given inherent uncertainty in prediction, we also investigate a decision reject option to balance the tradeoff between prediction accuracy vs.\ coverage. Results show our model improves accuracy of both label prediction on real crowd worker data, as well as data quality overall.



(published in AAAI HComp' 14, ECIR'15 under review)

Project |02

 

Project |01 Crowdsourcing via Probabilistic Matrix Factorization for Reliable and Efficient IR evaluation

Probabilistic Matrix Factorization is one of the widely used algorithms for recommendation engines such as Amazon or Netflix. We attempt to improve the quality of crowdsourced labels by inferring unobserved labels with  probabilistic matrix factorization. In addition, we use matrix factorization methods for finding pseudo-experts assessors in order to improve the quality of crowdsourced relevance judgments.



(published in AAAI HComp'12, SIGIR'12, Technical Report 13)

Project |03
Project |03 Reliable IR Evaluation in the absence of expert labels

Evaluating classifiers without expert judgments. We attempt to investigate the feasibility of evaluation methodologies without expert judgments or with only crowdsourced judgments. 



 

Project |04

 

Project |04 Stock Trend Forecasting with machine learning

Predicting stock trends based on a binary stock event model which is based on a simple and fast naive Bayesian classifier. It shows a better performance compared with the well-known trading algorithms in the backtesting.

Recent 5 years project experiences

 

  1. Crowdsourcing and Information Retrieval

  • Task-level Search Engine Evaluation with Crowd Assessors at Microsoft Research

  • Finding qualified crowdworkers via Probabilistic Matrix Factorization at UT Austin

  • Achieving Quality Crowdsourcing Across Tasks, Data Scales, and Operational Settings at  UT Austin (NSF project)

  • TREC 2012 Crowdsourcing track Image Relevance Judgment at UT Austin

  • TREC 2011 Crowdsourcing track evaluation at UT Austin

  • Large-scale Crowdsourcing for Graph Search at Intelius Inc. (summer internship)

  • Spam worker control in crowdsourcing at UT Austin

  • Machine learning for user-generated social data retrieval at Microsoft Bing Socia

  • Personalized Video Content Recommendation at KT with EU-FP7

 

  1. Big Data Analysis (Machine learning and Data Mining)

  •  A Prediction Model for Quality of Applications at Indeed.com (summer internship)

  • Temporal Analysis on Job Market Transition at Indeed.com (summer internship)

  • Stock event model based stock prediction at UT Austin

 

  1. Information Visualization

  • Information visualization for debugging of linked data inference at UT Austin

Just a sample of my work. To see more or discuss possible work >>

bottom of page