--My Research--

An Overview

Projects

Working Paper

Publications

Awards

Talks & Presentations

Bookmarks

CV


My Research Diary

Research



An Overview
What am I doing now?
I am working with Prof. Panagiotis G. Ipeirotis and Prof. Anindya Ghose in the Center for Digital Economy Research (CeDER) on The EconoMining Project @ NYU. Generally, I am interested in econometric modeling and economics of information systems. More specifically, now I am trying to understand the economic values of different product characteristics, by analyzing various types of User-Generated Content from social media, combining techniques from text mining, image classification, geo-tagging and human annotations with demand estimation.
Housing Market Analysis and Forecast:
I worked as a summer research intern at Microsoft Virtual Earch & Local Search group in 2008. I was working on analyzing and forecasting the housing market in Great Seattle Area based on econometric and spatio-temporal modeling.
Medical Image Mining:
Previously, I also worked on data preprocessing and data mining in Brain Imaging. We were trying to find an efficient way to transform the brain images (multidimensional arrays of data) to data representations which are amenable to data mining techniques.
First, we transfer the 3-D images into a 2-D matrix, with each row in the matrix representing one subject, and with each column representing one feature value for each pixel in the image slice. Then we perform SVM classification algorithm on them, and try to find certain subject whose brain image data contains abnormal regions compared to those normal ones. In this way, we will be able to detect certain disease at the earliest time.

Weblog Data Mining:

I also worked on exploring data mining techniques on the web data with special features, for instance, the recently appearing BLOG data. People tent to place their real feelings or true thoughts on their BLOGs, instead of taking surveys or interviews by the media. Therefore, mining BLOG data will help us know what people truely consider about certain social issues, as well as their real opinions towards certain business product.

To retrieve useful information from the BLOG documents, we first built up a BLOG document database by transferring the HTML web page files into text files. Then we applied certain algorithms and transfer these text files into a Word-by-Page matrix, which was used for further analysis. After this data pre-processing procedure, we conducted feature selection to reduce the feature space and also get rid of some noises. In this case, we could finally perform our data mining algorithms on these data and try to find interesting information.




Projects
  Microsoft Virtual Earth Award., Aug. 2007- May. 2008
  The EconoMining Project @ NYU, Aug. 2007-
  Brain imaging data Classification & Analysis for early Alzheimer's Disease detection, Jan 2006-May 2007.
  Mining the Blogosphere, Nov 2006-May 2007.



Papers & Publications
Towards Designing Ranking Systems for Hotels on Travel Search Engines: Combining Text mining with Demand Estimation in the Hotel Industry. Proceedings of the 2009 Workshop on Information Technology and Systems (WITS 2009), Phoenix, December, 2009. (with Anindya Ghose and Panagiotis G. Ipeirotis)
Stay Elsewhere? Improving Local Search for Hotels Using Econometric Modeling and Image Classification (WebDB 2008, in conjunction with ACM SIGMOD/PODS 2008, Vancouver, Canada. Continuing work based on 1,2, (with Anindya Ghose and Panagiotis G. Ipeirotis)
2 Improving Local Search for Hotels Using Econometric Modeling and Image Classification. Microsoft Virtual Earth and Location Summit, Redmond, WA, May, 2008. (with Anindya Ghose and Panagiotis G. Ipeirotis)
1 Stay Elsewhere? The Economic Impact of Location-based Hotel Features: A View from Remote Sensing Image Analysis. Winter Conference on Business Intelligence, March, 2008, Salt Lake City, Utah. (with Anindya Ghose and Panagiotis G. Ipeirotis)
Master's Thesis: Clustering Weblog Documents, May, 2007.
Beibei Li, Shuting Xu, Jun Zhang: Enhancing Clustering Blog Documents by Utilizing Author/Reader Comments. In Proceedings of the 45th ACM Southeast Conference (ACMSE 2007), pp.94-99, March 23-24, 2007, Winston-Sale, North Carolina, U.S.A
Beibei Li, Jiajin Le: Privacy Preserving Association Rule Mining in Distributed Environments, National Data Base Conference 2005 (NDBC 2005), Inner Mogul, China, August 2005.
Feng He, Jiajin Le, Beibei Li: Research About Integration Based on Web Service Through An E-learning System, IEEE 2005 International Conference on Service Operations and Logistics, and Informatics, Beijing, China, August 2005.
Beibei Li, Jiajin Le: Application of Web Service in Web Mining. International Symposium on Computational and Information Sciences (CIS'04), Shanghai, China, 2004. (Certificate of Appreciation Award) Springer-VerlagEs Lecture Notes in Computer Science (LNCS) series.
Beibei Li, Xiaoping Zhong, et al: Web-Based Courseware Implementation. Chapter 1: Web-Based Courseware Foundation, Chapter 8: Interactive Web-Based Teaching System Development, edited by Jingcheng Zhao, Posts & Telecom Press, 2004.



Awards
Doctoral Student Fellowship, Leonard N. Stern School of Business, New York University, Sept. 2008-present.
Student Travel Support from Graduate School Fellowship of University of Kentucky, Spring, 2007.
First Prize for paper presentation, 20th Annual EKU Symposium in the Mathematical, Statistical and Computer Sciences, Mar 31, 2006.
Editor's Choice Award for Outstanding Achievement in Poetry, International Library of Poetry, July 2006.
Seventeen other Fellowships and Awards in my B.S/M.S study in China, 1999-2005.



Presentations
Stay Elsewhere? Improving Local Search for Hotels Using Econometric Modeling and Image Classification WebDB 2008, in conjunction with ACM SIGMOD/PODS 2008, Vancouver, Canada.
Local Search for Hotel and Restaurants Using Econometrics Modeling and Image Classification.Virtual Earth Academic Research Collaboration 2007 RFP Awards Summits. April 30 - May 1, 2008. Redmond, Washington.
Stay Elsewhere? The Economic Impact of Location-based Hotel Features: A View from Remote Sensing Image Analysis. Winter Conference on Business Intelligence, March 20-22, 2008, Salt Lake City, Utah.
Enhancing Clustering Blog Documents by Utilizing Author/Reader Comments. Mar. 2007. North Carolina.



Bookmarks
< Econometric & Statistical Data Analysis>
TOOLs
STATA
Matlab
Resources
UCLA Stat Computing
Online Discussion Panel
Econ-PhDs.net
PhD in Economics Forum
< Web Data mining >
Fundamentals & News
DM Tutorials - By Andrew Moore
KDnuggets: Data Mining, Web Mining, Text Mining, and Knowledge Discovery Guide
Data Mining Blogs
Software
TMG: Text to Matrix Generator (A Matlab Tool)
PDDP: Principal Direction Divisive Partitioning for Clustering (A Matlab Tool)
SVM-Light: A software of Support Vector Machine(SVM) for Classification.
Some other Tools for DM
< Privacy-Preserving Data mining >
The Privacy Lab @ CMU - PPDM References
Christopher W. Clifton @ Purdue Univeristy (Privacy-Preserving Distributed DM)
My Master's Thesis Abstract: Privacy Preserving Distributed Association Rule Mining
< Others >
Unix & Solaris
Solaris 8----User commands
Unix programming
Unix Power
UNIX-C System Interfaces & Headers
LaTex
LaTex Help
Hypertext Help with LaTeX
LaTeX Symbols