Publications of Panagiotis G. Ipeirotis

horizontal rule

Working Papers

  1. The Dimensions of Reputation in Electronic Markets,
    A. Ghose, P. Ipeirotis, and A. Sundararajan,
  2. Deriving the Pricing Power of Product Features by Mining Consumer Reviews,
    N. Archak, A. Ghose, and P. Ipeirotis
  3. Estimating the Socio-Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics,
    A. Ghose and P. Ipeirotis
  4. Answering General Time Sensitive Queries,
    W. Dakka, L. Gravano, and P. Ipeirotis,
  5. Improving Data Quality and Data Mining Using Multiple, Noisy Labelers
    V. Sheng, F. Provost, and P. Ipeirotis
  6. Modeling Dependencies in Prediction Markets, (blog post)
    N. Archak and P. Ipeirotis

    Papers in Refereed Journals

  1. A Quality-Aware Optimizer for Information Extraction,
    A. Jain and P. Ipeirotis,
    ACM Transactions on Database Systems (TODS), March 2009
  2. Classification-Aware Hidden-Web Text Database Selection,
    P. Ipeirotis and L. Gravano,

    ACM Transactions on Information Systems (TOIS), vol. 26, no. 2, article 6, March 2008
  3. Towards a Query Optimizer for Text-Centric Tasks,
    P. Ipeirotis, E. Agichtein, P. Jain, and L. Gravano,

    ACM Transactions on Database Systems (TODS), vol. 32, no. 4, article 21, November 2007
  4. Modeling and Managing Changes in Text Databases,
    P. Ipeirotis, A. Ntoulas, J. Cho, and L. Gravano,
    ACM Transactions on Database Systems (TODS), vol. 32, no. 3, article 14, August 2007
  5. Duplicate Record Detection: A Survey,
    A. Elmagarmid, P. Ipeirotis, and V. Verykios,
    IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 19, no. 1, January 2007
  6. QProber: A System for Automatic Classification of Hidden-Web Databases,
    L. Gravano, P. Ipeirotis, and M. Sahami,
    ACM Transactions on Information Systems (TOIS), vol. 21, no. 1, January 2003

    Papers in Refereed Conferences

  7. Modeling Volatility in Prediction Markets, (blog post)
    N. Archak and P. Ipeirotis

    Proceedings of the 10th ACM Conference on Electronic Commerce (EC 2009), 2009 (40/158 = 25% accepted)
  8. Query by Document,
    Y. Yang, N. Bansal, W. Dakka, P. Ipeirotis, N. Koudas, D. Papadias
    Second ACM International Conference on Web Search and Data Mining (WSDM 2009), 2009 (29/170 = 17% accepted)
  9. Join Optimization of Information Extraction Output: Quality Matters!,
    A. Jain, P. Ipeirotis, A. Doan, and L. Gravano
    Proceedings of the 25th IEEE International Conference on Data Engineering (ICDE 2009), 2009
  10. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers, Best Paper Award Runner Up, (slides)
    V. Sheng, F. Provost, and P. Ipeirotis
    Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining  (KDD 2008), 2008 (50/~500 < 10% accepted)
  11. Automatic Extraction of Useful Facet Hierarchies from Text Databases,
    W. Dakka and P. Ipeirotis
    Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE 2008), 2008
  12. Show me the money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews, (slides)
    N. Archak, A. Ghose, and P. Ipeirotis

    Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), 2007 (~100/513 < 20% accepted)
  13. Opinion Mining Using Econometrics: A Case Study on Reputation Systems, (slides)
    A. Ghose, P. Ipeirotis, and A. Sundararajan

    Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), 2007 (132/588 = 22% accepted)
  14. To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks, Best Paper Award, (slides, extended slides)
    P. Ipeirotis, E. Agichtein, P. Jain, and L. Gravano,

    Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD 2006), 2006 (58/446 = 13% accepted)
  15. Automatic Construction of Multifaceted Browsing Interfaces, (slides)
    W. Dakka, P. Ipeirotis, and K. Wood,
    Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2005), 2005 (76/425 = 18% accepted)
  16. Modeling and Managing Content Changes in Text Databases, Best Paper Award, (slides)
    P. Ipeirotis, A. Ntoulas, J. Cho, and L. Gravano,
    Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE 2005), 2005 (67/521 = 13% accepted)
  17. When one Sample is not Enough: Improving Text Database Selection Using Shrinkage, (slides)
    P. Ipeirotis, and L. Gravano,
    Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), 2004 (69/431 = 16% accepted)
  18. Text Joins in an RDBMS for Web Data Integration, (slides, demo)
    L. Gravano, P. Ipeirotis, N. Koudas, and D. Srivastava,
    Proceedings of  the 12th International World-Wide Web Conference (WWW2003), 2003 (13% accepted)
  19. Distributed Search over the Hidden-Web: Hierarchical Database Sampling and Selection,
    P. Ipeirotis and L. Gravano,
    Proceedings of the 28th International Conference on Very Large Databases (VLDB 2002), 2002 (16% accepted)
  20. Extending SDARTS: Extracting Metadata from Web Databases and Interfacing with the Open Archives Initiative,
    P. Ipeirotis, T. Barry, and L. Gravano,
    in Proceedings of the Second ACM+IEEE Joint Conference on Digital Libraries (JCDL 2002), 2002 (33% accepted)
  21. Approximate String Joins in a Database (Almost) for Free, (erratum, slides)
    L. Gravano, P. Ipeirotis, H.V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava
    Proceedings of the 27th International Conference on Very Large Databases (VLDB 2001), 2001 (17% accepted)
  22. Probe, Count, and Classify: Categorizing Hidden-Web Databases, (slides)
    P. Ipeirotis, L. Gravano, and M. Sahami
    Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD 2001), 2001 (15% accepted)
  23. SDLIP + STARTS = SDARTS. A Protocol and Toolkit for Metasearching, Best Paper Award Runner Up, (slides)
    N. Green, P. Ipeirotis, and L. Gravano
    Proceedings of the First ACM+IEEE Joint Conference on Digital Libraries (JCDL 2001), 2001

    Papers in Refereed Workshops, Posters, and Demonstration Sessions

  24. Answering General Time Sensitive Queries,
    W. Dakka, L. Gravano, and P. Ipeirotis,
    Proceedings of the 2008 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2008), 2008
  25. Stay Elsewhere? Improving Local Search for Hotels Using Econometric Modeling and Image Classification,
    B. Li, A. Ghose, and P. Ipeirotis
    Proceedings of the Sixth International Workshop on the Web and Databases (WebDB 2008), 2008 (14/30 = 46% accepted)
  26. The Impact of Information Disclosure on Stock Market Returns: The Sarbanes-Oxley Act and the Role of Media as an Information Intermediary,
    K. Balakrishnan, A. Ghose, and P. Ipeirotis,

    Proceedings of the Seventh Workshop on the Economics of Information Security (WEIS 2008), 2008
  27. Multifaceted Browsing over Large Databases of Text-Annotated Objects,
    W. Dakka, P. Ipeirotis, and K. Wood,
    Proceedings of the 23rd IEEE International Conference on Data Engineering, Demonstrations (ICDE 2007), 2007 (28/73 = 38% accepted)
  28. Designing Ranking Systems for Consumer Reviews: The Impact of Review Subjectivity on Product Sales and Review Quality,
    A. Ghose and P. Ipeirotis,
    Proceedings of the 2006 Workshop on Information Technology and Systems (WITS 2006), 2006 (36/105 = 35% accepted)
  29. Automatic Discovery of Useful Facet Terms, (slides)
    W. Dakka, R. Dayal, and P. Ipeirotis
    ACM SIGIR 2006 Workshop on Faceted Search, 2006
  30. Reputation Premiums in Electronic Peer-to-Peer Markets: Analyzing Textual Feedback and Network Structure,
    A. Ghose, P. Ipeirotis, and A. Sundararajan
    ACM SIGCOMM 2005 Workshop Proceedings, Third Workshop on Economics of Peer-to-Peer Systems,  (P2PEcon 2005), 2005
  31. Modeling Query-Based Access to Text Databases, (slides)
    E. Agichtein, P. Ipeirotis, and L. Gravano,
    Proceedings of the Sixth International Workshop on the Web and Databases (WebDB 2003), 2003 (17/67 = 25% accepted).
  32. Text Joins for Data Cleansing and Integration in an RDBMS, (slides, poster, demo)
    L. Gravano, P. Ipeirotis, N. Koudas, and D. Srivastava,
    Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE 2003), 2003
  33. PERSIVAL Demo: Categorizing Hidden-Web Resources, (poster)
    P. Ipeirotis, L. Gravano, and M. Sahami

    Proceedings of the First ACM+IEEE Joint Conference on Digital Libraries (JCDL 2001), 2001.
  34. Automatic Classification of Text Databases through Query Probing, (slides)
    P. Ipeirotis, L. Gravano, and M. Sahami
    Proceedings of the Third International Workshop on the Web and Databases, WebDB 2000, (also in LCNS 1997), 2000 (20/69 = 29% accepted).

    Invited Papers

  35. The EconoMining Project at NYU: Studying the Economic Value of User-Generated Content on the Internet,
    A. Ghose and P. Ipeirotis
    Journal of Revenue and Pricing Management, vol 8, no. 2-3, March 2009
  36. Building Query Optimizers for Information Extraction: The SQoUT Project,
    A. Jain, P. Ipeirotis, and L. Gravano,
    SIGMOD Record, Special Issue on "Managing Information Extraction," vol. 37, no. 4, December 2008
  37. Searching Digital Libraries,
    P. Ipeirotis,
    Encyclopedia of Database Systems
    , 2008
  38. Designing Novel Review Ranking Systems: Predicting Usefulness and Impact of Reviews, (slides)
    A. Ghose and P. Ipeirotis,
    Proceedings of the Ninth International Conference on Electronic Commerce (ICEC 2007), 2007
  39. Query- vs. Crawling-based Classification of Searchable Web Databases,
    L. Gravano, P. Ipeirotis, and M. Sahami,
    IEEE Data Engineering Bulletin, vol. 25, no. 1, March 2002.
  40. Using q-grams in a DBMS for Approximate String Processing, (erratum)
    L. Gravano, P. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, L. Pietarinen, and D. Srivastava,

    IEEE Data Engineering Bulletin, vol. 24, no. 4, December 2001.

    Miscellaneous Publications and Presentations

  41. Multi-labeling When Data Preprocessing Is Costly,
    V. Sheng, F. Provost, P. Ipeirotis
    INFORMS Annual Meeting, 2008
  42. Improving Data Quality and Data Mining Using Multiple, Noisy Labelers,
    V. Sheng, F. Provost, P. Ipeirotis
    3rd Annual Machine Learning Symposium, 2008
  43. Detecting Important Events Using Prediction Markets, Text Mining, and Volatility Modeling,
    G. Tziralis and P. Ipeirotis
    Third Workshop on Prediction Markets, 2008
  44. Noisy Multi-Labeling for Data Mining
    V. Sheng, F. Provost, and P. Ipeirotis
    Fourth Research Symposium on Statistical Challenges in E-Commerce, 2008
  45. Measuring the Pricing Power of User-Generated Reviews for Hedonic Goods
    N. Archak, A. Ghose, and P. Ipeirotis,
    Fourth Research Symposium on Statistical Challenges in E-Commerce, 2008
  46. Stay Elsewhere? The Economic Impact of Location-based Hotel Features: A View from Remote Sensing Image Analysis
    B. Li, A. Ghose, and P. Ipeirotis

    Winter Conference on Business Intelligence, 2008
  47. Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets,
    A. Ghose and P. Ipeirotis,
    Proceedings of the International Conference on Decision Support Systems (ICDSS 2007)
  48. Towards Automating the Pricing Power of Product Attributes: An Analysis of Online Product Reviews
    N. Archak, A. Ghose, and P. Ipeirotis,

    Winter Conference on Business Intelligence, 2007
  49. The Dimensions of Reputation in Electronic Markets,
    A. Ghose, P. Ipeirotis, and A. Sundararajan
    Second Research Symposium on Statistical Challenges in E-Commerce, 2006
  50. Classifying and Searching Hidden-Web Text Databases,
    P. Ipeirotis
    Ph.D. Dissertation, Columbia University (advisor: L. Gravano), September 2004