This page contains links to copyrighted material.
Please note, by downloading these documents, you are automatically committed to obligations set forth by the respective authors and the publishers.
Publications and Technical Reports
Adaptive Oversampling Method for Imbalanced Data Classification
Under review.
Topic Trend Detection in Text Collections using Latent Dirichlet Allocation
Under review.
Learning on the Border: Active Learning in Imbalanced Data Classification
Seyda Ertekin, Jian Huang, Léon Bottou, C. Lee Giles
In Proc. of ACM 16th Conference on Information and Knowledge Management (CIKM 2007), Lisboa, Portugal, November 2007.
*Also an NEC Laboratories Technical Report, May 2007.
K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee Giles
In Proc. of IEEE/WIC/ACM International Conference on Web Intelligence (WI 2007), San Jose, California, November 2007.
Active Learning for Class Imbalance Problem
Seyda Ertekin, Jian Huang, C. Lee Giles
To appear in Proc. of the 30th International Conference on Research and Development in Information Retrieval (ACM SIGIR 2007), Amsterdam, Netherlands, July 2007, (short paper).
*SIGIR Travel Award Winner
A Clustering Method for Web Data with Multi-Type Interrelated Components
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee Giles
In proc. of 16th International World Wide Web Conference (WWW 2007), Banff, Alberta/Canada, May 2007, published by ACM, (short paper).
Efficient Multiclass Boosting Classification with Active Learning
Jian Huang, Seyda Ertekin, Yang Song, Hongyuan Zha, C. Lee Giles
In Proc. of 2007 SIAM 7th International Conference on Data Mining (SDM 2007), Minneapolis, Minnesota, April 2007.
*IBM Research Travel Award Winner
Efficient Name Disambiguation for Large Scale Datasets
Jian Huang, Seyda Ertekin, C. Lee Giles
In proc. of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2006), pp. 536-544, Berlin, Germany, September 2006.
Published in Lecture Notes in Computer Science, volume 4213/2006.
*The poster of this work also won the Best Poster Award in Greater NY Area DB/IR Day which is held in New York University (NYU), New York, October 2006.
*The media coverage of this paper can be found under the title: The New System Solves The "Who is J. Smith?" Puzzle.
Document Clustering Using Sparse Citation Graph Analysis
Levent Bolelli, Seyda Ertekin, C. Lee Giles
In proc. of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2006), pp. 30-41, Berlin, Germany, September 2006.
Published in Lecture Notes in Computer Science, volume 4213/2006.
Fast Author Name Disambiguation in Citeseer
Jian Huang, Seyda Ertekin, C. Lee Giles
IST Technical Report No. 0019. The Pennsylvania State University, University Park, September 2006.
Longer version of ECML/PKDD 2006 paper.
*The media coverage of this paper can be found under the title: The New System Solves The "Who is J. Smith?" Puzzle.
Fast Classification with Support Vector Machines 
Seyda Ertekin, Léon Bottou, C. Lee Giles
Grace Hopper Celebration of Women in Computing, San Diego, October 2006.
Extended abstract of the poster has been selected for ACM Student Research Competition (ACM-SRC) in the same conference.
Efficient Support Vector Learning for Large Scale Datasets
Seyda Ertekin
In proc. of Doctoral Consortium at Grace Hopper Celebration of Women in Computing, San Diego, October 2006.
Active Learning, Loss Function Convexity, and Support Vectors
Leon Bottou, Ronan Collobert, Seyda Ertekin, Jason Weston.
NIPS Workshop on Value of Information in Inference, Learning and Decision-Making, Vancouver, Canada, December 2005.
Fast Kernel Classifiers with Online and Active Learning
Antoine Bordes, Seyda Ertekin, Jason Weston, Léon Bottou
Journal of Machine Learning Research (JMLR), vol. 6, pp. 1579-1619, 2005.
The source code of LASVM (an online SVM algorithm) is provided at http://leon.bottou.com/projects/lasvm
Can Computers Learn Faster? 
Seyda Ertekin
In proc. of Second COE Research Symposium, The Pennsylvania State University, University Park, April 2005.
Comparative Study of Representation of Web Pages in Automatic Text Categorization
Seyda Ertekin, C. Lee Giles
IIS-IST Technical Report-62003, The Pennsylvania State University, University Park, June 2003.
The Shape of the Web and its Implications for Searching the Web
Kemal Efe, Vijay Raghavan, C. Henry Chu, Adrienne L. Broadwater, Levent Bolelli, Seyda Ertekin
In proc. of International Conference of Advances in Infrastructure for Electronic Business, Science, and Education on the Internet, Italy, July 2000.
Refereed Poster Presentations
Efficient Name Disambiguation for Large Scale Datasets
Jian Huang, Seyda Ertekin, C. Lee Giles
Greater NY Area DB/IR Day, New York University (NYU), New York, October 2006.
Received the Best Poster Award.
Fast Online Classification with SVMs
Seyda Ertekin, Léon Bottou, C. Lee Giles
Workshop for Women in Machine Learning, San Diego, October 2006.
Presented with a spotlight talk in the workshop as well.
Fast Classification with Online Support Vector Machines
Seyda Ertekin, Léon Bottou, C. Lee Giles
The National Conference of Artificial Intelligence (AAAI) Doctoral Consortium, Boston, July 2006.
Scalable Online Support Vector Machines
Seyda Ertekin, Léon Bottou, C. Lee Giles
North East Student Colloquium on Artificial Intelligence, (NESCAI), Cornell University, Ithaca, NY, April 2006.
Towards Fast Machine Learning Algorithms for Automatic Classification
Seyda Ertekin, Léon Bottou, C. Lee Giles
In proc. of Turkish-American Scientists and Scholars Association Annual Meeting, Philadelphia, March 2006.
Received Young Scientist Grant award from TASSA.