Accepted Papers

PAKDD2008 received 312 submissions from 34 countries and regions in Asia, Australasia, North America, South America, Europe and Africa. Each paper was rigorously reviewed by at least two program committee members, discussed by the reviewers under the supervision of an area chair, and judged by the program committee chairs. If there is a large disagreement, the area chair and/or PC co-chairs provided an additional review. Only 37 (11.9%) of the 312 submissions were accepted as long papers, 40 (12.8%) of them were accepted as regular papers, and 36 (11.5%) of them were accepted as short papers.

Long Papers
Pr226 Minimum Variance Associations --- Discovering Relationships in Numerical Data
  Szymon Jaroszewicz
Pr228 Semi-Supervised Local Fisher Discriminant Analysis for Dimensionality Reduction
  Masashi Sugiyama, Tsuyoshi Ide, Shinichi Nakajima, and Jun Sese
Pr243 Ambiguous Frequent Itemset Mining and Polynomial Delay Enumeration
  Takeaki Uno and Hiroki Arimura
Pr244 An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data
  Takeaki Uno
Pr260 A Mixture Model for Expert Finding
  Jing Zhang, Jie Tang, Liu Liu, and Juanzi Li
Pr267 Unusual Pattern Detection in High Dimensions
  Minh Nguyen, Leo Mark, and Edward Omiecinski
Pr269 Handling Numeric Attributes in Hoeffding Trees
  Bernhard Pfahringer, Geoff Holmes, and Richard Kirkby
Pr278 On Privacy in Time Series Data Mining
  Ye Zhu, Yongjian Fu, and Huirong Fu
Pr286 SEM: Mining Spatial Events from the Web
  Kaifeng Xu, Rui Li, Shenghua Bao, Dingyi Han, and Yong Yu
Pr291 Protecting Privacy in Incremental Maintenance for Distributed Association Rule Mining
  Wai Kit Wong, David Wai Lok Cheung, Edward Hung, and Huan Liu
Pr292 Large-scale k-means Clustering with User-Centric Privacy Preservation
  Jun Sakuma and Shigenobu Kobayashi
Pr293 ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization
  Tianming Hu, Hui Xiong, Xueqing Gong, and Sam Yuan Sung
Pr295 Mining Correlated Subgraphs in Graph Databases
  Tomonobu Ozaki and Takenao Ohkawa
Pr297 A Decremental Approach for Mining Frequent Itemsets from Uncertain Data
  Chun-Kit Chui and Ben Kao
Pr299 Person Name Disambiguation in Web Pages using Social Network; Compound Words and Latent Topics
  Shingo Ono, Issei Sato, Minoru Yoshida, and Hiroshi Nakagawa
Pr309 On Addressing Accuracy Concerns in Privacy Preserving Association Rule Mining
  Ling Guo, Songtao Guo, and Xintao Wu
Pr319 Towards Region Discovery in Spatial Datasets
  Wei Ding, Rachsuda Jiamthapthaksin, Rachana Parmar, Dan Jiang, Tomasz Stepinski, and Christoph Eick
Pr321 LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation
  Shin-ichi Minato, Takeaki Uno, and Hiroki Arimura
Pr326 Privacy-Preserving Linear Fisher Discriminant Analysis
  Shuguo Han and Wee Keong Ng
Pr344 Feature Selection by Nonparametric Bayes Error Minimization
  Shuang-Hong Yang and Bao-Gang Hu
Pr349 Accurate and Efficient Retrieval of Multimedia Time Series Data under Uniform Scaling and Time Warping
  Waiyawuth Euachongprasit and Chotirat Ann Ratanamahatana
Pr350 Extreme Support Vector Machine
  Qiuge Liu, Qing He, and Zhongzhi Shi
Pr369 Unsupervised Change Analysis using Supervised Learning
  Shohei Hido, Tsuyoshi Ide, Hisashi Kashima, Harunobu Kubo, and Hirofumi Matsuzawa
Pr370 Multi-Class Named Entity Recognition via Bootstrapping with Dependency Tree-based Patterns
  Van Dang and Akiko Aizawa
Pr389 A Decomposition Algorithm for Learning Bayesian Network Structures from Data
  Yifeng Zeng and Jorge Cordero Hernandez
Pr398 Scaling Record Linkage to Non-Uniform Distributed Class Sizes
  Steffen Rendle and Lars Schmidt-Thieme
Pr415 A Framework for Modeling Positive Class Expansion with Single Snapshot
  Yang Yu and Zhi-Hua Zhou
Pr432 Mining Bulletin Board Systems Using Community Generation
  Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou
Pr435 BOAI: Fast Alternating Decision Tree Induction based on Bottom-up Evaluation
  Bishan Yang, Tengjiao Wang, Dongqing Yang, and Lei Chang
Pr439 Feature Construction based on Closedness Properties is not that Simple
  Dominique Gay, Nazha Selmaoui, and Jean-Francois Boulicaut
Pr440 Characteristic-based Descriptors for Motion Sequence Recognition
  Liang Wang, Xiaozhe Wang, Christopher Leckie, and Ramamohanarao Kotagiri
Pr447 An Efficient Unordered Tree Kernel and its Application to Glycan Classification
  Tetsuji Kuboyama, Kouichi Hirata, and Kiyoko F. Aoki-Kinoshita
Pr448 Mining Quality-Aware Subspace Clusters
  Ying-Ju Chen, Yi-Hong Chu, and Ming-Syan Chen
Pr476 Learning Rules for Multiple Target Classification
  Bernard Zenko and Saso Dzeroski
Pr486 SubClass: Classification of Multidimensional Noisy Data Using Subspace Clusters
  Ira Assent, Ralph Krieger, Petra Welter, Jorg Herbers, and Thomas Seidl
Pr508 A Minimal Description Length Scheme for Polynomial Regression
  Aleksandar Pekov, Saso Dzeroski, and Ljuptuo Todorovski
Pr513 Generation of Globally Relevant Continuous Features for Classification
  Sylvain Letourneau, Stan Matwin, and A. Fazel Famili





Regular Papers

Pr209 Using Supervised and Unsupervised Techniques to Determine Groups of Patients with Different Continuity of Care
  Eu-Gene Siew, Leonid Churilov, Kate A. Smith-Miles, and Joachim P. Sturmberg
Pr212 Designing a System for a Process Parameter Determined through Modified PSO and Fuzzy Neural Network
  Jui-Tsung Wong, Kuei-Hsien Chen, and Chwen-Tzeng Su
Pr214 Tradeoff Analysis of Different Markov Blanket Local Learning Approaches
  Shunkai Fu and Michel C. Desmarais
Pr221 Maintaining Optimal Multi-way Splits for Numerical Attributes in Data Streams
  Tapio Elomaa and Petri Lehtinen
Pr225 A Clustering-Oriented Star Coordinate Translation Method for Reliable Clustering Parameterization
  Chieh-Yuan Tsai and Chuang-Cheng Chiu
Pr231 Forecasting Urban Air Pollution Using HMM-fuzzy Model
  M. Maruf Hossain, Md. Rafiul Hassan, and Michael Kirley
Pr239 Automatic Training Example Selection for Unsupervised Record Linkage
  Peter Christen
Pr240 Data-Aware Clustering Hierarchy for Wireless Sensor Networks
  Xiaochen Wu, Peng Wang, Wei Wang, and Baile Shi
Pr250 Tracking Topic Evolution in On-line Postings: 2006 IBM Innovation Jam data
  Mei Kobayashi and Raylene Yung
Pr256 Relational Pattern Mining based on Equivalent Classes of Properties Extracted from Samples
  Nobuhiro Inuzuka, Jun-ichi Motoyama, Shinpei Urazawa, and Tomofumi Nakano
Pr264 Fast On-line Estimation of the Joint Probability Distribution
  Jan Peter Patist
Pr266 Improving the Robustness to Outliers of Mixtures of Probabilistic PCAs
  Nicolas Delannay, Cedric Archambeau, and Michel Verleysen
Pr272 Term Committee Based Event Identification Within News Topics
  Kuo Zhang, JuanZi Li, Gang Wu, and KeHong Wang
Pr279 Connectivity Based Stream Clustering Using Localised Density Exemplars
  Sebastian Luhr and Mihai Lazarescu
Pr284 A Creditable Subspace Labeling Method based on D-S Evidence Theory
  Yu Zong, Xianchao Zhang, He Jiang, and Mingchu Li
Pr307 Mining a Complete Set of both Positive and Negative Association Rules from Large Databases
  Hao Wang, Xing Zhang, and Guoqing Chen
Pr327 Concept LatticeBased Mutation Control for Reactive Motifs Discovery
  Kitsana Waiyamai, Peera Liewlom, Thanapat Kangkachit, and Thanawin Rakthanmanon
Pr330 Bootstrap based Pattern Selection for Support Vector Regression
  Dongil Kim and Sungzoon Cho
Pr333 A Simple Characterization on Serially Constructible Episodes
  Takashi Katoh and Kouichi Hirata
Pr339 A More Topologically Stable Locally Linear Embedding Algorithm Based on R*-Tree
  Tian Xia, Jintao Li, Yongdong Zhang, and Sheng Tang
Pr342 A Comparison of Different Off-centered Entropies to Deal with Class Imbalance for Decision Trees
  Philippe Lenca, Stéphane Lallich, Thanh-Nghi Do, and Nguyen-Khang Pham
Pr343 Sparse Kernel-based Feature Weighting
  Shuang-Hong Yang, Yu-Jiu Yang Yang, and Bao-Gang Hu
Pr346 Locally Linear Online Mapping for Mining Low-Dimensional Data Manifolds
  Huicheng Zheng, Wei Shen, Qionghai Dai, and Sanqing Hu
Pr373 Learning User Purchase Intent From User-Centric Data
  Rajan Lukose, Jiye Li, Jing Zhou, and Satyanarayana Raju Penmetsa
Pr374 Applying Latent Semantic Indexing in Frequent Itemset Mining for Document Relation Discovery
  Thanaruk Theeramunkong, Kritsada Sriphaew, and Manabu Okumura
Pr377 Efficient Mining of High Utility Itemsets from Large Datasets
  Alva Erwin, Raj P. Gopalan, and Narasimaha Achuthan
Pr379 Constrained Clustering for Gene Expression Data Mining
  Vincent S. Tseng, Lien-Chin Chen, and Ching-Pin Kao
Pr385 Exploratory Hot Spot Profile Analysis using an Interactive Visual Drill-Down Self-Organizing Maps
  Denny, Graham Williams and Peter Christen
Pr394 G-TREACLE: A New Grid-based and Tree-alike Pattern Clustering Technique for Large Databases
  Cheng-Fa Tsai and Chia-Chen Yen
Pr413 FIsViz: A Frequent Itemset Visualizer
  Carson Kai-Sang Leung, Pourang P. Irani, and Christopher L. Carmichael
Pr417 A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data
  Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk
Pr421 Evaluating Standard Techniques for Implicit Diversity
  Ulf Johansson, Tuve Lofstrom, and Lars Niklasson
Pr422 Local Projection in Jumping Emerging Patterns Discovery in Transaction Databases
  Pawel Terlecki and Krzysztof Walczak
Pr428 Exploiting Propositionalization based on Random Relational Rules for Semi-Supervised Learning
  Grant Anderson and Bernhard Pfahringer
Pr433 Fast k Most Similar Neighbor Classifier for Mixed Data based on an Approximation and Elimination algorithm
  Selene Hernández Rodríguez, J. Ariel Carrasco-Ochoa, and J. Fco. Martínez-Trinidad
Pr438 Query Expansion for the Language Modelling Framework using the Naive Bayes Assumption
  Laurence Park and Kotagiri Ramamohanarao
Pr451 PAID: Packet Analysis for Anomaly Intrusion Detection
  Kuo-Chen Lee, Jason Chang, and Ming-Syan Chen
Pr464 On Discrete Data Modeling
  Nizar Bouguila and Walid Elguebaly
Pr472 Entity Network Prediction using Multitype Topic Models
  Hitohiro Shiozaki, Koji Eguchi, and Takenao Ohkawa
Pr519 Analyzing PETs on Imbalanced Datasets when Training and Testing Class Distributions Differ
  David Cieslak and Nitesh Chawla



Short Papers
Pr219 Rule Extraction with Rough-Fuzzy Hybridization Method
  Nan-Chen Hsieh
Pr229 A New Credit Scoring Method Based on Rough Sets and Decision Tree
  XiYue Zhou, DeFu Zhang, and Yi Jiang
Pr237 Efficient Mining of Minimal Distinguishing Subgraph Patterns from Graph Databases
  Zhiping Zeng, Jianyong Wang, and Lizhu Zhou
Pr258 Combined Association Rule Mining
  Huaifeng Zhang, Yanchang Zhao, Longbing Cao, and Chengqi Zhang
Pr262 R-map: Mapping Categorical Data for Clustering and Visualization based On Reference Sets
  Zhi-Yong Shen, Ming Li, Yi-Dong Shen, and Jun Sun
Pr275 Mining Changes in Patent Trends for Competitive Intelligence
  Meng-Jung Shih, Duen-Ren Liu, and Ming-Li Hsu
Pr277 Clustering Transaction Datasets Using Seeds
  Yun Sing Koh and Russel Pears
Pr289 Forward Semi-Supervised Feature Selection
  Jiangtao Ren, Zhengyuan Qiu, Wei Fan, Hong Cheng, and Philip S. Yu
Pr301 Discovering New Orders of the Chemical Elements through Genetic Algorithms
  Alexandre Blansché and Shuichi Iwata
Pr302 Combining Context and Existing Knowledge When Recognizing Biological Entities -- Early results
  Mika Timonen and Antti Pesonen
Pr312 What is Frequent in a Single Graph
  Bjoern Bringmann and Siegfried Nijssen
Pr317 Enriching WordNet with Folksonomies
  Hao Zheng, Xian Wu, and Yong Yu
Pr322 Detecting Near-Duplicates in Large-Scale Short Text Databases
  Caichun Gong, Yulan Huang, Xueqi Cheng, and Shuo Bai
Pr323 Text Categorization of Multilingual Web Pages on Specific Domain
  Jicheng Liu and Chunyan Liang
Pr338 Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances
  Jun Long, Jianping Yin, En Zhu, and Wentao Zhao
Pr345 Fighting WebSpam: Detecting Spam on the Graph via Content and Link Features
  Yu-Jiu Yang, Shuang-Hong Yang, and Bao-Gang Hu
Pr352 A New Model for Image Annotation
  Sanparith Marukatat
Pr356 Structure-based Hierarchical Transformations for Interactive Visual Exploration of Social Networks
  Lisa Singh, Mitchell Beard, Brian Gopalan, and Gregory Nelson
Pr364 Mining Non-Coincidental Rules Without A User Defined Support Threshold
  Yun Sing Koh
Pr365 I/O Scalable Bregman Co-clustering
  Kuo-Wei Hsu, Arindam Banerjee, and Jaideep Srivastava
Pr368 Efficient Joint Clustering Algorithms in Optimization and Geography Domains
  Chia-Hao Lo and Wen-Chih Peng
Pr372 Using Ontology-Based User Preferences to Aggregate Rank Lists in Web Search
  Lin Li, Zhenglu Yang, and Masaru Kitsuregawa
Pr384 Automatic Extraction of Basis Expressions that Indicate Economic Trends
  Hiroki Sakaji, Hiroyuki Sakai, and Shigeru Masuyama
Pr388 Seeing Several Stars: a Rating Inference Task for a Document Containing Several Evaluation Criteria
  Kazutaka Shimada and Tsutomu Endo
Pr396 Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features
  Vincent S. Tseng, Ja-Hwung Su, Jhih-Hong Huang, and Chih-Jen Chen
Pr399 A Cluster-Based Genetic-Fuzzy Mining Approach for Items with Multiple Minimum Supports
  Chun-Hao Chen, Tzung-Pei Hong, and Vincent S. Tseng
Pr406 A Framework for Discovering Spatio-Temporal Cohesive Networks
  Jin Soung Yoo and Joengmin Hwang
Pr408 Cell-based Outlier Detection Algorithm: A Fast Outlier Detection Algorithm for Large Datasets
  You Wan and Fuling Bian
Pr419 A New Framework for Taxonomy Discovery from Text
  Ahmad El Sayed, Hakim Hacid, and Djamel Zighed
Pr420 Jumping Emerging Patterns with Occurrence Count in Image Classification
  Lukasz Kobylinski and Krzysztof Walczak
Pr455 Customer Churn Time Prediction in Mobile Telecommunication Industry using Ordinal Regression
  Rupesh Gopal and Saroj Meher
Pr468 Unmixed Spectrum Clustering for Template Composition in Lung Sound Classification
  Tomonari Masada, Senya Kiyasu, and Sueharu Miyahara
Pr473 A Selective Classifier for Incomplete Data
  Jingnian Chen, Houkuan Huang, Fengzhan Tian, and Shengfeng Tian
Pr489 The Application of Echo State Network in Stock Data Mining
  Xiaowei Lin, Zehong Yang, and Yixu Song
Pr491 CP-tree: A Tree Structure for Single-Pass Frequent Pattern Mining
  Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and Young-Koo Lee
Pr511 Analyzing the Propagation of Influence and Concept Evolution in Enterprise Social Networks Through Centrality and Latent Semantic Analysis
  Weizhong Zhu, Chaomei Chen, and Robert B. Allen