Yuni Xia is an Associate Professor
of the Computer and Information Science Department at Indiana
University - Purdue University Indianapolis (IUPUI). She
received the B.S. in Computer Science from Huazhong University of
Science and Technology in China, and her MS and PhD in Computer
Science from Purdue University. She joined IUPUI in 2005. Before
that, she had also worked as a research intern at IBM T.J. Watson
Research center. Xia's research is on data mining and
databases, focusing on mining and management of uncertain data and
constantly evolving data such as sensor data, biomedical
data and moving object data. She also works on data query,
retrieval, management and mining in data-intensive applications,
and managing uncertainty in the decision support process.
Research
Data Mining: Uncertain
Data Mining, Data Stream Mining, Biomedical Informatics
Databases: Constant
Evolving Data Management, Sensor and Moving Object Databases, Data
Uncertainty Management
Research Projects (we gratefully acknowledge
the support of funding agencies) :
Health-Terrain: Visualizing Large Scale Health
Data, Supported by US Department of the Army, Co-PI (PI: Fang),
2013-2015 This project aims to
design a framework and new techniques for mining and visulizing
large scale health data.
Development of Key Technologies for Big Data Analysis and
Management Software Based on Next Generation Memory,
Supported by ETRI, Co-PI
(Institute PI: Lee), 2012-2017.
This collaborative project seeks to
develop big-data main memory database management system and
distributed streaming processing system using hardware
acceleration techniques including FPGA ( programmable /
reconfigurable chips) and GPGPU(general purpose graphics
processing unit).
Large
Scale Sensor Stream Analysis and Mining for Geriatric Care, Supported by IBM.
This project aims to design and
develop a real-time distributed sensor stream monitoring and
analysis system for geriatric care. This enables effective
home-based continuous geriatrics care, which is not only
cost-savings, but also improves the quality of life of the
elderly and their families.
DisProt Database: A Central
Repository of Information on Intrinsically Disordered Proteins,
Supported by NSF, Co-PI (PI: Dunker), 2009-2012.
The goal of this project is to
fully develop DisProt, a database that provides an essential
depository of information about intrinsically disordered
proteins (IDPs) . DisProt will be not only a collection of data
on intrinsically disordered proteins and their functions, but
also a unique research tool to conduct various computational
studies on these proteins and to help design better research
strategies for studying individual IDPs in laboratory. It's
expected that DisProt will support a very wide-spread use, both
for the purpose of carrying out bioinformatics experiments and
for the entire community involved in understanding cell and
molecular biology.
TrafficAnalyzer:
A Real-time Traffic Stream Processing and Analyzing System,
Supported by IBM.
Modern traffic monitoring systems
are required to perform real-time processing and analysis of
peta-bit continuous data streams. In this project, we propose to
design and develop a real-time traffic stream processing and
analyzing system. The most important feature of
TrafficAnalyzer is the real-time performance. The results of
processing need to be produced with virtually zero latency,
because in traffic monitoring system, real-time response is
crucial for reducing accidences rate and smoothing traffic flow.
TrafficAnalyzer must support sophisticated time-windowed
processing operations since streaming data continually changes,
often at high rates. These operations should be executed in a
way that produces results incrementally as new data arrives,
since the entire data set is never available in its
entirety. TrafficAnalyzer also provides careful management
of the historical data, as it need compare and combine “present”
data with the “past” to study the traffic flow change over the
time. TrafficAnalyzer is also resilient to inaccuracy and
uncertainties in the data streams, because inherent variations,
losses, or reordering of the data streams cause data to arrive
in the wrong order, or with variable delays.
Development of SYMBIOTE; A
Reconfigurable Logic Assisted Data Stream Management System for
Multimedia Sensor Networks, Supported by NSF, Co-PI (PI: Lee), 2008-2011
Numerous emerging applications
require real-time processing of high bandwidth multimedia data
streams. In this project, we propose a novel class of data
stream management systems called Reconfigurable Logic Assisted
DSMS (RLADSMS) that will provide one of the first comprehensive
and demonstrative approaches to using Reconfigurable Logic
coprocessors as data stream accelerators in the prototype
RLADSMS called SYMBIOTE. This project will investigate key
issues such as data models, query languages, hardware DSMS
operators, corresponding cost models of query execution,
considering hardware complexity of database operators, run-time
complexity of hardware and software operators, interconnect
latencies, bandwidth, resource allocation as well as
optimization techniques for this new class of data stream
management systems
Invention of a Consumer-Side
Geriatric Health Care Knowledge Management and Decision Support
System, Supported by 21st Century Research and Development
Fund, State of Indiana, Co-PI (Institute PI: Palakal),
2008-2010
This project proposes to build an
innovative Knowledge Management system unique in the Geriatric
Care Management Industry. This system will accelerate the
adoption of standards of care and provide the accumulation of
knowledge from current Social Science, Psychology, and Health
disciplines. It will also build a basis, comparable to the
Health Care Industry model, for evidence based outcomes
validation.
Publications
Chandima Hewa Nadungodage, Jaehwan John Lee, Yuni Xia,
Miyoung Lee, Myungcheol Lee, GPU-based Memory Efficient
Recommendation System for Big Data Applications, the
Internation Conference on GPU technoglogy, 2013.
Chandima Hewa Nadungodage, Yuni Xia, Jaehwan John Lee,
Yi-cheng Tu, Hyper-Structure Mining of Frequent Patterns in
Uncertain Data Streams, Journal of Knowledge and Information
Systems ( KAIS) , 2d012.
Shaun Grannis, Brian Dixon, Yuni Xia, Jianmin Wu, Using
Information Entropy to Monitor Chief Complaint Characteristics
and Quality, the 2012 International Society for Disease
Surveillance Conference.
Chandima H. Nadungodage, Yuni Xia, Pranav S. Vaidya, Yu
Chen, J. Lee, “Online Multidimensional Regression Analysis on
Concept-drifting Data Streams,” International Journal of Data
Mining, Modeling and Management (IJDMMM), Accepted.
Omkar Tilak, Andrew Hoblitzell, Snehasis Mukhopadhyay, Qian
You, Shiaofen Fang, Yuni Xia, Joseph Bidwell, Multi-Level Text
Mining for Bone Biology, Concurrency and Computation: Practice
and Experience, 23(17): 2355-2364 , 2011
Yu Chen, Pranav Vaidya, Jaehwan John Lee, Chandima Hewa
Nadungodage, Yuni Xia, Renfa Li, Qiang Wu, A New
Hardware/Software Partitioning Methodology Combining Search
Space Smoothing and Discrete Particle Swarm Optimization, ,
International Conference on Engineering of Reconfigurable
Systems and Algorithms (ERSA), 2011.
Biao Qin, Yuni Xia, Rakesh Sathyesh, Jiaqi Ge, Sunil
Probhakar, Classify
Uncertain Data with Decision Tree, Demo, International
Conference on Database Systems for Advanced Applications
(DASFAA) 2011.
Sandeep Raghuram, Yuni Xia, Jiaqi Ge, Mathew Palakal,
Josette Jones, Dave Pecenka, Eric Tinsley, Jean Bandos, and
Jerry Geesaman. AutoBayesian:
Developing Bayesian Networks Based on Text Mining, Demo,
International Conference on Database Systems for Advanced
Applications (DASFAA) 2011. (Best Demo Award)
Biao Qin, Yuni Xia, Sunil Prabhakar, Rule Induction for
Uncertain Data, Knowledge and Information System(KAIS),
23(17): 2355-2364, 2011
Pranav Vaidya, Y. Chen, Jaehwan John Lee, Chandima Hewa
Nadungodage, and Yuni Xia, "A General
Purpose FPGA Data Filter For Data Stream Processing",
International Conference on Engineering of Reconfigurable
Systems and Algorithms (ERSA), pp. 247-250, 2010.
Jiaqi Ge, Yuni Xia, Yicheng Tu, A
Discretization Algorithm for Uncertain Data, the 21st
International Conference on Database and Expert Systems
Applications (DEXA), 2010. (Acceptance Rate: 22.7%)
Andrew Hoblitzell, Snehasis Mukhopadhyay, Qian You, Shiaofen
Fang, Yuni Xia, Joseph Bidwell, Text Mining
for Bone Biology, Proceeding of the Workshop on Emerging
Computational Methods for the Life Sciences, 2010.
Jiaqi Ge, Yuni Xia, Chandima Nadungodage, Classify Uncertain Data
with Neural Network, the 14th Pacific-Asia
Conference on Knowledge Discovery and Data Mining (PAKDD),
2010. (Acceptance Rate: 10.2%)
Sandeep Raghuram, Yuni Xia, Mathew Palakal, Josette Jones,
Dave Pecenka, Eric Tinsley, Jean Bandos, and Jerry Geesaman.
"Bridging Text Mining and Bayesian Networks", Proc. of the
Workshop on Intelligent Biomedical Information Systems (IBIS),
2009.
Biao Qin, Yuni Xia, Fang Li, ”DTU: A Decision Tree for
Classifying Uncertain Data”, the Pacific-Asia Conference
on Knowledge Discovery and Data Mining (PAKDD), 2009
(Acceptance Rate: 11.5%).
Biao Qin, Yuni Xia, Sunil Prabhakar, Yicheng Tu, "A Rule-Based
Classification Algorithm for Uncertain Data", the IEEE
workshop on Management and Mining of Uncertain Data(MOUND), in
conjunction with International Conference of Data Engineering,
2009.
Jiangang Liu, Andrew Campen, Shuguang Huang, Sheng-Bin Peng,
Xiang Ye, Mathew Palakal, A. Keith Dunker, Yuni Xia and Shuyu
Li, "Identification of a gene signature in cell cycle pathway
for breast cancer prognosis using gene expression profiling
data", BMC Medical Genomics, 2008, 1:39 .
Yuni Xia, Andrew Campen, Dan Rigsby, Ying Guo, Xingdong
Feng, Eric Su, Mathew Palakal, Shuyu Li, "DGEM - a Microarray
Gene Expression Database for Primary Human Disease Tissues",
Molecular Diagnosis and Therapy, Issue 3, 2007.
Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar,
"Reducing Data Redundancy in Location-based Services", the
International Conference on Geosensor Networks (GeoSensor),
pp. 30-35, Boston, USA, 2006.
Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Rahul Shah,
Jeffrey Scott Vitter, Yuni Xia, "Efficient Join Processing
over Uncertain Data", the ACM 15th Conference on Information
and Knowledge Management (CIKM), pp. 738-747, Arlington, USA,
2006. (Acceptance Rate: 15%)
Yicheng Tu, Mohamed Hefeeda, Yuni Xia, Sunil Prabhakar, Song
Liu, Control-Based
Quality Adaptation in Data Stream Management Systems",
the International Conference of Database and Expert Systems
Applications (DEXA), pp.746 - 755, Copenhagen, Denmark, 2005.
(Acceptance Rate: 23%)
Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng, Rahul
Shah, "Indexing Continuously Changing Data with Mean Variance
Tree", the 20th ACM Symposium on Applied Computing (SAC), pp.
1125 - 1132, Santa Fe, New Mexico, USA, 2005. (Acceptance
Rate: 30%)
Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, "Change Tolerant
Indexing for Constantly Evolving Data", the
International Conference on Data Engineering (ICDE), pp.
391-402, Tokoyo, Japan, 2005. (Acceptance Rate: 13%)
Yuni Xia, Sunil Prabhakar, Efficient VNG Indexing in
Location-aware Services", the International Workshop on Mobile
and Distributed Computing (MDC), pp.414 - 419, Providence,
Rhode Island, USA, 2003.
Yuni Xia, Sunil Prabhakar, Q+Rtree: Efficient
Indexing for Moving Object Databases", the 8th
International Conference on Database Systems for Advanced
Applications (DASFAA), pp.175 - 182, Kyoto, Japan, 2003.
(Acceptance Rate: 25%)
Yuni Xia, Jonathon Munson, David Wood, Alan Cole,
Location-based Service System (LBS) Analysis and
Design'', Handbook of Research on Modern Systems
Analysis and Design Technologies and Applications, ISBN:
978-1-59904-887-1; 698 pp, 2008.
Meeta Pradhan and Yuni Xia, Bioterrorism and Biosecurity ",
Handbook of Research on Information Security and Assurance,
ISBN: 978-1-59904-855-0, 586 pp, 2008.
Sunil Prabhakar, Dmitri V. Kalashnikov, and Yuni Xia, "Query
Indexing and Velocity Constrained Indexing", Encyclopedia of
GIS, Springer Science, 2008.
Teaching
Please log into Oncourse for
lecture notes, readings, assignments, projects, etc.
CSCI340: Discrete Computational Structures
CSCI441: Client Server Databases
CSCI443: Database Systems
CSCI481: Data Mining
CSCI541: Database Management Systems
CSCI573: Data Mining
CSCI590: Advanced Database Systems
Profession Services
Program Committee:
The International Conference
on Collaborative Computing (CollaborateCom), 2010 to 2013
The IEEE International
Conference on Computer and Information Technology (CIT), 2010,
2011
The International Conference
on Frontier Computing (FC), 2010
The International Workshop
on Smart Homes for Tele-Health (SmarTel), 2010
The IEEE 12th International
Conference on Computational Science and Engineering(CSE), 2009
The International Workshop
on Smart Homes for Tele-Health (SmarTel), 2009
The International Workshop
on Information Fusion and Dissemination in Wireless Sensor
Networks (SensorFusion), 2009
The International Conference
on Intelligent Pervasive Computing (IPC), 2008
The IEEE 11th International
Conference on Computational Science and Engineering(CSE), 2008
The IEEE 21st International
Conference on Advanced Information Networking and Applications
(AINA), 2007
The IEEE/ACS 5th
International Conference on Computer Systems and Applications
(AICCSA), 2007
The Third International
Conference on Intelligent Environments(IE), 2007
The International
Workshop on Information Fusion and Dissemination in Wireless
Sensor Networks(SensorFusion), 2007
The International
Workshop on Knowledge Management and Discovery for Ubiquitous and
Pervasive Applications (KUPA), 2007
Local Chair:
ACM SIGMOD/PODS Conference, 2010
Journal Review:
IEEE Transaction on
Knowledge and Data Engineering
IEEE Transactions on
Parallel and Distributed Systems
ACM Transaction on Database
System
ACM Transaction on Knowledge
Discovery from Data
Knowledge and Information
Systems
Data and Knowledge
Engineering
Information Systems
Information Sciences
The International
Journal of Telemedicine and Application
The Information Fusion
Journal
The Journal of System and
Software
The International
Journal of Data Mining and Bioinformatics
The Electronics and
Telecommunication Research Institute Journal
The International
Journal of Computer Science and Technology
The Journal of
Ubiquitous Computing and Intelligence
Panelist:
Panelist, National Science
Foundation, CISE, 2007, 2009, 2011
Program Co-Chair:
The ACM workshop on Health
Information and Knowledge Management (HIKM) 2006
Awards
Best Demo Award,
International Conference on Database Systems for Advanced
Applications (DASFAA), 2011
IBM Scalable Data
Analytics Innovation Award, 2010
Trustees Teaching Award,
IUPUI, 2009
Research Venture Award,
IUPUI, 2009
IBM Real Time Innovation
Award, 2008
TechPoint MIRA Award,
with Purdue University Knowledge Projection Team, 2005 Leading
Light Award / Ice Miller Graduate Student Scholarship, 2004
IBM Grace Hopper
/Anita Borg Scholarship, 2004
Excellent Graduate
Student Scholarship, Huazhong Univ. of Science and Technology, 1998
Outstanding
Graduate Award, Huazhong University of Science and Technology, 1996