Yuni Xia
Associate Professor
Department of Computer Science
Indiana University - Purdue University Indianapolis

Address:  723 W. Michigan St, SL280E, Indianapolis, IN 46202, U.S.A.
Phone:     (317) 274-9738,  
Fax:         (317) 274-9742


    Research     Publications    Teaching    Services    Awards

Yuni Xia is an Associate Professor of the Computer and Information Science Department at Indiana University - Purdue University Indianapolis (IUPUI). She received PhD and MS in Computer Science from Purdue University, and her B.S. in Computer Science from Central China(HuaZhong) University of Science and Technology in China. Before joining IUPUI, she worked at IBM T.J. Watson Research center in 2003.  Xia's research is on data mining and databases, focusing on mining and management of big data and data streams including biomedical data, sensor data and moving object data. She also works on managing uncertainty in the decision support process.


        Data Mining: Big Data, Uncertain Data Mining, Data Stream Mining, Biomedical Informatics
        Databases: Data Streams, Data Uncertainty Management

    Research Projects (we gratefully acknowledge the support of funding agencies):
    Health-Terrain: Visualizing Large Scale Health Data, Supported by US Department of the Army, Co-PI (PI: Fang)
    This project aims to design a framework and new techniques for mining and visualizing large scale health data.

    Development of Key Technologies for Big Data Analysis and Management Software Based on Next Generation Memory,  Supported by ETRI,  Co-PI (Institute PI: Lee)
    This collaborative project seeks to develop big-data main memory database management system and distributed streaming processing system using hardware acceleration techniques including FPGA ( programmable / reconfigurable chips) and GPGPU(general purpose graphics processing unit).       

    Large Scale Sensor Stream Analysis and Mining for Geriatric Care, Supported by IBM
    This project aims to design and develop a real-time distributed sensor stream monitoring and analysis system for geriatric care. This enables effective home-based continuous geriatrics care, which is not only cost-savings, but also improves the quality of life of the elderly and their families.

    DisProt Database: A Central Repository of Information on Intrinsically Disordered Proteins, Supported by NSF,  Co-PI (PI: Dunker)
    The goal of this project is to fully develop DisProt, a database that provides an essential depository of information about intrinsically disordered proteins (IDPs) . DisProt will be not only a collection of data on intrinsically disordered proteins and their functions, but also a unique research tool to conduct various computational studies on these proteins and to help design better research strategies for studying individual IDPs in laboratory. It's expected that DisProt will support a very wide-spread use, both for the purpose of carrying out bioinformatics experiments and for the entire community involved in understanding cell and molecular biology.

    TrafficAnalyzer: A Real-time Traffic Stream Processing and Analyzing System, Supported by IBM
    Modern traffic monitoring systems are required to perform real-time processing and analysis of peta-bit continuous data streams. In this project, we propose to design and develop a real-time traffic stream processing and analyzing system.  The most important feature of TrafficAnalyzer is the real-time performance. The results of processing need to be produced with virtually zero latency, because in traffic monitoring system, real-time response is crucial for reducing accident rate and smoothing traffic flow. TrafficAnalyzer must support sophisticated time-windowed processing operations since streaming data continually changes, often at high rates. These operations should be executed in a way that produces results incrementally as new data arrives, since the entire data set is never available in its entirety.  TrafficAnalyzer also provides careful management of the historical data, as it need compare and combine present data with the past to study the traffic flow change over the time. TrafficAnalyzer is also resilient to inaccuracy and uncertainties in the data streams, because inherent variations, losses, or reordering of the data streams cause data to arrive in the wrong order, or with variable delays.

    Development of SYMBIOTE; A Reconfigurable Logic Assisted Data Stream Management System for Multimedia Sensor Networks, Supported by NSF, Co-PI (PI: Lee)
    Numerous emerging applications require real-time processing of high bandwidth multimedia data streams. In this project, we propose a novel class of data stream management systems called Reconfigurable Logic Assisted DSMS (RLADSMS) that will provide one of the first comprehensive and demonstrative approaches to using Reconfigurable Logic coprocessors as data stream accelerators in the prototype RLADSMS called SYMBIOTE. This project will investigate key issues such as data models, query languages, hardware DSMS operators, corresponding cost models of query execution, considering hardware complexity of database operators, run-time complexity of hardware and software operators, interconnect latencies, bandwidth, resource allocation as well as optimization techniques for this new class of data stream management system.

    Invention of a Consumer-Side Geriatric Health Care Knowledge Management and Decision Support System, Supported by 21st Century Research and Development Fund,  State of Indiana, Co-PI (Institute PI: Palakal)
    This project proposes to build an innovative Knowledge Management system unique in the Geriatric Care Management Industry. This system will accelerate the adoption of standards of care and provide the accumulation of knowledge from current Social Science, Psychology, and Health disciplines. It will also build a basis, comparable to the Health Care Industry model, for evidence based outcomes validation.


Book Chapters


Please log into Oncourse for lecture notes, readings, assignments, projects, etc.
CSCI340: Discrete Computational Structures
CSCI441: Client Server Databases
CSCI443: Database Systems
CSCI481: Data Mining
CSCI541: Database Management Systems
CSCI573: Data Mining
CSCI590: Advanced Database Systems

Profession Services
Program Committee:
        The International Conference on Collaborative Computing (CollaborateCom), 2010 to  present
        The IEEE International Conference on Computer and Information Technology (CIT)
        The International Conference on Frontier Computing (FC), 2010
        The International Workshop on Smart Homes for Tele-Health (SmarTel), 2010
        The IEEE 12th International Conference on Computational Science and Engineering(CSE), 2009
        The International Workshop on Smart Homes for Tele-Health (SmarTel), 2009
        The International Workshop on Information Fusion and Dissemination in Wireless Sensor Networks (SensorFusion), 2009
        The International Conference on Intelligent Pervasive Computing (IPC), 2008
        The IEEE 11th International Conference on Computational Science and Engineering(CSE), 2008
        The IEEE 21st International Conference on Advanced Information Networking and Applications (AINA), 2007
        The IEEE/ACS 5th International Conference on Computer Systems and Applications (AICCSA), 2007
        The Third International Conference on Intelligent Environments(IE), 2007
        The International Workshop on Information Fusion and Dissemination in Wireless Sensor Networks(SensorFusion), 2007
        The International Workshop on Knowledge Management and Discovery for Ubiquitous and Pervasive Applications (KUPA), 2007

Local Chair:
       ACM SIGMOD/PODS Conference, 2010

Journal Review:
        IEEE Transaction on Knowledge and Data Engineering
        IEEE Transactions on Parallel and Distributed Systems
        ACM Transaction on Database System
        ACM Transaction on Knowledge Discovery from Data
        Knowledge and Information Systems
        Data and Knowledge Engineering
        Information Systems
        Information Sciences
        The International Journal of Telemedicine and Application
        The Information Fusion Journal
        The Journal of System and Software
        The International Journal of Data Mining and Bioinformatics
        The Electronics and Telecommunication Research Institute Journal
        The International Journal of Computer Science and Technology
        The Journal of Ubiquitous Computing and Intelligence

        Panelist, National Science Foundation, CISE, 2007, 2009, 2011

Program Co-Chair:
        The ACM workshop on Health Information and Knowledge Management (HIKM) 2006


         Best Demo Award, International Conference on Database Systems for Advanced Applications (DASFAA), 2011
         Scalable Data Analytics Innovation Award, IBM Research, 2010
         Techpoint Mira Award, with Senior Care Navigation System development team at My Health Care Manager LLC, 2010
         Trustees Teaching Award, IUPUI, 2009
         Research Venture Award, IUPUI, 2009
         Real Time Innovation Award, IBM Research, 2008
         TechPoint MIRA Award, with Purdue University Knowledge Projection Team, 2005
         Leading Light Award, Indiana TechPoint Organization, 2004
         IBM Grace Hopper /Anita Borg Scholarship, 2004