INTRODUCTION

SIDD aims to establish semantic associations among disease-related databases and integrate them to provide disease global view at biological levels.

The current version (Jul 2013) of SIDD documented 4,465,131 entries relating to 139,365 disease-associated molecular, phenotypic and environmental features (DR-MPEs), and to 3,824 human diseases by integrating 18 disease-related databases. Each entry in the SIDD is a relationship between DR-MPEs and disease, including DR-MPE name, DR-MPE identifier, disease name, literature reference, and the link to the original database for browsing to acquire further information. To establish semantic associations, 4,284 disease terms in Medical subject headings (MeSH) and 4,139 disease terms in Online Mendelian Inheritance in Man (OMIM) are also mapped to Disease Ontology (DO), shown in SIDD.

We design a web interface that allows user to search and browse the data in several ways: ‘Browsing’, ‘Searching’ and ‘Network Visualization’. We also support Querying and Submitting the mappings from disease terms to DO.

The overall system design of SIDD is illustrated in the following figure. From the bottom of this figure, integrating the relationship between diseases and DR-MPEs includes three major steps, (1) extracting the DR-MPE records from 18 source database, (2) mapping all disease names to DO, (3) filtering out the redundant records among the same DR-MPE type databases. MySQL version 5.5.1 has been employed to manage all results of the three steps. To make the database accessible, the web interface, including submission forms and graphical outputs are constructed using JSP and Servlet. The web interface offers three main functionalities (shown in the top of Figure 2). First, Disease term can be queried and shown in the DO tree. Second, disease-related DR-MPEs can be browsed and downloaded. Third, network containing diseases and their co-occurrence DR-MPEs can be visualized in the webpage.