Integrating updating domain knowledge data mining

Users should be familiar with DDlog or SQL, working with relational databases, and Python to build Deep Dive applications or to integrate Deep Dive with other tools.

A developer who wants to modify and improve Deep Dive must have basic background knowledge mentioned in the Deep Dive developer's guide.

Deep Dive differs from traditional systems in several ways: These examples are described in the showcase page.

The complete code for these examples is available with Deep Dive (where permitted).

Deep Dive is a trained system that uses machine learning to cope with various forms of noise and imprecision.

Deep Dive is designed to make it easy for users to train the system through low-level feedback via the Mindtagger interface and rich, structured domain knowledge via rules.

Deep Dive is designed to make it easy for users to train the system through low-level feedback via the Mindtagger interface and rich, structured domain knowledge via rules. By contrast, previous pipeline-based systems require developers to build extractors, integration code, and other components—without any clear idea of how their changes improve the quality of their data product. Deep Dive is project led by Christopher Ré at Stanford University. Current group members include: Michael Cafarella, Xiao Cheng, Raphael Hoffman, Dan Iter, Thomas Palomares, Alex Ratner, Theodoros Rekatsinas, Zifei Shan, Jaeho Shin, Feiran Wang, Sen Wu, and Ce Zhang. Once all these processes are over, we would be able to use this information in many applications such as Fraud Detection, Market Analysis, Production Control, Science Exploration, etc. Data Mining is defined as extracting information from huge sets of data.

We gratefully acknowledge the support of the Defense Advanced Research Projects Agency (DARPA) XDATA Program under No. FA8750-13-2-0039, DARPA's MEMEX program, the National Science Foundation (NSF) CAREER Award under No. ACI-1343760, the Sloan Research Fellowship, the Office of Naval Research (ONR) under awards No. N000141310129, the Moore Foundation, American Family Insurance, CHTC, Google, Lightspeed Ventures, and Toshiba.

