Abstract
Acquiring knowledge has long been the major bottleneck preventing the rapid spread of AI systems. Manual approaches are slow and costly. Machine-learning approaches have limitations in the depth and breadth of knowledge they can acquire. The spread of the Internet has made possible a third solution: building knowledge bases by mass collaboration, with thousands of volunteers contributing simultaneously. While this approach promises large improvements in the speed and cost of knowledge base development, it can only succeed if the problem of ensuring the quality, relevance and consistency of the knowledge is addressed, if contributors are properly motivated, and if the underlying algorithms scale. In this paper we propose an architecture that meets all these desiderata. It uses first-order probabilistic reasoning techniques to combine potentially inconsistent knowledge sources of varying quality, and it uses machine-learning techniques to estimate the quality of knowledge. We evaluate the approach using a series of synthetic knowledge bases and a pilot study in the domain of printer troubleshooting.