dc10
DebConf10
Welcome to our feedback system. It collects feedback so that we have a chance to know what you think about the events of this conference.
Statistical Machine Learning Analysis of Debian Mailing Lists
In this talk, I will discuss the use of state-of-the-art machine learning techniques to analyze Debian mailing lists in order to discover political, social, and technical patterns that could be used to inform project decisions. I will concentrate on a class of techniques known as statistical topic models, which automatically infer groups of semantically-related words, known as topics, from word co-occurrence patterns in documents. The resultant topics can then be used to detect emergent areas of technical activity, identify subcommunities, and track trends over time. In addition to providing a brief overview of statistical topic models and their application to Debian mailing list data, I will present examples of topics inferred from Debian mailing lists, as well as some preliminary political, social, and technical findings discovered via these topics.