dc10 - 0.5
DebConf10
Speakers | |
---|---|
Hanna Wallach |
Schedule | |
---|---|
Day | DebConf Day 3 (2010-08-03) |
Room | Davis Auditorium |
Start time | 14:00 |
Duration | 01:00 |
Info | |
ID | 592 |
Event type | lecture |
Track | |
Language | en |
Feedback | |
---|---|
Did you attend this event? Give Feedback |
Statistical Machine Learning Analysis of Debian Mailing Lists
In this talk, I will discuss the use of state-of-the-art machine learning techniques to analyze Debian mailing lists in order to discover political, social, and technical patterns that could be used to inform project decisions. I will concentrate on a class of techniques known as statistical topic models, which automatically infer groups of semantically-related words, known as topics, from word co-occurrence patterns in documents. The resultant topics can then be used to detect emergent areas of technical activity, identify subcommunities, and track trends over time. In addition to providing a brief overview of statistical topic models and their application to Debian mailing list data, I will present examples of topics inferred from Debian mailing lists, as well as some preliminary political, social, and technical findings discovered via these topics.
In this talk, I will discuss the use of state-of-the-art machine learning techniques to analyze Debian mailing lists in order to discover political, social, and technical patterns that could be used to inform project decisions. I will concentrate on a class of techniques known as statistical topic models, which automatically infer groups of semantically-related words, known as topics, from word co-occurrence patterns in documents. The resultant topics can then be used to detect emergent areas of technical activity, identify subcommunities, and track trends over time. In addition to providing a brief overview of statistical topic models and their application to Debian mailing list data, I will present examples of topics inferred from Debian mailing lists, as well as some preliminary political, social, and technical findings discovered via these topics.
Hanna Wallach is a senior postdoctoral research associate at the University of Massachusetts Amherst, where she develops machine learning techniques for identifying and answering social science questions. In her not-so-spare time, Hanna used to maintain some Debian packages and has run several projects that encourage and promote women's involvement in free software development -- most notably Debian Women, with Erinn Clark and Helen Faulkner.