dc10 - 0.5
DebConf10
| Speakers | |
|---|---|
|
|
Hanna Wallach |
| Schedule | |
|---|---|
| Day | DebConf Day 3 (2010-08-03) |
| Room | Davis Auditorium |
| Start time | 14:00 |
| Duration | 01:00 |
| Info | |
| ID | 592 |
| Event type | lecture |
| Track | |
| Language | en |
| Feedback | |
|---|---|
|
Did you attend this event? Give Feedback |
Statistical Machine Learning Analysis of Debian Mailing Lists
In this talk, I will discuss the use of state-of-the-art machine learning techniques to analyze Debian mailing lists in order to discover political, social, and technical patterns that could be used to inform project decisions. I will concentrate on a class of techniques known as statistical topic models, which automatically infer groups of semantically-related words, known as topics, from word co-occurrence patterns in documents. The resultant topics can then be used to detect emergent areas of technical activity, identify subcommunities, and track trends over time. In addition to providing a brief overview of statistical topic models and their application to Debian mailing list data, I will present examples of topics inferred from Debian mailing lists, as well as some preliminary political, social, and technical findings discovered via these topics.
In this talk, I will discuss the use of state-of-the-art machine learning techniques to analyze Debian mailing lists in order to discover political, social, and technical patterns that could be used to inform project decisions. I will concentrate on a class of techniques known as statistical topic models, which automatically infer groups of semantically-related words, known as topics, from word co-occurrence patterns in documents. The resultant topics can then be used to detect emergent areas of technical activity, identify subcommunities, and track trends over time. In addition to providing a brief overview of statistical topic models and their application to Debian mailing list data, I will present examples of topics inferred from Debian mailing lists, as well as some preliminary political, social, and technical findings discovered via these topics.
Hanna Wallach is a senior postdoctoral research associate at the University of Massachusetts Amherst, where she develops machine learning techniques for identifying and answering social science questions. In her not-so-spare time, Hanna used to maintain some Debian packages and has run several projects that encourage and promote women's involvement in free software development -- most notably Debian Women, with Erinn Clark and Helen Faulkner.