For reasons that are difficult to articulate, I can’t get a particular Mitsou song out of my head (a little glimpse into the horrors of the early-90s late-80s Canadian music scene).
But anyway…
The Eclipse Foundation has decided to shut down the Eclipse Usage Data Collector. This decision was made based on a simple value calculation: the value that we have been able to extract from the data is effectively null when compared against the effort and resources required to capture and maintain it.
Yesterday (September 15/2011), we collected more than 13 million lines of usage data from almost 20k users. Volumes of data on this scale make university researches salivate. I’ve been working with a few researchers in various parts of the world over the years, and–while there have been some interesting papers produced–the actual value of the results to Eclipse Committers, member companies, and the greater community that we serve has been difficult to detect, let alone measure.
I have tried to do some of my own analysis of the data. I’ve come up with a few results that I find interesting. I had, for example, some early successes in determining project pairings in the wild.
While those early results were interesting to me and a handful of others, I have no data to suggest that this information has made any difference to development, or testing plans. I hoped that, by identifying pairings of projects, we could focus testing energy on the most common pairings (as the number of projects in the simultaneous release grows, the number of potential combinations grows exponentially). I did discuss this with some researchers who found the concept interesting, but ultimately weren’t able to come up with actual valuable results.
So this is the end of usage data collection by the Eclipse Foundation. At the end of September, I’ll flip a bit in the server that will make it stop collecting data (clients will continue to function as though the server is accepting the data; i.e. they will clean up their local caches of usage data). The client itself will still be included in the Indigo SR-1 and SR-2 releases, but will not be included in the Juno release. Discussion is happening in Bug 347069: your comments are most welcome.
The fate of the UDC code in the EPP project has not been decided. With this change, there is an opportunity for somebody to step up and take over this code (I believe that there are a few organizations using the UDC code for their own data-mining purposes). Or, we can archive it.