Research
Following up on Nick
Arnett's work that was the basis of
Opion Inc., now part of
PlanetFeedback, a unit of Intelliseek, Senti-Metrics Partners
is using similar systems to know what's going on in the global software
development community, with a particular focus on open source
systems. Some of this information will be available on OpenSector.org,
a web site focused on news and discussion about open source in the
public sector. Senti-Metrics also develops proprietary reports
for its clients, based on this research and analysis.
Privacy note:
In plain language, some of our information-gathering appears quite
similar to unethical harvesting of e-mail addresses. Therefore,
we realize that we must not make addresses or other information that
would individually identify discussion participants available to our
clients or any other third parties. We do not permit any third
party direct access to the database in which that information is
stored. We publish summary data and interpretation accompanied by
example text that typifies the trends we discover. We also will not
ever use this information for the purpose of direct marketing,
promotion, etc., on anyone's behalf, including our own.
Our tools include:
-
Robots (developed in
Python) intelligently gather meta-data and message content (selected
according to evolving rules) from numerous public discussions on the
Internet. The robots are able to extract data from web-based forums,
mailing lists and Usenet.
-
A relational database (MySQL)
stores the meta-data and the results of analysis. For this calendar year,
we so far collected more than 5 million messages from more than 5000 sources.
-
Commercial statistical tools (SPSS)
perform analyses such as regression, clustering and multi-dimensional scaling.
-
Time-series analysis (via SPSS, Excel
and custom Python) reveals time-based patterns in the discussions, such as:
-
Excitation levels -- the number of participants.
-
Subject diversity -- range of discussion topics.
-
Correlation to real-world events, such as stock prices and
sales.
-
Churn -- how rapidly the participants turn over.
-
Link and connection analysis (custom)
shows which people "hold together" and "bridge" electronic
communities, allowing us to report on the interests and attitudes of opinion leaders.
-
Content analysis (custom) uncovers
characteristics such as:
-
Affect -- positive or negative tone to a message;
-
Links -- web pages of interest to a community.
-
Visualization allow us to see the results of
link and other kinds of analysis.
Robot Note:
If you administer a discussion venue, you may have seen us subscribing
to one or more of your lists. Please do not confuse this activity
with spammer address-harvesting tools.