Introduced by Milind Gadre, Senior Director R&D at VMware, Inc.
Human society today is being roiled by a combination of negative / disruptive trends such as the rise of authoritarianism and populism, fueled by xenophobia and a fear of social change. At the same time, we can see a rise in positive / cohesive concerns for expanded human rights such as LGBT rights and empowerment of women. Religious fundamentalism continues to drive many societies and contributes to several of the negative / disruptive trends identified above. At times, it feels like we are regressing to social behaviors of the past.
If one looks at humanity in the aggregate, it appears that variations of the same social trends rise and fall in waves across seemingly disconnected societies and time frames. There is an ebb and flow to these trends … positive trends seem to almost inevitably give rise to negative ones and (sometimes) vice versa. Trends may start as one or many localized events of limited scope that fade without effect, but sometimes coalesce and balloon to global scope (see for example – the Syrian Civil War started with a little graffiti). Furthermore, communications today is near instantaneous at a global scale, and it amplifies or dampens the growth of a trend.
Have you ever wondered about the possible evolution of human society? Is a continuous oscillation between cohesive and disruptive trends the “normative” state for human societies? Will the current set of negative trends be amplified leading to conflict on a global scale (as happened in the early parts of the 20th century)? Will there be a new cold war between authoritarian and liberal societies? Or will the positive / cohesive trends prevail, leading to a better society for all? How will external disruptions such as climate change affect the outcomes?
Is it possible for us to learn from the past when similar trends prevailed? How can we analyze past social data to make predictions of how human society will evolve?
Ideas from Data Center Ops Management
Most large-scale data analytics is now stream based. Time series analytics is a common approach to data center ops monitoring applications (e.g. Log Insight or Splunk for log analytics, Wavefront or Datadog for time series-based metrics). Commercial data center Operations Management applications routinely monitor, analyze and provide actionable feedback on highly complex, seemingly random data, at scale. Ops Management solutions can auto-identify a “normative state” and “violations” for data center systems simply by evaluating operating data as a time series over some period of time.
Today, we have a large enough population size (over 8B humans), instant global communication – and the tools to analyze data efficiently at scale. By modeling human social evolution as a stream of events – a time series – we can apply AI / ML techniques to correlate events and to extract (even possibly predict) social trends.
What It Is Not
This approach is not the same as micro-targeting – which is employed by advertisers, political parties and the like (e.g. Cambridge Analytica). Hence, we do not need personal / private information. The proposal works by analyzing events in the aggregate as a time series, not on the basis of individual personal characteristics.
Challenges / Learning Opportunities
The challenges fall into 3 buckets:
Challenge 1 – Collecting Data
We need access to both historical and real-time event data.
Event: Any happening such as demonstrations, political events, court cases, etc.
We have to concern ourselves with both
- Historical event data –
a. We need access to historical data
b. We would need to crowdsource the shaping of the data as an event stream
2 Contemporary event data – this would rely on
a. Mining of data streams such as Twitter feeds
b. Mining of news sources such as Reuters and AP feeds
c. Social media apps for crowdsourced submission of real-time events
Challenge 2 – Shaping the Data
The event data will need to be shaped as a time series and tagged with a variety of information so as to support the AI / ML routines.
Some of the possible ways in which the event data would be tagged –
Basic Tags on the Event: date, time location, country, number of people involved, etc.
Social Trend Tags on the Event: Racism, xenophobia, authoritarianism, populism, civil rights, etc.
Information relation to the Event locale (incomplete list): Coefficient of social friction (0 to 1), economic conditions, Gini coefficient, weather, climate change, etc.
Challenge 3 – Developing the AI / ML Analytics
The final challenge is the actual development of the AI / ML analytics and the fine tuning so that we can
1. Based on historical event data, accurately model and predict historical trends – this is the Learning part.
2. Predict future trends based on a combination of historical and contemporary event data.
This project offers a challenging opportunity for students that are interested in learning about AI
/ ML and are also interested and curious in making sense of the complex political and social trends in our societies. There are opportunities to learn about the end to end challenges involved in developing a real-world AI / ML application – from collecting the data, then shaping it, followed by developing and fine-tuning the algorithms. There are challenges to be addressed in scraping social media and building apps for collecting contemporaneous events. There are opportunities for developing interesting data visualizations and user interfaces. Finally – there are opportunities to write and publish technical papers.