Concept: Learning AI / ML by Modeling Human Social Trends as a Time Series

Human society today is being roiled by a combination of negative trends such as the rise of authoritarianism and populism, fueled by xenophobia and a fear of social change. At the same time, we can see a rise in positive / cohesive concerns for expanded human rights such as LGBT rights and empowerment of women. Religious fundamentalism continues to drive many societies and contributes to some of the negative / disruptive trends identified above. At times, it feels like we are regressing to social behaviors of the past.

If one looks at humanity in the aggregate, it appears that social trends rise and fall in waves across seemingly disconnected societies. There is an ebb and flow to these trends … positive trends seem to almost inevitably give rise to negative ones and (sometimes) vice versa. Trends may start as one or many localized events of limited scope that fade without effect, but sometimes coalesce and balloon to global scope. Furthermore, communications today is near instantaneous at a global scale, and it amplifies or dampens the growth of a trend.

Questions

Have you ever wondered about the future of human society? Is a continuous oscillation between cohesive and disruptive trends the “normative” state for human societies? Will the current set of negative trends be amplified leading to conflict on a global scale (as happened in the early parts of the 20th century)? Will there be a new cold war between authoritarian and liberal societies?

Or will the positive / cohesive trends prevail, leading to a better society for all? How will external disruptions such as climate change affect the outcomes?

Is it possible for us to learn from the past when similar trends prevailed? How can we analyze past social data to make predictions of how human society will evolve?

Ideas from Data Center Ops Management

Most large-scale data analytics is now stream based. Time series analytics is a common approach to data center ops monitoring applications (e.g. Log Insight or Splunk for log analytics, Wavefront or Datadog for time series-based metrics). Commercial data center Operations Management applications routinely monitor, analyze and provide actionable feedback on highly complex, seemingly random data, at scale. Ops Management solutions can auto identify a “normative state” and “violations” for data center systems simply by evaluating operating data as a time series over some period of time.

Hypothesis

Today, we have a large enough population size (over 8B humans), instant global communication – and the tools to analyze data efficiently at scale. By modeling human social evolution as a stream of events – a time series – we can apply AI / ML techniques to correlate events, and to extract (possibly predict) social trends.

What It Is Not

This approach is not the same as micro targeting – which is employed by advertisers, political parties and the like (e.g. Cambridge Analytica). Hence, we do not need personal / private information. The proposal works by analyzing events in the aggregate as a time series, not on the basis of individual personal characteristics.

Challenges / Learning Opportunities

The challenges fall into 3 buckets:

Collecting Data

Event: Any happenings such as demonstrations, political events, court cases, etc.

We have to concern ourselves with both

  1. Historical event data –
    1. We need access to historical data
    2. We would need to crowdsource the preparation of the data as an event stream
  2. Contemporary event data – this would rely on
    1. Mining of data streams such as Twitter feeds
    2. Mining of news sources such as Reuters and AP feeds
    3. Social media apps for crowdsourced submission of real-time events

Shaping the Data

The event data will need to be tagged (with a variety of information) and shaped as a time series so as to support the AI / ML routines.

Some of the possible ways in which the event data would be tagged –

Basic Tags on the Event Date, time, location, country, number of people involved, etc
Social Trend Tags on the Event Racism, xenophobia, authoritarianism, populism, civil rights, etc

Developing the AI / ML Analytics

The final challenge is the actual development of the AI / ML analytics and the fine tuning so that we can

  1. Accurately model and predict historical trends based on the historical event data – this is the Learning
  2. Predict future trends based on contemporary event

In Summary

This project offers a challenging opportunity for a student that is interested in learning about AI / ML and is also interested and curious in making sense of the complex political and social trends in our societies. There are opportunities to learn about the end to end challenges involved in developing a real-world AI / ML application – from collecting the data, then shaping it, followed by developing and fine-tuning the algorithms. There are also aspects of scraping social media and building apps for collecting contemporaneous events.

Concept: Learning AI / ML by Modeling Human Social Trends as a Time Series

Human society today is being roiled by a combination of negative trends such as the rise of authoritarianism and populism, fueled by xenophobia and a fear of social change. At the same time, we can see a rise in positive / cohesive concerns for expanded human rights such as LGBT rights and empowerment of women. Religious fundamentalism continues to drive many societies and contributes to some of the negative / disruptive trends identified above. At times, it feels like we are regressing to social behaviors of the past.

If one looks at humanity in the aggregate, it appears that social trends rise and fall in waves across seemingly disconnected societies. There is an ebb and flow to these trends … positive trends seem to almost inevitably give rise to negative ones and (sometimes) vice versa. Trends may start as one or many localized events of limited scope that fade without effect, but sometimes coalesce and balloon to global scope. Furthermore, communications today is near instantaneous at a global scale, and it amplifies or dampens the growth of a trend.

Questions

Have you ever wondered about the future of human society? Is a continuous oscillation between cohesive and disruptive trends the “normative” state for human societies? Will the current set of negative trends be amplified leading to conflict on a global scale (as happened in the early parts of the 20th century)? Will there be a new cold war between authoritarian and liberal societies?

Or will the positive / cohesive trends prevail, leading to a better society for all? How will external disruptions such as climate change affect the outcomes?

Is it possible for us to learn from the past when similar trends prevailed? How can we analyze past social data to make predictions of how human society will evolve?

Ideas from Data Center Ops Management

Most large-scale data analytics is now stream based. Time series analytics is a common approach to data center ops monitoring applications (e.g. Log Insight or Splunk for log analytics, Wavefront or Datadog for time series-based metrics). Commercial data center Operations Management applications routinely monitor, analyze and provide actionable feedback on highly complex, seemingly random data, at scale. Ops Management solutions can auto identify a “normative state” and “violations” for data center systems simply by evaluating operating data as a time series over some period of time.

Hypothesis

Today, we have a large enough population size (over 8B humans), instant global communication – and the tools to analyze data efficiently at scale. By modeling human social evolution as a stream of events – a time series – we can apply AI / ML techniques to correlate events, and to extract (possibly predict) social trends.

What It Is Not

This approach is not the same as micro targeting – which is employed by advertisers, political parties and the like (e.g. Cambridge Analytica). Hence, we do not need personal / private information. The proposal works by analyzing events in the aggregate as a time series, not on the basis of individual personal characteristics.

Challenges / Learning Opportunities

The challenges fall into 3 buckets:

Collecting Data

Event: Any happenings such as demonstrations, political events, court cases, etc.

We have to concern ourselves with both

  1. Historical event data –
    1. We need access to historical data
    2. We would need to crowdsource the preparation of the data as an event stream
  2. Contemporary event data – this would rely on
    1. Mining of data streams such as Twitter feeds
    2. Mining of news sources such as Reuters and AP feeds
    3. Social media apps for crowdsourced submission of real-time events

Shaping the Data

The event data will need to be tagged (with a variety of information) and shaped as a time series so as to support the AI / ML routines.

Some of the possible ways in which the event data would be tagged –

Basic Tags on the Event Date, time, location, country, number of people involved, etc
Social Trend Tags on the Event Racism, xenophobia, authoritarianism, populism, civil rights, etc

Developing the AI / ML Analytics

The final challenge is the actual development of the AI / ML analytics and the fine tuning so that we can

  1. Accurately model and predict historical trends based on the historical event data – this is the Learning
  2. Predict future trends based on contemporary event

In Summary

This project offers a challenging opportunity for a student that is interested in learning about AI / ML and is also interested and curious in making sense of the complex political and social trends in our societies. There are opportunities to learn about the end to end challenges involved in developing a real-world AI / ML application – from collecting the data, then shaping it, followed by developing and fine-tuning the algorithms. There are also aspects of scraping social media and building apps for collecting contemporaneous events.

Concept: Learning AI / ML by Modeling Human Social Trends as a Time Series

Human society today is being roiled by a combination of negative trends such as the rise of authoritarianism and populism, fueled by xenophobia and a fear of social change. At the same time, we can see a rise in positive / cohesive concerns for expanded human rights such as LGBT rights and empowerment of women. Religious fundamentalism continues to drive many societies and contributes to some of the negative / disruptive trends identified above. At times, it feels like we are regressing to social behaviors of the past.

If one looks at humanity in the aggregate, it appears that social trends rise and fall in waves across seemingly disconnected societies. There is an ebb and flow to these trends … positive trends seem to almost inevitably give rise to negative ones and (sometimes) vice versa. Trends may start as one or many localized events of limited scope that fade without effect, but sometimes coalesce and balloon to global scope. Furthermore, communications today is near instantaneous at a global scale, and it amplifies or dampens the growth of a trend.

Questions

Have you ever wondered about the future of human society? Is a continuous oscillation between cohesive and disruptive trends the “normative” state for human societies? Will the current set of negative trends be amplified leading to conflict on a global scale (as happened in the early parts of the 20th century)? Will there be a new cold war between authoritarian and liberal societies?

Or will the positive / cohesive trends prevail, leading to a better society for all? How will external disruptions such as climate change affect the outcomes?

Is it possible for us to learn from the past when similar trends prevailed? How can we analyze past social data to make predictions of how human society will evolve?

Ideas from Data Center Ops Management

Most large-scale data analytics is now stream based. Time series analytics is a common approach to data center ops monitoring applications (e.g. Log Insight or Splunk for log analytics, Wavefront or Datadog for time series-based metrics). Commercial data center Operations Management applications routinely monitor, analyze and provide actionable feedback on highly complex, seemingly random data, at scale. Ops Management solutions can auto identify a “normative state” and “violations” for data center systems simply by evaluating operating data as a time series over some period of time.

Hypothesis

Today, we have a large enough population size (over 8B humans), instant global communication – and the tools to analyze data efficiently at scale. By modeling human social evolution as a stream of events – a time series – we can apply AI / ML techniques to correlate events, and to extract (possibly predict) social trends.

What It Is Not

This approach is not the same as micro targeting – which is employed by advertisers, political parties and the like (e.g. Cambridge Analytica). Hence, we do not need personal / private information. The proposal works by analyzing events in the aggregate as a time series, not on the basis of individual personal characteristics.

Challenges / Learning Opportunities

The challenges fall into 3 buckets:

Collecting Data

Event: Any happenings such as demonstrations, political events, court cases, etc.

We have to concern ourselves with both

  1. Historical event data –
    1. We need access to historical data
    2. We would need to crowdsource the preparation of the data as an event stream
  2. Contemporary event data – this would rely on
    1. Mining of data streams such as Twitter feeds
    2. Mining of news sources such as Reuters and AP feeds
    3. Social media apps for crowdsourced submission of real-time events

Shaping the Data

The event data will need to be tagged (with a variety of information) and shaped as a time series so as to support the AI / ML routines.

Some of the possible ways in which the event data would be tagged –

Basic Tags on the Event Date, time, location, country, number of people involved, etc
Social Trend Tags on the Event Racism, xenophobia, authoritarianism, populism, civil rights, etc

Developing the AI / ML Analytics

The final challenge is the actual development of the AI / ML analytics and the fine tuning so that we can

  1. Accurately model and predict historical trends based on the historical event data – this is the Learning
  2. Predict future trends based on contemporary event

In Summary

This project offers a challenging opportunity for a student that is interested in learning about AI / ML and is also interested and curious in making sense of the complex political and social trends in our societies. There are opportunities to learn about the end to end challenges involved in developing a real-world AI / ML application – from collecting the data, then shaping it, followed by developing and fine-tuning the algorithms. There are also aspects of scraping social media and building apps for collecting contemporaneous events.

You can find your dream basketball lineup with NBA Make-A-Team

Introduced by Vikram Singh, Dong-Eun Suh, Karl Walter, Dian Yu and Francis Yang 

NBA Make-A-Team, a new project developed by UC Berkeley students, gives NBA coaches, general managers and everyday fans the opportunity to select hypothetical starting lineups and get accurate feedback on how that basketball team would perform.

This simple-to-use tool can project a team’s winning percentage and predict its statistical strengths and weaknesses. It can also suggest a player swap that could help a user build a stronger team.

The tool empowers the user to assemble their dream team – choosing from  players that were part of the NBA between 2000 to 2017 – and compare that team against their friends’ squads-of-choice. Additionally, for those who hope to make their dream team a realistic team, there is a salary cap option that imposes a league’s spending limits on chosen players.

This project was developed by a team of students from Data-X, a project-oriented data and machine learning course at UC Berkeley. The largest hurdles the team faced were cleaning and processing the data and developing the models used to estimate a team’s winning percentage and to suggest player swaps.

This project is tremendously exciting because it provides instantaneous feedback on both real and hypothetical rosters and gamifies roster selection by highlighting team weaknesses and strengths.

See how it works here.

Want to know the best time to book your flight? UC Berkeley students created a data set that can help

Introduced by Deep Dave, David Lin, Sharon Ng, Vanessa Salas and Alexandre Vincent

As you can imagine, there is a wealth of data on most topics in the Internet Age. Data on stock prices, the housing market and energy consumption are just a few of the areas with a large amount of data available to the public. But how can you use this data to make informed decisions in your own life?

Four students from Data-X, a data and machine learning course at UC Berkeley, set out to answer this question and make data-driven decision a little bit easier for travelers.

Originally, the students planned to work with available data sets to answer the questions  “When is the best time to buy plane tickets?”. They found, however, that data in relation to this specific inquiry was limited. After weeks of searching for right data set, they found that data sets from Kaggle, a website dedicated to providing open data, only included flight times and Department of Transportation data sets only included yearly flight prices.

So the students decided to alter their project to make a data set themselves. They scraped flight data on a daily basis in order to create a workable database that can help others pinpoint when the right time is to book their flight.

UC Berkeley students start project to aid water supply research efforts in China

Introduced by Brandon Chou, Djavan De Clercq, Andrew Gonzalez, Charles Li and Rinitha Reddy

Urbanization, one of the prevailing trends of the 21st century, places great stress on the water resources of cities across the globe. This stress is particularly pronounced in China,  a country that has seen rapid urban growth in the past few decades.

This problem prompted a team of UC Berkeley students to begin a  statistical learning project that could aid researchers by providing them with greater insight into urban water supply patterns.

They applied statistical learning methods to 12 years of urban water supply data for 627 cities across China in order to identify the factors most responsible for variance in patterns of urban water distribution and management. For instance, they found that Chinese cities have consistently suffered water loss and leakage rates above 20 percent since 2001, and water prices are closely associated with leakage.

Additionally, developed an urban water sustainability index in order to compare cities. From there they were able to identify the cities and regions in China that face  sustainability issues.

Aside from their research, they also provided a general, systems-level perspective of major urban drinking water use trends in China for the benefit of public-sector stakeholders.

UC Berkeley students create algorithm that can predict your personality from your online photos

Introduced by Souhail Bentaleb, Guillaume Drugeot, Luna Izpisua Rodriguez, Farbod Nowzad and Ajay Shah

Using data science and machine learning, a team of UC Berkeley students have created a program that  can tell an accurate and informative story about a person’s life — all from the photos that they post on social media.

Typically, advertisers target consumers by analyzing a person’s most frequented online sites or “clicks” to collect basic demographic information, such as age, gender and location. But they’re overlooking an entire other aspect of people’s online lives, which could be a rich source consumer information: personal photos.

The 1.8 billion images that are being posted to Facebook, Instagram, Flickr, Snapchat and WhatsApp every day hold valuable information about a person’s lifestyle, daily activities and consumer behaviors .

In order to address this problem, the team of students created a machine learning algorithm that tags photos from social media profiles to reveal the most prominent keywords that define each image. From there, the algorithm uses this information to classify each consumer profile into one of four categories: outdoorsy, sporty, family or foodie.

This classification can provide advertisers with personalized insights about a consumer’s preferences.

The general public can use the tool as well in order to gain insight into the image that they project online.

A new tool can predict the best time for you to trade Bitcoin

Introduced by Olabode Faleye, Kazuomori Lewis, Vicente Izquierdo and Pedro Pablo Correa

Bitcoin is one of the newest financial instruments to take the world by storm. This decentralized digital currency has been experiencing unprecedented levels of growth in recent months, proving itself as one of the most lucrative investment opportunities of our generation. As of Jan. 16, it was up more than 1000 percent.

Its extreme volatility, however, makes it difficult for even seasoned investors to predict market trends and maximize their return on investment. As a result, a team of UC Berkeley students from Data-X aimed to develop strategies to predict market trends to inform even a novice investor when it is best to buy, sell or hold Bitcoin with their new tool, CryptoTitans. 

To develop informed trading strategies, the team used Bayesian Regression to predict prices on the minute scale,  Recurrent Neural Networks for prediction on a day scale and the more traditional Bollinger Bands for trading strategies in accordance with long-term market trends. With these different approaches, the tool could help investors with all levels of experience to maximize the profits they make from Bitcoin.

UC Berkeley students create a new way of assessing knee injuries

Introduced by Claudia Iriondo, Devina Jain, Raouf Muhamedrahimov, Vasileious Papanikolau, Kosta Trotskovsky and Leo Sun

More than 25 percent of adults in the United States suffer from knee pain, creating an urgent need for an efficient way to match patients who have cartilage and bone lesions with effective treatments. Patients undergo Magnetic Resonance Imaging (MRI) scans across the country, but they require a large amount of time and access to a radiology expert. 

But Cartilage-X offers a new way of assessing knee injuries. The new tool uses AI-guided lesion detection to predict the location and severity of cartilage and bone lesions, reducing the amount of time and expertise needed to assess these injuries.

The team  that created this tool consists of graduate and undergraduate students from UC Berkeley. With the help of Python, Matlab and Tensorflow, the team used a data set of more than 1,800 patients’ knee MRIs and integrated with existing UCSF research to create an automated pipeline for MRI analysis. The pipeline ingests MRI images stored in its database, segments cartilage compartments, automatically preprocesses to include only signal of interest and predicts the presence of a cartilage or bone lesion using neural networks.

The team’s future work will focus on improving model performance, introducing an image quality control step before training and testing model generalizability on a fresh data set from a different cohort.

By providing insight into lesion location and severity, and doing so in under a minute, Cartilage-X brings quantitative metrics for knee MRIs into the clinic.

UC Berkeley students create app that can tell you why you’re having car troubles

Introduced by Nicholas Hirons, Julian Kudszus, Soham Kudtarkar and Spencer Lee

Mechanics spend a lot of time diagnosing car issues and performing tests, but oftentimes customers aren’t sure if their cars are getting the service they need.

But a new web app created by UC Berkeley students could change this. The algorithm takes standard information about the type of car a user owns and takes error codes from car telematics data, or wireless information transmitted from your car. From this information, the algorithm is able to predict what issues need to addressed and what services need to be performed.

The algorithm takes into account a variety of vehicle data, such as make, mileage and engine control units to give the most precise prediction possible.

Because it takes information from vehicle telematics, the tool could mean that dealerships and auto repair shops could preemptively reach out to customers to let them know they need service, to stop a potential breakdown on the road.

A feature of the app that suggests likely causes and solutions for car issues also means that customers could avoid paying for services they don’t need once they arrive at an auto repair shop.

While initial results are mixed, the team was able to predict when certain service operations were required fairly well. For example, the model accurately predicted the need for tire inflation about 70 percent of the time. With more data, the team believes that they could significantly improve prediction accuracy by building models specific to each car brand.

The students hope their project can help members of the vehicle service industry understand the benefit of using connected car telematics to improve efficiency and customer service.