Concept: Inferred Information via Probabilistic Joins

Concept also introduced by Shomit Ghose, Data-X Advisor

This topic has is related to projects that can make predictions on topics where all the data may not be available. For example, the goal may be to predict a feature like “voting preference”, some training data may exist based on name, age, sex, and zip code.  However,  there may be other macro data available in various zip codes that are not tied directly to the target person. A method of probabilistically joining new features can be used as part of the prediction.  For example another there may be a relationship between voting preference and income and there may be another relationship between zip code and income, so in a probabilistic manner, estimates can be sharpened.

Topic: Probabilistic Joins of Disparate Data Sets

Big Data is vast in volume and also vast in variety, being drawn from a seemingly infinite set of sources.  But Big Data’s full benefit can only be gained if arbitrary data from arbitrary sources can be stitched together in statistically valid ways.  If data from different sources cannot be combined through the existence of common keys, as is traditionally done in database applications, it must instead be stitched together using probability-driven connections.  For example, how do you tie a data set that includes gender with one that includes location if there is no common key?  What statistical truths can be applied to tie these two data sets together?  Probability-driven data joins promise to enable the combining of, and correlation of, data that is drawn from different organizations and different endpoints to yield wholly unique insights.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s