Introduced by Jordan Cox, Rob Holbrook, Ramon Lim and Stephanie Zhu
Oftentimes, even just preparing data for analysis can take significant time, effort and knowledge. But now, a group of UC Berkeley students are introducing a solution to make data analysis a little bit easier: The Match ‘n’ Merge Application.
Powered by Python, JavaScript and HTML, the Match ‘n’ Merge intuitive web-based application allows you to load disparate data sets and merge them into one to help you jumpstart your data analysis project.
This unique application uses predictive models to match columns across your separate data sets and applies string pattern recognition algorithms to match the rows, ultimately resulting in a consistent, merged data set.
Version 1.0 allows merging of two CSV data sets and focuses on a specific set of alpha-numeric features.
Future releases will allow you to merge multiple data sets of varying forms. They also will automatically generate relevant features for stronger predictive modeling and will utilize probabilistic joins to fill in missing data.