Concept: Deep-learning startup focused on optimizing supply chains for manufacturers

Introduced by Moataz Rashad, Founder & CEO of DeepVu,, @deepvuhq, @moatazr.

DeepVu is a deep-learning startup focused on optimizing supply chains for manufacturers. We work with tier-1 manufactures in the US and Asia.

One of the secondary sub-use-cases that we encounter involves forecasting the price of certain commodities that are key constituents of our customer’s products Bill of Materials. If it is often needed to forecast the price of that commodity (for example, copper, PVC, aluminum, IronOre etc) several months into the future and in some cases a year out in order to inform our model’s predictions of the price of the manufactured parts.

For this competition project, you’ll be given a commodity’s market data, typically 5 years, and you’ll have the following attributes (open-price, close-price, trading volume, day high, day low, etc.)
You do have the freedom to add additional columns that you think may help enrich the intelligence of your model, for example, the price of gasoline, GDP data, etc.
You have the freedom to choose any deep-learning or traditional machine-learning model of your choosing.

Framework Requirements:
Only Tensorflow or PyTorch are allowed and we encourage you to use GPUs.

We recommend you split the data-set 80% for training and 10% validation and 10% testing

Performance Goal: Mean Absolute Error of 7.5% as measured against actuals prices for
time period 4 weeks out (from the last date in the training set given) to 12 weeks out.

Submission Deadline: April 15th, 2018. You’ll need to submit:
a. your source code in python (GitHub repo is fine)
b. README file with steps that include any framework/library dependencies and how to train and run inference.
c. a csv file with the predicted vs. actual on given test-set
d. csv file with predictions for daily prices 4 weeks out till 12 weeks out.
e. Performance numbers and analysis document

1st place: $2500 cash award + priority placement for a paid summer internship
2nd place: $1200 cash award + priority placement for a paid summer internship

Concept: 3D body scanner to revolutionize how we discover & shop for clothing

Introduced by Richard D. Berwick, Co-founder, Twindom, 3D Scanning, 3D Printing, and 3D SAAS, (765) BER-WICK |
What Twindom does:
Twindom is leveraging the photorealistic quality of our 3D body scanner and the data we’ve captured to date to produce Drapr, a Virtual Fitting product. Our goal is to revolutionize how we discover and shop for clothing by applying bleeding edge research in body scanning, body modeling, physics simulation, geometry processing, machine learning and more.
Who we are:
As a team, we (1) seek to be world class, (2) get things done, (3) love to learn, (4) thirst to build products customers love, and (5) believe you have to act differently and creatively to stand out. If you believe in these principles, you’ll fit in well at Twindom. We’re looking for the best in the world – and if that is you, you’re a good candidate for the job.
What you’ll do:
You will be responsible for designing and developing major components of the virtual fitting pipeline. Our small and dedicated team is currently tackling a wide variety of problems in computer vision, including image segmentation, mesh segmentation, image recognition, mesh manipulation, rendering, body modeling, physics simulation, and 3D scanning. The challenges we encounter are endless — and as a current/past Cal student, you know how to solve problems.

Concept: Science-based sleep coaching

Introduced by Dax Vivid, Postdoctoral Researcher at UC Berkeley,, and

About SleepWeb:
We are harvesting the capabilities of existing sensors to provide science-based sleep coaching. Our team includes a mechanical engineer and a biologist. We are looking for software developers to join our team. This is an opportunity for specialists in AI, statistics, and machine learning to develop the backend of software that integrates basic environmental and physiological constituents of good sleep then offer a framework for advice based on user inputs.

Concept: Creating a real-time newsbot

Introduced by Jason Best, Vectr Ventures,, David Law, SCET Berkeley,, and Anand Gomes, CEO & Founder at RiskEx (,

Concept: Creating a real-time newsbot that scrapes and publishes the most relevant news based on one or more tickerized asset classes and related filters. These news items are then inputted to a single-page web-widget. This will be the first truly “unbundled” news bot for the financial markets.  

Problem: Even in 2018, there is currently no free or inexpensive way of aggregating news that is relevant to a specific asset class in a single-page-application by simply toggling the asset class, using its ticker. Certain paid services such as Bloomberg and Reuters do a pretty good job but they cost $25K / year and there is no way to “unbundle” their news offerings. This is similar to what is happening with cable television and the desire to purchase only the specific channels that you watch/care about.

To learn more, please click HERE.

Concept: Transform dirty financial data into actionable market intelligence for commodities

Introduced by Jason Best, Vectr Ventures,, David Law, SCET Berkeley,, and Anand Gomes, CEO & Founder at RiskEx (,

Concept 1: Transform dirty financial data, supplied for regulatory compliance, into actionable market intelligence for commodities (potentially extendable to crypto assets) traders. This will be the first time this type of unbundled service has ever been created in the U.S. financial markets and has the potential for massive adoption within the commodities markets. The result of this project will be a real-time scraping of Swap Data Repository Data to extract relevant and accurate insight on derivative trading volumes for the Over-The-Counter (OTC) Commodity Markets.

Problem: A “Swap Data Repository” (SDR) is an entity created by the Dodd-Frank Act in 2010 in order to provide a central facility for derivative trading data reporting and record-keeping by financial institutions – so regulators have a better idea of the risk held on the financial institution’s balance sheets.

However, the data suffers greatly from a lack of standardization and is extremely difficult to understand (by the regulatory body’s (CFTC’s) own admission -> CFTC Data Cleanup) mainly due to

  • The bespoke nature of the derivatives (tailored to the risk/investment profiles of the participants)
  • Different internal record-keeping conventions for units and instruments (gallons, barrels, MTs, swaps vs futures, different strategy naming conventions)

Further, relative to other financial asset classes, OTC derivatives data has a high degree of dimensionality (Amortization schedule, Collateralization Type, Settlement Type, Day-count conventions, Payment Frequency, Reset Frequency, Strike, Underlying, Term in addition to price, quantity and other attributes). Today, each OTC trading desk tasks a human analyst to “clean-up” the data on a daily / weekly basis which is extremely time-consuming and costly.

To learn more, please click HERE.

Concept: Early Earthquake Warning

Introduced by Alejandro Cantu, CEO of SkyAlert,

A seismic network is deployed along the Pacific Coast, one of the seismic actives zones in Mexico. The network is composed of a set of sensors grouped by clusters, each cluster is located in different cities, each cluster can allocate from 1 to 5 sensors. Every time the system confirms an earthquake, the parameter MDACT changes from OFF to ON.


Based on the database, determine the best way to notify the occurrence of an earthquake, taking into account:
1. One single station trigger
2. Multi-station trigger
3. Correlation between the seismic catalog and the sensors database.
4. Additionally, take into account that we want to sense and report medium to large magnitude earthquakes which are usually the ones that affect cities the most and avoid sensing/reporting small earthquakes.
5. Some of the earthquakes that were felt in Mexico City, determine a rule to notify those earthquakes.

Concept: Trademark infringement

Introduced by SB Master, Founder & CEO of Naming Matters,

About Naming Matters

Naming Matters ( is an automated branding platform that provides creative and vetting tools for naming new companies, brands, products, and services. Naming Matters reimagines this stressful and expensive process, saving hundreds of hours and fostering creativity with its beautiful name space visualization. For its automated risk assessment of potential names, no legal knowledge is required of the user; Naming Matters performs this algorithmically and includes the assessment of registered and pending trademarks, URLs, and social website usernames. Patented and patent-pending, Naming Matters is the latest invention by world naming expert SB Master, founder of Master-McNeil ( who previously named PayPal, Ariba, Concur, Affirm, Eos, Zesty, Causes, Clarium, various cars, and 60+ products for Apple, among hundreds of additional companies, products and services. For more background, two recent articles about Naming Matters are included below, from TechCrunch and the Harvard Business School magazine.
About The Project

Being sued for trademark infringement is a serious problem, and selecting names is thus, risky business. Trademark infringement is a product of similarity of name combined with similarly of goods or services; the more similar your proposed name and goods are to someone else’s, the riskier that name. Our current product addresses Registered trademarks. But in many countries, users are able to build “Common Law” rights to names merely through use, without ever registering anywhere as a trademark. Our challenge here is to find these Common Law uses of the names are users are considering. We need to identify both identical names, and phonetically similar names. Ideally, we will be able to find these uses, and also provide some information on what the name is being used for. This could include a “clip” or excerpt showing how the name is being used in text, or something more definitive such as an industry classification. We expect the richest places to look for name use would be big datasets such as the archives of major newspapers and business publications; business announcement services such as PR Newswire; and industry-specific publications.   

Predicting New Venture Sucess

If you are interested to predict success of new ventures, look at this article:

Concept: Deep Learning and NLP for Personalized Sales Email Generation

Introduced by Yaron Oren, Founder & CEO of B2B Lead Generation is big business. Hundreds of billions of dollars are spent on it, yet generating high quality businesses leads remains every Sales & Marketing organization’s top challenge from small startups to large enterprises. We all get inbound email solicitations that are broad, unoriginal, and appear to have been generated from a robotic-like template. When we get something personalized, there is usually a high level of manual and expensive effort behind it. Can we automate and improve the current email generation process by leveraging several deep learning algorithms and social genome to compose personal emails? Can we create an AI Engine to compose a sophisticated email generator that is personal and hooks the reader in? In tackling this problem you will both source publicly available email training data (we can point you in the right direction) as well as explore advanced NLP techniques and supervised and unsupervised deep learning algorithms ranging from Recurrent Neural Networks (RNNs) + LSTMs to Deep Boltzmann Machines / AutoEncoders.

Concept: Blockchain based social currency to regulate social platform such as Twitter

Introduced by Sanjeev Verma, Blockchain and Mobile Security Architect at Samsung Electronics. Concept: To improve the quality of social media platform content, this project will utilize Blockchain based Social Currency to regulate social media platform such as Twitter.

Problem or Why?: Social media platforms are free for all platforms. It is a great media to share information and interact with each other. Most of the social platforms started with this objective in mind. However, the quality of posts degenerated over time due to lack of regulation. The value of a social platform such as Twitter depends on the quality of posts—attracting more advertising dollars.

Here are some of the issues with Twitter:

  • Use of fake accounts—sometimes by the same person.
  • Use of Trolls to bring down the reputation of someone—some politicians use paid Trolls against their political opponents.
  • Abusive & Threatening Posts.
  • Sharing of fake News, fake Photos, fake Videos
  • People take pride ( social status) in the number of followers—businesses have sprung that create fake followers.

This list is not comprehensive—this discourages people and some people often leave Twitter due to harassment.


Blockchain based cryptocurrency can be used to self-regulate social media such as Twitter. Cryptocurrency is something of value—in case of social media cryptocurrency is “reputation”.  People participating in the social media such as Twitter should be awarded for their good behavior—they should also be punished for their bad behavior. Also, they should be awarded for improving the quality of the platform through their posts. Probably people can be even further incentivized to use their earned “social cryptocurrency” to buy products advertised in the platform. One can use machine learning algorithms also to flag abusive posts.


A regulated platform will improve the quality of content shared through social media—attracting more people to the platform. This will in turn attract more advertising dollars to the platform.