Concept: Predicting future outcomes based on historical records

Introduced by Pawel Gniewek, PhD student researcher at UC Berkeley. The benefits of predictive power are of a tremendous interest in our daily life – for example, when we try to make a decision about our career.

To that end, I have obtained a publication record of the American Physical Society. The database contains the papers published in Physical Review journals dating back to the middle of the twentieth century. The published records in the APS database will serve as a proxy for the trends in physics over the last 100 years.

We observe that some individuals are better than others in recognizing prospective and important scientific topics. Some of those individuals are also responsible for the emergence of new scientific fields. Thus it’s tempting to use a comprehensive record of published papers to put this observation under investigation.

The project has two major goals:

  1. Using natural language processing tools, we will process abstracts and titles of the papers in order to extract keywords. Those keywords are meant to serve as a proxy of the scientific fields in physics – for example: granular materials, quantum dots, dark matter etc. Based on that, we will trace the time evolution of these trends in time and pinpoint the location at which those trends emerge.
  2. Using the papers’ keywords, authors, and their affiliations (obtained in step 1), we will train a neural network in order to predict the evolution of the fields/subfields in physics over time.

If successful, this model may serve as an advisory tool for young scientists deciding on their future career and for academic boards that are distributing public resources.

Leave a Reply