Web-based startup success prediction

B Sharchilev, M Roizner, A Rumyantsev… - Proceedings of the 27th …, 2018 - dl.acm.org
B Sharchilev, M Roizner, A Rumyantsev, D Ozornin, P Serdyukov, M de Rijke
Proceedings of the 27th ACM international conference on information and …, 2018dl.acm.org
We consider the problem of predicting the success of startup companies at their early
development stages. We formulate the task as predicting whether a company that has
already secured initial (seed or angel) funding will attract a further round of investment in a
given period of time. Previous work on this task has mostly been restricted to mining
structured data sources, such as databases of the startup ecosystem consisting of investors,
incubators and startups. Instead, we investigate the potential of using web-based open …
We consider the problem of predicting the success of startup companies at their early development stages. We formulate the task as predicting whether a company that has already secured initial (seed or angel) funding will attract a further round of investment in a given period of time. Previous work on this task has mostly been restricted to mining structured data sources, such as databases of the startup ecosystem consisting of investors, incubators and startups. Instead, we investigate the potential of using web-based open sources for the startup success prediction task and model the task using a very rich set of signals from such sources. In particular, we enrich structured data about the startup ecosystem with information from a business- and employment-oriented social networking service and from the web in general. Using these signals, we train a robust machine learning pipeline encompassing multiple base models using gradient boosting. We show that utilizing companies' mentions on the Web yields a substantial performance boost in comparison to only using structured data about the startup ecosystem. We also provide a thorough analysis of the obtained model that allows one to obtain insights into both the types of useful signals discoverable on the Web and market mechanisms underlying the funding process.
ACM Digital Library