New data analysis competitions

  • On Kaggle: Quora Question Pairs.

    Currently, Quora uses a Random Forest model to identify duplicate questions. In this competition, Kagglers are challenged to tackle this natural language processing problem by applying advanced techniques to classify whether question pairs are duplicates or not. Doing so will make it easier to find high quality answers to questions resulting in an improved experience for Quora writers, seekers, and readers.


  • Beijing to release national artificial intelligence development plan.

    Beijing is drafting a national development plan on artificial intelligence and setting up a special fund as part of an effort to push the technology's application in the economy and national security, said China's top technology official.

  • New Machine Learning Framework Uncovers Twitter's Vast Bot Population.

    Up to 15 percent of active Twitter accounts are really bots: autonomous agents driven by algorithms rather than actual human personalities. That's about 48 million fakes. The 15 percent figure comes courtesy of a new analysis by computer scientists at Indiana University and the University of Southern California using a machine learning framework designed to detect bots based on nearly a thousand distinct Twitter user characteristics. The group's work is described in a paper posted this week to the arXiv preprint server.

  • AI's PR Problem.

    I'd suggest that one problem with AI is the name itself-coined more than 50 years ago to describe efforts to program computers to solve problems that required human intelligence or attention. Had artificial intelligence been named something less spooky, it might seem as prosaic as operations research or predictive analytics.

  • Deep Learning First:'s Path to Autonomous Driving.

    It's only been about a year since went public, but already, the company has a fleet of four vehicles navigating around the San Francisco Bay Area (mostly) autonomously—even in situations that are notoriously difficult for self-driving cars, like at night, or when it's raining.


Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.

Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!