New data analysis competitions

  • Kaggle has a new competition using financial data, and the novelty this time is that everything will be run using Kaggle's script system: you code and run directly on their servers. I'm curious to see how this develops. Up to $100,000 in prizes (up to the 7th place!), which always helps.

How-to

Privacy

On Monday, a top prosecutor in the Brooklyn district attorney's office was arrested amid allegations that she used an illegal wiretap to eavesdrop on a coworker and an NYPD detective.

But in a twist worthy of a Mary Higgins Clark novel, the New York Times reports that the incident was part of a "love triangle gone wrong," and an anonymous law enforcement official said that prosecutor Tara Lenich carried out the wiretap because of "a personal entanglement between her and the detective." The Times also says that the scenario The New York Daily News reports that the two "worked closely [together] on a high-profile gun trafficking case."

Tech

Google researchers have worked with doctors to develop an AI that can automatically identify diabetic retinopathy, a leading cause blindness among adults. Using deep learning—the same breed of AI that identifies faces, animals, and objects in pictures uploaded to Google’s online services—the system detects the condition by examining retinal photos. In a recent study, it succeeded at about the same rate as human opthamologists, according to a paper published today in the Journal of the American Medical Association.

These "sentence compression algorithms" just went live on the desktop incarnation of the search engine. They handle a task that's pretty simple for humans but has traditionally been quite difficult for machines. They show how deep learning is advancing the art of natural language understanding, the ability to understand and respond to natural human speech. "You need to use neural networks—or at least that is the only way we have found to do it," Google research product manager David Orr says of the company's sentence compression work. "We have to use all of the most advanced technology we have."

Using artificial intelligence to flag live video is still at the research stage, and has two challenges, Candela said. "One, your computer vision algorithm has to be fast, and I think we can push there, and the other one is you need to prioritize things in the right way so that a human looks at it, an expert who understands our policies, and takes it down."

One result of this new fashion is that a few big new applications are being explored, in places with enough data and potential prediction value to make them decent candidates. But another result is the one described in my tweet above: fashion-induced overuse of more expensive new methods on smaller problems to which they are poorly matched. We should expect this second result to produce a net loss on average. The size of this loss could be enough to outweigh all the gains from the few big new applications; after all, most value is usually achieved in many small problems.

Visualizations