New data analysis competitions
- Mercari Price Suggestion Challenge on Kaggle. You can only use the in-house kernels for this. $100,000 in prizes.
The idea of websites tracking users isn’t new, but research from Princeton University released last week indicates that online tracking is far more invasive than most users understand. In the first installment of a series titled “No Boundaries,” three researchers from Princeton’s Center for Information Technology Policy (CITP) explain how third-party scripts that run on many of the world’s most popular websites track your every keystroke and then send that information to a third-party server.
I’ve come to a different conclusion: The assault we face is driven in large measure by the exceptional appetites of a wholly new genus of capitalism, a systemic coherent new logic of accumulation that I call surveillance capitalism. Capitalism has been hijacked by a lucrative surveillance project that subverts the “normal” evolutionary mechanisms associated with its historical success and corrupts the unity of supply and demand that has for centuries, however imperfectly, tethered capitalism to the genuine needs of its populations and societies, thus enabling the fruitful expansion of market democracy.
The federal government is considering allowing private companies to use its national facial recognition database for a fee, documents released under Freedom of Information laws reveal. The partially redacted documents show that the Attorney General’s Department is in discussions with major telecommunications companies about pilot programs for private sector use of the Facial Verification Service in 2018. The documents also indicate strong interest from financial institutions in using the database.
Thomas Hargrove is a homicide archivist. For the past seven years, he has been collecting municipal records of murders, and he now has the largest catalogue of killings in the country—751,785 murders carried out since 1976, which is roughly twenty-seven thousand more than appear in F.B.I. files. States are supposed to report murders to the Department of Justice, but some report inaccurately, or fail to report altogether, and Hargrove has sued some of these states to obtain their records. Using computer code he wrote, he searches his archive for statistical anomalies among the more ordinary murders resulting from lovers’ triangles, gang fights, robberies, or brawls. Each year, about five thousand people kill someone and don’t get caught, and a percentage of these men and women have undoubtedly killed more than once. Hargrove intends to find them with his code, which he sometimes calls a serial-killer detector.
Intelligence agencies have a limited number of trained human analysts looking for undeclared nuclear facilities, or secret military sites, hidden among terabytes of satellite images. But the same sort of deep learning artificial intelligence that enables Google and Facebook to automatically filter images of human faces and cats could also prove invaluable in the world of spy versus spy. An early example: US researchers have trained deep learning algorithms to identify Chinese surface-to-air missile sites—hundreds of times faster than their human counterparts.
Big data and social science are coming together and sharing equal footing at McGill’s new Centre for Social and Cultural Data Science.
Called CSCDS (pronounced as “cascades”) for short, the centre represents an interdisciplinary alliance in which data scientists join forces with research partners in the humanities, arts and social sciences.
A neural network designs Halloween costumes. Thanks to Dani for this link!
I train neural networks, a type of machine learning algorithm, to write humor by giving them datasets that they have to teach themselves to mimic. They can sometimes do a surprisingly good job, coming up with a metal band called Chaosrug, a craft beer called Yamquak and another called The Fine Stranger (which now exists!), and a My Little Pony called Blue Cuss.
So, I wanted to find out if a neural network could help invent Halloween costumes. I couldn’t find a big enough dataset, so I crowdsourced it by asking readers to list awesome Halloween costumes. I got over 4,500 submissions.
Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.
Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!