Please note: there's a chance there won't be a Data Links post next weekend, as I have some family in town and will be busy in that large room with blue ceiling that other people call "outside". Fear not, I'll recover the regular schedule by the end of the month.



  • [PDF] The Princeton Web Transparency and Accountability Project.

    When you browse the web, hidden “third parties” collect a large amount of data about your behavior. This data feeds algorithms to target ads to you, tailor your news recommendations, and sometimes vary prices of online products. The network of trackers comprises hundreds of entities, but consumers have little awareness of its pervasiveness and sophistication

  • Improve Your Privacy in the Age of Mass Surveillance.

    today we’ll reclaim our privacy and improve browsing experience step-by-step. There is a difference between protecting your grandma sharing cake recipes, and a human rights activists in a hostile country. Your granny might not be the right person to sell a prepaid SIM & burner-phone to. An activist might consider the below steps entry-level basics, even dangerous if not tailored to the individual. But we all need protection. Even more so if you assume that «you got nothing to hide».


  • After the privacy section, let's keep grounded on dystopia just a bit more to appreciate this series of posts that analyse ways in which Black Mirror can become reality. Data analysis / machine learning plays a big part in that: I, II, III, IV, V.

  • AI winter is well on its way. Hacker News discussion.

    Visibly the sentiment has quite considerably declined, there are much fewer tweets praising deep learning as the ultimate algorithm, the papers are becoming less "revolutionary" and much more "evolutionary". Deepmind hasn't shown anything breathtaking since their Alpha Go zero [and even that wasn't that exciting, given the obscene amount of compute necessary and applicability to games only - see Moravec's paradox]. OpenAI was rather quiet, with their last media outburst being the Dota 2 playing agent [which I suppose was meant to create as much buzz as Alpha Go, but fizzled out rather quickly]. In fact articles began showing up that even Google in fact does not know what to do with Deepmind, as their results are apparently not as practical as originally expected... As for the prominent researchers, they've been generally touring around meeting with government officials in Canada or France to secure their future grants, Yann Lecun even stepped down (rather symbolically) from the Head of Research to Chief AI scientist at Facebook. This gradual shift from rich, big corporations to government sponsored institutes suggests to me that the interest in this kind of research within these corporations (I think of Google and Facebook) is actually slowly winding down. Again these are all early signs, nothing spoken out loud, just the body language.

  • Thousands of AI researchers are boycotting the new Nature journal. Hacker News discussion. Less to do with machine learning that with the traditional publisher business model.

    Machine learning is a young and technologically astute field. It does not have the historical traditions of other fields and its academics have seen no need for the closed-access publishing model. The community itself created, collated, and reviewed the research it carried out. We used the internet to create new journals that were freely available and made no charge to authors. The era of subscriptions and leatherbound volumes seemed to be behind us.

    The public already pays taxes that fund our research. Why should people have to pay again to read the results? Colleagues in less well-funded universities also benefit. Makerere University in Kampala, Uganda, has as much access to the leading machine-learning research as Harvard or MIT. The ability to pay no longer determines the ability to play.

  • Google Started a Political Shitstorm Because of Its Over-Reliance on Wikipedia. Take results from a third party, treat them as The Truth and place them next to your own. What can go wrong?

  • Artificial intelligence footstep recognition system could be used for airport security.

    Researchers at The University of Manchester in collaboration with the Universidad Autónoma de Madrid, Spain, have developed a state-of-the-art artificial intelligence (AI), biometric verification system that can measure a human’s individual gait or walking pattern. It can successfully verify an individual simply by them walking on a pressure pad in the floor and analysing the footstep 3D and time-based data.


Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.

Have you read an article you liked and would like to suggest it for the next issue? Just contact me!