How-to

  • Two Decades of Recommender Systems at Amazon.com. Hacker News discussion here.

    Amazon.com launched item-based collaborative filtering in 1998, enabling recommendations at a previously unseen scale for millions of customers and a catalog of millions of items. Since we wrote about the algorithm in IEEE Internet Computing in 2003[2] it has seen widespread use across the Web, including YouTube, Netflix, and many others. The algorithm's success has been from its simplicity, scalability, and often surprising and useful recommendations, as well as desirable properties such as updating immediately based on new information about a customer and being able to explain why it recommended something in a way that's easily understandable.

    What was described in our 2003 IEEE Internet Computing article has faced many challenges and seen much development over the years. Here, we describe some of the updates, improvements, and adaptations for item-based collaborative filtering, and offer our view on what the future holds for collaborative filtering, recommender systems, and personalization.

    Privacy

  • New documents reveal Kenya's worrying attempts to monitor the Internet.

    In January 2017, Kenya’s information and communication technology regulator, the Communications Authority of Kenya, announced that it was spending over 2 billion shillings (around 14 million USD) on new initiatives to monitor Kenyans’ communications and regulate their communications devices. The press lit up with claims of spying, and members of Kenya’s ICT community vowed to reject the initiatives as violating Kenyans’ constitutional rights, including the right to privacy (Article 31). Nevertheless, the Communications Authority claimed that these projects would help prevent a repeat of the post-election violence following the 2007 presidential elections.

    As political tension continues to mount in the run up to next month’s presidential elections, the Kenyan government has also rushed to operationalise a cybersecurity strategy that was first articulated in 2014. But what does the development of Kenya’s cybersecurity practices practically mean for Kenyan citizens? 

  • China's Surveillance Plans Include 600 Million CCTV Cameras Nationwide, And Pervasive Facial Recognition.

    Two of the recurrent themes here on Techdirt recently are China's ever-widening surveillance of its citizens, and the rise of increasingly powerful facial recognition systems. Those two areas are brought together in a fascinating article in the Wall Street Journal that explores China's plans to roll out facial recognition systems on a massive scale. That's made a lot easier by the pre-existing centralized image database of citizens, all of whom must have a government-issued photo ID by the age of 16, together with billions more photos found on social networks, to which the Chinese government presumably has ready access.

Tech

  • The Real Threat of Artificial Intelligence.

    Unlike the Industrial Revolution and the computer revolution, the A.I. revolution is not taking certain jobs (artisans, personal assistants who use paper and typewriters) and replacing them with other jobs (assembly-line workers, personal assistants conversant with computers). Instead, it is poised to bring about a wide-scale decimation of jobs — mostly lower-paying jobs, but some higher-paying ones, too.

  • CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms.

    We propose a new system for generating art. The system generates art by looking at art and learning about style; and becomes creative by increasing the arousal potential of the generated art by deviating from the learned styles. We build over Generative Adversarial Networks (GAN), which have shown the ability to learn to generate novel images simulating a given distribution. We argue that such networks are limited in their ability to generate creative products in their original design. We propose modifications to its objective to make it capable of generating creative art by maximizing deviation from established styles and minimizing deviation from art distribution. We conducted experiments to compare the response of human subjects to the generated art with their response to art created by artists. The results show that human subjects could not distinguish art generated by the proposed system from art generated by contemporary artists and shown in top art fairs. Human subjects even rated the generated images higher on various scales.

  • Banks Deploy AI to Cut Off Terrorists’ Funding.

    Banks have long used anti-money laundering systems to flag suspicious activity, and in the aftermath of September 11th, they have turned to those same legacy tools to catch terror-related transactions, too. But these legacy tools are not up to the job. They rely upon hard-coded “if-then” rules about predictably suspicious behavior. If the software spots a seven-figure transfer of funds from Miami to Bogota, for example, it knows to flag it. But as terrorist groups like ISIS recruit people internationally for smaller, targeted attacks, those tools become far less effective. There are just too many rules and possibilities to consider.


Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.

Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!