New data analysis competitions
- If you are in Madrid, this will interest you: Madrid's City Hall, in collaboration with Medialab-Prado, has started two competitions related to data analysis and data journalism: 1, 2. Be sure to check them out.
How-to
-
(PDF) Automated Crowdturfing Attacks and Defenses in Online Review Systems.
In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect
Privacy
-
Alibaba launches ‘smile to pay’ facial recognition system at KFC in China.
The service allows customers to process their payment simply by smiling after placing their order at one of the fast food restaurant's self-serve screens. A 3-D camera then scans the customer's face to verify their identity. An additional phone number verification option is available for added security.
-
AI Will Soon Identify Protesters With Their Faces Partly Concealed.
A new paper to be presented at the IEEE International Conference on Computer Vision Workshops (ICCVW) introduces a deep-learning algorithm—a subset of machine learning used to detect and model patterns in large heaps of data—that can identify an individual even when part of their face is obscured. The system was able to correctly identify a person concealed by a scarf 67 percent of the time when they were photographed against a "complex" background, which better resembles real-world conditions.
Tech
-
This is a very nice story with a quite straightforward lesson: set your PRNG seeds. Source 1, source 2.
When Pixar wanted to release its 2003 film Finding Nemo for Blu-ray 3D in 2012, the studio had to rerender the film to produce the 3D effects. The studio by then was no longer using the same animation software system, and it found that certain aspects of the original could not be emulated in its new software. The movement of seagrass, for instance, had been controlled by a random number generator, but there was no way to retrieve the original seed value for that generator. So animators manually replicated the plants’ movements frame by frame, a laborious process. The fact that the studio had lost access to its own film after less than a decade is a sobering commentary on the challenges of archiving computer-generated work.
Thanks to the cryptography mailing list for this link.
Visualizations
Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.
Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!