New data analysis competitions
- Kaggle's Toxic Comment Classification Challenge. $35,000 in prizes.
China has been building what it calls "the world's biggest camera surveillance network". Across the country, 170 million CCTV cameras are already in place and an estimated 400 million new ones will be installed in the next three years.
Many of the cameras are fitted with artificial intelligence, including facial recognition technology. The BBC's John Sudworth has been given rare access to one of the new hi-tech police control rooms.
Safari in Arms Race Against Trackers - Criteo Feels the Heat. A very good example of current privacy wars.
All popular browsers give users control over who gets to set cookies, but Safari is the only one that blocks third-party cookies (those set by a domain other than the site you are visiting) by default. (Safari's choice is important because only 5-10% of users ever change default settings in software.) Criteo relies on third-party cookies. Since users have little reason to visit Criteo's own website, the company gets its cookies onto users’ machines through its integration on many online retail websites. Safari’s cookie blocking is a major problem for Criteo, especially given the large and lucrative nature of iPhone's user base. Rather than accept this, Criteo has repeatedly implemented ways to defeat Safari's privacy protections.
In August 2016, Australia’s federal Department of Health published medical billing records of about 2.9 million Australians online. These records came from the Medicare Benefits Scheme (MBS) and the Pharmaceutical Benefits Scheme (PBS) containing 1 billion lines of historical health data from the records of around 10 per cent of the population.
These longitudinal records were de-identified, a process intended to prevent a person’s identity from being connected with information, and were made public on the government’s open data website as part of its policy on accessible public data.
Now, we find that patients can be re-identified, using known information about the person to find their record. This was disclosed to the Department in December 2016.
The algorithms that play increasingly central roles in our lives often emanate from Silicon Valley, but the effort to hold them accountable may have another epicenter: New York City. Last week, the New York City Council unanimously passed a bill to tackle algorithmic discrimination — the first measure of its kind in the country.
The algorithmic accountability bill, waiting to be signed into law by Mayor Bill de Blasio, establishes a task force that will study how city agencies use algorithms to make decisions that affect New Yorkers’ lives, and whether any of the systems appear to discriminate against people based on age, race, religion, gender, sexual orientation or citizenship status. The task force’s report will also explore how to make these decision-making processes understandable to the public.
“Sometimes it comes up with a desert and it thinks its an indecent image or pornography,” Mark Stokes, the department’s head of digital and electronics forensics, recently told The Telegraph. “For some reason, lots of people have screen-savers of deserts and it picks it up thinking it is skin colour.”
Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.
Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!