• YouTube Is Improperly Collecting Children’s Data, Consumer Groups Say.

    A coalition of more than 20 consumer advocacy groups is expected to file a complaint with federal officials on Monday claiming that YouTube has been violating a children’s privacy law.

    The complaint contends that YouTube, a subsidiary of Google, has been collecting and profiting from the personal information of young children on its main site, although the company says the platform is meant only for users 13 and older.

  • As Banks Embrace Biometric Tracking of Customers, Cybertheft Explodes in Mexico.

    Criminal organizations in Mexico have branched out into a lucrative new market and revenue stream: big data. They have developed innovative practices to obtain sensitive user information by lifting data from the databases of government agencies such as Condusef, Consar and Buró de Crédito. They call bank customers and spoof on the caller ID screen the phone number of the bank they claim to represent. To gain the target’s trust, they give the credit card security code to the target and ask if it matches what they see on the back of their card. And it goes from there. Now, they’re about to be gifted an invaluable cache of data: the biometric identifiers of Mexican bank customers.

  • Chinese man caught by facial recognition at pop concert (via RISKS.)

    Chinese police have used facial recognition technology to locate and arrest a man who was among a crowd of 60,000 concert goers.

    The suspect, who has been identified only as Mr Ao, was attending a concert by pop star Jacky Cheung in Nanchang city last weekend when he was caught.


  • Mark Zuckerberg Says Facebook Will Have AI to Detect Hate Speech In ‘5-10 years’.

    Zuckerberg was responding to a question from Republican Rep. John Thune, who asked what steps Facebook takes to determine hate speech on the platform at present and what some of the challenges are in doing so. In his response, Zuckerberg noted that in Facebook’s early days the company didn’t have AI tools that could automatically flag content, but lately such tools have been implemented. Zuckerberg claimed that more than 90 percent of pro-ISIS or Al Qaeda content is currently automatically flagged by machines, and last year the social network started using automated tools to detect when users may be at risk of self-harm and intervene.

  • Up Next: A Better Recommendation System. But what is the purpose of a recommender system: to give you something you really like or to have you clicking on around the site with close calls so you can load more ads?

    I’ve been a Pinterest user for a long time. I have boards going back years, spanning past interests (art deco weddings) and more recent ones (rubber duck-themed first birthday parties). When I log into the site, I get served up a slate of relevant recommendations—pins featuring colorful images of baby clothes alongside pins of hearty Instant Pot recipes. With each click, the recommendations get more specific. Click on one chicken soup recipe, and other varieties appear. Click on a pin of rubber duck cake pops, and duck cupcakes and a duck-shaped cheese plate quickly populate beneath the header “More like this.”

    These are welcome, innocuous suggestions. And they keep me clicking.

  • Google works out a fascinating, slightly scary way for AI to isolate voices in a crowd. Original research post.

    The company says this tech works on videos with a single audio track and can isolate voices in a video algorithmically, depending on who's talking, or by having a user manually select the face of the person whose voice they want to hear.

    Google says the visual component here is key, as the tech watches for when a person's mouth is moving to better identify which voices to focus on at a given point and to create more accurate individual speech tracks for the length of a video.

  • Unethical growth hacks: A look into the growing Youtube news bot epidemic. Hacker News discussion. This is just, again, something entirely created by the incentive structure. As Dr. Ian Malcolm would say, capitalism finds a way.

    Someone has effectively created a fully automated process running 24/7 that is taking and stripping recent articles, converting them into video format, and posting it on Youtube as their own. And while doing so, they take credit for it and reap all the rewards — such as revenue and influence — that come with it.

  • Police use Experian Marketing Data for AI Custody Decisions (via Techdirt.)

    Durham Police has paid global data broker Experian for UK postcode stereotypes built on 850 million pieces of information to feed into an artificial intelligence (AI) tool used in custody decisions, a Big Brother Watch investigation has revealed.

    Durham Police is feeding Experian’s ‘Mosaic’ data, which profiles all 50 million adults in the UK to classify UK postcodes, households and even individuals[2] into stereotypes, into its AI ‘Harm Assessment Risk Tool’ (HART). The 66 ‘Mosaic’ categories include ‘Disconnected Youth’, ‘Asian Heritage’ and ‘Dependent Greys’.

    Durham Police’s AI tool processes Experian’s ‘Mosaic’ data and other personal information to predict whether a suspect might be at low, medium or high risk of reoffending.

Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.

Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!