• The House That Spied on Me.

    In December, I converted my one-bedroom apartment in San Francisco into a “smart home.” I connected as many of my appliances and belongings as I could to the internet: an Amazon Echo, my lights, my coffee maker, my baby monitor, my kid’s toys, my vacuum, my TV, my toothbrush, a photo frame, a sex toy, and even my bed.

  • The CLOUD Act: A Dangerous Expansion of Police Snooping on Cross-Border Data.

    The Clarifying Overseas Use of Data (CLOUD) Act expands American and foreign law enforcement’s ability to target and access people’s data across international borders in two ways. First, the bill creates an explicit provision for U.S. law enforcement (from a local police department to federal agents in Immigration and Customs Enforcement) to access “the contents of a wire or electronic communication and any record or other information” about a person regardless of where they live or where that information is located on the globe. In other words, U.S. police could compel a service provider—like Google, Facebook, or Snapchat—to hand over a user’s content and metadata, even if it is stored in a foreign country, without following that foreign country’s privacy laws.


  • [PDF] The Future Computed. Artificial Intelligence and its role in sociecy.

  • Customer Satisfaction at the Push of a Button. The important thing about gathering data... is gathering enough (valid) data. Then you can process it. So make it easy.

    What HappyOrNot’s gas-station data lacked in substance, though, they made up for in volume. A perennial challenge in polling is gathering responses from enough people to support meaningful conclusions. The challenge grows as the questions become more probing, since people who have the time and the inclination to fill out long, boring surveys aren’t necessarily representative customers. Even ratings on Amazon and on, which are visited by millions of people every day, are often based on so few responses that a single positive or negative review can affect customer purchases for months. In 2014, a study of more than a million online restaurant reviews, on sites including Foursquare, GrubHub, and TripAdvisor, found that the ratings were influenced by a number of “exogenous” factors, unrelated to food quality—among them menu prices (higher is better) and the weather on the day the reviews were written (worse is worse).

  • Greedy, Brittle, Opaque, and Shallow: The Downsides to Deep Learning.

    According to skeptics like Marcus, deep learning is greedy, brittle, opaque, and shallow. The systems are greedy because they demand huge sets of training data. Brittle because when a neural net is given a “transfer test”—confronted with scenarios that differ from the examples used in training—it cannot contextualize the situation and frequently breaks. They are opaque because, unlike traditional programs with their formal, debuggable code, the parameters of neural networks can only be interpreted in terms of their weights within a mathematical geography. Consequently, they are black boxes, whose outputs cannot be explained, raising doubts about their reliability and biases. Finally, they are shallow because they are programmed with little innate knowledge and possess no common sense about the world or human psychology.

Data Links is a periodic blog post published on Sundays (specific time may vary) which contains interesting links about data science, machine learning and related topics. You can subscribe to it using the general blog RSS feed or this one, which only contains these articles, if you are not interested in other things I might publish.

Have you read an article you liked and would you like to suggest it for the next issue? Just contact me!