(When) might algorithms know us better than we know ourselves?

Nadia Piet
8 min readDec 14, 2019

--

Disclaimer: this is not speculative (nor is it 100% peer-reviewed and flawlessly replicable science research ok).

Who knows you?

What does it mean to say an algorithm knows you better than your colleagues? Better than your friends? Better than yourself?

The first glimpses of this phenomenon appeared to me when I learned about the use of psychographic personas in Growth Tribe’s AI & marketing course preview. Psychographic personas are tools providing marketers with a deep data-driven analysis of each user’s behavior, psyche, preferences, and vulnerabilities to inform their targeting.

Then I came across the concept again watching the interview between WIRED’s Nick Thompson with Tristan Harris and Yuval Noah Harari (some of the great thinkers of our time), in which Harari speaks on it saying “it’s the end of the poker face, the end of the hidden parts of your personality”.

Since The Great Hack came out on Netflix it seems everyone has woken up to this truth, hailing the Cambridge Analytica situation as its prime example. To refer to it as a scandal would imply that this is an exception — it isn’t.

While it sounds like an intriguing Black Mirror plot, it is an emerging reality. Over the next months I kept an eye out for present-day proof and developments of algorithms predicting increasingly intimate matters.

Equally fascinating and frightening, I’m sharing my findings below, including examples and interactive experiences, and leave the conclusion to you.

Personality predictions based on FB page likes

A Cambridge and Stanford University study from 2015 compared “a set of personality predictions made by the computer (based on Facebook page likes), with a set of predictions made by friends and family, and by the person themselves in a self-assessment”.

What they found was that with the input of just 10 page likes, the algorithm’s prediction benchmarked against a work colleague. With 300 though, the algorithm predicts your personality as does your spouse.

In short:

This image captures how many page likes the algorithm needs to match the personality predictions of others around you. As you can tell, there’s one blank: how much data does the algorithm need to know you better than you know you?

Inferences from behavioral data for deeper insights

In the real world, the algorithm’s analysis isn’t restricted to what you’re explicitly sharing such as status updates or page likes. It draws on and induces conclusions from many more subtle — and sometimes surprising — data points such as your typing style, clicking speed, amount of unlocks, distance from other devices, conversational tone, internet speed, amount of devices, and so on.

Panoptykon Foundation’s visualization of the 3 layers of digital identity

Panoptykon Foundation elaborates on these 3 layers of inferences in their article “your digital identity has three layers, and you can only protect one of them”.

DIY: View your own psychographic persona / personality prediction

If you’re skeptical or curious what the algorithm would think of you, you’re in luck. You can do a similar personality test to the one discussed above in your own browser with:

  • IBM Watson Personality Insights — web page & based on plain text or Twitter profile
  • Crystal — Chrome extension & based on LinkedIn profile
  • Apply Magic Sauce — webpage & based on Twitter, Facebook, Linkedin data. A bit more work as you have to download and upload your own user data rather than do a social sign-in
Preview of Crystal’s analysis of Bill Gates

Seeing is believing and mine is pretty on-point:

IBM Watson Personality Predictions run on my Twitter profile @nadiapiet

How do you feel about its analysis? Did it pick up on anything you wish was as obvious to some of your friends or family members?

Psychographic personas are a typical example, but there just the top of the iceberg. From predicting sexual orientation to when you’re about to quit your job, let’s look at a handful of examples where the algorithm exceeds our human judgment in increasingly intimate matters.

A handful weird predictions the algorithm can make about you

💑 Knows if you’re staying with your partner (with 80% accuracy)

A study at the University of Southern California led by Shri Narayanan asked couples to record 10-min talks of them discussing disagreements and predict relationship strength and whether the couple would still be together 2 years later based on it.

Human therapists assessed video recordings including facial expressions and body language, while the algorithm analyzed only features of speech such as intonation, speech duration, and how the couple took turns to speak, and excluding even the meaning of words. More notably, it picked up on features of speech beyond human perception like spectral tilt.

Ultimately, the algorithm could predict whether a couple would be together 2 years later with 79.3% accuracy. A tiny bit better than the therapists (at 75.6%), and substantially better than most of us.

Would you want to run your relationship data through this algorithm? Would it become a self-fulfilling prophecy?

💼 Knows when you’re about to quit your job (with 95% accuracy)

IBM has developed and deployed an algorithm that can predict when employees are planning to leave their jobs with 95% accuracy.

Patented as the “predictive attrition program”, it analyzes many data points to assess employee flight risk and suggest engagement actions for managers. Although we don’t have quantitive data on the human benchmark, we know our efforts don’t match up. IBM’s CEO Ginni Rometty claims the algorithm has saved IBM $300 million in retention costs and, along with a handful of other AI applications, has allowed them to cut their HR workforce by 30%.

Could it know you’re quitting before you’re really planning to? If HR takes action based on the prediction, can it prevent you from quitting?

👅 Knows your sexuality from your face (with 90% accuracy)

Yilun Wang and Michael Kosinski’s conducted a study on 5,000 facial images from a U.S. dating website and trained the algorithm to distinguish between straight and gay people. It was right 81% of the time for men, and 74% accurate for women. Accuracy could peak up to 91%-83% when the computer evaluated five images per person, where humans benchmarked at just 61%-54%.

Surrounded by a whole lot of (justified) controversy and questionable research methods, AI-empowered “gaydar” is and otherwise might soon be a thing. Makes you wonder if it could know you’re gay before you do?

And there’s more:

These are just a handful of examples in a myriad of highly personal predictive algorithms. Being aware of the flawed research methods of some studies, I’d like to point out that this article isn’t about the credibility of either one of them individually but about pointing out the larger development that is apparent.

“A golden age of behavioural research”

Eric Siegel, chairman of the Predictive Analytics World conference, framed it as “a golden age of behavioural research. It’s amazing how much we can figure out about how people think [and act] now.”

We can answer the title with all this in mind. Yes, certain algorithms and pre-trained models are able to conclude things about you that you (or your close ones) don’t know (yet). So what does that mean?

While instantly reminiscent of a 1984-type plot and findings its early adoption in advertising and propaganda, there’s a few ways to think about this.

  1. We must imagine alternative use cases where we can leverage these insights for good. For health, education, justice.
  2. We can perhaps learn about ourselves as individuals and societies by gazing into our algorithmic reflections.
  3. We have a responsibility to think, discuss, and steer these developments as they increasingly shape the world around us. If you’re feeling triggered, here are some of the questions worth pondering / discussing:
  • What does it mean when an algorithm knows things about me that I don’t?
  • Does the algorithm know what’s best for me? Better than I do, perhaps?
    Say it makes better decisions, in which cases would I give up the autonomy to make those choices? What are the implications of outsourcing agency to intelligent, (semi-)autonomous systems?
  • Who is able to collect and access the data? Companies, employers, governments? In which cases is it justified to act on it?
  • What if a prediction is false at a point where we’ve grown to trust the system? How would dodging data points influence one’s ability to participate in society? What remains of our private life (vs public) when data is available on increasingly intimate matters? Especially in nations with limited freedom
  • How can we create systems mindful of the limitations of capturing subjective human values in quantitative measures?

As we generate, share, and analyze more data, finetune algorithms, and find access to pretrained models — these predictive machines will only become more accurate, nuanced, and widespread.

Buying Google Homes, fitbits, and smart fridges as we speak, we must begin to think about, and design, the principles and parameters of these supposed insights now rather than later.

Know thyself

Relying on algorithmic insights as a source of truth reminds me of the tradition of scrying, where instead of peaking into ponds, balls, or coffee dirt, we now dig into data sets. Or how the ancient Greeks would turn to the Oracle of Delphi for answers. As we enter this era, let’s gracefully remember the inscription upon the Oracle’s entrance urging us to “know thyself”, and proceed with a similar awareness.

--

--

Nadia Piet

Designer & researcher focussed on AI/ML, data, digital culture & the human condition mediated through computing