I asked an A.I. “what does a scientist look like?”

Dane DeSutter, Ph.D.
8 min readJan 9, 2023

--

Artificial Intelligence (A.I.) reflects the inherent biases of the data it is trained on. My experiment with Stable Diffusion suggests that these tools encode male-dominant gender-science stereotypes. Despite this limitation, such technologies can still be directly guided to represent various intersections of science and identity.

This is an article in a series exploring my journey to blend user experience research into my data science work

Take a moment and close your eyes. Now imagine a scientist.

What do you see?

Perhaps you see a neat white lab coat. Maybe you see a messy experiment bench adorned with a steampunky undulation of glassware, metal, and tubes. The glassware may even be filled with bubbling tonics with wispy vapor tumbling over the necks of oddly shaped flasks.

Stable diffusion portraits based on the prompt “scientist in lab coat looking away from the camera, colored pencil, simple”

Now ask yourself, what did the person wearing the safety goggles look like?

Children increasingly sketch women when asked to depict a scientist

When researchers ask children aged 6–16 to “draw a scientist” — something we have data on since the 1960s — their drawings reveal an interesting reflection of the historic social influences on gender-science stereotypes.

Dr. David Miller and his colleagues published a meta-analysis back in 2018 looking at the evolution of gender-science stereotypes in children by analyzing the “Draw-A-Scientist” (DAST) dataset. The authors were seeking evidence that how children depict scientists has changed as the representation of women in science has increased.

What the authors concluded from 78 studies is that:

  1. Children increasingly draw women when they draw scientists
  2. Girls, in particular, have shown the largest change, depicting women more than 50% of the time

Such findings indicate that cultural stereotypes of gender and science have shifted measurably as more women participate in fields like chemistry, biology, and astronomy & physics.

Note: Percentage of children that drew male scientists presented cross sectionally by decade, gender, and age cohort. Reprinted from “The development of children’s gender-science stereotypes: A meta-analysis of 5 decades of U.S. Draw-A-Scientist studies,” by D. I. Miller, 2018, Child Development, 89(6), 1943–1955. Copyright 2018 by the Society for Research in Child Development.

A new cultural mirror

As part of a larger exploration of the ways that humans and intelligent algorithms interact, I wanted to better understand A.I. models that generate images.

Stable Diffusion is a generative deep learning, text-to-image model trained on the LAION-5B dataset, a collection of 5 billion image-text pairs scraped from the entirety of the web. Images and their text are aggregated by resolution, caption language, and subject matter.

I wondered whether datasets generated from A.I. image tools like Stable Diffusion—similar to children’s drawings in DAST—might provide a lens into how social stereotypes are encoded in intelligent algorithms.

What does Stable Diffusion think a scientist looks like?

To run this experiment I generated 132 images from a pre-trained model of Stable Diffusion (version 1.5). Images were generated with a positive prompt: “illustration of a scientist, colored pencil, simple.” A negative prompt was included to limit the likelihood that the scientist was drawn “out of frame” or with “bad symmetry” to minimize random artifacts in images.

Images were then categorized as either depicting a man, woman, or not determined. This categorization scheme was chosen to be consistent with methods in prior literature. If you would like to analyze these images with a different categorization scheme, the dataset is available here: download Stable Diffusion DAST images (54 MB).

So what did I find?

1. Majority of depicted scientists were men

What I found was that in the sample of generated images, a majority (89%) of them depicted men. For reference, this is about 30% points higher than the 2016 estimate of all drawings in the DAST dataset.

While I suspected that men may be overrepresented in the data, the magnitude of the discrepancy to DAST was surprising.

Sampling of depictions of men generated from prompt to draw a scientist

There was one silver lining in the data: not all scientists were white men.

Though I did not specifically attempt to analyze the intersection of gender and race, a subset of scientists were drawn of non Anglo-European descent.

Non-white scientists received some representation among male scientists

2. Around 1 in 10 illustrations could not be readily categorized

I also found that 8% of scientists could not be tagged as male or female. In some cases images lacked or contained conflicting gender indicators. Some images also did not clearly depict a human.

The inability to clearly code drawings by gender is also a feature that occurs in the DAST dataset, especially when children draw rudimentary sketches like stick figures.

Illustrations that could not be categorized as including archetypal male or female characteristics

3. Women received sparse representation

How did A.I. do when representing women in scientific roles?

Unfortunately, female scientists were very rarely depicted in this dataset. I estimate that only 3% of images generated with the given positive and negative prompt depict women.

Women were only sketched in 3% of images

The representation of non-white female scientists was worse—none were generated in my sample.

The four depictions of women generated from prompt to draw a scientist

These estimates are troubling when viewed against current workforce participation data. Recent data from Pew Research reports that women comprise 50% of science, technology, engineering, and mathematics (STEM) jobs.

Women are most predominantly represented in health-related careers (74%), followed by life sciences (48%), mathematics (47%), and physical sciences (40%). Computer (25%) and engineering roles (15%) lag further behind.

Data from Pew Research on the representation of women in professions (dot) organized by field (row)

A.I. is our future, but its data is our past

I found it troubling that the discrepancy for depicting women relative to DAST and workforce participation was so vast.

Though women’s representation has shifted significantly since the 1960s toward parity with men, A.I.’s representation of who is a scientist showed a very skewed, outdated view of gender-science stereotypes.

Framed against the meta-analysis from Miller and colleagues, A.I. drew women around as often as children did in 1965. This was much more extreme than I had anticipated going into this experiment.

A.I. drew women around as often as children did in 1965

This brings up important questions about how we reckon with the bias in the machine: how do we know that our A.I. is not just repackaging old social problems for a future generation? How do we identify the sources of bias in our data and catch them before they go live?

Theories of how children internalize gender stereotypes include both their direct observations of their social environments through interactions with people in their families and communities. Children also indirectly observe gendered roles through traditional and online multimedia.

The relative importance of gender as a social category also means that children frequently look for cues about what is appropriate for their gender. This makes reducing bias very consequential in intelligent algorithms that are becoming more integrated into everyday use.

Bias may seem like an abstract concept that we collectively know is important, but do not have a good grasp on how to address it in concrete terms.

Image generation tools like Stable Diffusion make the scope and scale of concepts like gender-science bias visible in a very real way.

Can we alter the mirror’s reflection?

This experiment presented an interesting opportunity to directly observe the way that intelligent algorithms are reflections of both the data they learn, but also the society that produced that data.

Despite the limitations of A.I. tools when given minimally specified prompts like “draw a scientist,” there may still be potential to wield these tools toward positive social ends.

Models like Stable Diffusion are open source and can be trained to generate images from novel classes of objects and people.

One such possibility that has gained popularity recently is using A.I. to generate artistic avatars of people from their own camera reels.

While these apps are limited in what they produce, it is possible to train models like Stable Diffusion — on as few as 3–5 images — and have access to the entire language model for generating novel images.

To illustrate this I trained a model of myself and asked Stable Diffusion to draw me as a scientist.

Admittedly some of the images captured, erm, more of my sense of humor than science skill.

The algorithm somehow guessed I don’t take things too seriously

But some of the images did a really wonderful job of showing me what I might have sketched as a child envisioning myself as a scientist (though with much more artistic skill).

How Stable Diffusion illustrated me as a scientist

With the entire language model at my fingertips, I was able to personalize the drawing prompt to also include relevant concepts about me. Here we can see that I was able to generate some representations of the intersection of my queer and scientific identities.

Illustrations depicting both my queer and scientific identities with an augmented prompt

Focusing the reflection with our own lens

From this experiment I relearned an important lesson. Stable Diffusion, like many other machine learning models, is just a tool.

Tools can be wielded in a number of ways — for good or ill — but the person holding the tool gets to make those choices. By understanding how these and similar ML/A.I. work, we have the capability to shape their outputs in new and interesting ways.

Imagine, for example, consider an intervention where we use A.I. to let children see themselves as a scientist. They could, quite literally, see what they would look like as a veterinarian, an astronomer, a video game developer, or even an inhabitant of an interstellar space station.

Children could see themselves as a veterinarian, physicist, video game developer, or even an inhabitant of an interstellar space station

Stable Diffusion could be used to let children explore seeing themselves as scientists

A.I. tools surely will come to us with biases, but we can still exert some creative control over how those algorithms represent our world.

In the case of image generation, we can personalize the experience by training the model to render a particular person. The language model can then be used to layer on other relevant concepts about that individual, including their interests, salient identity characteristics, and their own vision of what a scientist is and does.

The challenge now lies in really understanding how A.I. reflects a world that is fundamentally static and in the past. We need to be vigilant for ways that this past world no longer represents our lived experience and when we should be prepared to guide it to a different output.

We may not be putting pencil to paper, but we have an opportunity to sketch a new generation of ML/A.I. tools that can have a positive social impact.

Disclaimer: These writings reflect my own opinions and not necessarily those of my employer.

--

--

Dane DeSutter, Ph.D.

Dane is a data scientist at Catalyst Education. He likes to build things, break things, and ask "what is that thing and what could it do?"